c# - Regex to remove HTML attributes and tags except allowed -


i need validate input text html tags specific rules.

        string result = string.empty;         string acceptabletags = "h1|h2|h3|h4|h5|h6|br|img|video|cut|a";         string acceptableatributes = "alt|href|height|width|align|valign|src|class|id|name|title";         string stringpattern = @"</?(?(?=" + acceptabletags + @")notag|[a-za-z0-9]+)(?:\s[a-za-z0-9\-]+=?(?:(["",']?).*?\1?)?)*\s*/?>";         result = regex.replace(msg, stringpattern, "");         stringpattern = @"\s(?!(" + acceptableatributes + @"))\w+(\s*=\s*[""|']?[/.,#?\w\s:;-]+[""|']?)";         result = regex.replace(result, stringpattern, "");         return result; 

this working code. example, remove onload attribute here

<img src="pic.jpg" onload=" alert(123)"> 

but not here

<img src="pic.jpg"onload="alert(123)"> 

p.s. better have 1 regex this, not know well.

this library works purpose.


Comments

Popular posts from this blog

PHP DOM loadHTML() method unusual warning -

python - How to create jsonb index using GIN on SQLAlchemy? -

c# - TransactionScope not rolling back although no complete() is called -