c# - Regex to remove HTML attributes and tags except allowed -
i need validate input text html tags specific rules.
string result = string.empty; string acceptabletags = "h1|h2|h3|h4|h5|h6|br|img|video|cut|a"; string acceptableatributes = "alt|href|height|width|align|valign|src|class|id|name|title"; string stringpattern = @"</?(?(?=" + acceptabletags + @")notag|[a-za-z0-9]+)(?:\s[a-za-z0-9\-]+=?(?:(["",']?).*?\1?)?)*\s*/?>"; result = regex.replace(msg, stringpattern, ""); stringpattern = @"\s(?!(" + acceptableatributes + @"))\w+(\s*=\s*[""|']?[/.,#?\w\s:;-]+[""|']?)"; result = regex.replace(result, stringpattern, ""); return result;
this working code. example, remove onload
attribute here
<img src="pic.jpg" onload=" alert(123)">
but not here
<img src="pic.jpg"onload="alert(123)">
p.s. better have 1 regex this, not know well.
Comments
Post a Comment