正则表达式从任何HTML标记(style =“”)中删除HTML属性?

时间:2021-05-13 19:34:45

I'm looking for a regex pattern that will look for an attribute within an HTML tag. Specifically, I'd like to find all instances of ...

我正在寻找一个正在寻找HTML标签中的属性的正则表达式模式。具体来说,我想找到...的所有实例

style=""

... and remove it from the HTML tag that it is contained within. Obviously this would include anything contained with the double quotes as well.

...并将其从包含在其中的HTML标记中删除。显然,这将包括双引号中包含的任何内容。

I'm using Classic ASP to do this. I already have a function setup for a different regex pattern that looks for all HTML tags in a string and removes them. It works great. But now I just need another pattern for specifically removing all of the style attributes.

我正在使用经典ASP来做到这一点。我已经为不同的正则表达式模式设置了函数,该模式查找字符串中的所有HTML标记并将其删除。它很棒。但现在我只需要另一种模式来专门删除所有样式属性。

Any help would be greatly appreciated.

任何帮助将不胜感激。

5 个解决方案

#1


14  

I think this might do it:

我想这可能会这样做:

/style="[a-zA-Z0-9:;\.\s\(\)\-\,]*"/gi

You could also put these in capturing groups, if you wanted to replace some parts only

如果您只想更换某些部件,也可以将它们放入捕获组中

/(style=")([a-zA-Z0-9:;\.\s\(\)\-\,]*)(")/gi

Working Example: http://regexr.com?2up30

工作实例:http://regexr.com?2up30

#2


22  

Perhaps a simpler expression is

或许更简单的表达方式

 style="[^\"]*"

so everything between the double quotes except a double quote.

所以双引号之间的所有内容除了双引号外。

#3


0  

This works with perl. Maybe you need to change the regex to match ASP rules a little bit but it should work for any tag.

这适用于perl。也许您需要更改正则表达式以匹配ASP规则,但它应该适用于任何标记。

$file=~ s/(<\s*[a-z][a-z0-9]*.*\s)(style\s*=\s*".*?")([^<>]*>)/$1 $3/sig;

Where line is an html file.

其中line是一个html文件。

Also this is in .net C#

这也是在.net C#中

      string resultString = null;
      string subjectString = "<html style=\"something\"> ";

      resultString = Regex.Replace(subjectString, @"(<\s*[a-z][a-z0-9]*.*\s)(style\s*=\s*"".*?"")([^<>]*>)", "$1 $3", RegexOptions.Singleline | RegexOptions.IgnoreCase);

Result : <html >

结果:

#4


0  

This expression work for me:

这个表达对我有用:

style=".+"/ig

#5


0  

I tried Jason Gennaro's regular expression and slightly modified it

我尝试了Jason Gennaro的正则表达并略微修改了它

/style="[a-zA-Z0-9:;&\."\s\(\)\-\,]*|\\/ig

This regular expression captures some specific cases with &quot inside the string for example

例如,这个正则表达式用字符串中的“in”捕获一些特定情况

 <div class="frame" style="font-family: Monaco, Consolas, &quot;Courier New&quot;, monospace; font-size: 12px; background-color: rgb(245, 245, 245);">some text</div>

#1


14  

I think this might do it:

我想这可能会这样做:

/style="[a-zA-Z0-9:;\.\s\(\)\-\,]*"/gi

You could also put these in capturing groups, if you wanted to replace some parts only

如果您只想更换某些部件,也可以将它们放入捕获组中

/(style=")([a-zA-Z0-9:;\.\s\(\)\-\,]*)(")/gi

Working Example: http://regexr.com?2up30

工作实例:http://regexr.com?2up30

#2


22  

Perhaps a simpler expression is

或许更简单的表达方式

 style="[^\"]*"

so everything between the double quotes except a double quote.

所以双引号之间的所有内容除了双引号外。

#3


0  

This works with perl. Maybe you need to change the regex to match ASP rules a little bit but it should work for any tag.

这适用于perl。也许您需要更改正则表达式以匹配ASP规则,但它应该适用于任何标记。

$file=~ s/(<\s*[a-z][a-z0-9]*.*\s)(style\s*=\s*".*?")([^<>]*>)/$1 $3/sig;

Where line is an html file.

其中line是一个html文件。

Also this is in .net C#

这也是在.net C#中

      string resultString = null;
      string subjectString = "<html style=\"something\"> ";

      resultString = Regex.Replace(subjectString, @"(<\s*[a-z][a-z0-9]*.*\s)(style\s*=\s*"".*?"")([^<>]*>)", "$1 $3", RegexOptions.Singleline | RegexOptions.IgnoreCase);

Result : <html >

结果:

#4


0  

This expression work for me:

这个表达对我有用:

style=".+"/ig

#5


0  

I tried Jason Gennaro's regular expression and slightly modified it

我尝试了Jason Gennaro的正则表达并略微修改了它

/style="[a-zA-Z0-9:;&\."\s\(\)\-\,]*|\\/ig

This regular expression captures some specific cases with &quot inside the string for example

例如,这个正则表达式用字符串中的“in”捕获一些特定情况

 <div class="frame" style="font-family: Monaco, Consolas, &quot;Courier New&quot;, monospace; font-size: 12px; background-color: rgb(245, 245, 245);">some text</div>