如何用正则表达式获取``标签内的全部内容?

时间:2022-11-11 04:16:03

How can I grab the entire content inside <body> tag with regex?

如何使用正则表达式获取标记内的整个内容?

For instance,

例如,

<html><body><p><a href="#">xx</a></p>

<p><a href="#">xx</a></p></body></html> 

I want to return this only,

我只想归还这个,

<p><a href="#">xx</a></p>

<p><a href="#">xx</a></p>

Or any other better ideas? maybe DOM but I have to use saveHTML(); then it will return doctype and body tag...

还是其他更好的想法?也许DOM,但我必须使用saveHTML();那么它将返回doctype和body标签......

HTML Purifier is a pain to use so I decide not to use it. I thought regex could be the next best option for my disaster.

HTML Purifier很难用,所以我决定不使用它。我认为正则表达式可能是我灾难的下一个最佳选择。

2 个解决方案

#1


21  

preg_match("/<body[^>]*>(.*?)<\/body>/is", $html, $matches);

$matches[1] will be the contents of the body tag

$ matches [1]将是body标签的内容

#2


1  

preg_match("~<body.*?>(.*?)<\/body>~is", $html, $match);
print_r($match);

#1


21  

preg_match("/<body[^>]*>(.*?)<\/body>/is", $html, $matches);

$matches[1] will be the contents of the body tag

$ matches [1]将是body标签的内容

#2


1  

preg_match("~<body.*?>(.*?)<\/body>~is", $html, $match);
print_r($match);