How can I grab the entire content inside <body>
tag with regex?
如何使用正则表达式获取标记内的整个内容?
For instance,
例如,
<html><body><p><a href="#">xx</a></p>
<p><a href="#">xx</a></p></body></html>
I want to return this only,
我只想归还这个,
<p><a href="#">xx</a></p>
<p><a href="#">xx</a></p>
Or any other better ideas? maybe DOM but I have to use saveHTML();
then it will return doctype
and body
tag...
还是其他更好的想法?也许DOM,但我必须使用saveHTML();那么它将返回doctype和body标签......
HTML Purifier is a pain to use so I decide not to use it. I thought regex could be the next best option for my disaster.
HTML Purifier很难用,所以我决定不使用它。我认为正则表达式可能是我灾难的下一个最佳选择。
2 个解决方案
#1
21
preg_match("/<body[^>]*>(.*?)<\/body>/is", $html, $matches);
$matches[1]
will be the contents of the body tag
$ matches [1]将是body标签的内容
#2
1
preg_match("~<body.*?>(.*?)<\/body>~is", $html, $match);
print_r($match);
#1
21
preg_match("/<body[^>]*>(.*?)<\/body>/is", $html, $matches);
$matches[1]
will be the contents of the body tag
$ matches [1]将是body标签的内容
#2
1
preg_match("~<body.*?>(.*?)<\/body>~is", $html, $match);
print_r($match);