I'm having a problem with a script I've been using for some time and it worked for me until I hit this problem.
我有一个脚本问题,我已经使用了一段时间,它对我起作用,直到我遇到这个问题。
I have a script with which I'd like to delete all p html tags from a html source code. The script does work partly because it only removes some of the p tags, but leaves some out.
我有一个脚本,我想从一个html源代码中删除所有的p html标记。该脚本之所以有效,部分原因是它只删除了一些p标记,但遗漏了一些。
I don't understand why it does that.
我不明白为什么会这样。
$doc = new DOMDocument();
$a = <<<FAIL
<html><body>
<div style="clear:both"></div>
<p class="articletitle">hoo</p>
<p class="articletext">hmmm</p>
<p class="articletext">hmmmm</p>
<p align="center"></p>
</body></html>
FAIL;
$doc->loadHTML($a);
$list = $doc->getElementsByTagName("p");
foreach ($list as $l) {
$l->parentNode->removeChild($l);
$c++;
}
echo $doc->saveHTML() . $c;
the script returns
脚本返回
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
<div style="clear:both"></div>
<p class="articletext">hmmm</p>
<p align="center"></p>
leaving out two p tags...
遗漏了两个p标签…
Can you please help me to find out why it's skipping some tags
你能帮我找出它为什么跳过一些标签吗?
1 个解决方案
#1
10
Try this way:
试试这种方法:
$doc->loadHTML($a);
$list = $doc->getElementsByTagName("p");
while ($list->length > 0) {
$p = $list->item(0);
$p->parentNode->removeChild($p);
}
#1
10
Try this way:
试试这种方法:
$doc->loadHTML($a);
$list = $doc->getElementsByTagName("p");
while ($list->length > 0) {
$p = $list->item(0);
$p->parentNode->removeChild($p);
}