$html = file_get_contents("test.html");
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$body = $xpath->query('//body');
I want to loop through all elements of the body tag of a HTML file and print out the "style" attribute associated with these elements. How can I do this?
我想循环遍历HTML文件的body标签的所有元素,并打印出与这些元素关联的“style”属性。我怎样才能做到这一点?
3 个解决方案
#1
10
You can take my RecursiveDOMIterator for this:
您可以将我的RecursiveDOMIterator用于此:
Code (compacted)
代码(压缩)
class RecursiveDOMIterator implements RecursiveIterator
{
protected $_position;
protected $_nodeList;
public function __construct(DOMNode $domNode)
{
$this->_position = 0;
$this->_nodeList = $domNode->childNodes;
}
public function getChildren() { return new self($this->current()); }
public function key() { return $this->_position; }
public function next() { $this->_position++; }
public function rewind() { $this->_position = 0; }
public function valid()
{
return $this->_position < $this->_nodeList->length;
}
public function hasChildren()
{
return $this->current()->hasChildNodes();
}
public function current()
{
return $this->_nodeList->item($this->_position);
}
}
Usage:
用法:
$dom = new DOMDocument;
$dom->loadHTMLFile('http://*.com/questions/4431142/');
$dit = new RecursiveIteratorIterator(
new RecursiveDOMIterator($dom),
RecursiveIteratorIterator::SELF_FIRST
);
foreach($dit as $node) {
if($node->nodeType === XML_ELEMENT_NODE && $node->hasAttribute('style')) {
printf(
'Element %s - Styles: %s%s',
$node->nodeName,
$node->getAttribute('style'),
PHP_EOL
);
}
}
Output:
输出:
Element div - Styles: margin-top: 8px; height:24px;
Element div - Styles: margin-top: 8px; height:24px; display:none;
Element a - Styles: font-size: 200%; margin-left: 30px;
Element div - Styles: display:none
Element div - Styles: display:none
Element span - Styles: color:#FE7A15;font-size:140%
Element span - Styles: color:#FE7A15;font-size:140%
Element span - Styles: color:#FE7A15;font-size:140%
Element span - Styles: color:#E8272C;font-size:140%
Element span - Styles: color:#00AFEF;font-size:140%
Element span - Styles: color:#969696;font-size:140%
Element span - Styles: color:#46937D;font-size:140%
Element span - Styles: color:#C0D0DC;font-size:140%
Element span - Styles: color:#000;font-size:140%
Element span - Styles: color:#dd4814;font-size:140%
Element span - Styles: color:#9ce4fe;font-size:140%
Element span - Styles: color:#cf4d3f;font-size:140%
Element span - Styles: color:#f4f28d;font-size:140%
Element span - Styles: color:#0f3559;font-size:140%
Element span - Styles: color:#f2f2f2;font-size:140%
Element span - Styles: color:#037187;font-size:140%
Element span - Styles: color:#f1e7cc;font-size:140%
Element span - Styles: color:#e1cdae;font-size:140%
Element span - Styles: color:#a2d9f6;font-size:140%
#2
8
Another option would be to use XPath to only find elements descended from the <body>
and having a style
attribute, like:
另一种选择是使用XPath只查找来自的元素并具有style属性,如:
$dom = new DOMDocument;
$dom->loadHTMLFile('https://*.com/questions/4431142/');
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('/html/body//*[@style]');
foreach($nodes as $node) {
printf(
'Element %s - Styles: %s%s',
$node->nodeName,
$node->getAttribute('style'),
PHP_EOL
);
}
The output is the same as in Gordon's answer and the only important line is the $nodes = …
one.
输出与Gordon的答案相同,唯一重要的一行是$ nodes = ... one。
#3
0
I did it recursively like this. I'm not sure if its the most efficient way. I tried the method on this web page and it worked fine.
我像这样递归地做了。我不确定它是否是最有效的方式。我在这个网页上尝试了这个方法,它工作得很好。
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$body = $xpath->query('//body')->item(0);
recursePrintStyles($body);
function recursePrintStyles($node)
{
if ($node->nodeType !== XML_ELEMENT_NODE)
{
return;
}
echo $node->tagName;
echo "\t";
echo $node->getAttribute('style');
echo "\n";
foreach ($node->childNodes as $childNode)
{
recursePrintStyles($childNode);
}
}
#1
10
You can take my RecursiveDOMIterator for this:
您可以将我的RecursiveDOMIterator用于此:
Code (compacted)
代码(压缩)
class RecursiveDOMIterator implements RecursiveIterator
{
protected $_position;
protected $_nodeList;
public function __construct(DOMNode $domNode)
{
$this->_position = 0;
$this->_nodeList = $domNode->childNodes;
}
public function getChildren() { return new self($this->current()); }
public function key() { return $this->_position; }
public function next() { $this->_position++; }
public function rewind() { $this->_position = 0; }
public function valid()
{
return $this->_position < $this->_nodeList->length;
}
public function hasChildren()
{
return $this->current()->hasChildNodes();
}
public function current()
{
return $this->_nodeList->item($this->_position);
}
}
Usage:
用法:
$dom = new DOMDocument;
$dom->loadHTMLFile('http://*.com/questions/4431142/');
$dit = new RecursiveIteratorIterator(
new RecursiveDOMIterator($dom),
RecursiveIteratorIterator::SELF_FIRST
);
foreach($dit as $node) {
if($node->nodeType === XML_ELEMENT_NODE && $node->hasAttribute('style')) {
printf(
'Element %s - Styles: %s%s',
$node->nodeName,
$node->getAttribute('style'),
PHP_EOL
);
}
}
Output:
输出:
Element div - Styles: margin-top: 8px; height:24px;
Element div - Styles: margin-top: 8px; height:24px; display:none;
Element a - Styles: font-size: 200%; margin-left: 30px;
Element div - Styles: display:none
Element div - Styles: display:none
Element span - Styles: color:#FE7A15;font-size:140%
Element span - Styles: color:#FE7A15;font-size:140%
Element span - Styles: color:#FE7A15;font-size:140%
Element span - Styles: color:#E8272C;font-size:140%
Element span - Styles: color:#00AFEF;font-size:140%
Element span - Styles: color:#969696;font-size:140%
Element span - Styles: color:#46937D;font-size:140%
Element span - Styles: color:#C0D0DC;font-size:140%
Element span - Styles: color:#000;font-size:140%
Element span - Styles: color:#dd4814;font-size:140%
Element span - Styles: color:#9ce4fe;font-size:140%
Element span - Styles: color:#cf4d3f;font-size:140%
Element span - Styles: color:#f4f28d;font-size:140%
Element span - Styles: color:#0f3559;font-size:140%
Element span - Styles: color:#f2f2f2;font-size:140%
Element span - Styles: color:#037187;font-size:140%
Element span - Styles: color:#f1e7cc;font-size:140%
Element span - Styles: color:#e1cdae;font-size:140%
Element span - Styles: color:#a2d9f6;font-size:140%
#2
8
Another option would be to use XPath to only find elements descended from the <body>
and having a style
attribute, like:
另一种选择是使用XPath只查找来自的元素并具有style属性,如:
$dom = new DOMDocument;
$dom->loadHTMLFile('https://*.com/questions/4431142/');
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('/html/body//*[@style]');
foreach($nodes as $node) {
printf(
'Element %s - Styles: %s%s',
$node->nodeName,
$node->getAttribute('style'),
PHP_EOL
);
}
The output is the same as in Gordon's answer and the only important line is the $nodes = …
one.
输出与Gordon的答案相同,唯一重要的一行是$ nodes = ... one。
#3
0
I did it recursively like this. I'm not sure if its the most efficient way. I tried the method on this web page and it worked fine.
我像这样递归地做了。我不确定它是否是最有效的方式。我在这个网页上尝试了这个方法,它工作得很好。
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$body = $xpath->query('//body')->item(0);
recursePrintStyles($body);
function recursePrintStyles($node)
{
if ($node->nodeType !== XML_ELEMENT_NODE)
{
return;
}
echo $node->tagName;
echo "\t";
echo $node->getAttribute('style');
echo "\n";
foreach ($node->childNodes as $childNode)
{
recursePrintStyles($childNode);
}
}