如何从php DOMDocument的子节点获取文本

时间:2022-05-25 13:39:02

I've been writing a php code to get information from a site, so far i was able to get the href attribute, but i cant find a way to get the text from the child node "span", can someone help me?

我一直在写一个PHP代码来从网站获取信息,到目前为止我能够获得href属性,但我无法找到从子节点“span”获取文本的方法,有人可以帮助我吗?

html- >

html->

<a class="js-publication" href="publication/247931167"> 
    <span class="publication-title">An approach for textual authoring</span> 
</a>

This is how i am currently able to get the href ->

这就是我目前能够获得href - >的方法

    @$dom->loadHTMLFile($curPage);
    $anchors = $dom->getElementsByTagName('a'); 
    foreach ($anchors as $element) {            
        $class_ = $element->getAttribute('class');
        if (0 !== strpos($class_, 'js-publication')) {
            $href = $element->getAttribute('href');
            if(0 === stripos($href,'publication/')){
                echo $href;//link para a publicação;
                echo "\n";
            }
        }
    }

1 个解决方案

#1


1  

You can use DOMXpath

您可以使用DOMXpath

$html = <<< LOL
<a class="js-publication" href="publication/247931167"> 
    <span class="publication-title">An approach for textual authoring</span> 
</a>
LOL;

$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
foreach ($xpath->query("//a[@class='js-publication']") as $element){
    echo $element->getAttribute('href');
    echo $element->textContent;
}
//publication/247931167
//An approach for textual authoring

Or without the for loop, if you just want one element :

或者没有for循环,如果你只想要一个元素:

echo $xpath->query("//a[@class='js-publication']/span")[0]->textContent;
echo $xpath->query("//a[@class='js-publication']")[0]->getAttribute('href');

Ideone Demo

Ideone演示

#1


1  

You can use DOMXpath

您可以使用DOMXpath

$html = <<< LOL
<a class="js-publication" href="publication/247931167"> 
    <span class="publication-title">An approach for textual authoring</span> 
</a>
LOL;

$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
foreach ($xpath->query("//a[@class='js-publication']") as $element){
    echo $element->getAttribute('href');
    echo $element->textContent;
}
//publication/247931167
//An approach for textual authoring

Or without the for loop, if you just want one element :

或者没有for循环,如果你只想要一个元素:

echo $xpath->query("//a[@class='js-publication']/span")[0]->textContent;
echo $xpath->query("//a[@class='js-publication']")[0]->getAttribute('href');

Ideone Demo

Ideone演示