我是如何使用php从html视图源获取“rel”属性的值的？ [重复]

This question already has an answer here:

这个问题在这里已有答案：

How do you parse and process HTML/XML in PHP? 28 answers
你如何在PHP中解析和处理HTML / XML？ 28个答案

I want the value of the rel attribute of the anchor tag associated with the search domain.

我想要与搜索域关联的锚标记的rel属性的值。

I have to change the domain "blog.zeit.de/berlinjournal" instead of "http://blog.zeit.de/berlinjournal/". Use this domain and find out rel Val

我必须更改域名“blog.zeit.de/berlinjournal”而不是“http://blog.zeit.de/berlinjournal/”。使用此域名并找出rel Val

@Sam Onela, code not working for this domain. Please help me to solve this error.

@Sam Onela，代码不适用于此域名。请帮我解决这个错误。

My code is:

我的代码是：

$domain = 'blog.zeit.de/berlinjournal';
$handle = fopen($domain, 'r');
$content = stream_get_contents($handle);
fclose($handle);
if ((strpos($content, $domain) !== false)) {
        echo 'true'; // true if $domain found in view source content
}

Get the clear idea in blow image

在吹像中获得清晰的想法

1 个解决方案

#1

Create an instance of DOMDocument, call the loadHTML() method, then use simplexml_import_dom() to get an instance of a SimpleXMLElement, on which the xpath() method can be used to query for that anchor tag.

创建DOMDocument的实例，调用loadHTML（）方法，然后使用simplexml_import_dom（）获取SimpleXMLElement的实例，在该实例上可以使用xpath（）方法查询该锚标记。

You may also notice warnings printed to the screen when loading the html. To set it to use the internal error handler, use libxml_use_internal_errors(true); - thanks to @dewsworld for this answer.

加载html时，您可能还会注意到屏幕上显示警告。要将其设置为使用内部错误处理程序，请使用libxml_use_internal_errors（true）; - 感谢@dewsworld的回答。

libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($content);
$xml = simplexml_import_dom($doc);
$results = $xml->xpath("//a[@href='$domain']");
if (sizeof($results)) {
    echo 'rel: '.$results[0]['rel'].'<br>';
}

See it demonstrated in this phpfiddle.

看到它在这个phpfiddle中演示。

Update

Since the HTML of the original URL has changed and the requirement is now to find the rel attribute of a different anchor tag, that can be done with the contains() xpath function.

由于原始URL的HTML已更改，并且现在需要查找不同锚标记的rel属性，因此可以使用contains（）xpath函数来完成。

$searchDomain = 'rballutschinski.wordpress.com/';
if ((strpos($content, $searchDomain) !== false)) {
    $doc = new DOMDocument();
    $doc->loadHTML($content);
    $xml = simplexml_import_dom($doc);
    $results = $xml->xpath("//a[contains(@href,'$searchDomain')]");
    if (sizeof($results)) {
        $rel = $results[0]['rel'];
    }

See a demonstration in this phpfiddle.

请参阅此phpfiddle中的演示。

#1

创建DOMDocument的实例，调用loadHTML（）方法，然后使用simplexml_import_dom（）获取SimpleXMLElement的实例，在该实例上可以使用xpath（）方法查询该锚标记。

You may also notice warnings printed to the screen when loading the html. To set it to use the internal error handler, use libxml_use_internal_errors(true); - thanks to @dewsworld for this answer.

加载html时，您可能还会注意到屏幕上显示警告。要将其设置为使用内部错误处理程序，请使用libxml_use_internal_errors（true）; - 感谢@dewsworld的回答。

libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($content);
$xml = simplexml_import_dom($doc);
$results = $xml->xpath("//a[@href='$domain']");
if (sizeof($results)) {
    echo 'rel: '.$results[0]['rel'].'<br>';
}

See it demonstrated in this phpfiddle.

看到它在这个phpfiddle中演示。

Update

Since the HTML of the original URL has changed and the requirement is now to find the rel attribute of a different anchor tag, that can be done with the contains() xpath function.

由于原始URL的HTML已更改，并且现在需要查找不同锚标记的rel属性，因此可以使用contains（）xpath函数来完成。

$searchDomain = 'rballutschinski.wordpress.com/';
if ((strpos($content, $searchDomain) !== false)) {
    $doc = new DOMDocument();
    $doc->loadHTML($content);
    $xml = simplexml_import_dom($doc);
    $results = $xml->xpath("//a[contains(@href,'$searchDomain')]");
    if (sizeof($results)) {
        $rel = $results[0]['rel'];
    }

See a demonstration in this phpfiddle.

请参阅此phpfiddle中的演示。

秒客网

我是如何使用php从html视图源获取“rel”属性的值的？ [重复]

1 个解决方案

#1

Update

#1

Update

相关文章