这个XPath查询有什么问题?

时间:2020-12-22 00:09:24

I want to make headers list from this url: http://www.2dplay.com/action-games.htm

我想从这个URL创建标题列表:http://www.2dplay.com/action-games.htm

My query is as given:

我的查询如下:

 $gamelist = $xpath->query('//div[@id="wrapper"]//div[@id="body_wrap"]//div[@id="content"]//
table[@id="cat_games"]//tbody//tr//td//h2//a');
    foreach($gamelist as $e){
        echo $e->nodeValue;
        echo "<br/>";
    }

It gives no results. If I close the query to table[@id="cat_games"] it gives every txt information in one node. Any help will be greatly appreciated.

它没有结果。如果我将查询关闭到表[@ id =“cat_games”],它会在一个节点中提供每个txt信息。任何帮助将不胜感激。

1 个解决方案

#1


4  

You must know that the id attribute for an element must be unique, according to W3C XHTML 1.0 C.8 section. XHTML 1.0 is a reformulation of HTML 4 in XML 1.0, therefore the HTML4 7.5.2 definition applies here too.

根据W3C XHTML 1.0 C.8部分,您必须知道元素的id属性必须是唯一的。 XHTML 1.0是XML 1.0中HTML 4的重构,因此HTML4 7.5.2定义也适用于此。

Since the document you are parsing is declared as XHTML 1.0, you do not need to provide the full path to the element you want because the table element has an id attribute. You can resolve this element directly instead:

由于您要解析的文档被声明为XHTML 1.0,因此您不需要提供所需元素的完整路径,因为table元素具有id属性。您可以直接解析此元素:

//table[@id="cat_games"]/tr/td/h2/a

If you fear the structure of the table may change (i.e.: a <tbody> tag may be added eventually), you can also make a more generic query:

如果您担心表的结构可能会发生变化(即:最终可能会添加标记),您还可以进行更通用的查询:

//table[@id="cat_games"]//h2/a

#1


4  

You must know that the id attribute for an element must be unique, according to W3C XHTML 1.0 C.8 section. XHTML 1.0 is a reformulation of HTML 4 in XML 1.0, therefore the HTML4 7.5.2 definition applies here too.

根据W3C XHTML 1.0 C.8部分,您必须知道元素的id属性必须是唯一的。 XHTML 1.0是XML 1.0中HTML 4的重构,因此HTML4 7.5.2定义也适用于此。

Since the document you are parsing is declared as XHTML 1.0, you do not need to provide the full path to the element you want because the table element has an id attribute. You can resolve this element directly instead:

由于您要解析的文档被声明为XHTML 1.0,因此您不需要提供所需元素的完整路径,因为table元素具有id属性。您可以直接解析此元素:

//table[@id="cat_games"]/tr/td/h2/a

If you fear the structure of the table may change (i.e.: a <tbody> tag may be added eventually), you can also make a more generic query:

如果您担心表的结构可能会发生变化(即:最终可能会添加标记),您还可以进行更通用的查询:

//table[@id="cat_games"]//h2/a