PHP XPath搜索返回0结果

时间:2022-09-10 19:18:00

Below I have a PHP script that I need to search through an XML file and find the ID for <AnotherChild>. For some reason, at the moment it returns 0 results and I can't figure out why. If anyone can see why it's returning 0 results I'd really appreciate it if they could let me know why.

下面我有一个PHP脚本,我需要搜索XML文件并找到 的ID。出于某种原因,目前它返回0结果,我无法弄清楚原因。如果有人能够知道为什么它会返回0结果我真的很感激它,如果他们能让我知道为什么。

XML:

XML:

<TransXChange xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.transxchange.org.uk/" xsi:schemaLocation="http://www.transxchange.org.uk/ http://www.transxchange.org.uk/schema/2.1/TransXChange_general.xsd" CreationDateTime="2013-07-12T18:12:21.8122032+01:00" ModificationDateTime="2013-07-12T18:12:21.8122032+01:00" Modification="new" RevisionNumber="3" FileName="swe_44-611A-1-y10.xml" SchemaVersion="2.1">
    <Node1>...</Node1>
    <Node2>...</Node2>
    <Node3>...</Node3>
    <Node4>...</Node4>
    <Node5>...</Node5>
    <Node6>...</Node6>
    <Node7>
        <Child>
            <id>ABCDEFG123</id>
        </Child>
        <AnotherChild>
            <id>ABCDEFG124</id>
        </AnotherChild>
    </Node7>
    <Node8>...</Node8>
</TransXChange>

PHP:

PHP:

<?php

  $xmldoc = new DOMDocument();
  $xmldoc->load("directory1/directory2/file.xml");

  $xpathvar = new DOMXPath($xmldoc);
  $xpathvar->registerNamespace('transXchange', 'http://www.transxchange.org.uk/');

  $queryResult = $xpathvar->query('//AnotherChild/id');
  foreach($queryResult as $result) {
    echo $result->textContent;
  }
?>

Thanks

谢谢

2 个解决方案

#1


9  

The two questions linked in comments do actually answer this question, but they don't quite make it clear enough why they answer it IMO, so I'll add this following my answer in chat.

在评论中链接的两个问题确实回答了这个问题,但他们并没有明确说明为什么他们回答IMO,所以我将在聊天中回答这个问题。


Consider the following XML document:

请考虑以下XML文档:

<root>
  <child>
    <grandchild>foo</grandchild>
  </child>
</root>

This has no xmlns attributes at all, which means you can query //grandchild and get the result you expect. Every node is in the default namespace, so everything can be addressed without registering a namespace in XPath.

它根本没有xmlns属性,这意味着您可以查询//孙子并获得您期望的结果。每个节点都在默认命名空间中,因此无需在XPath中注册命名空间即可解决所有问题。

Now consider this:

现在考虑一下:

<root xmlns="http://www.bar.com/">
  <child>
    <grandchild>foo</grandchild>
  </child>
</root>

This declares a namespace of http://www.bar.com/ and as a result you must use that namespace to address a member node.

这声明了http://www.bar.com/的命名空间,因此您必须使用该命名空间来寻址成员节点。

As you have already figured out, the way to do this is to use DOMXPath::registerNamespace() - but the crucial point that you missed is that (in PHP's XPath implementation) every namespace must be registered with a prefix, and you must use that prefix to address nodes that belong to it. It is not possible register a namespace in XPath with an empty prefix.

正如您已经想到的那样,这样做的方法是使用DOMXPath :: registerNamespace() - 但您错过的关键点是(在PHP的XPath实现中)每个命名空间必须使用前缀注册,并且您必须使用该前缀用于寻址属于它的节点。在XPath中使用空前缀注册命名空间是不可能的。

So, given the second example above, lets look at how we would execute the original //grandchild query:

因此,鉴于上面的第二个例子,让我们看看我们将如何执行原始的//孙子查询:

<?php

    $doc = new DOMDocument();
    $doc->loadXML($xml);

    $xpath = new DOMXPath($doc);
    $xpath->registerNamespace('bar', 'http://www.bar.com/');

    $nodes = $xpath->query('//bar:grandchild');
    foreach($nodes as $node) {
        // do stuff with $node
    }

Note how we registered the namespace using it's URI, and we specified a prefix. Even though the original XML did not contain this prefix, we use the prefix in the query - example.

请注意我们如何使用它的URI注册命名空间,并指定了前缀。即使原始XML不包含此前缀,我们在查询中使用前缀 - 示例。

To understand why, lets look at another piece of XML:

要了解原因,让我们看看另一段XML:

<baz:root xmlns:baz="http://www.bar.com/">
  <baz:child>
    <baz:grandchild>foo</baz:grandchild>
  </baz:child>
</baz:root>

This document is semantically identical to the second - the code sample would work equally well with either (proof). The prefix is separate from the namespace. Note that even though this uses a baz: prefix in the document, the XPath uses the bar: prefix. This is because the think that identifies the namespace is the URI, not the prefix.

该文档在语义上与第二个文档相同 - 代码示例与两者(证明)同样适用。前缀与命名空间分开。请注意,即使在文档中使用了baz:前缀,XPath也会使用bar:前缀。这是因为标识命名空间的思路是URI,而不是前缀。

So when a document uses a namespace, we must work with the namespace, not against it, by registering the namespace in XPath and using the prefix we registered it against to refer to any nodes that belong to that namespace.

因此,当文档使用命名空间时,我们必须使用命名空间而不是反对它,方法是在XPath中注册命名空间并使用我们注册的前缀来引用属于该命名空间的任何节点。

For completeness, when we apply these principles to your original document, the query that you would use with the code in the question is:

为了完整起见,当我们将这些原则应用于您的原始文档时,您将与问题中的代码一起使用的查询是:

//transXchange:AnotherChild/transXchange:id

#2


2  

To fix this problem I first registered the namespace:

为了解决这个问题,我首先注册了命名空间:

$xpathvar->registerNamespace('transXchange', 'http://www.transxchange.org.uk/');

And then modified the query like so:

然后像这样修改查询:

$queryResult = $xpathvar->query('//transXchange:AnotherChild/transXchange:id');

This returned the ID successfully.

这成功返回了ID。

#1


9  

The two questions linked in comments do actually answer this question, but they don't quite make it clear enough why they answer it IMO, so I'll add this following my answer in chat.

在评论中链接的两个问题确实回答了这个问题,但他们并没有明确说明为什么他们回答IMO,所以我将在聊天中回答这个问题。


Consider the following XML document:

请考虑以下XML文档:

<root>
  <child>
    <grandchild>foo</grandchild>
  </child>
</root>

This has no xmlns attributes at all, which means you can query //grandchild and get the result you expect. Every node is in the default namespace, so everything can be addressed without registering a namespace in XPath.

它根本没有xmlns属性,这意味着您可以查询//孙子并获得您期望的结果。每个节点都在默认命名空间中,因此无需在XPath中注册命名空间即可解决所有问题。

Now consider this:

现在考虑一下:

<root xmlns="http://www.bar.com/">
  <child>
    <grandchild>foo</grandchild>
  </child>
</root>

This declares a namespace of http://www.bar.com/ and as a result you must use that namespace to address a member node.

这声明了http://www.bar.com/的命名空间,因此您必须使用该命名空间来寻址成员节点。

As you have already figured out, the way to do this is to use DOMXPath::registerNamespace() - but the crucial point that you missed is that (in PHP's XPath implementation) every namespace must be registered with a prefix, and you must use that prefix to address nodes that belong to it. It is not possible register a namespace in XPath with an empty prefix.

正如您已经想到的那样,这样做的方法是使用DOMXPath :: registerNamespace() - 但您错过的关键点是(在PHP的XPath实现中)每个命名空间必须使用前缀注册,并且您必须使用该前缀用于寻址属于它的节点。在XPath中使用空前缀注册命名空间是不可能的。

So, given the second example above, lets look at how we would execute the original //grandchild query:

因此,鉴于上面的第二个例子,让我们看看我们将如何执行原始的//孙子查询:

<?php

    $doc = new DOMDocument();
    $doc->loadXML($xml);

    $xpath = new DOMXPath($doc);
    $xpath->registerNamespace('bar', 'http://www.bar.com/');

    $nodes = $xpath->query('//bar:grandchild');
    foreach($nodes as $node) {
        // do stuff with $node
    }

Note how we registered the namespace using it's URI, and we specified a prefix. Even though the original XML did not contain this prefix, we use the prefix in the query - example.

请注意我们如何使用它的URI注册命名空间,并指定了前缀。即使原始XML不包含此前缀,我们在查询中使用前缀 - 示例。

To understand why, lets look at another piece of XML:

要了解原因,让我们看看另一段XML:

<baz:root xmlns:baz="http://www.bar.com/">
  <baz:child>
    <baz:grandchild>foo</baz:grandchild>
  </baz:child>
</baz:root>

This document is semantically identical to the second - the code sample would work equally well with either (proof). The prefix is separate from the namespace. Note that even though this uses a baz: prefix in the document, the XPath uses the bar: prefix. This is because the think that identifies the namespace is the URI, not the prefix.

该文档在语义上与第二个文档相同 - 代码示例与两者(证明)同样适用。前缀与命名空间分开。请注意,即使在文档中使用了baz:前缀,XPath也会使用bar:前缀。这是因为标识命名空间的思路是URI,而不是前缀。

So when a document uses a namespace, we must work with the namespace, not against it, by registering the namespace in XPath and using the prefix we registered it against to refer to any nodes that belong to that namespace.

因此,当文档使用命名空间时,我们必须使用命名空间而不是反对它,方法是在XPath中注册命名空间并使用我们注册的前缀来引用属于该命名空间的任何节点。

For completeness, when we apply these principles to your original document, the query that you would use with the code in the question is:

为了完整起见,当我们将这些原则应用于您的原始文档时,您将与问题中的代码一起使用的查询是:

//transXchange:AnotherChild/transXchange:id

#2


2  

To fix this problem I first registered the namespace:

为了解决这个问题,我首先注册了命名空间:

$xpathvar->registerNamespace('transXchange', 'http://www.transxchange.org.uk/');

And then modified the query like so:

然后像这样修改查询:

$queryResult = $xpathvar->query('//transXchange:AnotherChild/transXchange:id');

This returned the ID successfully.

这成功返回了ID。