如何使用Java中的名称空间和XPath查询XML ?

时间:2022-02-03 01:34:42

When my XML looks like this (no xmlns) then I can easly query it with XPath like /workbook/sheets/sheet[1]

当我的XML看起来像这样(没有xmlns)时,我可以轻松地用XPath /工作簿/表/表格[1]查询它。

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workbook>
  <sheets>
    <sheet name="Sheet1" sheetId="1" r:id="rId1"/>
  </sheets>
</workbook>

But when it looks like this then I can't

但当它看起来像这样时,我就不能

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
  <sheets>
    <sheet name="Sheet1" sheetId="1" r:id="rId1"/>
  </sheets>
</workbook>

Any ideas?

什么好主意吗?

6 个解决方案

#1


61  

In the second example XML file the elements are bound to a namespace. Your XPath is attempting to address elements that are bound to the default "no namespace" namespace, so they don't match.

在第二个示例XML文件中,元素被绑定到一个名称空间。XPath试图处理绑定到默认“无名称空间”名称空间的元素,因此它们不匹配。

The preferred method is to register the namespace with a namespace-prefix. It makes your XPath much easier to develop, read, and maintain.

首选方法是使用名称空间前缀注册名称空间。它使XPath更易于开发、读取和维护。

However, it is not mandatory that you register the namespace and use the namespace-prefix in your XPath.

但是,您不需要注册名称空间并在XPath中使用名称空间前缀。

You can formulate an XPath expression that uses a generic match for an element and a predicate filter that restricts the match for the desired local-name() and the namespace-uri(). For example:

您可以构造一个XPath表达式,该表达式使用元素的通用匹配,使用谓词过滤器限制对所需的本地名称()和名称空间-uri()的匹配。例如:

/*[local-name()='workbook'
    and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']
  /*[local-name()='sheets'
      and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']
  /*[local-name()='sheet'
      and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'][1]

As you can see, it produces an extremely long and verbose XPath statement that is very difficult to read (and maintain).

如您所见,它生成一个非常长且冗长的XPath语句,很难读取(和维护)。

You could also just match on the local-name() of the element and ignore the namespace. For example:

您还可以只匹配元素的本地名称(),并忽略名称空间。例如:

/*[local-name()='workbook']/*[local-name()='sheets']/*[local-name()='sheet'][1]

However, you run the risk of matching the wrong elements. If your XML has mixed vocabularies (which may not be an issue for this instance) that use the same local-name(), your XPath could match on the wrong elements and select the wrong content:

但是,您可能会遇到匹配错误元素的风险。如果您的XML有使用相同本地名称()的混合词汇表(对于本例来说可能不是问题),那么您的XPath可以匹配错误的元素并选择错误的内容:

#2


55  

Your problem is the default namespace. Check out this article for how to deal with namespaces in your XPath: http://www.edankert.com/defaultnamespaces.html

您的问题是默认名称空间。查看本文,了解如何处理XPath中的名称空间:http://www.edankert.com/defaultnamespaces.html

One of the conclusions they draw is:

他们得出的结论之一是:

So, to be able to use XPath expressions on XML content defined in a (default) namespace, we need to specify a namespace prefix mapping

因此,为了能够在(默认)名称空间中定义的XML内容上使用XPath表达式,我们需要指定名称空间前缀映射。

Note that this doesn't mean that you have to change your source document in any way (though you're free to put the namespace prefixes in there if you so desire). Sounds strange, right? What you will do is create a namespace prefix mapping in your java code and use said prefix in your XPath expression. Here, we'll create a mapping from spreadsheet to your default namespace.

请注意,这并不意味着您必须以任何方式更改源文档(尽管您可以随意将名称空间前缀放入其中)。听起来很奇怪,对吧?您将在java代码中创建一个名称空间前缀映射,并在XPath表达式中使用前面提到的前缀。在这里,我们将创建从电子表格到默认名称空间的映射。

XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();

// there's no default implementation for NamespaceContext...seems kind of silly, no?
xpath.setNamespaceContext(new NamespaceContext() {
    public String getNamespaceURI(String prefix) {
        if (prefix == null) throw new NullPointerException("Null prefix");
        else if ("spreadsheet".equals(prefix)) return "http://schemas.openxmlformats.org/spreadsheetml/2006/main";
        else if ("xml".equals(prefix)) return XMLConstants.XML_NS_URI;
        return XMLConstants.NULL_NS_URI;
    }

    // This method isn't necessary for XPath processing.
    public String getPrefix(String uri) {
        throw new UnsupportedOperationException();
    }

    // This method isn't necessary for XPath processing either.
    public Iterator getPrefixes(String uri) {
        throw new UnsupportedOperationException();
    }
});

// note that all the elements in the expression are prefixed with our namespace mapping!
XPathExpression expr = xpath.compile("/spreadsheet:workbook/spreadsheet:sheets/spreadsheet:sheet[1]");

// assuming you've got your XML document in a variable named doc...
Node result = (Node) expr.evaluate(doc, XPathConstants.NODE);

And voila...Now you've got your element saved in the result variable.

瞧……现在您已经将元素保存在result变量中。

Caveat: if you're parsing your XML as a DOM with the standard JAXP classes, be sure to call setNamespaceAware(true) on your DocumentBuilderFactory. Otherwise, this code won't work!

注意:如果使用标准JAXP类将XML解析为DOM,请确保在DocumentBuilderFactory上调用setNamespaceAware(true)。否则,此代码将不起作用!

#3


33  

All namespaces that you intend to select from in the source XML must be associated with a prefix in the host language. In Java/JAXP this is done by specifying the URI for each namespace prefix using an instance of javax.xml.namespace.NamespaceContext. Unfortunately, there is no implementation of NamespaceContext provided in the SDK.

您打算在源XML中选择的所有名称空间必须与宿主语言中的前缀相关联。在Java/JAXP中,这是通过使用Java .xml.namespace. namespacecontext实例为每个名称空间前缀指定URI来实现的。遗憾的是,在SDK中没有实现NamespaceContext。

Fortunately, it's very easy to write your own:

幸运的是,写自己的东西很容易:

import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import javax.xml.namespace.NamespaceContext;

public class SimpleNamespaceContext implements NamespaceContext {

    private final Map<String, String> PREF_MAP = new HashMap<String, String>();

    public SimpleNamespaceContext(final Map<String, String> prefMap) {
        PREF_MAP.putAll(prefMap);       
    }

    public String getNamespaceURI(String prefix) {
        return PREF_MAP.get(prefix);
    }

    public String getPrefix(String uri) {
        throw new UnsupportedOperationException();
    }

    public Iterator getPrefixes(String uri) {
        throw new UnsupportedOperationException();
    }

}

Use it like this:

使用它是这样的:

XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
HashMap<String, String> prefMap = new HashMap<String, String>() {{
    put("main", "http://schemas.openxmlformats.org/spreadsheetml/2006/main");
    put("r", "http://schemas.openxmlformats.org/officeDocument/2006/relationships");
}};
SimpleNamespaceContext namespaces = new SimpleNamespaceContext(prefMap);
xpath.setNamespaceContext(namespaces);
XPathExpression expr = xpath
        .compile("/main:workbook/main:sheets/main:sheet[1]");
Object result = expr.evaluate(doc, XPathConstants.NODESET);

Note that even though the first namespace does not specify a prefix in the source document (i.e. it is the default namespace) you must associate it with a prefix anyway. Your expression should then reference nodes in that namespace using the prefix you've chosen, like this:

注意,尽管第一个名称空间没有在源文档中指定前缀(即默认名称空间),但无论如何必须将其与前缀关联。然后,表达式应该使用您选择的前缀引用该名称空间中的节点,如下所示:

/main:workbook/main:sheets/main:sheet[1]

The prefix names you choose to associate with each namespace are arbitrary; they do not need to match what appears in the source XML. This mapping is just a way to tell the XPath engine that a given prefix name in an expression correlates with a specific namespace in the source document.

选择与每个名称空间关联的前缀名称是任意的;它们不需要匹配源XML中出现的内容。这种映射只是告诉XPath引擎表达式中的给定前缀名称与源文档中的特定名称空间相关联的一种方式。

#4


2  

If you are using Spring, it already contains org.springframework.util.xml.SimpleNamespaceContext.

如果您正在使用Spring,它已经包含了org.springframe .util.xml. simpl珐琅espacecontext。

        import org.springframework.util.xml.SimpleNamespaceContext;
        ...

        XPathFactory xPathfactory = XPathFactory.newInstance();
        XPath xpath = xPathfactory.newXPath();
        SimpleNamespaceContext nsc = new SimpleNamespaceContext();

        nsc.bindNamespaceUri("a", "http://some.namespace.com/nsContext");
        xpath.setNamespaceContext(nsc);

        XPathExpression xpathExpr = xpath.compile("//a:first/a:second");

        String result = (String) xpathExpr.evaluate(object, XPathConstants.STRING);

#5


0  

Make sure that you are referencing the namespace in your XSLT

确保您正在引用XSLT中的命名空间。

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
             xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
             xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"       >

#6


0  

I've written a simple NamespaceContext implementation (here), that takes a Map<String, String> as input, where the key is a prefix, and the value is a namespace.

我编写了一个简单的NamespaceContext实现(这里),它接受Map 作为输入,其中键是前缀,值是名称空间。 ,>

It follows the NamespaceContext spesification, and you can see how it works in the unit tests.

它遵循NamespaceContext spesification,您可以在单元测试中看到它是如何工作的。

Map<String, String> mappings = new HashMap<>();
mappings.put("foo", "http://foo");
mappings.put("foo2", "http://foo");
mappings.put("bar", "http://bar");

context = new SimpleNamespaceContext(mappings);

context.getNamespaceURI("foo");    // "http://foo"
context.getPrefix("http://foo");   // "foo" or "foo2"
context.getPrefixes("http://foo"); // ["foo", "foo2"]

Note that it has a dependency on Google Guava

注意,它对谷歌番石榴有依赖性。

#1


61  

In the second example XML file the elements are bound to a namespace. Your XPath is attempting to address elements that are bound to the default "no namespace" namespace, so they don't match.

在第二个示例XML文件中,元素被绑定到一个名称空间。XPath试图处理绑定到默认“无名称空间”名称空间的元素,因此它们不匹配。

The preferred method is to register the namespace with a namespace-prefix. It makes your XPath much easier to develop, read, and maintain.

首选方法是使用名称空间前缀注册名称空间。它使XPath更易于开发、读取和维护。

However, it is not mandatory that you register the namespace and use the namespace-prefix in your XPath.

但是,您不需要注册名称空间并在XPath中使用名称空间前缀。

You can formulate an XPath expression that uses a generic match for an element and a predicate filter that restricts the match for the desired local-name() and the namespace-uri(). For example:

您可以构造一个XPath表达式,该表达式使用元素的通用匹配,使用谓词过滤器限制对所需的本地名称()和名称空间-uri()的匹配。例如:

/*[local-name()='workbook'
    and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']
  /*[local-name()='sheets'
      and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']
  /*[local-name()='sheet'
      and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'][1]

As you can see, it produces an extremely long and verbose XPath statement that is very difficult to read (and maintain).

如您所见,它生成一个非常长且冗长的XPath语句,很难读取(和维护)。

You could also just match on the local-name() of the element and ignore the namespace. For example:

您还可以只匹配元素的本地名称(),并忽略名称空间。例如:

/*[local-name()='workbook']/*[local-name()='sheets']/*[local-name()='sheet'][1]

However, you run the risk of matching the wrong elements. If your XML has mixed vocabularies (which may not be an issue for this instance) that use the same local-name(), your XPath could match on the wrong elements and select the wrong content:

但是,您可能会遇到匹配错误元素的风险。如果您的XML有使用相同本地名称()的混合词汇表(对于本例来说可能不是问题),那么您的XPath可以匹配错误的元素并选择错误的内容:

#2


55  

Your problem is the default namespace. Check out this article for how to deal with namespaces in your XPath: http://www.edankert.com/defaultnamespaces.html

您的问题是默认名称空间。查看本文,了解如何处理XPath中的名称空间:http://www.edankert.com/defaultnamespaces.html

One of the conclusions they draw is:

他们得出的结论之一是:

So, to be able to use XPath expressions on XML content defined in a (default) namespace, we need to specify a namespace prefix mapping

因此,为了能够在(默认)名称空间中定义的XML内容上使用XPath表达式,我们需要指定名称空间前缀映射。

Note that this doesn't mean that you have to change your source document in any way (though you're free to put the namespace prefixes in there if you so desire). Sounds strange, right? What you will do is create a namespace prefix mapping in your java code and use said prefix in your XPath expression. Here, we'll create a mapping from spreadsheet to your default namespace.

请注意,这并不意味着您必须以任何方式更改源文档(尽管您可以随意将名称空间前缀放入其中)。听起来很奇怪,对吧?您将在java代码中创建一个名称空间前缀映射,并在XPath表达式中使用前面提到的前缀。在这里,我们将创建从电子表格到默认名称空间的映射。

XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();

// there's no default implementation for NamespaceContext...seems kind of silly, no?
xpath.setNamespaceContext(new NamespaceContext() {
    public String getNamespaceURI(String prefix) {
        if (prefix == null) throw new NullPointerException("Null prefix");
        else if ("spreadsheet".equals(prefix)) return "http://schemas.openxmlformats.org/spreadsheetml/2006/main";
        else if ("xml".equals(prefix)) return XMLConstants.XML_NS_URI;
        return XMLConstants.NULL_NS_URI;
    }

    // This method isn't necessary for XPath processing.
    public String getPrefix(String uri) {
        throw new UnsupportedOperationException();
    }

    // This method isn't necessary for XPath processing either.
    public Iterator getPrefixes(String uri) {
        throw new UnsupportedOperationException();
    }
});

// note that all the elements in the expression are prefixed with our namespace mapping!
XPathExpression expr = xpath.compile("/spreadsheet:workbook/spreadsheet:sheets/spreadsheet:sheet[1]");

// assuming you've got your XML document in a variable named doc...
Node result = (Node) expr.evaluate(doc, XPathConstants.NODE);

And voila...Now you've got your element saved in the result variable.

瞧……现在您已经将元素保存在result变量中。

Caveat: if you're parsing your XML as a DOM with the standard JAXP classes, be sure to call setNamespaceAware(true) on your DocumentBuilderFactory. Otherwise, this code won't work!

注意:如果使用标准JAXP类将XML解析为DOM,请确保在DocumentBuilderFactory上调用setNamespaceAware(true)。否则,此代码将不起作用!

#3


33  

All namespaces that you intend to select from in the source XML must be associated with a prefix in the host language. In Java/JAXP this is done by specifying the URI for each namespace prefix using an instance of javax.xml.namespace.NamespaceContext. Unfortunately, there is no implementation of NamespaceContext provided in the SDK.

您打算在源XML中选择的所有名称空间必须与宿主语言中的前缀相关联。在Java/JAXP中,这是通过使用Java .xml.namespace. namespacecontext实例为每个名称空间前缀指定URI来实现的。遗憾的是,在SDK中没有实现NamespaceContext。

Fortunately, it's very easy to write your own:

幸运的是,写自己的东西很容易:

import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import javax.xml.namespace.NamespaceContext;

public class SimpleNamespaceContext implements NamespaceContext {

    private final Map<String, String> PREF_MAP = new HashMap<String, String>();

    public SimpleNamespaceContext(final Map<String, String> prefMap) {
        PREF_MAP.putAll(prefMap);       
    }

    public String getNamespaceURI(String prefix) {
        return PREF_MAP.get(prefix);
    }

    public String getPrefix(String uri) {
        throw new UnsupportedOperationException();
    }

    public Iterator getPrefixes(String uri) {
        throw new UnsupportedOperationException();
    }

}

Use it like this:

使用它是这样的:

XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
HashMap<String, String> prefMap = new HashMap<String, String>() {{
    put("main", "http://schemas.openxmlformats.org/spreadsheetml/2006/main");
    put("r", "http://schemas.openxmlformats.org/officeDocument/2006/relationships");
}};
SimpleNamespaceContext namespaces = new SimpleNamespaceContext(prefMap);
xpath.setNamespaceContext(namespaces);
XPathExpression expr = xpath
        .compile("/main:workbook/main:sheets/main:sheet[1]");
Object result = expr.evaluate(doc, XPathConstants.NODESET);

Note that even though the first namespace does not specify a prefix in the source document (i.e. it is the default namespace) you must associate it with a prefix anyway. Your expression should then reference nodes in that namespace using the prefix you've chosen, like this:

注意,尽管第一个名称空间没有在源文档中指定前缀(即默认名称空间),但无论如何必须将其与前缀关联。然后,表达式应该使用您选择的前缀引用该名称空间中的节点,如下所示:

/main:workbook/main:sheets/main:sheet[1]

The prefix names you choose to associate with each namespace are arbitrary; they do not need to match what appears in the source XML. This mapping is just a way to tell the XPath engine that a given prefix name in an expression correlates with a specific namespace in the source document.

选择与每个名称空间关联的前缀名称是任意的;它们不需要匹配源XML中出现的内容。这种映射只是告诉XPath引擎表达式中的给定前缀名称与源文档中的特定名称空间相关联的一种方式。

#4


2  

If you are using Spring, it already contains org.springframework.util.xml.SimpleNamespaceContext.

如果您正在使用Spring,它已经包含了org.springframe .util.xml. simpl珐琅espacecontext。

        import org.springframework.util.xml.SimpleNamespaceContext;
        ...

        XPathFactory xPathfactory = XPathFactory.newInstance();
        XPath xpath = xPathfactory.newXPath();
        SimpleNamespaceContext nsc = new SimpleNamespaceContext();

        nsc.bindNamespaceUri("a", "http://some.namespace.com/nsContext");
        xpath.setNamespaceContext(nsc);

        XPathExpression xpathExpr = xpath.compile("//a:first/a:second");

        String result = (String) xpathExpr.evaluate(object, XPathConstants.STRING);

#5


0  

Make sure that you are referencing the namespace in your XSLT

确保您正在引用XSLT中的命名空间。

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
             xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
             xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"       >

#6


0  

I've written a simple NamespaceContext implementation (here), that takes a Map<String, String> as input, where the key is a prefix, and the value is a namespace.

我编写了一个简单的NamespaceContext实现(这里),它接受Map 作为输入,其中键是前缀,值是名称空间。 ,>

It follows the NamespaceContext spesification, and you can see how it works in the unit tests.

它遵循NamespaceContext spesification,您可以在单元测试中看到它是如何工作的。

Map<String, String> mappings = new HashMap<>();
mappings.put("foo", "http://foo");
mappings.put("foo2", "http://foo");
mappings.put("bar", "http://bar");

context = new SimpleNamespaceContext(mappings);

context.getNamespaceURI("foo");    // "http://foo"
context.getPrefix("http://foo");   // "foo" or "foo2"
context.getPrefixes("http://foo"); // ["foo", "foo2"]

Note that it has a dependency on Google Guava

注意,它对谷歌番石榴有依赖性。