如何从XML中将子元素提取到Java中的字符串中?

时间:2022-01-26 07:10:32

If I have an XML document like

如果我有一个像这样的XML文档

<root>   
   <element1>
        <child attr1="blah">
           <child2>blahblah</child2>
        <child>   
   </element1> 
</root>

I want to get an XML string with the first child element. My output string would be

我想要得到一个带有第一个子元素的XML字符串。输出字符串是

<element1>
    <child attr1="blah">
       <child2>blahblah</child2>
    <child>
</element1>

There are many approaches, would like to see some ideas. I've been trying to use Java XML APIs for it, but it's not clear that there is a good way to do this.

有很多方法,希望看到一些想法。我一直在尝试为它使用Java XML api,但不清楚是否有一种好的方法来实现这一点。

thanks

谢谢

8 个解决方案

#1


7  

You're right, with the standard XML API, there's not a good way - here's one example (may be bug ridden; it runs, but I wrote it a long time ago).

您是对的,使用标准的XML API,没有一种好的方法——这里有一个例子(可能是错误驱动的;它运行,但我很久以前就写了。

import javax.xml.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import org.w3c.dom.*;
import java.io.*;

public class Proc
{
    public static void main(String[] args) throws Exception
    {
        //Parse the input document
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.parse(new File("in.xml"));

        //Set up the transformer to write the output string
        TransformerFactory tFactory = TransformerFactory.newInstance();
        Transformer transformer = tFactory.newTransformer();
        transformer.setOutputProperty("indent", "yes");
        StringWriter sw = new StringWriter();
        StreamResult result = new StreamResult(sw);

        //Find the first child node - this could be done with xpath as well
        NodeList nl = doc.getDocumentElement().getChildNodes();
        DOMSource source = null;
        for(int x = 0;x < nl.getLength();x++)
        {
            Node e = nl.item(x);
            if(e instanceof Element)
            {
                source = new DOMSource(e);
                break;
            }
        }

        //Do the transformation and output
        transformer.transform(source, result);
        System.out.println(sw.toString());
    }
}

It would seem like you could get the first child just by using doc.getDocumentElement().getFirstChild(), but the problem with that is if there is any whitespace between the root and the child element, that will create a Text node in the tree, and you'll get that node instead of the actual element node. The output from this program is:

看起来像你可以第一个孩子只是通过使用doc.getDocumentElement().getFirstChild(),但问题是如果有任何空格之间的根和子元素,将创建一个文本节点的树,你会得到该节点,而不是实际的元素节点。本程序输出为:

D:\home\tmp\xml>java Proc
<?xml version="1.0" encoding="UTF-8"?>
<element1>
        <child attr1="blah">
           <child2>blahblah</child2>
       </child>
   </element1>

I think you can suppress the xml version string if you don't need it, but I'm not sure on that. I would probably try to use a third party XML library if at all possible.

我认为如果不需要xml版本字符串,可以将它隐藏起来,但我不确定。如果可能的话,我可能会尝试使用第三方XML库。

#2


5  

Since this is the top google answer and For those of you who just want the basic:

因为这是谷歌的最上面的答案对于那些只想要基本答案的人来说:

    public static String serializeXml(Element element) throws Exception
{
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    StreamResult result = new StreamResult(buffer);

    DOMSource source = new DOMSource(element);
    TransformerFactory.newInstance().newTransformer().transform(source, result);

    return new String(buffer.toByteArray());
}

I use this for debug, which most likely is what you need this for

我将它用于调试,这很可能是您需要的

#3


3  

I would recommend JDOM. It's a Java XML library that makes dealing with XML much easier than the standard W3C approach.

我建议JDOM。它是一个Java XML库,使处理XML比标准的W3C方法简单得多。

#4


1  

XMLBeans is an easy to use (once you get the hang of it) tool to deal with XML without having to deal with the annoyances of parsing.

XMLBeans是一种易于使用(一旦您掌握了它)的工具,可以处理XML,而不必处理解析的麻烦。

It requires that you have a schema for the XML file, but it also provides a tool to generate a schema from an exisint XML file (depending on your needs the generated on is probably fine).

它要求您有一个XML文件的模式,但它也提供了一个工具来从exisint XML文件生成模式(根据您的需要,生成的on可能是好的)。

#5


0  

If your xml has schema backing it, you could use xmlbeans or JAXB to generate pojo objects that help you marshal/unmarshal xml.

如果您的xml有模式支持它,您可以使用xmlbeans或JAXB生成pojo对象,这些对象可以帮助您对xml进行编组/unmarshal。

http://xmlbeans.apache.org/ https://jaxb.dev.java.net/

http://xmlbeans.apache.org/ https://jaxb.dev.java.net/

#6


0  

As question is actually about first occurrence of string inside another string, I would use String class methods, instead of XML parsers:

由于问题实际上是在另一个字符串中第一次出现字符串,所以我将使用string类方法,而不是XML解析器:

public static String getElementAsString(String xml, String tagName){
    int beginIndex = xml.indexOf("<" + tagName);
    int endIndex = xml.indexOf("</" + tagName, beginIndex) + tagName.length() + 3;
    return xml.substring(beginIndex, endIndex);
}

#7


0  

public String getXML(String xmlContent, String tagName){

    String startTag = "<"+ tagName + ">";
    String endTag = "</"+ tagName + ">";
    int startposition = xmlContent.indexOf(startTag);
    int endposition = xmlContent.indexOf(endTag, startposition);
    if (startposition == -1){
        return "ddd";
    }
    startposition += startTag.length();
    if(endposition == -1){ 
        return "eee";
    }
    return xmlContent.substring(startposition, endposition);
}

Pass your xml as string to this method,and in your case pass 'element' as parameter tagname.

将您的xml作为字符串传递给这个方法,在您的示例中传递“元素”作为参数标记名。

#8


0  

You can use following function to extract xml block as string by passing proper xpath expression,

您可以使用以下函数来通过传递适当的xpath表达式来提取xml块,

    private static String nodeToString(Node node) throws TransformerException
{
    StringWriter buf = new StringWriter();
    Transformer xform = TransformerFactory.newInstance().newTransformer();
    xform.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    xform.transform(new DOMSource(node), new StreamResult(buf));
    return(buf.toString());
}

    public static void main(String[] args) throws Exception
{
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(inputFile);

        XPath xPath = XPathFactory.newInstance().newXPath();
        Node result = (Node)xPath.evaluate("A/B/C", doc, XPathConstants.NODE); //"A/B[id = '1']" //"//*[@type='t1']"

        System.out.println(nodeToString(result));

}

#1


7  

You're right, with the standard XML API, there's not a good way - here's one example (may be bug ridden; it runs, but I wrote it a long time ago).

您是对的,使用标准的XML API,没有一种好的方法——这里有一个例子(可能是错误驱动的;它运行,但我很久以前就写了。

import javax.xml.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import org.w3c.dom.*;
import java.io.*;

public class Proc
{
    public static void main(String[] args) throws Exception
    {
        //Parse the input document
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.parse(new File("in.xml"));

        //Set up the transformer to write the output string
        TransformerFactory tFactory = TransformerFactory.newInstance();
        Transformer transformer = tFactory.newTransformer();
        transformer.setOutputProperty("indent", "yes");
        StringWriter sw = new StringWriter();
        StreamResult result = new StreamResult(sw);

        //Find the first child node - this could be done with xpath as well
        NodeList nl = doc.getDocumentElement().getChildNodes();
        DOMSource source = null;
        for(int x = 0;x < nl.getLength();x++)
        {
            Node e = nl.item(x);
            if(e instanceof Element)
            {
                source = new DOMSource(e);
                break;
            }
        }

        //Do the transformation and output
        transformer.transform(source, result);
        System.out.println(sw.toString());
    }
}

It would seem like you could get the first child just by using doc.getDocumentElement().getFirstChild(), but the problem with that is if there is any whitespace between the root and the child element, that will create a Text node in the tree, and you'll get that node instead of the actual element node. The output from this program is:

看起来像你可以第一个孩子只是通过使用doc.getDocumentElement().getFirstChild(),但问题是如果有任何空格之间的根和子元素,将创建一个文本节点的树,你会得到该节点,而不是实际的元素节点。本程序输出为:

D:\home\tmp\xml>java Proc
<?xml version="1.0" encoding="UTF-8"?>
<element1>
        <child attr1="blah">
           <child2>blahblah</child2>
       </child>
   </element1>

I think you can suppress the xml version string if you don't need it, but I'm not sure on that. I would probably try to use a third party XML library if at all possible.

我认为如果不需要xml版本字符串,可以将它隐藏起来,但我不确定。如果可能的话,我可能会尝试使用第三方XML库。

#2


5  

Since this is the top google answer and For those of you who just want the basic:

因为这是谷歌的最上面的答案对于那些只想要基本答案的人来说:

    public static String serializeXml(Element element) throws Exception
{
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    StreamResult result = new StreamResult(buffer);

    DOMSource source = new DOMSource(element);
    TransformerFactory.newInstance().newTransformer().transform(source, result);

    return new String(buffer.toByteArray());
}

I use this for debug, which most likely is what you need this for

我将它用于调试,这很可能是您需要的

#3


3  

I would recommend JDOM. It's a Java XML library that makes dealing with XML much easier than the standard W3C approach.

我建议JDOM。它是一个Java XML库,使处理XML比标准的W3C方法简单得多。

#4


1  

XMLBeans is an easy to use (once you get the hang of it) tool to deal with XML without having to deal with the annoyances of parsing.

XMLBeans是一种易于使用(一旦您掌握了它)的工具,可以处理XML,而不必处理解析的麻烦。

It requires that you have a schema for the XML file, but it also provides a tool to generate a schema from an exisint XML file (depending on your needs the generated on is probably fine).

它要求您有一个XML文件的模式,但它也提供了一个工具来从exisint XML文件生成模式(根据您的需要,生成的on可能是好的)。

#5


0  

If your xml has schema backing it, you could use xmlbeans or JAXB to generate pojo objects that help you marshal/unmarshal xml.

如果您的xml有模式支持它,您可以使用xmlbeans或JAXB生成pojo对象,这些对象可以帮助您对xml进行编组/unmarshal。

http://xmlbeans.apache.org/ https://jaxb.dev.java.net/

http://xmlbeans.apache.org/ https://jaxb.dev.java.net/

#6


0  

As question is actually about first occurrence of string inside another string, I would use String class methods, instead of XML parsers:

由于问题实际上是在另一个字符串中第一次出现字符串,所以我将使用string类方法,而不是XML解析器:

public static String getElementAsString(String xml, String tagName){
    int beginIndex = xml.indexOf("<" + tagName);
    int endIndex = xml.indexOf("</" + tagName, beginIndex) + tagName.length() + 3;
    return xml.substring(beginIndex, endIndex);
}

#7


0  

public String getXML(String xmlContent, String tagName){

    String startTag = "<"+ tagName + ">";
    String endTag = "</"+ tagName + ">";
    int startposition = xmlContent.indexOf(startTag);
    int endposition = xmlContent.indexOf(endTag, startposition);
    if (startposition == -1){
        return "ddd";
    }
    startposition += startTag.length();
    if(endposition == -1){ 
        return "eee";
    }
    return xmlContent.substring(startposition, endposition);
}

Pass your xml as string to this method,and in your case pass 'element' as parameter tagname.

将您的xml作为字符串传递给这个方法,在您的示例中传递“元素”作为参数标记名。

#8


0  

You can use following function to extract xml block as string by passing proper xpath expression,

您可以使用以下函数来通过传递适当的xpath表达式来提取xml块,

    private static String nodeToString(Node node) throws TransformerException
{
    StringWriter buf = new StringWriter();
    Transformer xform = TransformerFactory.newInstance().newTransformer();
    xform.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    xform.transform(new DOMSource(node), new StreamResult(buf));
    return(buf.toString());
}

    public static void main(String[] args) throws Exception
{
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(inputFile);

        XPath xPath = XPathFactory.newInstance().newXPath();
        Node result = (Node)xPath.evaluate("A/B/C", doc, XPathConstants.NODE); //"A/B[id = '1']" //"//*[@type='t1']"

        System.out.println(nodeToString(result));

}