在Java中处理XML文件,节点混乱

时间:2023-02-10 23:47:26

I am trying to parse XML file in Java and it works just fine, but I do not really get why. I have the following code (I just snipped important things):

我试图用Java解析XML文件,它工作正常,但我真的不明白为什么。我有以下代码(我只是剪了重要的东西):

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();

Document document = builder.parse(new File(fileName));

NodeList nodeList = document.getDocumentElement().getChildNodes();

for (int i = 0; i < nodeList.getLength(); i++)
  {
   Node node = nodeList.item(i);

   if (node.getNodeType() == Node.ELEMENT_NODE) {
   Element elem = (Element) node;

   // Get the value of all sub-elements.
   String original = elem.getElementsByTagName("Original")
         .item(0).getChildNodes().item(0).getNodeValue();

   String translation = elem.getElementsByTagName("Translation").item(0)
         .getChildNodes().item(0).getNodeValue();

   Integer score = Integer.parseInt(elem.getElementsByTagName("Score")
         .item(0).getChildNodes().item(0).getNodeValue());
}

My XML is simple one:

我的XML很简单:

<?xml version="1.0" encoding="UTF-8"?>
    <Dictionary>
         <Word>
              <Original>die Unterwäsche</Original >
              <Translation>Bielizna</Translation>
              <Score>-4</Score>
         </Word>
         <Word>
              <Original>die Müche</Original>
              <Translation>Fatyga, trud</Translation>
              <Score>0</Score>
         </Word>
         <Word>
              <Original>wetten</Original>
              <Translation>założyć się</Translation>
              <Score>-6</Score>
         </Word>
         <Word>
              <Original>umsonst</Original>
              <Translation>Bez powodu</Translation>
              <Score>0</Score>
         </Word>
    </Dictionary>

Big question is: why I have 9 nodes when calling nodeList.getLength() ? I printed them and 4 are elements (it seems fine) and 5 others are text nodes, but I do not really get what they are. And why is Node casted on Element?

最大的问题是:为什么我在调用nodeList.getLength()时有9个节点?我打印了它们,4个是元素(看起来很好),另外5个是文本节点,但我真的不知道它们是什么。为什么Node会在Element上投放?

Second thing is this part:

第二件事是这部分:

elem.getElementsByTagName("Score")
         .item(0).getChildNodes().item(0).getNodeValue());

I am calling item(0) on a found node, but again, what is it practically?

我在找到的节点上调用item(0),但实际上它又是什么?

I would really appreciate your help, I am quite beginner and I am struggling with it for a while now. Posting step-by-step guide what is what with parts of my XML listed would mean a world to me.

我非常感谢你的帮助,我是初学者,我现在正在努力解决它。发布逐步指导我列出的XML部分的内容对我来说意味着一个世界。

1 个解决方案

#1


why I have 9 nodes when calling nodeList.getLength() ?

为什么我在调用nodeList.getLength()时有9个节点?

The 9 nodes are:

9个节点是:

1 of <Document>
4 of <Word>
4 of Everything between <Word>

之间的 4 of Everything中的 4中的1个

5 others are text nodes, but I do not really get what they are

其他5个是文本节点,但我真的不知道它们是什么

<?xml version="1.0" encoding="UTF-8"?>
<Dictionary>                         <-- null text
    <Word>                           <-- null text
        <Original>...
        <Translation>...
        <Score>...
    </Word>
    <Word>                           <-- null text
        <Original>...
        <Translation>...
        <Score>...
    </Word>
    <Word>                           <-- null text
        <Original>...
        <Translation>...
        <Score>...
    </Word>
    <Word>                           <-- null text
        <Original>...
        <Translation>...
        <Score>...
    </Word>
</Dictionary>

And why is Node casted on Element?

为什么Node会在Element上投放?

To answer this last part, I refer you to another post: What's the difference between an element and a node in XML?

为了回答这最后一部分,我将向您推荐另一篇文章:XML中的元素和节点之间有什么区别?

#1


why I have 9 nodes when calling nodeList.getLength() ?

为什么我在调用nodeList.getLength()时有9个节点?

The 9 nodes are:

9个节点是:

1 of <Document>
4 of <Word>
4 of Everything between <Word>

之间的 4 of Everything中的 4中的1个

5 others are text nodes, but I do not really get what they are

其他5个是文本节点,但我真的不知道它们是什么

<?xml version="1.0" encoding="UTF-8"?>
<Dictionary>                         <-- null text
    <Word>                           <-- null text
        <Original>...
        <Translation>...
        <Score>...
    </Word>
    <Word>                           <-- null text
        <Original>...
        <Translation>...
        <Score>...
    </Word>
    <Word>                           <-- null text
        <Original>...
        <Translation>...
        <Score>...
    </Word>
    <Word>                           <-- null text
        <Original>...
        <Translation>...
        <Score>...
    </Word>
</Dictionary>

And why is Node casted on Element?

为什么Node会在Element上投放?

To answer this last part, I refer you to another post: What's the difference between an element and a node in XML?

为了回答这最后一部分,我将向您推荐另一篇文章:XML中的元素和节点之间有什么区别?