使用SAX解析器,如何解析具有相同名称标签但在不同元素中的xml文件?

时间:2021-05-20 19:44:16

Is it possible to give path expressions in SAX parser? I have an XML file which has a few same name tags, but they are in different element. Is there any way to differentiate between them. Here is the XML:

是否可以在SAX解析器中提供路径表达式?我有一个XML文件,它有几个相同的名称标签,但它们在不同的元素中。有没有办法区分它们。这是XML:

<Schools>
    <School>
        <ID>335823</ID> 
        <Name>Fairfax High School</Name> 
        <Student>
            <ID>4195653</ID>
            <Name>Will Turner</Name>
        </Student>
        <Student>
            <ID>4195654</ID>
            <Name>Bruce Paltrow</Name>
        </Student>
        <Student>
            <ID>4195655</ID>
            <Name>Santosh Gowswami</Name>
        </Student>
    </School>
    <School>
        <ID>335824</ID> 
        <Name>FallsChurch High School</Name> 
        <Student>
            <ID>4153</ID>
            <Name>John Singer</Name>
        </Student>
        <Student>
            <ID>4154</ID>
            <Name>Shane Warne</Name>
        </Student>
        <Student>
            <ID>4155</ID>
            <Name>Eddie Diaz</Name>
        </Student>
    </School>
</Schools>

I want to differentiate between the Name and Id of a student from the name and ID of a school.

我想根据学校的名称和ID来区分学生的姓名和身份证明。

Thanks for the response:

谢谢你的回复:

I have created a student pojo which has the following fields- school_id,school_name, student_id and student_name and getter and setter methods for them. This is my temporary parser implementation. When i parse the xml, I need to put the values of school name, id , student name, id in the pojo and return it. Can you tell me on how I should implement the stack for the differentiation. This is my parser framework::

我创建了一个学生pojo,其中包含以下字段:school_id,school_name,student_id和student_name以及getter和setter方法。这是我的临时解析器实现。当我解析xml时,我需要将学校名称,id,学生姓名,id的值放在pojo中并返回它。你能告诉我如何实现堆栈以区分。这是我的解析器框架::

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class HandleXML extends DefaultHandler {

    private student info;
    private boolean school_id = false;
    private boolean school_name = false;
    private boolean student_id = false;
    private boolean student_name = false;
    private boolean student = false;
    private boolean school = false;


    public HandleXML(student record) {
        super();
        this.info = record;
        school_id = false;
        school_name = false;
        student_id = false;
        student_name = false;
        student = false;
        school = false;
    }

    @Override
    public void startElement(String uri, String localName,
            String qName, Attributes attributes)
            throws SAXException {
    if (qName.equalsIgnoreCase("student")) {
            student = true;
        }
    if (qName.equalsIgnoreCase("school")) {
            school_id = true;
        }
    if (qName.equalsIgnoreCase("school_id")) {
            school_id = true;
        }
    if (qName.equalsIgnoreCase("student_id")) {
            student_id = true;
        }
    if (qName.equalsIgnoreCase("school_name")) {
            school_name = true;
        }
    if (qName.equalsIgnoreCase("student_name")) {
            student_name = true;
        }
    }

    @Override
    public void endElement(String uri, String localName,
            String qName)
            throws SAXException {
    }

    @Override
    public void characters(char ch[], int start, int length)
            throws SAXException {

        String data = new String(ch, start, length);

    }
}

5 个解决方案

#1


13  

Well, I haven't played in years with SAX in Java, so here's my take on it:

好吧,我在Java中没有使用过多年的SAX,所以这是我对它的看法:

package play.xml.sax;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Stack;

public class Test1 {
    public static void main(String[] args) {
        SAXParserFactory spf = SAXParserFactory.newInstance();
        SchoolsHandler handler = new SchoolsHandler();
        try {
            SAXParser sp = spf.newSAXParser();
            sp.parse("schools.xml", handler);
            System.out.println("Number of read schools: " + handler.getSchools().size());
        } catch (SAXException se) {
            se.printStackTrace();
        } catch (ParserConfigurationException pce) {
            pce.printStackTrace();
        } catch (IOException ie) {
            ie.printStackTrace();
        }
    }
}

class SchoolsHandler extends DefaultHandler {
    private static final String TAG_SCHOOLS = "Schools";
    private static final String TAG_SCHOOL = "School";
    private static final String TAG_STUDENT = "Student";
    private static final String TAG_ID = "ID";
    private static final String TAG_NAME = "Name";

    private final Stack<String> tagsStack = new Stack<String>();
    private final StringBuilder tempVal = new StringBuilder();

    private List<School> schools;
    private School school;
    private Student student;

    public void startElement(String uri, String localName, String qName, Attributes attributes) {
        pushTag(qName);
        tempVal.setLength(0);
        if (TAG_SCHOOLS.equalsIgnoreCase(qName)) {
            schools = new ArrayList<School>();
        } else if (TAG_SCHOOL.equalsIgnoreCase(qName)) {
            school = new School();
        } else if (TAG_STUDENT.equalsIgnoreCase(qName)) {
            student = new Student();
        }
    }

    public void characters(char ch[], int start, int length) {
        tempVal.append(ch, start, length);
    }

    public void endElement(String uri, String localName, String qName) {
        String tag = peekTag();
        if (!qName.equals(tag)) {
            throw new InternalError();
        }

        popTag();
        String parentTag = peekTag();

        if (TAG_ID.equalsIgnoreCase(tag)) {
            int id = Integer.valueOf(tempVal.toString().trim());
            if (TAG_STUDENT.equalsIgnoreCase(parentTag)) {
                student.setId(id);
            } else if (TAG_SCHOOL.equalsIgnoreCase(parentTag)) {
                school.setId(id);
            }
        } else if (TAG_NAME.equalsIgnoreCase(tag)) {
            String name = tempVal.toString().trim();
            if (TAG_STUDENT.equalsIgnoreCase(parentTag)) {
                student.setName(name);
            } else if (TAG_SCHOOL.equalsIgnoreCase(parentTag)) {
                school.setName(name);
            }
        } else if (TAG_STUDENT.equalsIgnoreCase(tag)) {
            school.addStudent(student);
        } else if (TAG_SCHOOL.equalsIgnoreCase(tag)) {
            schools.add(school);
        }
    }

    public void startDocument() {
        pushTag("");
    }

    public List<School> getSchools() {
        return schools;
    }

    private void pushTag(String tag) {
        tagsStack.push(tag);
    }

    private String popTag() {
        return tagsStack.pop();
    }

    private String peekTag() {
        return tagsStack.peek();
    }
}

class School {
    private int id;
    private String name;
    private List<Student> students = new ArrayList<Student>();

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public void addStudent(Student student) {
        students.add(student);
    }

    public List<Student> getStudents() {
        return students;
    }
}

class Student {
    private int id;
    private String name;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }
}

schools.xml contains your example XML. Please note that I crammed everything in a single file, but this is only for I was just playing around.

schools.xml包含您的示例XML。请注意,我把所有内容都塞进了一个文件中,但这只是因为我只是在玩耍。

#2


14  

In a SAX parser you are given each element in document order. You have to maintain a stack to track nesting (push onto the stack when handling startElement, and pop for endElement). You can differentiate the different <Name> elements by what is currently on the stack.

在SAX解析器中,按文档顺序给出每个元素。您必须维护堆栈以跟踪嵌套(在处理startElement时推入堆栈,并为endElement弹出)。您可以通过堆栈中当前的内容区分不同的 元素。

Alternatively, just keep a variable that tells you if you've encountered a <School> tag or <Student> tag to tell you which type of <Name> you are seeing.

或者,只需保留一个变量,告诉您是否遇到 标记或 标记,以告诉您所看到的 类型。

#3


2  

Yes, understanding xml using a SAX parser is generally a bit more complicated than working with DOM. basically, you need to maintain state/context in your SAX parser so that you can differentiate between those situations.

是的,使用SAX解析器理解xml通常比使用DOM更复杂。基本上,您需要在SAX解析器中维护状态/上下文,以便区分这些情况。

note, the other key to implementing a SAX handler is understanding that values may be split across multiple character events.

请注意,实现SAX处理程序的另一个关键是理解可以跨多个字符事件拆分值。

#4


1  

Sax is event based, via callbacks you can read the XML document serially. Sax is good for reading large XML documents as the whole document is not loaded into memory. You might want to look at Xpath, e.g.

Sax是基于事件的,通过回调,您可以连续读取XML文档。 Sax非常适合读取大型XML文档,因为整个文档没有加载到内存中。您可能想要查看Xpath,例如

XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
String expression = "/Schools/school/ ...";
XPathExpression xPathExpression = xPath.compile(expression);
// Compile the expression to get a XPathExpression object.
Object result = xPathExpression.evaluate(xmlDocument);

#5


0  

private boolean isInStudentNode;
...................................................    

public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
    // enter node Student
    if(qName.equalEgnoreCase("Student"){
       isInStudentNode = true;
    }
    ...
}

public void endElement(String uri, String localName, String qName) throws SAXException {
    // end node Student
    if(qName.equalEgnoreCase("Student"){
       isInStudentNode = false;
       ...........
    }

    // end node Name (school|student)
    if(qName.equalEgnoreCase("Name"){
        if(isInStudentNode) student.setName(...);
        else school.setName(...);
    }
}

its work with me

它与我合作

#1


13  

Well, I haven't played in years with SAX in Java, so here's my take on it:

好吧,我在Java中没有使用过多年的SAX,所以这是我对它的看法:

package play.xml.sax;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Stack;

public class Test1 {
    public static void main(String[] args) {
        SAXParserFactory spf = SAXParserFactory.newInstance();
        SchoolsHandler handler = new SchoolsHandler();
        try {
            SAXParser sp = spf.newSAXParser();
            sp.parse("schools.xml", handler);
            System.out.println("Number of read schools: " + handler.getSchools().size());
        } catch (SAXException se) {
            se.printStackTrace();
        } catch (ParserConfigurationException pce) {
            pce.printStackTrace();
        } catch (IOException ie) {
            ie.printStackTrace();
        }
    }
}

class SchoolsHandler extends DefaultHandler {
    private static final String TAG_SCHOOLS = "Schools";
    private static final String TAG_SCHOOL = "School";
    private static final String TAG_STUDENT = "Student";
    private static final String TAG_ID = "ID";
    private static final String TAG_NAME = "Name";

    private final Stack<String> tagsStack = new Stack<String>();
    private final StringBuilder tempVal = new StringBuilder();

    private List<School> schools;
    private School school;
    private Student student;

    public void startElement(String uri, String localName, String qName, Attributes attributes) {
        pushTag(qName);
        tempVal.setLength(0);
        if (TAG_SCHOOLS.equalsIgnoreCase(qName)) {
            schools = new ArrayList<School>();
        } else if (TAG_SCHOOL.equalsIgnoreCase(qName)) {
            school = new School();
        } else if (TAG_STUDENT.equalsIgnoreCase(qName)) {
            student = new Student();
        }
    }

    public void characters(char ch[], int start, int length) {
        tempVal.append(ch, start, length);
    }

    public void endElement(String uri, String localName, String qName) {
        String tag = peekTag();
        if (!qName.equals(tag)) {
            throw new InternalError();
        }

        popTag();
        String parentTag = peekTag();

        if (TAG_ID.equalsIgnoreCase(tag)) {
            int id = Integer.valueOf(tempVal.toString().trim());
            if (TAG_STUDENT.equalsIgnoreCase(parentTag)) {
                student.setId(id);
            } else if (TAG_SCHOOL.equalsIgnoreCase(parentTag)) {
                school.setId(id);
            }
        } else if (TAG_NAME.equalsIgnoreCase(tag)) {
            String name = tempVal.toString().trim();
            if (TAG_STUDENT.equalsIgnoreCase(parentTag)) {
                student.setName(name);
            } else if (TAG_SCHOOL.equalsIgnoreCase(parentTag)) {
                school.setName(name);
            }
        } else if (TAG_STUDENT.equalsIgnoreCase(tag)) {
            school.addStudent(student);
        } else if (TAG_SCHOOL.equalsIgnoreCase(tag)) {
            schools.add(school);
        }
    }

    public void startDocument() {
        pushTag("");
    }

    public List<School> getSchools() {
        return schools;
    }

    private void pushTag(String tag) {
        tagsStack.push(tag);
    }

    private String popTag() {
        return tagsStack.pop();
    }

    private String peekTag() {
        return tagsStack.peek();
    }
}

class School {
    private int id;
    private String name;
    private List<Student> students = new ArrayList<Student>();

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public void addStudent(Student student) {
        students.add(student);
    }

    public List<Student> getStudents() {
        return students;
    }
}

class Student {
    private int id;
    private String name;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }
}

schools.xml contains your example XML. Please note that I crammed everything in a single file, but this is only for I was just playing around.

schools.xml包含您的示例XML。请注意,我把所有内容都塞进了一个文件中,但这只是因为我只是在玩耍。

#2


14  

In a SAX parser you are given each element in document order. You have to maintain a stack to track nesting (push onto the stack when handling startElement, and pop for endElement). You can differentiate the different <Name> elements by what is currently on the stack.

在SAX解析器中,按文档顺序给出每个元素。您必须维护堆栈以跟踪嵌套(在处理startElement时推入堆栈,并为endElement弹出)。您可以通过堆栈中当前的内容区分不同的 元素。

Alternatively, just keep a variable that tells you if you've encountered a <School> tag or <Student> tag to tell you which type of <Name> you are seeing.

或者,只需保留一个变量,告诉您是否遇到 标记或 标记,以告诉您所看到的 类型。

#3


2  

Yes, understanding xml using a SAX parser is generally a bit more complicated than working with DOM. basically, you need to maintain state/context in your SAX parser so that you can differentiate between those situations.

是的,使用SAX解析器理解xml通常比使用DOM更复杂。基本上,您需要在SAX解析器中维护状态/上下文,以便区分这些情况。

note, the other key to implementing a SAX handler is understanding that values may be split across multiple character events.

请注意,实现SAX处理程序的另一个关键是理解可以跨多个字符事件拆分值。

#4


1  

Sax is event based, via callbacks you can read the XML document serially. Sax is good for reading large XML documents as the whole document is not loaded into memory. You might want to look at Xpath, e.g.

Sax是基于事件的,通过回调,您可以连续读取XML文档。 Sax非常适合读取大型XML文档,因为整个文档没有加载到内存中。您可能想要查看Xpath,例如

XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
String expression = "/Schools/school/ ...";
XPathExpression xPathExpression = xPath.compile(expression);
// Compile the expression to get a XPathExpression object.
Object result = xPathExpression.evaluate(xmlDocument);

#5


0  

private boolean isInStudentNode;
...................................................    

public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
    // enter node Student
    if(qName.equalEgnoreCase("Student"){
       isInStudentNode = true;
    }
    ...
}

public void endElement(String uri, String localName, String qName) throws SAXException {
    // end node Student
    if(qName.equalEgnoreCase("Student"){
       isInStudentNode = false;
       ...........
    }

    // end node Name (school|student)
    if(qName.equalEgnoreCase("Name"){
        if(isInStudentNode) student.setName(...);
        else school.setName(...);
    }
}

its work with me

它与我合作