Java内置的JSON或XML数据解析器

时间:2022-10-29 13:42:20

I want to read data that's stored in a file. I haven't decide yet what format to store it, but I'm looking for a format that's easy to parse. Initially I thought I'd go with JSON, but it seems Java doesn't have a built-in parser for JSON.

我想读取存储在文件中的数据。我还没有决定存储它的格式,但我正在寻找一种易于解析的格式。最初我以为我会使用JSON,但似乎Java没有内置的JSON解析器。

The data stored will be a bunch of records, each record composed of a set of fields. So it's not simple enough to be stored in a text file that can be read line by line. This is why I think I need something like JSON. But I don't want to add external libraries just to parse the format. Any suggestions? I'm new to Java.

存储的数据将是一堆记录,每个记录由一组字段组成。因此,存储在可以逐行读取的文本文件中并不够简单。这就是为什么我认为我需要像JSON这样的东西。但我不想只是为了解析格式而添加外部库。有什么建议么?我是Java的新手。

8 个解决方案

#1


16  

While Java many not have a standard JSON parsing library, there are several libraries available that are fast, reliable, and easy to use. Many also allow you to use standard object binding methodologies such as JAXB do define your deserialization mappings using annotations.

虽然Java很多都没有标准的JSON解析库,但有几个库可用,快速,可靠且易于使用。许多还允许您使用标准对象绑定方法(如JAXB)使用注释定义反序列化映射。

I prefer Jackson myself. Google-GSon is also popular, and you can see how some people compare the two in this question.

我自己更喜欢杰克逊。 Google-GSon也很受欢迎,您可以看到有些人在这个问题上如何比较这两者。

You might want to be less afraid of using external libraries. It's almost always better to leverage an existing library that has the functionality you want, rather than to write your own. And with tools like Maven or Ivy to automatically calculate and download dependencies from your project definition, there's really no reason to fear using libraries.

您可能希望不那么害怕使用外部库。利用具有所需功能的现有库几乎总是更好,而不是编写自己的库。使用Maven或Ivy等工具自动计算和下载项目定义中的依赖项,实际上没有理由担心使用库。

That being said, with the current state of Java XML support, you should find XML equally accessible. This answer provides a simple example of using javax.xml.parsers.DocumentBuilder to generate a DOM.

话虽如此,在Java XML支持的当前状态下,您应该发现XML可以同等访问。这个答案提供了一个使用javax.xml.parsers.DocumentBuilder生成DOM的简单示例。

#2


10  

As many others have pointed out, Java doesn't ship a standard JSON-parsing library as part of the JDK, so if you want to use JDK-bundled tech with absolutely NO dependencies, you have 3 XML parsing choices:

正如许多其他人所指出的那样,Java并没有将标准的JSON解析库作为JDK的一部分提供,因此如果您希望使用JDK捆绑技术并且完全没有依赖关系,那么您有3种XML解析选择:

  • XPathFactory - XPath-based parsing. Reads the entire XML into an in-memory data structure and allows you to execute queries on it using XPath expression language. This is probably the slowest and most memory intensive, BUT, one of the most convenient ways to query your data. You wouldn't write a stock-trading app using this, but if you just need data from a big config file, it is very handy (although for configs, there are many other specific libs for that which are easier than rolling your own).
  • XPathFactory - 基于XPath的解析。将整个XML读入内存中的数据结构,并允许您使用XPath表达式语言对其执行查询。这可能是最慢且占用最多的内存,但这是查询数据最方便的方法之一。你不会用这个写一个股票交易应用程序,但是如果你只需要一个大配置文件中的数据,它就非常方便了(虽然对于配置,还有许多其他特定的库比你自己更容易) 。
  • DocumentBuilder - DOM-based parsing. Reads the entire XML into an in-memory data structure you can query and traverse as needed. 2nd slowest and fairly memory-intense, but necessary if you want/need the XML DOM to stick around in memory so you can operate on it. Also handy if you want to read, query, make changes and write the DOM back out as a modified XML file.
  • DocumentBuilder - 基于DOM的解析。将整个XML读入内存中的数据结构,您可以根据需要进行查询和遍历。第二个最慢和相当内存密集,但如果你想/需要XML DOM留在内存中以便你可以操作它是必要的。如果您想要读取,查询,进行更改并将DOM作为修改后的XML文件重新编写,也很方便。
  • SAXParser - SAX-based parsing. Almost the fastest. Parses through the XML top-to-bottom, calling stubbed methods in your ContentHandler implementation (provided at parse time) every time the appropriate element is hit. It is basically like a chatty person telling you everything they are doing AS they do it. It is up to you to implement the stubbed out methods to actually do something with the data it is passing you as it finds it.
  • SAXParser - 基于SAX的解析。几乎是最快的。每次点击相应的元素时,解析XML从上到下,在ContentHandler实现中调用存根方法(在解析时提供)。它基本上就像一个健谈的人告诉你他们正在做的一切,因为他们这样做。由你来实现被删除的方法来实际对它在找到它时传递给你的数据做一些事情。
  • XMLStreamReader - Fastest parsing method and uses the lowest overhead. This is the new golden-child of XML parsing in Java. It is similar to STAX, but instead of calling stubbed methods every time it finds something new, it rips across the XML file and notifies the caller of its modified state as it sees new content but does nothing WITH the content until you ask it for it. For example, it'll say something like "Now I'm looking at an open tag... now a close tag... now some chars... now a comment..." and unless you ask it for information about those elements it is hitting (get attributes, characters, etc.) it never actually parses and processes them out of the stream, it just skips them.
  • XMLStreamReader - 最快的解析方法,使用最低的开销。这是Java中XML解析的新金子。它类似于STAX,但它不是每次发现新的东西时调用存根方法,而是在XML文件中翻录并在调用者看到新内容时通知调用者其修改后的状态,但在您要求内容之前不对内容执行任何操作。例如,它会说“现在我正在看一个开放的标签......现在是一个密切的标签......现在有些字符......现在是评论......”除非你向它询问有关的信息它正在击中的那些元素(获取属性,字符等)它从未实际解析并将它们从流中处理出来,它只是跳过它们。

NOW, all that being said, working with these APIs especially if you are new isn't the most intuitive in the world. If you've done XML parsing in Java before, you'll be fine though.

现在,所说的一切,特别是如果你是新手,使用这些API并不是世界上最直观的。如果你以前用Java完成了XML解析,那么你会没事的。

If you WILL consider a tiny 3rd party JAR though, I am going to point you at my Simple Java XML Parser (SJXP) library. It gives you the ease of XPath with the performance of STAX parsing; honestly (I am being unbiased, seriously) -- it is awesome.

如果您将考虑一个小型的第三方JAR,我将指向您的Simple Java XML Parser(SJXP)库。它通过STAX解析的性能为您提供了轻松的XPath;说实话(我是公正的,认真的) - 这太棒了。

I spent more than a year working on this while writing a really robust Feed-parsing system that started off as a SAX-based system, then moved to STAX and the more I worked on it the more I realized how easily I could abstract out the pain of STAX with simple rules.

我花了一年多的时间研究这个问题,同时编写了一个非常强大的Feed解析系统,该系统从基于SAX的系统开始,然后转移到STAX,我工作的越多,我就越能意识到我能够轻松地抽象出来STAX的痛苦与简单的规则。

You can look at the Usage example, but you essentially define rules to match like "/library/book/title" will parse all your tag contents; you can parse attributes and even name-space qualified values (yes it supports namespaces too!)

您可以查看Usage示例,但实际上您定义了匹配的规则,如“/ library / book / title”将解析您的所有标记内容;你可以解析属性甚至名称空间限定值(是的,它也支持命名空间!)

Here is an RSS feed parser example:

这是一个RSS提要解析器示例:

IRule linkRule = new DefaultRule(Type.CHARACTER, "/rss/channel/item/link") {
    @Override
    public void handleParsedCharacters(XMLParser parser, String text, Object userObject) {
        // Also store the link, or something equivalently fancy
    }
}

Then you just pass that rule to the parser when you create it, like this:

然后,您只需在创建时将该规则传递给解析器,如下所示:

XMLParser parser = new XMLParser(linkRule);

And you are done; just give the parser your XML files via the parse method and you'll get callbacks every time that path is matched.

你完成了;只需通过parse方法为解析器提供XML文件,每次匹配路径时都会得到回调。

I have benchmarked, profiled and optimized out the overhead of the library ontop of STAX to the point that it is measurably non-existent. The actual patch-matching is done via cached hash codes so I am not even doing string comparisons inside the parser.

我已经对STAX上的库的开销进行了基准测试,分析和优化,以至于它几乎不存在。实际的补丁匹配是通过缓存的哈希码完成的,所以我甚至不在解析器中进行字符串比较。

It's really fast and it works on Android.

它非常快,适用于Android。

If you want to do JSON instead, I'd strongly recommend using GSON. Jackson is faster, but the API is 37x more complex than the GSON API. You'll spend more time figuring out exactly which classes you need to use in Jackson than you will with GSON.

如果你想做JSON,我强烈建议使用GSON。杰克逊速度更快,但API比GSON API复杂37倍。你会花更多的时间来确定你需要在杰克逊使用哪些课程,而不是使用GSON。

Also since the last GSON release and the rewrite of the stream parser the speed gap has been closed quite a bit; you can use the stream parser impl of theirs to get near-Jackson parsing speeds if that is critical.

自从上一次GSON发布和流解析器的重写以来,速度差距已经被关闭了很多;你可以使用他们的流解析器impl来获得接近Jackson的解析速度,如果这很关键的话。

That being said, if you need ULTIMATE speed above and beyond anything and that is priority #1, then use Jackson.

话虽这么说,如果你需要ULTIMATE速度超过任何东西并且优先级为#1,那么就使用Jackson。

#3


6  

I'm using GSON: http://code.google.com/p/google-gson/ for parsing JSON, It is very easy to use:

我正在使用GSON:http://code.google.com/p/google-gson/来解析JSON,它非常易于使用:

Gson gson = new Gson();
String xyzAsString = gson.toJson(xyz);

to deserialize JSON use:

反序列化JSON使用:

Gson gson = new Gson();
Classname xyz = gson.fromJson(JSONedString, Classname.class);

for more examples please look here: https://sites.google.com/site/gson/gson-user-guide

有关更多示例,请访问:https://sites.google.com/site/gson/gson-user-guide

#4


5  

You've already accepted, but everyone seems to be missing the fact that Java does have a standard JSON library. Ever since JDK 7 there is a javax.json package in the standard lib.

您已经接受了,但是每个人似乎都错过了Java确实拥有标准JSON库的事实。从JDK 7开始,标准库中就有一个javax.json包。

#5


1  

Java provides SAXParser for parsing XML.

Java提供了SAXParser来解析XML。

#6


1  

If you're programing on netbeans you can use dtd to generate xml scanner. Just click on dtd file with right mouse button and pick "Generate DOM scanner"

如果您正在使用netbeans进行编程,则可以使用dtd生成xml扫描程序。只需用鼠标右键单击dtd文件,然后选择“Generate DOM scanner”

#7


0  

javax.json is the Java package - note also that there is an extremely light-weight Java alternative to SAX, called StAX (Streaming API for XML).

javax.json是Java包 - 请注意,SAX有一个非常轻量级的Java替代品,称为StAX(XML的流式API)。

JSON v XML in the app you suggest, in my opinion, depends a lot more on what you're going to do with the data and how you're going to process it. For example, if you're sending the data to a web page and need to use object notation to process it using JavaScript, then JSON is the obvious choice. If you just want to display it, then you might want to consider XHTML -and let your backend choose what's being displayed. If you're transferring data between various industry computers in B2B applications, they you likely need to use XML and tags defined by industry standards.

在我看来,你建议的应用程序中的JSON v XML更多地取决于你将如何处理数据以及你将如何处理它。例如,如果您要将数据发送到网页并需要使用对象表示法来使用JavaScript处理它,那么JSON是显而易见的选择。如果您只想显示它,那么您可能需要考虑XHTML - 并让您的后端选择正在显示的内容。如果您在B2B应用程序中的各种行业计算机之间传输数据,则可能需要使用行业标准定义的XML和标记。

#8


-1  

JSON is great, better than XML.

JSON非常棒,比XML更好。

Why don't you want to add external libraries? If you really can not use, you can rewrite a parser. Just implementing a parser is not too difficult.

你为什么不想添加外部库?如果你真的无法使用,你可以重写一个解析器。仅实现解析器并不困难。

#1


16  

While Java many not have a standard JSON parsing library, there are several libraries available that are fast, reliable, and easy to use. Many also allow you to use standard object binding methodologies such as JAXB do define your deserialization mappings using annotations.

虽然Java很多都没有标准的JSON解析库,但有几个库可用,快速,可靠且易于使用。许多还允许您使用标准对象绑定方法(如JAXB)使用注释定义反序列化映射。

I prefer Jackson myself. Google-GSon is also popular, and you can see how some people compare the two in this question.

我自己更喜欢杰克逊。 Google-GSon也很受欢迎,您可以看到有些人在这个问题上如何比较这两者。

You might want to be less afraid of using external libraries. It's almost always better to leverage an existing library that has the functionality you want, rather than to write your own. And with tools like Maven or Ivy to automatically calculate and download dependencies from your project definition, there's really no reason to fear using libraries.

您可能希望不那么害怕使用外部库。利用具有所需功能的现有库几乎总是更好,而不是编写自己的库。使用Maven或Ivy等工具自动计算和下载项目定义中的依赖项,实际上没有理由担心使用库。

That being said, with the current state of Java XML support, you should find XML equally accessible. This answer provides a simple example of using javax.xml.parsers.DocumentBuilder to generate a DOM.

话虽如此,在Java XML支持的当前状态下,您应该发现XML可以同等访问。这个答案提供了一个使用javax.xml.parsers.DocumentBuilder生成DOM的简单示例。

#2


10  

As many others have pointed out, Java doesn't ship a standard JSON-parsing library as part of the JDK, so if you want to use JDK-bundled tech with absolutely NO dependencies, you have 3 XML parsing choices:

正如许多其他人所指出的那样,Java并没有将标准的JSON解析库作为JDK的一部分提供,因此如果您希望使用JDK捆绑技术并且完全没有依赖关系,那么您有3种XML解析选择:

  • XPathFactory - XPath-based parsing. Reads the entire XML into an in-memory data structure and allows you to execute queries on it using XPath expression language. This is probably the slowest and most memory intensive, BUT, one of the most convenient ways to query your data. You wouldn't write a stock-trading app using this, but if you just need data from a big config file, it is very handy (although for configs, there are many other specific libs for that which are easier than rolling your own).
  • XPathFactory - 基于XPath的解析。将整个XML读入内存中的数据结构,并允许您使用XPath表达式语言对其执行查询。这可能是最慢且占用最多的内存,但这是查询数据最方便的方法之一。你不会用这个写一个股票交易应用程序,但是如果你只需要一个大配置文件中的数据,它就非常方便了(虽然对于配置,还有许多其他特定的库比你自己更容易) 。
  • DocumentBuilder - DOM-based parsing. Reads the entire XML into an in-memory data structure you can query and traverse as needed. 2nd slowest and fairly memory-intense, but necessary if you want/need the XML DOM to stick around in memory so you can operate on it. Also handy if you want to read, query, make changes and write the DOM back out as a modified XML file.
  • DocumentBuilder - 基于DOM的解析。将整个XML读入内存中的数据结构,您可以根据需要进行查询和遍历。第二个最慢和相当内存密集,但如果你想/需要XML DOM留在内存中以便你可以操作它是必要的。如果您想要读取,查询,进行更改并将DOM作为修改后的XML文件重新编写,也很方便。
  • SAXParser - SAX-based parsing. Almost the fastest. Parses through the XML top-to-bottom, calling stubbed methods in your ContentHandler implementation (provided at parse time) every time the appropriate element is hit. It is basically like a chatty person telling you everything they are doing AS they do it. It is up to you to implement the stubbed out methods to actually do something with the data it is passing you as it finds it.
  • SAXParser - 基于SAX的解析。几乎是最快的。每次点击相应的元素时,解析XML从上到下,在ContentHandler实现中调用存根方法(在解析时提供)。它基本上就像一个健谈的人告诉你他们正在做的一切,因为他们这样做。由你来实现被删除的方法来实际对它在找到它时传递给你的数据做一些事情。
  • XMLStreamReader - Fastest parsing method and uses the lowest overhead. This is the new golden-child of XML parsing in Java. It is similar to STAX, but instead of calling stubbed methods every time it finds something new, it rips across the XML file and notifies the caller of its modified state as it sees new content but does nothing WITH the content until you ask it for it. For example, it'll say something like "Now I'm looking at an open tag... now a close tag... now some chars... now a comment..." and unless you ask it for information about those elements it is hitting (get attributes, characters, etc.) it never actually parses and processes them out of the stream, it just skips them.
  • XMLStreamReader - 最快的解析方法,使用最低的开销。这是Java中XML解析的新金子。它类似于STAX,但它不是每次发现新的东西时调用存根方法,而是在XML文件中翻录并在调用者看到新内容时通知调用者其修改后的状态,但在您要求内容之前不对内容执行任何操作。例如,它会说“现在我正在看一个开放的标签......现在是一个密切的标签......现在有些字符......现在是评论......”除非你向它询问有关的信息它正在击中的那些元素(获取属性,字符等)它从未实际解析并将它们从流中处理出来,它只是跳过它们。

NOW, all that being said, working with these APIs especially if you are new isn't the most intuitive in the world. If you've done XML parsing in Java before, you'll be fine though.

现在,所说的一切,特别是如果你是新手,使用这些API并不是世界上最直观的。如果你以前用Java完成了XML解析,那么你会没事的。

If you WILL consider a tiny 3rd party JAR though, I am going to point you at my Simple Java XML Parser (SJXP) library. It gives you the ease of XPath with the performance of STAX parsing; honestly (I am being unbiased, seriously) -- it is awesome.

如果您将考虑一个小型的第三方JAR,我将指向您的Simple Java XML Parser(SJXP)库。它通过STAX解析的性能为您提供了轻松的XPath;说实话(我是公正的,认真的) - 这太棒了。

I spent more than a year working on this while writing a really robust Feed-parsing system that started off as a SAX-based system, then moved to STAX and the more I worked on it the more I realized how easily I could abstract out the pain of STAX with simple rules.

我花了一年多的时间研究这个问题,同时编写了一个非常强大的Feed解析系统,该系统从基于SAX的系统开始,然后转移到STAX,我工作的越多,我就越能意识到我能够轻松地抽象出来STAX的痛苦与简单的规则。

You can look at the Usage example, but you essentially define rules to match like "/library/book/title" will parse all your tag contents; you can parse attributes and even name-space qualified values (yes it supports namespaces too!)

您可以查看Usage示例,但实际上您定义了匹配的规则,如“/ library / book / title”将解析您的所有标记内容;你可以解析属性甚至名称空间限定值(是的,它也支持命名空间!)

Here is an RSS feed parser example:

这是一个RSS提要解析器示例:

IRule linkRule = new DefaultRule(Type.CHARACTER, "/rss/channel/item/link") {
    @Override
    public void handleParsedCharacters(XMLParser parser, String text, Object userObject) {
        // Also store the link, or something equivalently fancy
    }
}

Then you just pass that rule to the parser when you create it, like this:

然后,您只需在创建时将该规则传递给解析器,如下所示:

XMLParser parser = new XMLParser(linkRule);

And you are done; just give the parser your XML files via the parse method and you'll get callbacks every time that path is matched.

你完成了;只需通过parse方法为解析器提供XML文件,每次匹配路径时都会得到回调。

I have benchmarked, profiled and optimized out the overhead of the library ontop of STAX to the point that it is measurably non-existent. The actual patch-matching is done via cached hash codes so I am not even doing string comparisons inside the parser.

我已经对STAX上的库的开销进行了基准测试,分析和优化,以至于它几乎不存在。实际的补丁匹配是通过缓存的哈希码完成的,所以我甚至不在解析器中进行字符串比较。

It's really fast and it works on Android.

它非常快,适用于Android。

If you want to do JSON instead, I'd strongly recommend using GSON. Jackson is faster, but the API is 37x more complex than the GSON API. You'll spend more time figuring out exactly which classes you need to use in Jackson than you will with GSON.

如果你想做JSON,我强烈建议使用GSON。杰克逊速度更快,但API比GSON API复杂37倍。你会花更多的时间来确定你需要在杰克逊使用哪些课程,而不是使用GSON。

Also since the last GSON release and the rewrite of the stream parser the speed gap has been closed quite a bit; you can use the stream parser impl of theirs to get near-Jackson parsing speeds if that is critical.

自从上一次GSON发布和流解析器的重写以来,速度差距已经被关闭了很多;你可以使用他们的流解析器impl来获得接近Jackson的解析速度,如果这很关键的话。

That being said, if you need ULTIMATE speed above and beyond anything and that is priority #1, then use Jackson.

话虽这么说,如果你需要ULTIMATE速度超过任何东西并且优先级为#1,那么就使用Jackson。

#3


6  

I'm using GSON: http://code.google.com/p/google-gson/ for parsing JSON, It is very easy to use:

我正在使用GSON:http://code.google.com/p/google-gson/来解析JSON,它非常易于使用:

Gson gson = new Gson();
String xyzAsString = gson.toJson(xyz);

to deserialize JSON use:

反序列化JSON使用:

Gson gson = new Gson();
Classname xyz = gson.fromJson(JSONedString, Classname.class);

for more examples please look here: https://sites.google.com/site/gson/gson-user-guide

有关更多示例,请访问:https://sites.google.com/site/gson/gson-user-guide

#4


5  

You've already accepted, but everyone seems to be missing the fact that Java does have a standard JSON library. Ever since JDK 7 there is a javax.json package in the standard lib.

您已经接受了,但是每个人似乎都错过了Java确实拥有标准JSON库的事实。从JDK 7开始,标准库中就有一个javax.json包。

#5


1  

Java provides SAXParser for parsing XML.

Java提供了SAXParser来解析XML。

#6


1  

If you're programing on netbeans you can use dtd to generate xml scanner. Just click on dtd file with right mouse button and pick "Generate DOM scanner"

如果您正在使用netbeans进行编程,则可以使用dtd生成xml扫描程序。只需用鼠标右键单击dtd文件,然后选择“Generate DOM scanner”

#7


0  

javax.json is the Java package - note also that there is an extremely light-weight Java alternative to SAX, called StAX (Streaming API for XML).

javax.json是Java包 - 请注意,SAX有一个非常轻量级的Java替代品,称为StAX(XML的流式API)。

JSON v XML in the app you suggest, in my opinion, depends a lot more on what you're going to do with the data and how you're going to process it. For example, if you're sending the data to a web page and need to use object notation to process it using JavaScript, then JSON is the obvious choice. If you just want to display it, then you might want to consider XHTML -and let your backend choose what's being displayed. If you're transferring data between various industry computers in B2B applications, they you likely need to use XML and tags defined by industry standards.

在我看来,你建议的应用程序中的JSON v XML更多地取决于你将如何处理数据以及你将如何处理它。例如,如果您要将数据发送到网页并需要使用对象表示法来使用JavaScript处理它,那么JSON是显而易见的选择。如果您只想显示它,那么您可能需要考虑XHTML - 并让您的后端选择正在显示的内容。如果您在B2B应用程序中的各种行业计算机之间传输数据,则可能需要使用行业标准定义的XML和标记。

#8


-1  

JSON is great, better than XML.

JSON非常棒,比XML更好。

Why don't you want to add external libraries? If you really can not use, you can rewrite a parser. Just implementing a parser is not too difficult.

你为什么不想添加外部库?如果你真的无法使用,你可以重写一个解析器。仅实现解析器并不困难。