如何用Python解析有点错误的JSON ?

时间:2022-02-13 23:15:04

I have a following JSON string coming from external input source:

我有以下来自外部输入源的JSON字符串:

{value: "82363549923gnyh49c9djl239pjm01223", id: 17893}

This is wrong-formatted JSON string ("id" and "value" must be in quotes), but I need to parse it anyway. I have tried simplejson and json-py and seems they could not be set up to parse such strings.

这是错误格式的JSON字符串(“id”和“value”必须在引号中),但无论如何我都需要解析它。我尝试过simplejson和json-py,但似乎无法设置它们来解析这些字符串。

I am running Python 2.5 on Google App engine, so any C-based solutions like python-cjson are not applicable.

我正在谷歌应用程序引擎上运行Python 2.5,因此任何基于c的解决方案(如Python -cjson)都不适用。

Input format could be changed to XML or YAML, in adition to JSON listed above, but I am using JSON within the project and changing format in specific place would not be very good.

输入格式可以更改为XML或YAML,在上面列出的JSON中,但是我在项目中使用JSON,在特定的地方更改格式不是很好。

Now I've switched to XML and parsing the data successfully, but looking forward to any solution that would allow me to switch back to JSON.

现在,我已经切换到XML并成功地解析数据,但我希望看到任何允许我切换回JSON的解决方案。

4 个解决方案

#1


33  

since YAML (>=1.2) is a superset of JSON, you can do:

因为YAML(>=1.2)是JSON的超集,所以可以:

>>> import yaml
>>> s = '{value: "82363549923gnyh49c9djl239pjm01223", id: 17893}'
>>> yaml.load(s)
{'id': 17893, 'value': '82363549923gnyh49c9djl239pjm01223'}

#2


13  

You can use demjson.

您可以使用demjson。

>>> import demjson
>>> demjson.decode('{foo:3}')
{u'foo': 3}

#3


1  

You could use a string parser to fix it first, a regex could do it provided that this is as complicated as the JSON will get.

您可以使用字符串解析器来首先修复它,regex可以这样做,前提是这和JSON所具有的复杂性一样复杂。

#4


0  

Pyparsing includes a JSON parser example, here is the online source. You could modify the definition of memberDef to allow a non-quoted string for the member name, and then you could use this to parser your not-quite-JSON source text.

pyparse包括一个JSON解析器示例,这是在线源代码。您可以修改memberDef的定义,以允许成员名称的非引用字符串,然后您可以使用它来解析您的非quite- json源文本。

This page also has info and a link to my article in the August, 2008 issue of Python Magazine, which has a lot more detailed info about this parser. The page shows some sample JSON, and code that accesses the parsed results like it was a deserialized object.

这个页面还包含了我在2008年8月发行的Python杂志上的文章的信息和链接,该杂志有关于这个解析器的详细信息。页面显示了一些示例JSON,以及访问解析结果的代码,就像它是反序列化对象一样。

#1


33  

since YAML (>=1.2) is a superset of JSON, you can do:

因为YAML(>=1.2)是JSON的超集,所以可以:

>>> import yaml
>>> s = '{value: "82363549923gnyh49c9djl239pjm01223", id: 17893}'
>>> yaml.load(s)
{'id': 17893, 'value': '82363549923gnyh49c9djl239pjm01223'}

#2


13  

You can use demjson.

您可以使用demjson。

>>> import demjson
>>> demjson.decode('{foo:3}')
{u'foo': 3}

#3


1  

You could use a string parser to fix it first, a regex could do it provided that this is as complicated as the JSON will get.

您可以使用字符串解析器来首先修复它,regex可以这样做,前提是这和JSON所具有的复杂性一样复杂。

#4


0  

Pyparsing includes a JSON parser example, here is the online source. You could modify the definition of memberDef to allow a non-quoted string for the member name, and then you could use this to parser your not-quite-JSON source text.

pyparse包括一个JSON解析器示例,这是在线源代码。您可以修改memberDef的定义,以允许成员名称的非引用字符串,然后您可以使用它来解析您的非quite- json源文本。

This page also has info and a link to my article in the August, 2008 issue of Python Magazine, which has a lot more detailed info about this parser. The page shows some sample JSON, and code that accesses the parsed results like it was a deserialized object.

这个页面还包含了我在2008年8月发行的Python杂志上的文章的信息和链接,该杂志有关于这个解析器的详细信息。页面显示了一些示例JSON,以及访问解析结果的代码,就像它是反序列化对象一样。