I'm trying to do some simple JSON parsing using Python 3's built in JSON module, and from reading a bunch of other questions on SO and googling, it seems this is supposed to be pretty straightforward. However, I think I'm getting a string returned instead of the expected dictionary.
我正在尝试使用Python 3内置的JSON模块进行一些简单的JSON解析,并且通过阅读关于SO和谷歌搜索的一堆其他问题,看起来这应该是非常简单的。但是,我想我得到一个字符串而不是预期的字典。
Firstly, here is the JSON I am trying to get values from. It's just some output from Twitter's API
首先,这是我试图从中获取值的JSON。这只是Twitter API的一些输出
[{'in_reply_to_status_id_str': None, 'in_reply_to_screen_name': None, 'retweeted': False, 'in_reply_to_status_id': None, 'contributors': None, 'favorite_count': 0, 'in_reply_to_user_id': None, 'coordinates': None, 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', 'geo': None, 'retweet_count': 0, 'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90', 'created_at': 'Mon Sep 01 19:36:25 +0000 2014', 'entities': {'symbols': [], 'user_mentions': [], 'urls': [{'expanded_url': 'http://www.isthereanappthat.com', 'display_url': 'isthereanappthat.com', 'url': 'http://t.co/QDVYv6bV90', 'indices': [16, 38]}], 'hashtags': []}, 'id_str': '506526005943865344', 'in_reply_to_user_id_str': None, 'truncated': False, 'favorited': False, 'lang': 'en', 'possibly_sensitive': False, 'id': 506526005943865344, 'user': {'profile_text_color': '333333', 'time_zone': None, 'entities': {'description': {'urls': []}}, 'url': None, 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'protected': False, 'default_profile_image': True, 'utc_offset': None, 'default_profile': True, 'screen_name': 'KickzWatch', 'follow_request_sent': False, 'following': False, 'profile_background_color': 'C0DEED', 'notifications': False, 'description': '', 'profile_sidebar_border_color': 'C0DEED', 'geo_enabled': False, 'verified': False, 'friends_count': 40, 'created_at': 'Mon Sep 01 16:29:18 +0000 2014', 'is_translator': False, 'profile_sidebar_fill_color': 'DDEEF6', 'statuses_count': 4, 'location': '', 'id_str': '2784389341', 'followers_count': 4, 'favourites_count': 0, 'contributors_enabled': False, 'is_translation_enabled': False, 'lang': 'en', 'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'id': 2784389341, 'profile_use_background_image': True, 'listed_count': 0, 'profile_background_tile': False, 'name': 'Maktub Destiny', 'profile_link_color': '0084B4'}, 'place': None}]
I assigned this String to a variable named json_string like so:
我将此String分配给名为json_string的变量,如下所示:
json_string = json.dumps(output)
jason = json.loads(json_string)
Then, when I try to get a specific key from the "jason" dictionary:
然后,当我尝试从“jason”字典中获取特定密钥时:
print(jason['hashtags'])
I'm getting an error:
我收到一个错误:
TypeError: string indices must be integers
I want to be able to convert the json output to a dictionary, then use jason[key_name]
call to get values using specified keys. Is there something obvious that I'm missing here?
我希望能够将json输出转换为字典,然后使用jason [key_name]调用来获取使用指定键的值。我有什么明显的遗漏吗?
This is my fist time working with Python, after coming from Java. I absolutely love the language and think it's very powerful. So, any help on this would be greatly appreciated!
这是我在使用Java之后第一次使用Python。我非常喜欢这种语言,并认为它非常强大。所以,对此的任何帮助将不胜感激!
2 个解决方案
#1
5
Ok first you should print your object so that you can read it:
好的,首先你应该打印你的对象,以便你可以阅读它:
>>> from pprint import pprint
>>> output = [{'in_reply_to_status_id_str': None, 'in_reply_to_screen_name': None, 'retweeted': False, 'in_reply_to_status_id': None, 'contributors': None, 'favorite_count': 0, 'in_reply_to_user_id': None, 'coordinates': None, 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', 'geo': None, 'retweet_count': 0, 'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90', 'created_at': 'Mon Sep 01 19:36:25 +0000 2014', 'entities': {'symbols': [], 'user_mentions': [], 'urls': [{'expanded_url': 'http://www.isthereanappthat.com', 'display_url': 'isthereanappthat.com', 'url': 'http://t.co/QDVYv6bV90', 'indices': [16, 38]}], 'hashtags': []}, 'id_str': '506526005943865344', 'in_reply_to_user_id_str': None, 'truncated': False, 'favorited': False, 'lang': 'en', 'possibly_sensitive': False, 'id': 506526005943865344, 'user': {'profile_text_color': '333333', 'time_zone': None, 'entities': {'description': {'urls': []}}, 'url': None, 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'protected': False, 'default_profile_image': True, 'utc_offset': None, 'default_profile': True, 'screen_name': 'KickzWatch', 'follow_request_sent': False, 'following': False, 'profile_background_color': 'C0DEED', 'notifications': False, 'description': '', 'profile_sidebar_border_color': 'C0DEED', 'geo_enabled': False, 'verified': False, 'friends_count': 40, 'created_at': 'Mon Sep 01 16:29:18 +0000 2014', 'is_translator': False, 'profile_sidebar_fill_color': 'DDEEF6', 'statuses_count': 4, 'location': '', 'id_str': '2784389341', 'followers_count': 4, 'favourites_count': 0, 'contributors_enabled': False, 'is_translation_enabled': False, 'lang': 'en', 'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'id': 2784389341, 'profile_use_background_image': True, 'listed_count': 0, 'profile_background_tile': False, 'name': 'Maktub Destiny', 'profile_link_color': '0084B4'}, 'place': None}]
>>> pprint(output)
[{'contributors': None,
'coordinates': None,
'created_at': 'Mon Sep 01 19:36:25 +0000 2014',
'entities': {'hashtags': [],
'symbols': [],
'urls': [{'display_url': 'isthereanappthat.com',
'expanded_url': 'http://www.isthereanappthat.com',
'indices': [16, 38],
'url': 'http://t.co/QDVYv6bV90'}],
'user_mentions': []},
'favorite_count': 0,
'favorited': False,
'geo': None,
'id': 506526005943865344,
'id_str': '506526005943865344',
'in_reply_to_screen_name': None,
'in_reply_to_status_id': None,
'in_reply_to_status_id_str': None,
'in_reply_to_user_id': None,
'in_reply_to_user_id_str': None,
'lang': 'en',
'place': None,
'possibly_sensitive': False,
'retweet_count': 0,
'retweeted': False,
'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>',
'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90',
'truncated': False,
'user': {'contributors_enabled': False,
'created_at': 'Mon Sep 01 16:29:18 +0000 2014',
'default_profile': True,
'default_profile_image': True,
'description': '',
'entities': {'description': {'urls': []}},
'favourites_count': 0,
'follow_request_sent': False,
'followers_count': 4,
'following': False,
'friends_count': 40,
'geo_enabled': False,
'id': 2784389341,
'id_str': '2784389341',
'is_translation_enabled': False,
'is_translator': False,
'lang': 'en',
'listed_count': 0,
'location': '',
'name': 'Maktub Destiny',
'notifications': False,
'profile_background_color': 'C0DEED',
'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_tile': False,
'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png',
'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png',
'profile_link_color': '0084B4',
'profile_sidebar_border_color': 'C0DEED',
'profile_sidebar_fill_color': 'DDEEF6',
'profile_text_color': '333333',
'profile_use_background_image': True,
'protected': False,
'screen_name': 'KickzWatch',
'statuses_count': 4,
'time_zone': None,
'url': None,
'utc_offset': None,
'verified': False}}]
From looking at this you can see that output is a list
which contains a single dict
. To access this you need:
通过查看此信息,您可以看到输出是一个包含单个字典的列表。要访问它,您需要:
>>> first_elem = output[0]
You will also see that the hashtags
key in the first_elem
is contained in a second level dict
under the key entities
:
您还将看到first_elem中的hashtags键包含在关键实体下的第二级dict中:
>>> entities = first_elem['entities']
>>> pprint(entities)
{'hashtags': [],
'symbols': [],
'urls': [{'display_url': 'isthereanappthat.com',
'expanded_url': 'http://www.isthereanappthat.com',
'indices': [16, 38],
'url': 'http://t.co/QDVYv6bV90'}],
'user_mentions': []}
Now you are able to access hashtags
:
现在您可以访问主题标签:
>>> entities['hashtags']
[]
Which just happens to be the empty list.
这恰好是空列表。
To convert to JSON, note the comment:
要转换为JSON,请注意注释:
>>> import json
>>> # Make sure output is the list object not a string representing the object
>>> json_string = json.dumps(output)
>>> jason = json.loads(output)
>>> jason[0]['entities']['hashtags']
[]
I think your problem is that you made output a string before you json.dumps
it, meaning that json.loads
will return a string, not a json object.
我认为你的问题是你在json.dump之前输出了一个字符串,这意味着json.loads将返回一个字符串,而不是一个json对象。
And @Dan's answer is correct, this is not valid JSON. It is however a valid python dict, and I'm assuming that you got it from Twitter using python then printed it.
而@ Dan的答案是正确的,这是无效的JSON。然而它是一个有效的python dict,我假设你从Twitter使用python然后打印它。
#2
5
First off, your JSON example is not valid JSON; the Twitter API would not output this, because it would break every conforming JSON consumer.
首先,您的JSON示例不是有效的JSON; Twitter API不会输出这个,因为它会破坏每个符合JSON的消费者。
- jsonlint shows the first, obvious syntax error: single-quoted rather than double quoted strings.
- jsonlint显示了第一个明显的语法错误:单引号而不是双引号字符串。
- Secondly, you have
None
where JSON requiresnull
,False
instead offalse
, andTrue
, instead oftrue
. - 其次,你有没有JSON需要null,False而不是false,和True,而不是true。
Your alleged "JSON" example appears to have been pre-decoded into Python :). When I use a snippet of real JSON, it works exactly as expected:
您所谓的“JSON”示例似乎已预先解码为Python :)。当我使用真正的JSON片段时,它完全按预期工作:
import json
json_string = r"""
[{"actual_json_key":"actual_json_value"}]
"""
jason = json.loads(json_string)
print(jason[0]["actual_json_key"])
#1
5
Ok first you should print your object so that you can read it:
好的,首先你应该打印你的对象,以便你可以阅读它:
>>> from pprint import pprint
>>> output = [{'in_reply_to_status_id_str': None, 'in_reply_to_screen_name': None, 'retweeted': False, 'in_reply_to_status_id': None, 'contributors': None, 'favorite_count': 0, 'in_reply_to_user_id': None, 'coordinates': None, 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', 'geo': None, 'retweet_count': 0, 'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90', 'created_at': 'Mon Sep 01 19:36:25 +0000 2014', 'entities': {'symbols': [], 'user_mentions': [], 'urls': [{'expanded_url': 'http://www.isthereanappthat.com', 'display_url': 'isthereanappthat.com', 'url': 'http://t.co/QDVYv6bV90', 'indices': [16, 38]}], 'hashtags': []}, 'id_str': '506526005943865344', 'in_reply_to_user_id_str': None, 'truncated': False, 'favorited': False, 'lang': 'en', 'possibly_sensitive': False, 'id': 506526005943865344, 'user': {'profile_text_color': '333333', 'time_zone': None, 'entities': {'description': {'urls': []}}, 'url': None, 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'protected': False, 'default_profile_image': True, 'utc_offset': None, 'default_profile': True, 'screen_name': 'KickzWatch', 'follow_request_sent': False, 'following': False, 'profile_background_color': 'C0DEED', 'notifications': False, 'description': '', 'profile_sidebar_border_color': 'C0DEED', 'geo_enabled': False, 'verified': False, 'friends_count': 40, 'created_at': 'Mon Sep 01 16:29:18 +0000 2014', 'is_translator': False, 'profile_sidebar_fill_color': 'DDEEF6', 'statuses_count': 4, 'location': '', 'id_str': '2784389341', 'followers_count': 4, 'favourites_count': 0, 'contributors_enabled': False, 'is_translation_enabled': False, 'lang': 'en', 'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'id': 2784389341, 'profile_use_background_image': True, 'listed_count': 0, 'profile_background_tile': False, 'name': 'Maktub Destiny', 'profile_link_color': '0084B4'}, 'place': None}]
>>> pprint(output)
[{'contributors': None,
'coordinates': None,
'created_at': 'Mon Sep 01 19:36:25 +0000 2014',
'entities': {'hashtags': [],
'symbols': [],
'urls': [{'display_url': 'isthereanappthat.com',
'expanded_url': 'http://www.isthereanappthat.com',
'indices': [16, 38],
'url': 'http://t.co/QDVYv6bV90'}],
'user_mentions': []},
'favorite_count': 0,
'favorited': False,
'geo': None,
'id': 506526005943865344,
'id_str': '506526005943865344',
'in_reply_to_screen_name': None,
'in_reply_to_status_id': None,
'in_reply_to_status_id_str': None,
'in_reply_to_user_id': None,
'in_reply_to_user_id_str': None,
'lang': 'en',
'place': None,
'possibly_sensitive': False,
'retweet_count': 0,
'retweeted': False,
'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>',
'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90',
'truncated': False,
'user': {'contributors_enabled': False,
'created_at': 'Mon Sep 01 16:29:18 +0000 2014',
'default_profile': True,
'default_profile_image': True,
'description': '',
'entities': {'description': {'urls': []}},
'favourites_count': 0,
'follow_request_sent': False,
'followers_count': 4,
'following': False,
'friends_count': 40,
'geo_enabled': False,
'id': 2784389341,
'id_str': '2784389341',
'is_translation_enabled': False,
'is_translator': False,
'lang': 'en',
'listed_count': 0,
'location': '',
'name': 'Maktub Destiny',
'notifications': False,
'profile_background_color': 'C0DEED',
'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_tile': False,
'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png',
'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png',
'profile_link_color': '0084B4',
'profile_sidebar_border_color': 'C0DEED',
'profile_sidebar_fill_color': 'DDEEF6',
'profile_text_color': '333333',
'profile_use_background_image': True,
'protected': False,
'screen_name': 'KickzWatch',
'statuses_count': 4,
'time_zone': None,
'url': None,
'utc_offset': None,
'verified': False}}]
From looking at this you can see that output is a list
which contains a single dict
. To access this you need:
通过查看此信息,您可以看到输出是一个包含单个字典的列表。要访问它,您需要:
>>> first_elem = output[0]
You will also see that the hashtags
key in the first_elem
is contained in a second level dict
under the key entities
:
您还将看到first_elem中的hashtags键包含在关键实体下的第二级dict中:
>>> entities = first_elem['entities']
>>> pprint(entities)
{'hashtags': [],
'symbols': [],
'urls': [{'display_url': 'isthereanappthat.com',
'expanded_url': 'http://www.isthereanappthat.com',
'indices': [16, 38],
'url': 'http://t.co/QDVYv6bV90'}],
'user_mentions': []}
Now you are able to access hashtags
:
现在您可以访问主题标签:
>>> entities['hashtags']
[]
Which just happens to be the empty list.
这恰好是空列表。
To convert to JSON, note the comment:
要转换为JSON,请注意注释:
>>> import json
>>> # Make sure output is the list object not a string representing the object
>>> json_string = json.dumps(output)
>>> jason = json.loads(output)
>>> jason[0]['entities']['hashtags']
[]
I think your problem is that you made output a string before you json.dumps
it, meaning that json.loads
will return a string, not a json object.
我认为你的问题是你在json.dump之前输出了一个字符串,这意味着json.loads将返回一个字符串,而不是一个json对象。
And @Dan's answer is correct, this is not valid JSON. It is however a valid python dict, and I'm assuming that you got it from Twitter using python then printed it.
而@ Dan的答案是正确的,这是无效的JSON。然而它是一个有效的python dict,我假设你从Twitter使用python然后打印它。
#2
5
First off, your JSON example is not valid JSON; the Twitter API would not output this, because it would break every conforming JSON consumer.
首先,您的JSON示例不是有效的JSON; Twitter API不会输出这个,因为它会破坏每个符合JSON的消费者。
- jsonlint shows the first, obvious syntax error: single-quoted rather than double quoted strings.
- jsonlint显示了第一个明显的语法错误:单引号而不是双引号字符串。
- Secondly, you have
None
where JSON requiresnull
,False
instead offalse
, andTrue
, instead oftrue
. - 其次,你有没有JSON需要null,False而不是false,和True,而不是true。
Your alleged "JSON" example appears to have been pre-decoded into Python :). When I use a snippet of real JSON, it works exactly as expected:
您所谓的“JSON”示例似乎已预先解码为Python :)。当我使用真正的JSON片段时,它完全按预期工作:
import json
json_string = r"""
[{"actual_json_key":"actual_json_value"}]
"""
jason = json.loads(json_string)
print(jason[0]["actual_json_key"])