将多行JSON转换为python字典

时间:2021-03-07 18:15:17

I currently have this data in a file which is multiple JSON rows (about 13k rows but the example below is shortened:

我目前在一个包含多个JSON行的文件中(大约13k行,但是下面的示例被缩短了:

{"first_name":"John","last_name":"Smith","age":30}
{"first_name":"Tim","last_name":"Johnson","age":34}

I have the following code:

我有以下代码:

import json
import codecs

with open('brief.csv') as f:
    for line in f:
        tweet = codecs.open('brief.csv', encoding='utf8').read()
        data = json.loads(tweet)
print data
print data.keys()
print data.values()

If I only have one row of data in my file, this works great. However, I can't seem to figure out how to go row by row to change each row into a dictionary. When I try to run this on multiple lines, I get the ValueError(errmsg("Extra data", s end, len(s))) error due to the code only wanting to deal with two curly braces, IE the first row. I ultimately want to be able to select certain keys (like first_name and age) and then print out only those values out of my file.

如果我的文件中只有一行数据,这很好。然而,我似乎不知道如何一行一行地把每一行都变成字典。当我尝试在多行上运行时,我得到ValueError(errmsg(“额外数据”,s end, len(s))))错误,因为代码只希望处理两个花括号(即第一行)。我最终希望能够选择某些键(比如first_name和age),然后将这些值从文件中打印出来。

Any idea how to accomplish this?

你知道怎么做吗?

2 个解决方案

#1


1  

You're reading the whole file once for each line... try something like this:

每一行都要读一次整个文件……试试这样:

import json
import codecs

tweets = []

with codecs.open('brief.csv', encoding='utf8') as f:
    for line in f.readlines():
        tweets.append(json.loads(line))

print tweets

for tweet in tweets:
    print tweet.keys()
    print tweet['last_name']

#2


0  

May be you can try like below more simplify

也许你可以试试下面的简化?

>>> import simplejson as json 
>>> with open("brief.csv") as f:
...     for line in f:
...         data = json.loads(line)
...         print data
...         print data.values()
...         print data.keys()

{'first_name': 'John', 'last_name': 'Smith', 'age': 30}
['John', 'Smith', 30]
['first_name', 'last_name', 'age']
{'first_name': 'Tim', 'last_name': 'Johnson', 'age': 34}
['Tim', 'Johnson', 34]
['first_name', 'last_name', 'age']

#1


1  

You're reading the whole file once for each line... try something like this:

每一行都要读一次整个文件……试试这样:

import json
import codecs

tweets = []

with codecs.open('brief.csv', encoding='utf8') as f:
    for line in f.readlines():
        tweets.append(json.loads(line))

print tweets

for tweet in tweets:
    print tweet.keys()
    print tweet['last_name']

#2


0  

May be you can try like below more simplify

也许你可以试试下面的简化?

>>> import simplejson as json 
>>> with open("brief.csv") as f:
...     for line in f:
...         data = json.loads(line)
...         print data
...         print data.values()
...         print data.keys()

{'first_name': 'John', 'last_name': 'Smith', 'age': 30}
['John', 'Smith', 30]
['first_name', 'last_name', 'age']
{'first_name': 'Tim', 'last_name': 'Johnson', 'age': 34}
['Tim', 'Johnson', 34]
['first_name', 'last_name', 'age']