just trying to load this JSON file(with non-ascii characters) as a python dictionary with Unicode encoding but still getting this error:
只是尝试加载这个JSON文件(使用非ascii字符)作为具有Unicode编码的python字典但仍然收到此错误:
return codecs.ascii_decode(input, self.errors)[0]
return codecs.ascii_decode(input,self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 90: ordinal not in range(128)
UnicodeDecodeError:'ascii'编解码器无法解码位置90的字节0xc3:序数不在范围内(128)
JSON file content = "tooltip":{ "dxPivotGrid-sortRowBySummary": "Sort\"{0}\"byThisRow",}
JSON文件内容=“tooltip”:{“dxPivotGrid-sortRowBySummary”:“Sort \”{0} \“byThisRow”,}
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
for line in f:
data.append(json.loads(line.encode('utf-8','replace')))
4 个解决方案
#1
2
You have several problems as near as I can tell. First, is the file encoding. When you open a file without specifying an encoding, the file is opened with whatever sys.getfilesystemencoding()
is. Since that may vary (especially on Windows machines) its a good idea to explicitly use encoding="utf-8"
for most json files. Because of your error message, I suspect that the file was opened with an ascii
encoding.
你有几个问题,尽我所知。首先是文件编码。在未指定编码的情况下打开文件时,将使用sys.getfilesystemencoding()打开文件。由于这可能会有所不同(特别是在Windows机器上),因此对大多数json文件明确使用encoding =“utf-8”是一个好主意。由于您的错误消息,我怀疑该文件是使用ascii编码打开的。
Next, the file is decoded from utf-8 into python strings as it is read by the file system object. The utf-8 line has already been decoded to a string and is already ready for json to read. When you do line.encode('utf-8','replace')
, you encode the line back into a bytes
object which the json loads
(that is, "load string") can't handle.
接下来,文件从文件系统对象读取时,将文件从utf-8解码为python字符串。 utf-8行已经被解码为字符串,并且已经准备好让json读取。当你执行line.encode('utf-8','replace')时,你将该行编码回一个json加载的字节对象(即“加载字符串”)无法处理。
Finally, "tooltip":{ "navbar":"Operações de grupo"}
isn't valid json, but it does look like one line of a pretty-printed json file containing a single json object. My guess is that you should read the entire file as 1 json object.
最后,“tooltip”:{“navbar”:“Operaçõesdegrupo”}不是有效的json,但它看起来像一个包含单个json对象的漂亮打印的json文件的一行。我的猜测是你应该将整个文件读作1个json对象。
Putting it all together you get:
总而言之,你得到:
import json
with open('/Users/myvb/Desktop/Automation/pt-PT.json', encoding="utf-8") as f:
data = json.load(f)
From its name, its possible that this file is encoded as a Windows Portugese code page. If so, the "cp860"
encoding may work better.
从它的名字来看,这个文件可能被编码为Windows葡萄牙语代码页。如果是这样,“cp860”编码可能会更好。
#2
0
I had the same problem, what worked for me was creating a regular expression, and parsing every line from the json file:
我有同样的问题,对我有用的是创建一个正则表达式,并解析json文件中的每一行:
REGEXP = '[^A-Za-z0-9\'\:\.\;\-\?\!]+'
new_file_line = re.sub(REGEXP, ' ', old_file_line).strip()
#3
0
Having a file with content similar to yours I can read the file in one simple shot:
如果文件的内容与您的内容相似,我可以通过一个简单的镜头阅读该文件:
>>> import json
>>> fname = "data.json"
>>> with open(fname) as f:
... data = json.load(f)
...
>>> data
{'tooltip': {'navbar': 'Operações de grupo'}}
#4
0
You don't need to read each line. You have two options:
您不需要阅读每一行。你有两个选择:
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
data.append(json.load(f))
Or, you can load all lines and pass them to the json module:
或者,您可以加载所有行并将它们传递给json模块:
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
data.append(json.loads(''.join(f.readlines())))
Obviously, the first suggestion is the best.
显然,第一个建议是最好的。
#1
2
You have several problems as near as I can tell. First, is the file encoding. When you open a file without specifying an encoding, the file is opened with whatever sys.getfilesystemencoding()
is. Since that may vary (especially on Windows machines) its a good idea to explicitly use encoding="utf-8"
for most json files. Because of your error message, I suspect that the file was opened with an ascii
encoding.
你有几个问题,尽我所知。首先是文件编码。在未指定编码的情况下打开文件时,将使用sys.getfilesystemencoding()打开文件。由于这可能会有所不同(特别是在Windows机器上),因此对大多数json文件明确使用encoding =“utf-8”是一个好主意。由于您的错误消息,我怀疑该文件是使用ascii编码打开的。
Next, the file is decoded from utf-8 into python strings as it is read by the file system object. The utf-8 line has already been decoded to a string and is already ready for json to read. When you do line.encode('utf-8','replace')
, you encode the line back into a bytes
object which the json loads
(that is, "load string") can't handle.
接下来,文件从文件系统对象读取时,将文件从utf-8解码为python字符串。 utf-8行已经被解码为字符串,并且已经准备好让json读取。当你执行line.encode('utf-8','replace')时,你将该行编码回一个json加载的字节对象(即“加载字符串”)无法处理。
Finally, "tooltip":{ "navbar":"Operações de grupo"}
isn't valid json, but it does look like one line of a pretty-printed json file containing a single json object. My guess is that you should read the entire file as 1 json object.
最后,“tooltip”:{“navbar”:“Operaçõesdegrupo”}不是有效的json,但它看起来像一个包含单个json对象的漂亮打印的json文件的一行。我的猜测是你应该将整个文件读作1个json对象。
Putting it all together you get:
总而言之,你得到:
import json
with open('/Users/myvb/Desktop/Automation/pt-PT.json', encoding="utf-8") as f:
data = json.load(f)
From its name, its possible that this file is encoded as a Windows Portugese code page. If so, the "cp860"
encoding may work better.
从它的名字来看,这个文件可能被编码为Windows葡萄牙语代码页。如果是这样,“cp860”编码可能会更好。
#2
0
I had the same problem, what worked for me was creating a regular expression, and parsing every line from the json file:
我有同样的问题,对我有用的是创建一个正则表达式,并解析json文件中的每一行:
REGEXP = '[^A-Za-z0-9\'\:\.\;\-\?\!]+'
new_file_line = re.sub(REGEXP, ' ', old_file_line).strip()
#3
0
Having a file with content similar to yours I can read the file in one simple shot:
如果文件的内容与您的内容相似,我可以通过一个简单的镜头阅读该文件:
>>> import json
>>> fname = "data.json"
>>> with open(fname) as f:
... data = json.load(f)
...
>>> data
{'tooltip': {'navbar': 'Operações de grupo'}}
#4
0
You don't need to read each line. You have two options:
您不需要阅读每一行。你有两个选择:
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
data.append(json.load(f))
Or, you can load all lines and pass them to the json module:
或者,您可以加载所有行并将它们传递给json模块:
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
data.append(json.loads(''.join(f.readlines())))
Obviously, the first suggestion is the best.
显然,第一个建议是最好的。