This doesn't appear to be possible to me using the standard library json module. When using json.dumps it will automatically escape all non-ASCII characters then encode the string to ASCII. I can specify that it not escape non-ASCII characters, but then it crashes when it tries to convert the output to ASCII.

在我看来，使用标准库json模块是不可能的。当使用json。转储文件将自动转义所有非ASCII字符，然后将字符串编码为ASCII。我可以指定它不是转义非ASCII字符，但是当它试图将输出转换为ASCII时，它会崩溃。

The problem is - I don't want ASCII! I just want my JSON string back as a unicode (or UTF-8) string. Are there any convenient ways to do that?

问题是——我不想要ASCII!我只想让JSON字符串返回为unicode(或UTF-8)字符串。有什么方便的方法吗?

Here's an example to demonstrate what I want:

这里有一个例子来说明我想要什么:

d = {'navn': 'Åge', 'stilling': 'Lærling'}
json.dumps(d, output_encoding='utf8')
# => '{"stilling": "Lærling", "navn": "Åge"}'

But of course, there is no such option as output_encoding, so here's the actual output:

但是当然，没有output_encoding这样的选项，所以这是实际的输出:

d = {'navn': 'Åge', 'stilling': 'Lærling'}
json.dumps(d)
# => '{"stilling": "L\\u00e6rling", "navn": "\\u00c5ge"}'

So to summarize - I want to convert a Python dict to an UTF-8 JSON string without any escapes. How can I do that?

总结一下，我想把Python词典转换成UTF-8 JSON字符串，没有任何转义。我怎么做呢?

I'll accept solutions like:

我接受的解决方案:

Hacks (pre- and post processing input to dumps to achieve the desired effect)
hack(处理前和处理后输入到转储以实现所需的效果)
Subclassing the JSONEncoder (I have no idea how it works and the documentation isn't very helpful)
子类化JSONEncoder(我不知道它是如何工作的，文档也不是很有用)
Third party libraries available on PyPi
PyPi上的第三方库。

2 个解决方案

#1

Requirements

Make sure your python files are encoded in UTF-8. Or else your non-ascii characters will become question marks, ?. Notepad++ has excellent encoding options for this.

确保您的python文件是用UTF-8编码的。否则你的非ascii字符会变成问号?Notepad++具有出色的编码选项。
Make sure that you have the appropriate fonts included. If you want to display Japanese characters then you need to install Japanese fonts.

确保包含适当的字体。如果要显示日文字符，则需要安装日文字体。
Make sure that your IDE supports displaying unicode characters. Otherwise you might get an UnicodeEncodeError error thrown.

确保您的IDE支持显示unicode字符。否则您可能会得到一个UnicodeEncodeError抛出。

Example:

例子:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 22-23: character maps to <undefined>

PyScripter works for me. It's included with "Portable Python" at http://portablepython.com/wiki/PortablePython3.2.1.1

PyScripter为我工作。它包含在http://portablepython.com/wiki/PortablePython3.2.1.1中的“便携Python”中

Make sure you're using Python 3+, since this version offers better unicode support.
请确保您正在使用Python 3+，因为这个版本提供了更好的unicode支持。

Problem

json.dumps() escapes unicode characters.

json.dumps()逃unicode字符。

Solution

Read the update at the bottom. Or...

阅读底部的更新。还是……

Replace each escaped characters with the parsed unicode character.

用解析过的unicode字符替换每个转义字符。

I created a simple lambda function called getStringWithDecodedUnicode that does just that.

我创建了一个简单的lambda函数getStringWithDecodedUnicode就是这样做的。

import re   
getStringWithDecodedUnicode = lambda str : re.sub( '\\\\u([\da-f]{4})', (lambda x : chr( int( x.group(1), 16 ) )), str )

Here's getStringWithDecodedUnicode as a regular function.

这是一个常规函数getStringWithDecodedUnicode。

def getStringWithDecodedUnicode( value ):
    findUnicodeRE = re.compile( '\\\\u([\da-f]{4})' )
    def getParsedUnicode(x):
        return chr( int( x.group(1), 16 ) )

    return  findUnicodeRE.sub(getParsedUnicode, str( value ) )

Example

testJSONWithUnicode.py (Using PyScripter as the IDE)

import re
import json
getStringWithDecodedUnicode = lambda str : re.sub( '\\\\u([\da-f]{4})', (lambda x : chr( int( x.group(1), 16 ) )), str )

data = {"Japan":"日本"}
jsonString = json.dumps( data )
print( "json.dumps({0}) = {1}".format( data, jsonString ) )
jsonString = getStringWithDecodedUnicode( jsonString )
print( "Decoded Unicode: %s" % jsonString )

Output

json.dumps({'Japan': '日本'}) = {"Japan": "\u65e5\u672c"}
Decoded Unicode: {"Japan": "日本"}

Update

Or... just pass ensure_ascii=False as an option for json.dumps.

还是……只需将ensure_ascii=False作为json.dumps的选项。

Note: You need to meet the requirements that I outlined at the beginning or else this isn't going to work.

注意:您需要满足我在开始时概述的需求，否则这将不起作用。

import json
data = {'navn': 'Åge', 'stilling': 'Lærling'}
result = json.dumps(d, ensure_ascii=False)
print( result ) # prints '{"stilling": "Lærling", "navn": "Åge"}'

#2

encode_ascii=False is the best solution IMHO.

encode_ascii=False是最好的解决方案。

If you are using Python2.7, here is example python file :

如果您正在使用Python2.7，下面是一个python文件示例:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# example.py
from __future__ import unicode_literals
from json import dumps as json_dumps
d = {'navn': 'Åge', 'stilling': 'Lærling'}
print json_dumps(d, ensure_ascii=False).encode('utf-8')

#1

Requirements

Make sure your python files are encoded in UTF-8. Or else your non-ascii characters will become question marks, ?. Notepad++ has excellent encoding options for this.

确保您的python文件是用UTF-8编码的。否则你的非ascii字符会变成问号?Notepad++具有出色的编码选项。
Make sure that you have the appropriate fonts included. If you want to display Japanese characters then you need to install Japanese fonts.

确保包含适当的字体。如果要显示日文字符，则需要安装日文字体。
Make sure that your IDE supports displaying unicode characters. Otherwise you might get an UnicodeEncodeError error thrown.

确保您的IDE支持显示unicode字符。否则您可能会得到一个UnicodeEncodeError抛出。

Example:

例子:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 22-23: character maps to <undefined>

PyScripter works for me. It's included with "Portable Python" at http://portablepython.com/wiki/PortablePython3.2.1.1

PyScripter为我工作。它包含在http://portablepython.com/wiki/PortablePython3.2.1.1中的“便携Python”中

Make sure you're using Python 3+, since this version offers better unicode support.
请确保您正在使用Python 3+，因为这个版本提供了更好的unicode支持。

Problem

json.dumps() escapes unicode characters.

json.dumps()逃unicode字符。

Solution

Read the update at the bottom. Or...

阅读底部的更新。还是……

Replace each escaped characters with the parsed unicode character.

用解析过的unicode字符替换每个转义字符。

I created a simple lambda function called getStringWithDecodedUnicode that does just that.

我创建了一个简单的lambda函数getStringWithDecodedUnicode就是这样做的。

import re   
getStringWithDecodedUnicode = lambda str : re.sub( '\\\\u([\da-f]{4})', (lambda x : chr( int( x.group(1), 16 ) )), str )

Here's getStringWithDecodedUnicode as a regular function.

这是一个常规函数getStringWithDecodedUnicode。

def getStringWithDecodedUnicode( value ):
    findUnicodeRE = re.compile( '\\\\u([\da-f]{4})' )
    def getParsedUnicode(x):
        return chr( int( x.group(1), 16 ) )

    return  findUnicodeRE.sub(getParsedUnicode, str( value ) )

Example

testJSONWithUnicode.py (Using PyScripter as the IDE)

import re
import json
getStringWithDecodedUnicode = lambda str : re.sub( '\\\\u([\da-f]{4})', (lambda x : chr( int( x.group(1), 16 ) )), str )

data = {"Japan":"日本"}
jsonString = json.dumps( data )
print( "json.dumps({0}) = {1}".format( data, jsonString ) )
jsonString = getStringWithDecodedUnicode( jsonString )
print( "Decoded Unicode: %s" % jsonString )

Output

json.dumps({'Japan': '日本'}) = {"Japan": "\u65e5\u672c"}
Decoded Unicode: {"Japan": "日本"}

Update

Or... just pass ensure_ascii=False as an option for json.dumps.

还是……只需将ensure_ascii=False作为json.dumps的选项。

Note: You need to meet the requirements that I outlined at the beginning or else this isn't going to work.

注意:您需要满足我在开始时概述的需求，否则这将不起作用。

import json
data = {'navn': 'Åge', 'stilling': 'Lærling'}
result = json.dumps(d, ensure_ascii=False)
print( result ) # prints '{"stilling": "Lærling", "navn": "Åge"}'

#2

encode_ascii=False is the best solution IMHO.

encode_ascii=False是最好的解决方案。

If you are using Python2.7, here is example python file :

如果您正在使用Python2.7，下面是一个python文件示例:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# example.py
from __future__ import unicode_literals
from json import dumps as json_dumps
d = {'navn': 'Åge', 'stilling': 'Lærling'}
print json_dumps(d, ensure_ascii=False).encode('utf-8')

秒客网

如何将命令转换为unicode JSON字符串?

2 个解决方案

#1

Requirements

Problem

Solution

Example

testJSONWithUnicode.py (Using PyScripter as the IDE)

Output

Update

#2

#1

Requirements

Problem

Solution

Example

testJSONWithUnicode.py (Using PyScripter as the IDE)

Output

Update

#2

相关文章