This question is based on a side-effect of that one.
这个问题是基于那一个的副作用。
My .py
files are all have # -*- coding: utf-8 -*-
encoding definer on the first line, like my api.py
我的.py文件都有# -*-编码:第一行的utf-8 -*编码定义符,就像我的api.py一样。
As I mention on the related question, I use HttpResponse
to return the api documentation. Since I defined encoding by:
正如我在相关问题上提到的,我使用HttpResponse来返回api文档。因为我定义了编码:
HttpResponse(cy_content, content_type='text/plain; charset=utf-8')
Everything is ok, and when I call my API service, there are no encoding problems except the string formed from a dictionary by pprint
一切都很好,当我调用我的API服务时,除了通过pprint从字典中生成的字符串外,没有任何编码问题。
Since I am using Turkish characters in some values in my dict, pprint converts them to unichr
equivalents, like:
由于我在我的字典中使用的是土耳其字符,pprint将它们转换为unichr等价类,比如:
API_STATUS = {
1: 'müşteri',
2: 'some other status message'
}
my_str = 'Here is the documentation part that contains Turkish chars like işüğçö'
my_str += pprint.pformat(API_STATUS, indent=4, width=1)
return HttpRespopnse(my_str, content_type='text/plain; charset=utf-8')
And my plain text output is like:
我的纯文本输出是:
Here is the documentation part that contains Turkish chars like işüğçö
{
1: 'm\xc3\xbc\xc5\x9fteri',
2: 'some other status message'
}
I try to decode or encode pprint output to different encodings, with no success... What is the best practice to overcome this problem
我尝试解码或编码pprint输出到不同的编码,没有成功…克服这个问题的最佳实践是什么?
2 个解决方案
#1
36
pprint
appears to use repr
by default, you can work around this by overriding PrettyPrinter.format
:
pprint在默认情况下使用repr,您可以通过覆盖PrettyPrinter.format来解决这个问题。
# coding=utf8
import pprint
class MyPrettyPrinter(pprint.PrettyPrinter):
def format(self, object, context, maxlevels, level):
if isinstance(object, unicode):
return (object.encode('utf8'), True, False)
return pprint.PrettyPrinter.format(self, object, context, maxlevels, level)
d = {'foo': u'işüğçö'}
pprint.pprint(d) # {'foo': u'i\u015f\xfc\u011f\xe7\xf6'}
MyPrettyPrinter().pprint(d) # {'foo': işüğçö}
#2
1
You should use unicode strings instead of 8-bit ones:
您应该使用unicode字符串,而不是8位的字符串:
API_STATUS = {
1: u'müşteri',
2: u'some other status message'
}
my_str = u'Here is the documentation part that contains Turkish chars like işüğçö'
my_str += pprint.pformat(API_STATUS, indent=4, width=1)
The pprint
module is designed to print out all possible kind of nested structure in a readable way. To do that it will print the objects representation rather then convert it to a string, so you'll end up with the escape syntax wheather you use unicode strings or not. But if you're using unicode in your document, then you really should be using unicode literals!
pprint模块被设计成以一种可读的方式打印出所有可能的嵌套结构。要做到这一点,它将打印对象表示,而不是将其转换为字符串,这样您就会得到使用unicode字符串的转义语法。但是如果您在文档中使用unicode,那么您确实应该使用unicode文字!
Anyway, thg435 has given you a solution how to change this behaviour of pformat.
无论如何,thg435给了您一个解决方案,如何改变pformat的这种行为。
#1
36
pprint
appears to use repr
by default, you can work around this by overriding PrettyPrinter.format
:
pprint在默认情况下使用repr,您可以通过覆盖PrettyPrinter.format来解决这个问题。
# coding=utf8
import pprint
class MyPrettyPrinter(pprint.PrettyPrinter):
def format(self, object, context, maxlevels, level):
if isinstance(object, unicode):
return (object.encode('utf8'), True, False)
return pprint.PrettyPrinter.format(self, object, context, maxlevels, level)
d = {'foo': u'işüğçö'}
pprint.pprint(d) # {'foo': u'i\u015f\xfc\u011f\xe7\xf6'}
MyPrettyPrinter().pprint(d) # {'foo': işüğçö}
#2
1
You should use unicode strings instead of 8-bit ones:
您应该使用unicode字符串,而不是8位的字符串:
API_STATUS = {
1: u'müşteri',
2: u'some other status message'
}
my_str = u'Here is the documentation part that contains Turkish chars like işüğçö'
my_str += pprint.pformat(API_STATUS, indent=4, width=1)
The pprint
module is designed to print out all possible kind of nested structure in a readable way. To do that it will print the objects representation rather then convert it to a string, so you'll end up with the escape syntax wheather you use unicode strings or not. But if you're using unicode in your document, then you really should be using unicode literals!
pprint模块被设计成以一种可读的方式打印出所有可能的嵌套结构。要做到这一点,它将打印对象表示,而不是将其转换为字符串,这样您就会得到使用unicode字符串的转义语法。但是如果您在文档中使用unicode,那么您确实应该使用unicode文字!
Anyway, thg435 has given you a solution how to change this behaviour of pformat.
无论如何,thg435给了您一个解决方案,如何改变pformat的这种行为。