I have a dict that's feed with url response. Like:
我有一个用url响应的命令。如:
>>> d
{
0: {'data': u'<p>found "\u62c9\u67cf \u591a\u516c \u56ed"</p>'}
1: {'data': u'<p>some other data</p>'}
...
}
While using xml.etree.ElementTree
function on this data values (d[0]['data']
) I get the most famous error message:
在使用xml.etree。ElementTree函数对这个数据值(d[0]['data'])我得到了最著名的错误消息:
UnicodeEncodeError: 'ascii' codec can't encode characters...
UnicodeEncodeError:“ascii”编解码器不能对字符进行编码…
What should I do to this Unicode string to make it suitable for ElementTree parser?
要使这个Unicode字符串适合ElementTree解析器,我应该对它做什么?
PS. Please don't send me links with Unicode & Python explanation. I read it all already unfortunately, and can't make use of it, as hopefully others can.
请不要给我发送带有Unicode和Python说明的链接。不幸的是,我已经读过了,不能像其他人那样利用它。
1 个解决方案
#1
24
You'll have to encode it manually, to UTF-8:
你将不得不手工编码它,到UTF-8:
ElementTree.fromstring(d[0]['data'].encode('utf-8'))
as the API only takes encoded bytes as input. UTF-8 is a good default for such data.
因为API只接受编码的字节作为输入。UTF-8是此类数据的良好默认值。
It'll be able to decode to unicode again from there:
它可以从那里解码unicode:
>>> from xml.etree import ElementTree
>>> p = ElementTree.fromstring(u'<p>found "\u62c9\u67cf \u591a\u516c \u56ed"</p>'.encode('utf8'))
>>> p.text
u'found "\u62c9\u67cf \u591a\u516c \u56ed"'
>>> print p.text
found "拉柏 多公 园"
#1
24
You'll have to encode it manually, to UTF-8:
你将不得不手工编码它,到UTF-8:
ElementTree.fromstring(d[0]['data'].encode('utf-8'))
as the API only takes encoded bytes as input. UTF-8 is a good default for such data.
因为API只接受编码的字节作为输入。UTF-8是此类数据的良好默认值。
It'll be able to decode to unicode again from there:
它可以从那里解码unicode:
>>> from xml.etree import ElementTree
>>> p = ElementTree.fromstring(u'<p>found "\u62c9\u67cf \u591a\u516c \u56ed"</p>'.encode('utf8'))
>>> p.text
u'found "\u62c9\u67cf \u591a\u516c \u56ed"'
>>> print p.text
found "拉柏 多公 园"