从Unicode字符串(不支持编码声明)创建xml节点?

时间:2021-01-31 20:16:06

I have a database field which is storing an XML document as Unicode. However, when I fetch the field and try and initiate an lxml node, I get the following error:

我有一个数据库字段,它将XML文档存储为Unicode。但是,当我获取字段并尝试初始化lxml节点时,会得到以下错误:

node = etree.fromstring(self.xml)
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

The text I current have (self.xml) contains Japanese characters, etc. How would I create the node?

当前的文本(self.xml)包含日文字符等。如何创建节点?

1 个解决方案

#1


4  

If you have unicode, you can specify the utf-8 parser for lxml:

如果您有unicode,您可以为lxml指定utf-8解析器:

utf8_parser = etree.XMLParser(encoding='utf-8')
node = etree.fromstring(self.xml.encode('utf-8'), parser=utf8_parser)

#1


4  

If you have unicode, you can specify the utf-8 parser for lxml:

如果您有unicode,您可以为lxml指定utf-8解析器:

utf8_parser = etree.XMLParser(encoding='utf-8')
node = etree.fromstring(self.xml.encode('utf-8'), parser=utf8_parser)