我可以使用libxml2和unicode吗?

时间:2021-02-08 20:15:37

Can I use libxml2 with unicode? I want to read and write xml files written in unicode, is it possible using libxml2 with c++?

我可以在unicode中使用libxml2吗?我想读和写用unicode写的xml文件,用libxml2和c++可以吗?

2 个解决方案

#1


3  

libxml2 use utf8 encoding internally to store values, and will convert input from specified encoding (in xml encoding declaration) to utf8 using iconv. So yes, libxml2 can work with unicode in a certain sense.

libxml2在内部使用utf8编码来存储值,并使用iconv将输入从指定的编码(在xml编码声明中)转换为utf8。是的,libxml2可以在某种意义上使用unicode。

But if your real question is : does libxml2 accept wchar_t* as input, then the answer is no. You'll have to convert it to a 8 bit encoding (utf8 is probably the safer bet since it can encode every unicode codepoint).

但如果真正的问题是:libxml2是否接受wchar_t*作为输入,那么答案是否定的。您必须将它转换为8位编码(utf8可能是更安全的选择,因为它可以对每个unicode编码点进行编码)。

#2


3  

It would appear that the answer is yes.

看来答案是肯定的。

Use this processing instruction for UTF-8 content*:

使用此处理指令处理UTF-8内容*:

<?xml version="1.0" encoding="UTF-8"?>

*which is what I assume you mean by "unicode," since Unicode is not UTF-8.

*我猜你说的“unicode”就是这个意思,因为unicode不是UTF-8。

#1


3  

libxml2 use utf8 encoding internally to store values, and will convert input from specified encoding (in xml encoding declaration) to utf8 using iconv. So yes, libxml2 can work with unicode in a certain sense.

libxml2在内部使用utf8编码来存储值,并使用iconv将输入从指定的编码(在xml编码声明中)转换为utf8。是的,libxml2可以在某种意义上使用unicode。

But if your real question is : does libxml2 accept wchar_t* as input, then the answer is no. You'll have to convert it to a 8 bit encoding (utf8 is probably the safer bet since it can encode every unicode codepoint).

但如果真正的问题是:libxml2是否接受wchar_t*作为输入,那么答案是否定的。您必须将它转换为8位编码(utf8可能是更安全的选择,因为它可以对每个unicode编码点进行编码)。

#2


3  

It would appear that the answer is yes.

看来答案是肯定的。

Use this processing instruction for UTF-8 content*:

使用此处理指令处理UTF-8内容*:

<?xml version="1.0" encoding="UTF-8"?>

*which is what I assume you mean by "unicode," since Unicode is not UTF-8.

*我猜你说的“unicode”就是这个意思,因为unicode不是UTF-8。