I want to save unicode data into database from xml string by using this code:
我想使用以下代码将unicode数据从xml字符串保存到数据库中:
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xmlText);
using (XmlNodeReader xmlReader = new XmlNodeReader(xmlDoc))
{
DataTable dt = new DataTable();
dt.TableName = "sms";
dt.Columns.Add("rowID");
dt.Columns.Add("origAddr");
dt.Columns.Add("time");
dt.Columns.Add("message");
dt.ReadXml(xmlReader);
return dt;
}
but when I save datatable into database my unicode character appear with question mark (???????)
但是当我将datatable保存到数据库中时,我的unicode字符就会出现问号(?????)
My database collation is correct and other unicode character are stored correctly.
我的数据库排序是正确的,其他unicode字符被正确存储。
I apologize to you because of poor english writing :)
我很抱歉,因为我的英文写得不好。
3 个解决方案
#1
1
I'll start things off with an educated guess.
我将从一个有根据的猜测开始。
Your database, or your table, uses a character set that is not full Unicode. The characters which are getting stored as question marks are characters which are outside the database or table character set. The characters which are getting stored correctly happen to be within the database or table character set.
您的数据库或表使用的字符集不是全Unicode的。作为问号存储的字符是数据库或表字符集之外的字符。正确存储的字符恰好位于数据库或表字符集内。
Alternatively, you have your XMLDocument()
or DataTable()
objects are converting the characters they read into a character set which is less than full Unicode.
或者,您有您的XMLDocument()或DataTable()对象,它们将读取的字符转换为小于完全Unicode的字符集。
Give the extra information requested by the comments, and I'll see if I can improve this answer.
给出评论要求的额外信息,我看看是否能改进这个答案。
#2
0
Usually this happens when you source text is not stored as Unicode. For example, if you read your xml data from a text file, and the text file is stored as Ansi (using codepage), or it is stored as Unicode file without BOM (Byte Order Mark, or signature), when you read your text file, non-ASCII characters may not be read correctly.
To solve this, open your source xml file in a text editor (for example Notepad++) and change your encoding to Unicode or UTF-8, and then save the file.
You can also open the file in Notepad, and save the file as Unicode (File/Save As -> Encoding: Unicode or UTF-8). Make sure that when you open your file in notepad, the characters are displayed correctly.
当源文本不存储为Unicode时,通常会发生这种情况。例如,如果您从文本文件中读取xml数据,并且文本文件存储为Ansi(使用代码页),或者它存储为没有BOM(字节顺序标记或签名)的Unicode文件,那么当您读取文本文件时,可能无法正确读取非ascii字符。要解决这个问题,请在文本编辑器(例如Notepad++)中打开源xml文件,并将编码更改为Unicode或UTF-8,然后保存该文件。您还可以在记事本中打开该文件,并将该文件保存为Unicode (file / save as ->编码:Unicode或UTF-8)。确保在记事本中打开文件时,正确显示字符。
#3
0
use XmlTextReader for read the xml and verify if error persist
使用XmlTextReader读取xml并验证是否存在错误。
XmlTextReader stream = new XmlTextReader(_pathXml);
while (stream.Read())
{
//TODO save each element
}
#1
1
I'll start things off with an educated guess.
我将从一个有根据的猜测开始。
Your database, or your table, uses a character set that is not full Unicode. The characters which are getting stored as question marks are characters which are outside the database or table character set. The characters which are getting stored correctly happen to be within the database or table character set.
您的数据库或表使用的字符集不是全Unicode的。作为问号存储的字符是数据库或表字符集之外的字符。正确存储的字符恰好位于数据库或表字符集内。
Alternatively, you have your XMLDocument()
or DataTable()
objects are converting the characters they read into a character set which is less than full Unicode.
或者,您有您的XMLDocument()或DataTable()对象,它们将读取的字符转换为小于完全Unicode的字符集。
Give the extra information requested by the comments, and I'll see if I can improve this answer.
给出评论要求的额外信息,我看看是否能改进这个答案。
#2
0
Usually this happens when you source text is not stored as Unicode. For example, if you read your xml data from a text file, and the text file is stored as Ansi (using codepage), or it is stored as Unicode file without BOM (Byte Order Mark, or signature), when you read your text file, non-ASCII characters may not be read correctly.
To solve this, open your source xml file in a text editor (for example Notepad++) and change your encoding to Unicode or UTF-8, and then save the file.
You can also open the file in Notepad, and save the file as Unicode (File/Save As -> Encoding: Unicode or UTF-8). Make sure that when you open your file in notepad, the characters are displayed correctly.
当源文本不存储为Unicode时,通常会发生这种情况。例如,如果您从文本文件中读取xml数据,并且文本文件存储为Ansi(使用代码页),或者它存储为没有BOM(字节顺序标记或签名)的Unicode文件,那么当您读取文本文件时,可能无法正确读取非ascii字符。要解决这个问题,请在文本编辑器(例如Notepad++)中打开源xml文件,并将编码更改为Unicode或UTF-8,然后保存该文件。您还可以在记事本中打开该文件,并将该文件保存为Unicode (file / save as ->编码:Unicode或UTF-8)。确保在记事本中打开文件时,正确显示字符。
#3
0
use XmlTextReader for read the xml and verify if error persist
使用XmlTextReader读取xml并验证是否存在错误。
XmlTextReader stream = new XmlTextReader(_pathXml);
while (stream.Read())
{
//TODO save each element
}