I have a bunch of UTF-8 encoded flat files that need to be imported into a SQL Server 2008 R2 database. Bulk inserts are not able to identify the diameters nor seems to accept UTF-8.
我有一堆UTF-8编码的平面文件需要导入到SQL Server 2008 R2数据库中。散装刀片不能识别直径,也不能接受UTF-8。
I understand that there is a number of articles on how SQL Server 2008 deals with UTF-8 encoding, but I'm sort of looking for any updated answers as most of those articles are old.
我知道有很多关于SQL Server 2008如何处理UTF-8编码的文章,但我正在寻找任何更新的答案,因为大多数文章都是旧的。
Is there anything I can to do in order to get these flat files into the database either by converting them before an insert or a process to run during the insert?
有什么办法可以通过在插入之前转换它们或在插入期间运行进程来将这些平面文件放入数据库中吗?
I want to stay away from manually converting each one. Furthermore, SSIS packages that I've attempted to create can read and separate the data. It just can't move the data it seems. :(
我想远离手动转换每一个。此外,我尝试创建的SSIS包可以读取和分离数据。它似乎无法移动数据。 :(
The flat files are generated by Java. Converting the java environment from UTF-8 to any other encoding has been unsuccessful.
平面文件由Java生成。将java环境从UTF-8转换为任何其他编码都是不成功的。
NOTE
I have no intention of storing UTF-8 data. My delimiter is coming in funky because it's UTF-8. SQL Server cannot read the characters when separating the columns and rows. That's it.
我无意存储UTF-8数据。我的分隔符很时髦,因为它是UTF-8。分离列和行时,SQL Server无法读取字符。而已。
3 个解决方案
#1
8
Not true, you simply need to choose code page 65001
不成真,您只需选择代码页65001即可
#2
1
- convert your data file to UTF-16 Little Endian (exactly Little Endian)
- use bcp with -w option.
将您的数据文件转换为UTF-16 Little Endian(完全是Little Endian)
使用带-w选项的bcp。
#3
-1
Microsoft has always been crap regarding encoding, especially in SQL Server. Here is your solution.
微软一直是关于编码的废话,特别是在SQL Server中。这是你的解决方案。
#1
8
Not true, you simply need to choose code page 65001
不成真,您只需选择代码页65001即可
#2
1
- convert your data file to UTF-16 Little Endian (exactly Little Endian)
- use bcp with -w option.
将您的数据文件转换为UTF-16 Little Endian(完全是Little Endian)
使用带-w选项的bcp。
#3
-1
Microsoft has always been crap regarding encoding, especially in SQL Server. Here is your solution.
微软一直是关于编码的废话,特别是在SQL Server中。这是你的解决方案。