As far as I understand, in MySQL unicode_ci (utf8_unicode_ci in particular) collations are meant to support all the characters regardless to locale.
据我所知,在MySQL unicode_ci(特别是utf8_unicode_ci)中,排序规则是为了支持所有字符而不管语言环境。
I need to achieve the same with SQL Server 2008 R2. My database is going to contain data in very different languages (not limited to latin-based alphabets). I am not going to use non-Unicode strings at all. What collation should I chose?
我需要使用SQL Server 2008 R2实现相同的功能。我的数据库将包含非常不同语言的数据(不限于基于拉丁语的字母表)。我根本不会使用非Unicode字符串。我应该选择什么样的整理?
1 个解决方案
#1
7
You might as well go with Latin1_General_CI_AI
你可以选择Latin1_General_CI_AI
The reason is that unicode data is stored using NVarchar fields, SQL Server is more flexible in that it can mix Varchar (1-byte) and NVarchar (2-byte) data. So to match UTF8, any collation would do. As for CI - every single collation in 2008 allows for the CI specification to be added (it is a checkbox in the UI "case sensitive" - unchecked for insensitive).
原因是使用NVarchar字段存储unicode数据,SQL Server更灵活,因为它可以混合Varchar(1字节)和NVarchar(2字节)数据。所以为了匹配UTF8,任何整理都可以。对于CI - 2008年的每一个排序规则都允许添加CI规范(它是UI中的一个复选框“区分大小写” - 未选中不敏感)。
The last bit and some others like width are just additional tuning on SQL Server.
最后一点和其他一些像宽度只是SQL Server的额外调整。
Point #2 from http://forums.mysql.com/read.php?103,187048,188748
点#2来自http://forums.mysql.com/read.php?103,187048,188748
utf8_unicode_ci is fine for all these languages: Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian.
utf8_unicode_ci适用于所有这些语言:俄语,保加利亚语,白俄罗斯语,马其顿语,塞尔维亚语和乌克兰语。
If you require sorting for a particular language, where languages handle accents differently, you need a specific dictionary order - refer here http://msdn.microsoft.com/en-us/library/ms144250.aspx. Otherwise Latin1_General is based on Latin-US
如果您需要对特定语言进行排序,语言处理重音的方式不同,则需要特定的字典顺序 - 请参阅http://msdn.microsoft.com/en-us/library/ms144250.aspx。否则Latin1_General基于拉丁美洲
#1
7
You might as well go with Latin1_General_CI_AI
你可以选择Latin1_General_CI_AI
The reason is that unicode data is stored using NVarchar fields, SQL Server is more flexible in that it can mix Varchar (1-byte) and NVarchar (2-byte) data. So to match UTF8, any collation would do. As for CI - every single collation in 2008 allows for the CI specification to be added (it is a checkbox in the UI "case sensitive" - unchecked for insensitive).
原因是使用NVarchar字段存储unicode数据,SQL Server更灵活,因为它可以混合Varchar(1字节)和NVarchar(2字节)数据。所以为了匹配UTF8,任何整理都可以。对于CI - 2008年的每一个排序规则都允许添加CI规范(它是UI中的一个复选框“区分大小写” - 未选中不敏感)。
The last bit and some others like width are just additional tuning on SQL Server.
最后一点和其他一些像宽度只是SQL Server的额外调整。
Point #2 from http://forums.mysql.com/read.php?103,187048,188748
点#2来自http://forums.mysql.com/read.php?103,187048,188748
utf8_unicode_ci is fine for all these languages: Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian.
utf8_unicode_ci适用于所有这些语言:俄语,保加利亚语,白俄罗斯语,马其顿语,塞尔维亚语和乌克兰语。
If you require sorting for a particular language, where languages handle accents differently, you need a specific dictionary order - refer here http://msdn.microsoft.com/en-us/library/ms144250.aspx. Otherwise Latin1_General is based on Latin-US
如果您需要对特定语言进行排序,语言处理重音的方式不同,则需要特定的字典顺序 - 请参阅http://msdn.microsoft.com/en-us/library/ms144250.aspx。否则Latin1_General基于拉丁美洲