Sql Server 2008 - 归类类型之间的差异

I'm installing a new SQL Server 2008 server and are having some problems getting any usable information regarding different collations. I have searched SQL Server BOL and google'ed for an answer but can't seem to be able to find any usable information.

我正在安装新的SQL Server 2008服务器,并且在获取有关不同排序规则的任何可用信息时遇到一些问题。我搜索过SQL Server BOL并谷歌搜索答案,但似乎无法找到任何有用的信息。

What is the difference between the Windows Collation "Finnish_Swedish_100" and "Finnish_Swedish"?

Windows Collation“Finnish_Swedish_100”和“Finnish_Swedish”之间有什么区别?

I suppose that the "_100"-version is a updated collation in SQL Server 2008, but what things have changed from the older version if that is the case?

我认为“_100”-version是SQL Server 2008中的更新排序规则,但如果是这种情况,那么旧版本的内容会发生什么变化?
Is it usually a good thing to have "Accent-sensitive" enabled? I know that it depends on the task and all that, but is there any well-known pros and cons to consider?

启用“强调敏感”通常是一件好事吗?我知道这取决于任务和所有这些,但是有任何众所周知的利弊需要考虑吗?
The "Binary" and "Binary-code point" parameters, in which cases should theese be enabled?

“二进制”和“二进制代码点”参数,在哪些情况下应该启用?

7 个解决方案

#1

The _100 indicates a collation sequence new in SQL Server 2008, those with _90 are for 2005 and those with no suffix are 2000. I don't know what the differences are, and can't find any documentation. Unless you are doing linked server queries to another SQL server of a different version, I'd be tempted to go with the _100 one. Sorry I can't help with the differences.

_100表示SQL Server 2008中的新排序规则,_90的排序顺序是2005年,没有后缀的排序顺序是2000.我不知道有什么不同,也找不到任何文档。除非您正在对另一个不同版本的SQL服务器进行链接服务器查询,否则我很想使用_100。对不起,我无法解决这些分歧。

#2

The letters ÅÄÖ/åäö do not mix up with A and O just by setting the collation to AI (Accent Insensitive). That is however true for â and other "combinations" not part of the Swedish alphabet as individual letters. â will mix or not mix depending of the setting in question.

字母ÅÄÖ/åäö只是通过将整理设置为AI(Accent Insensitive)而不与A和O混淆。然而,对于â和其他“组合”而言,这不是瑞典字母表中的单个字母。 â将根据相关设置混合或不混合。

Since I have a lot of old databases I still need to communicate with, also using linked servers, I chose FINNISH _SWEDISH _CI _AS now that I'm installing SQL2008. That was the default setting for FINNISH _SWEDISH when the Windows collations first appeared in SQL Server.

由于我有很多旧的数据库,我仍然需要与之交流,也使用链接服务器,我选择FINNISH _SWEDISH _CI _AS现在我正在安装SQL2008。当Windows排序规则首次出现在SQL Server中时,这是FINNISH _SWEDISH的默认设置。

#3

To address question 3 (info taken off the MSDN; wording theirs, format mine):

解决问题3(从MSDN上取下的信息;用他们的措辞,格式化我的):

Binary (_BIN):

Sorts and compares data in SQL Server tables based on the bit patterns defined for each character.

根据为每个字符定义的位模式对SQL Server表中的数据进行排序和比较。

Binary sort order is case-sensitive and accent-sensitive.

二进制排序顺序区分大小写并且区分重音。

Binary is also the fastest sorting order.

二进制也是最快的排序顺序。

If this option is not selected, SQL Server follows sorting and comparison rules as defined in dictionaries for the associated language or alphabet.

如果未选择此选项,则SQL Server将遵循相关语言或字母表的词典中定义的排序和比较规则。

Binary-code point (_BIN2):

二进制代码点(_BIN2):

For Unicode data: Sorts and compares data in SQL Server tables based on Unicode code points.

对于Unicode数据:根据Unicode代码点对SQL Server表中的数据进行排序和比较。

For non-Unicode data: will use comparisons identical to binary sorts.

对于非Unicode数据:将使用与二进制排序相同的比较。

The advantage of using a Binary-code point sort order is that no data resorting is required in applications that compare sorted SQL Server data. As a result, a Binary-code point sort order provides simpler application development and possible performance increases.

使用二进制代码点排序顺序的优点是,在比较排序的SQL Server数据的应用程序中不需要数据求助。因此,二进制代码点排序顺序可以提供更简单的应用程序开发和可能的性能提升。

For more information, see Guidelines for Using BIN and BIN2 Collations.

有关更多信息,请参阅使用BIN和BIN2排序规则。

#4

Use the query below to try it out yourself.

使用下面的查询自己尝试一下。

As you can see, å, ä, etc. do not count as accented characters, and are sorted according to the Swedish alphabet when using the Finnish/Swedish collation.

如您所见,å,ä等不计入重音字符,并且在使用芬兰语/瑞典语排序规则时根据瑞典字母排序。

However, the accents are only considered if you use the AS collation. For the AI collation, their order is unchanged, as if there was no accent at all.

但是,只有在使用AS排序规则时才会考虑重音。对于AI整理,他们的顺序没有变化,好像根本没有重音。

CREATE TABLE #Test (
    Number int identity,
    Value nvarchar(20) NOT NULL
);
GO

INSERT INTO #Test VALUES ('àá');
INSERT INTO #Test VALUES ('áa');
INSERT INTO #Test VALUES ('aa');
INSERT INTO #Test VALUES ('aà');

INSERT INTO #Test VALUES ('áb');
INSERT INTO #Test VALUES ('ab');

-- w is considered an accented version of v
INSERT INTO #Test VALUES ('wa');
INSERT INTO #Test VALUES ('va');
INSERT INTO #Test VALUES ('zz');
INSERT INTO #Test VALUES ('åä');
GO

SELECT Number, Value FROM #Test ORDER BY Value COLLATE Finnish_Swedish_CI_AS;
SELECT Number, Value FROM #Test ORDER BY Value COLLATE Finnish_Swedish_CI_AI;
GO

DROP TABLE #Test;
GO

#5

To adress your question 1. Accent sensitive is a good thing to have enabled for Finnish-Swedish. Otherwise your "å"s and "ä"s will be sorted as "a"s and "ö"s as "o"s. (Assuming you will be using those kind of international characters).

解决你的问题1.对于芬兰语 - 瑞典语来说,口音敏感是一件好事。否则,您的“å”和“ä”将被分类为“a”,“ö”将被分类为“o”。 (假设你将使用那种国际角色)。

More here: http://msdn.microsoft.com/en-us/library/ms143515.aspx (discusses both binary codepoint and accent sensitivity)

更多信息:http://msdn.microsoft.com/en-us/library/ms143515.aspx(讨论二进制代码点和重音敏感度)

#6

On Questions 2 and 3

关于问题2和问题3

Accent Sensitivity is something I would suggest turning OFF if you are accepting user data, and ON if you have clean, sanitized data. Not being Finnish myself, I don't know how many words there are that are different depending on the ó ô õ or ö that they have in them. But if there are users entering data, you can be sure that they will NOT be consistent in their usage, and you want to be able to match them. If you are gathering data from a dataset that you know the content of, and know the consistency of, then you will want to turn Accent Sensitivity ON because you know that the differences are purposeful.

如果您接受用户数据,我建议关闭重音灵敏度,如果您有干净,消毒的数据,则建议关闭。我自己不是芬兰人,我不知道有多少单词取决于他们在其中的óô或ö。但是,如果有用户输入数据,您可以确保它们的使用不一致,并且您希望能够匹配它们。如果您从数据集中收集知道其内容并知道其一致性的数据,那么您将需要打开Accent Sensitivity,因为您知道这些差异是有目的的。

The same questions apply when considering Question 3. (I'm mostly getting this from the link Tomalak provided) If the data is case and accent sensitive, then you want _BIN, because it will sort faster. If the data is irregular, and not case/accent sensitive, then you will want _BIN2, because it is designed for Unicode data.

在考虑问题3时,同样的问题也适用。(我主要是从Tomalak提供的链接中得到这个)如果数据是大小写和重音敏感的,那么你想要_BIN,因为它会更快排序。如果数据是不规则的,而不是大小写/重音敏感,那么你将需要_BIN2,因为它是为Unicode数据设计的。

#7

To address qestion 2:

解决问题2:

Yes, if accent's are required grammer for the given language.

是的,如果口音是给定语言所需的语法。

#1

#2

#3