I logged into MariaDB/MySQL and entered:
我登录MariaDB / MySQL并输入:
SHOW COLLATION;
I see utf8mb4_unicode_ci
and utf8mb4_unicode_520_ci
among the available collations. What is the difference between these two collations and which should we be using?
我在可用的排序规则中看到utf8mb4_unicode_ci和utf8mb4_unicode_520_ci。这两种排序规则之间有什么区别,我们应该使用哪种排序规则?
2 个解决方案
#1
19
Well you shall need to read in to the documentation. I can't tell you what you should be using because every project is different.
那么你需要阅读文档。我无法告诉你应该使用什么因为每个项目都不同。
10.1.3 Collation Naming Conventions
MySQL collation names follow these conventions:
MySQL排序规则名称遵循以下约定:
A collation name starts with the name of the character set with which it is associated, followed by one or more suffixes indicating other collation characteristics. For example, utf8_general_ci and latin_swedish_ci are collations for the utf8 and latin1 character sets, respectively.
排序规则名称以与其关联的字符集的名称开头,后跟一个或多个指示其他排序规则特征的后缀。例如,utf8_general_ci和latin_swedish_ci分别是utf8和latin1字符集的排序规则。
A language-specific collation includes a language name. For example, utf8_turkish_ci and utf8_hungarian_ci sort characters for the utf8 character set using the rules of Turkish and Hungarian, respectively.
特定于语言的排序规则包括语言名称。例如,utf8_turkish_ci和utf8_hungarian_ci分别使用土耳其语和匈牙利语的规则为utf8字符集排序字符。
Case sensitivity for sorting is indicated by _ci (case insensitive), _cs (case sensitive), or _bin (binary; character comparisons are based on character binary code values). For example, latin1_general_ci is case insensitive, latin1_general_cs is case sensitive, and latin1_bin uses binary code values.
排序的区分大小写由_ci(不区分大小写),_ cc(区分大小写)或_bin(二进制;字符比较基于字符二进制代码值)表示。例如,latin1_general_ci不区分大小写,latin1_general_cs区分大小写,latin1_bin使用二进制代码值。
For Unicode, collation names may include a version number to indicate the version of the Unicode Collation Algorithm (UCA) on which the collation is based. UCA-based collations without a version number in the name use the version-4.0.0 UCA weight keys. For example:
对于Unicode,归类名称可以包括版本号,以指示归类所基于的Unicode归类算法(UCA)的版本。名称中没有版本号的基于UCA的排序规则使用版本-4.0.0 UCA权重键。例如:
utf8_unicode_ci (with no version named) is based on UCA 4.0.0 weight keys >(http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt).
utf8_unicode_ci(没有命名版本)基于UCA 4.0.0权重键>(http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt)。
utf8_unicode_520_ci is based on UCA 5.2.0 weight keys (http://www.unicode.org/Public/UCA/5.2.0/allkeys.txt).
utf8_unicode_520_ci基于UCA 5.2.0权重密钥(http://www.unicode.org/Public/UCA/5.2.0/allkeys.txt)。
For Unicode, the xxx_general_mysql500_ci collations preserve the pre-5.1.24 ordering of the original xxx_general_ci collations and permit upgrades for tables created before MySQL 5.1.24. For more information, see Section 2.11.3, “Checking Whether Tables or Indexes Must Be Rebuilt”, and Section 2.11.4, “Rebuilding or Repairing Tables or Indexes”.
对于Unicode,xxx_general_mysql500_ci排序规则保留了原始xxx_general_ci排序规则的5.1.24之前的排序,并允许对MySQL 5.1.24之前创建的表进行升级。有关更多信息,请参见第2.11.3节“检查表或索引是否必须重建”,以及第2.11.4节“重建或修复表或索引”。
Source : https://dev.mysql.com/doc/refman/5.6/en/charset-collation-names.html
资料来源:https://dev.mysql.com/doc/refman/5.6/en/charset-collation-names.html
#2
2
To see a bit more discussion of the actual differences, you can go to https://dev.mysql.com/worklog/task/?id=2673 and click "High Level Architecture".
要查看有关实际差异的更多讨论,可以访问https://dev.mysql.com/worklog/task/?id=2673并单击“高级架构”。
#1
19
Well you shall need to read in to the documentation. I can't tell you what you should be using because every project is different.
那么你需要阅读文档。我无法告诉你应该使用什么因为每个项目都不同。
10.1.3 Collation Naming Conventions
MySQL collation names follow these conventions:
MySQL排序规则名称遵循以下约定:
A collation name starts with the name of the character set with which it is associated, followed by one or more suffixes indicating other collation characteristics. For example, utf8_general_ci and latin_swedish_ci are collations for the utf8 and latin1 character sets, respectively.
排序规则名称以与其关联的字符集的名称开头,后跟一个或多个指示其他排序规则特征的后缀。例如,utf8_general_ci和latin_swedish_ci分别是utf8和latin1字符集的排序规则。
A language-specific collation includes a language name. For example, utf8_turkish_ci and utf8_hungarian_ci sort characters for the utf8 character set using the rules of Turkish and Hungarian, respectively.
特定于语言的排序规则包括语言名称。例如,utf8_turkish_ci和utf8_hungarian_ci分别使用土耳其语和匈牙利语的规则为utf8字符集排序字符。
Case sensitivity for sorting is indicated by _ci (case insensitive), _cs (case sensitive), or _bin (binary; character comparisons are based on character binary code values). For example, latin1_general_ci is case insensitive, latin1_general_cs is case sensitive, and latin1_bin uses binary code values.
排序的区分大小写由_ci(不区分大小写),_ cc(区分大小写)或_bin(二进制;字符比较基于字符二进制代码值)表示。例如,latin1_general_ci不区分大小写,latin1_general_cs区分大小写,latin1_bin使用二进制代码值。
For Unicode, collation names may include a version number to indicate the version of the Unicode Collation Algorithm (UCA) on which the collation is based. UCA-based collations without a version number in the name use the version-4.0.0 UCA weight keys. For example:
对于Unicode,归类名称可以包括版本号,以指示归类所基于的Unicode归类算法(UCA)的版本。名称中没有版本号的基于UCA的排序规则使用版本-4.0.0 UCA权重键。例如:
utf8_unicode_ci (with no version named) is based on UCA 4.0.0 weight keys >(http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt).
utf8_unicode_ci(没有命名版本)基于UCA 4.0.0权重键>(http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt)。
utf8_unicode_520_ci is based on UCA 5.2.0 weight keys (http://www.unicode.org/Public/UCA/5.2.0/allkeys.txt).
utf8_unicode_520_ci基于UCA 5.2.0权重密钥(http://www.unicode.org/Public/UCA/5.2.0/allkeys.txt)。
For Unicode, the xxx_general_mysql500_ci collations preserve the pre-5.1.24 ordering of the original xxx_general_ci collations and permit upgrades for tables created before MySQL 5.1.24. For more information, see Section 2.11.3, “Checking Whether Tables or Indexes Must Be Rebuilt”, and Section 2.11.4, “Rebuilding or Repairing Tables or Indexes”.
对于Unicode,xxx_general_mysql500_ci排序规则保留了原始xxx_general_ci排序规则的5.1.24之前的排序,并允许对MySQL 5.1.24之前创建的表进行升级。有关更多信息,请参见第2.11.3节“检查表或索引是否必须重建”,以及第2.11.4节“重建或修复表或索引”。
Source : https://dev.mysql.com/doc/refman/5.6/en/charset-collation-names.html
资料来源:https://dev.mysql.com/doc/refman/5.6/en/charset-collation-names.html
#2
2
To see a bit more discussion of the actual differences, you can go to https://dev.mysql.com/worklog/task/?id=2673 and click "High Level Architecture".
要查看有关实际差异的更多讨论,可以访问https://dev.mysql.com/worklog/task/?id=2673并单击“高级架构”。