I have a couple tables that are set to the latin1 character set but I suspect have been erroneously been inserted with some values that are actually encoded using utf8.
我有几个表设置为latin1字符集,但我怀疑错误插入了一些实际使用utf8编码的值。
MySQL makes this a little more complicated because it silently converts everything based on your connection settings.
MySQL使这变得更复杂,因为它根据您的连接设置静默转换所有内容。
How can I test my hypothesis that there are some utf8-encoded bytes in a latin1 column in MySQL?
我如何测试我的假设,即MySQL中的latin1列中有一些utf8编码的字节?
1 个解决方案
#1
If you find strings of 2 bytes which match the following bit pattern:
如果找到与以下位模式匹配的2个字节的字符串:
110xxxxx 10xxxxxx
chances are that these are utf-8 characters. It is possible that they are 2 consecutive non-ascii latin-1 characters (like 'Ä' or something unprintable), but that is unlikely.
很有可能这些都是utf-8字符。它们可能是2个连续的非ascii latin-1字符(如'Ä'或某些不可打印的字符),但这不太可能。
#1
If you find strings of 2 bytes which match the following bit pattern:
如果找到与以下位模式匹配的2个字节的字符串:
110xxxxx 10xxxxxx
chances are that these are utf-8 characters. It is possible that they are 2 consecutive non-ascii latin-1 characters (like 'Ä' or something unprintable), but that is unlikely.
很有可能这些都是utf-8字符。它们可能是2个连续的非ascii latin-1字符(如'Ä'或某些不可打印的字符),但这不太可能。