如何转换这些奇怪的角色?(«,¬、¹,A)

My page often shows things like Ã«, Ã, Ã¬, Ã¹, Ã in place of normal characters.

我页面经常显示一个«,A¬,¹,代替普通字符。

I use utf8 for header page and MySQL encode. How does this happen?

我使用utf8作为头页和MySQL编码。这是如何发生的?

3 个解决方案

#1

These are utf-8 encoded characters. Use utf8_decode() to convert them to normal ISO-8859-1 characters.

这些是utf-8编码字符。使用utf8_decode()将它们转换为普通的ISO-8859-1字符。

#2

If you see those characters you probably just didn’t specify the character encoding properly. Because those characters are the result when an UTF-8 multi-byte string is interpreted with a single-byte encoding like ISO 8859-1 or Windows-1252.

如果您看到这些字符，您可能只是没有正确地指定字符编码。因为当UTF-8多字节字符串被解释为像ISO 8859-1或Windows-1252这样的单字节编码时，这些字符就是结果。

In this case Ã« could be encoded with 0xC3 0xAB that represents the Unicode character ë (U+00EB) in UTF-8.

在这种情况下，«可以用0xC3 0xAB编码，它表示UTF-8中的Unicode字符e (U+00EB)。

#3

Even though utf8_decode is a useful solution, I prefer to correct the encoding errors on the table itself. In my opinion it is better to correct the bad characters themselves than making "hacks" in the code. Simply do a replace on the field on the table. To correct the bad encoded characters from OP :

尽管utf8_decode是一个有用的解决方案，但我更喜欢纠正表本身的编码错误。在我看来，最好是纠正错误的字符，而不是在代码中进行“hacks”。只需在表上的字段上做一个替换。纠正OP中的错误编码字符:

update <table> set <field> = replace(<field>, "Ã«", "ë")
update <table> set <field> = replace(<field>, "Ã", "à")
update <table> set <field> = replace(<field>, "Ã¬", "ì")
update <table> set <field> = replace(<field>, "Ã¹", "ù")

Where <table> is the name of the mysql table and <field> is the name of the column in the table. Here is a very good check-list for those typically bad encoded windows-1252 to utf-8 characters -> Debugging Chart Mapping Windows-1252 Characters to UTF-8 Bytes to Latin-1 Characters.

其中

是表中列的名称。这里是一个很好的清单，用于那些典型的糟糕的编码窗口-1252到utf-8字符->调试图映射窗口-1252字符到utf-8字节到Latin-1字符。

是mysql表的名称，

Remember to backup your table before trying to replace any characters with SQL!

在尝试用SQL替换任何字符之前，请记住备份您的表!

[I know this is an answer to a very old question, but was facing the issue once again. Some old windows machine didnt encoded the text correct before inserting it to the utf8_general_ci collated table.]

我知道这是对一个老问题的回答，但我又要面对这个问题了。一些旧的windows机器在将文本插入到utf8_general_ci排序表之前没有对文本进行正确编码。

#1