Say I have:
说我有:
ID Title
------------------------------------------------------
| 1 | ماهر زين |
------------------------------------------------------
Currently it's data type is set to VARCHAR(255)
with collation=utf8-default collation
.
目前,它的数据类型设置为VARCHAR(255),collation = utf8-default collation。
Based On research I had I found that You have to have Table column with the data type set to NVARCHAR
to be able to store unicode or arabic characters. So I tried to change the Data Type of my column to NVARCHAR
But it gives this error:
基于我的研究,我发现你必须将Table列的数据类型设置为NVARCHAR才能存储unicode或arabic字符。所以我尝试将我的列的数据类型更改为NVARCHAR但它给出了以下错误:
Query:
ALTER TABLE `db`.`table`
CHANGE COLUMN `NAME` `NAME` NVARCHAR(255) CHARACTER SET 'utf8' NULL DEFAULT NULL ;
Error:
Operation failed: There was an error while applying the SQL script to the database. Executing: ALTER TABLE
db
.table
CHANGE COLUMNNAME
NAME
NVARCHAR(255) CHARACTER SET 'utf8' NULL DEFAULT NULL ;操作失败:将SQL脚本应用于数据库时出错。执行:ALTER TABLE db.table CHANGE COLUMN NAME NAME NVARCHAR(255)CHARACTER SET'utf8'NULL DEFAULT NULL;
ERROR 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CHARACTER SET 'utf8' NULL DEFAULT NULL' at line 2 SQL Statement: ALTER TABLE
db
.table
CHANGE COLUMNNAME
NAME
NVARCHAR(255) CHARACTER SET 'utf8' NULL DEFAULT NULL错误1064:您的SQL语法有错误;检查与MySQL服务器版本对应的手册,以便在'CHARACTER SET'附近使用正确的语法'utf8'第2行'NULL DEFAULT NULL'SQL语句:ALTER TABLE db.table CHANGE COLUMN NAME NAME NVARCHAR(255)CHARACTER SET'utf8 'NULL DEFAULT NULL
FYI: I'm doing this conversion with MySql workbench manually.
仅供参考:我正在手动使用MySql工作台进行此转换。
2 个解决方案
#1
1
There is no need for NVARCHAR
here as Mysql handles Unicode fine with VARCHAR
. (Actually, NVARCHAR
is just VARCHAR
with predefined utf8
char set - see https://dev.mysql.com/doc/refman/5.7/en/charset-national.html)
这里不需要NVARCHAR,因为Mysql使用VARCHAR处理Unicode。 (实际上,NVARCHAR只是带有预定义utf8字符集的VARCHAR - 请参阅https://dev.mysql.com/doc/refman/5.7/en/charset-national.html)
Maybe you are confusing it with MSSQL?
也许你把它与MSSQL混淆了?
#2
1
I see 3 Questions from you that all seem to boil down to Arabic turning into "question marks". Search for that in http://*.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored . It discusses the likely causes.
我看到你们的3个问题,似乎都归结为阿拉伯语变成了“问号”。在http://*.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-存储中搜索。它讨论了可能的原因。
However, you have tried some ALTERs -- These may have made things worse. So let's do some diagnosing. Do SELECT HEX(...)
as discussed in that link under "Test the data". The hex for 'ماهر', if correctly stored as utf8, should show as D985 D8A7 D987 D8B1
. If you see anything different, the problem gets messier.
但是,你已经尝试过一些ALTER - 这可能会让事情变得更糟。所以让我们做一些诊断。按照“测试数据”下的链接中的说明进行SELECT HEX(...)。 'ماهر'的十六进制,如果正确存储为utf8,应显示为D985 D8A7 D987 D8B1。如果你看到任何不同的东西,问题会变得更加混乱。
3F3F3F3F
(hex for 4 question marks) is what you get if latin1 is involved. C399E280A6C398C2A7C399E280A1C398C2B1
would be "double encoding".
如果涉及latin1,则获得3F3F3F3F(4个问号的十六进制)。 C399E280A6C398C2A7C399E280A1C398C2B1将是“双重编码”。
Anyway, the likely cause of question marks is
无论如何,问号的可能原因是
- The bytes to be stored are not encoded as utf8/utf8mb4. Fix this. -- Dump the Arabic text in HEX from Java.
- The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this. -- Please provide
SHOW CREATE TABLE
for verification. - Also, check that the connection during reading is UTF-8. -- Theoretically, you dealt with this via
&useUnicode=yes&characterEncoding=UTF-8
.
要存储的字节不编码为utf8 / utf8mb4。解决这个问题。 - 从Java中转储HEX中的阿拉伯语文本。
数据库中的列是CHARACTER SET utf8(或utf8mb4)。解决这个问题。 - 请提供SHOW CREATE TABLE进行验证。
另外,检查读取期间的连接是否为UTF-8。 - 从理论上讲,你通过&useUnicode = yes&characterEncoding = UTF-8处理了这个问题。
#1
1
There is no need for NVARCHAR
here as Mysql handles Unicode fine with VARCHAR
. (Actually, NVARCHAR
is just VARCHAR
with predefined utf8
char set - see https://dev.mysql.com/doc/refman/5.7/en/charset-national.html)
这里不需要NVARCHAR,因为Mysql使用VARCHAR处理Unicode。 (实际上,NVARCHAR只是带有预定义utf8字符集的VARCHAR - 请参阅https://dev.mysql.com/doc/refman/5.7/en/charset-national.html)
Maybe you are confusing it with MSSQL?
也许你把它与MSSQL混淆了?
#2
1
I see 3 Questions from you that all seem to boil down to Arabic turning into "question marks". Search for that in http://*.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored . It discusses the likely causes.
我看到你们的3个问题,似乎都归结为阿拉伯语变成了“问号”。在http://*.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-存储中搜索。它讨论了可能的原因。
However, you have tried some ALTERs -- These may have made things worse. So let's do some diagnosing. Do SELECT HEX(...)
as discussed in that link under "Test the data". The hex for 'ماهر', if correctly stored as utf8, should show as D985 D8A7 D987 D8B1
. If you see anything different, the problem gets messier.
但是,你已经尝试过一些ALTER - 这可能会让事情变得更糟。所以让我们做一些诊断。按照“测试数据”下的链接中的说明进行SELECT HEX(...)。 'ماهر'的十六进制,如果正确存储为utf8,应显示为D985 D8A7 D987 D8B1。如果你看到任何不同的东西,问题会变得更加混乱。
3F3F3F3F
(hex for 4 question marks) is what you get if latin1 is involved. C399E280A6C398C2A7C399E280A1C398C2B1
would be "double encoding".
如果涉及latin1,则获得3F3F3F3F(4个问号的十六进制)。 C399E280A6C398C2A7C399E280A1C398C2B1将是“双重编码”。
Anyway, the likely cause of question marks is
无论如何,问号的可能原因是
- The bytes to be stored are not encoded as utf8/utf8mb4. Fix this. -- Dump the Arabic text in HEX from Java.
- The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this. -- Please provide
SHOW CREATE TABLE
for verification. - Also, check that the connection during reading is UTF-8. -- Theoretically, you dealt with this via
&useUnicode=yes&characterEncoding=UTF-8
.
要存储的字节不编码为utf8 / utf8mb4。解决这个问题。 - 从Java中转储HEX中的阿拉伯语文本。
数据库中的列是CHARACTER SET utf8(或utf8mb4)。解决这个问题。 - 请提供SHOW CREATE TABLE进行验证。
另外,检查读取期间的连接是否为UTF-8。 - 从理论上讲,你通过&useUnicode = yes&characterEncoding = UTF-8处理了这个问题。