I'm calling json_encode()
on data that comes from a MySQL database with utf8_general_ci
collation. The problem is that some rows have weird data which I can't clean. For example symbol �
, so once it reaches json_encode()
, it fails with json_encode(): Invalid UTF-8 sequence in argument
.
我在使用utf8_general_ci排序的MySQL数据库中调用json_encode()。问题是有些行有奇怪的数据,我不能清理。例如,当它到达json_encode()时,它就会失败,json_encode():参数中无效的UTF-8序列。
I've tried utf8_encode()
and utf8_decode()
, even with mb_check_encoding()
but it keeps getting through and causing havoc.
我已经尝试了utf8_encode()和utf8_decode(),即使使用mb_check_encoding(),但它仍然能够通过并造成破坏。
Running PHP 5.3.10 on Mac. So the question is - how can I clean up invalid utf8 symbols, keeping the rest of data, so that json_encoding()
would work?
在Mac上运行PHP 5.3.10,所以问题是——如何清除无效的utf8符号,保留其余的数据,以便json_encoding()能够工作?
Update. Here is a way to reproduce it:
更新。这里有一种复制的方法:
echo json_encode(pack("H*" ,'c32e'));
11 个解决方案
#1
31
I had a similar error which caused json_encode to return a null field whenever there was a hi-ascii character such as a curly apostrophe in a string, due to the wrong character set being returned in the query.
我有一个类似的错误,它导致json_encode返回一个空字段,因为在查询中返回的字符设置错误,在字符串中出现了一个高ascii字符,比如字符串中的一个卷撇号。
The solution was to make sure it comes as utf8 by adding:
解决方案是通过添加:
mysql_set_charset('utf8');
after the mysql connect statement.
在mysql连接语句之后。
#2
23
Seems like the symbol was Å
, but since data consists of surnames that shouldn't be public, only first letter was shown and it was done by just $lastname[0]
, which is wrong for multibyte strings and caused the whole hassle. Changed it to mb_substr($lastname, 0, 1)
- works like a charm.
似乎这个符号是A,但是由于数据包含了不应该公开的姓氏,只显示了第一个字母,它只使用了$lastname[0],这对于多字节字符串是错误的,并且引起了整个麻烦。将其更改为mb_substr($lastname, 0,1)——工作起来很有魅力。
#3
21
The problem is that this character is UTF8, but json_encode does not handle it correctly. To say more, there is a list of other characters (see Unicode characters list), that will trigger the same error, so stripping off this one (Å) will not correct an issue to the end.
问题是这个字符是UTF8,但是json_encode不能正确处理它。另外,还有其他字符的列表(参见Unicode字符列表),这将触发相同的错误,因此去掉这个(a)将不会纠正一个问题。
What we have used is to convert these chars to html entities like this:
我们所使用的是将这些字符转换成这样的html实体:
htmlentities( (string) $value, ENT_QUOTES, 'utf-8', FALSE);
#4
12
Make sure that your connection charset to MySQL is UTF-8. It often defaults to ISO-8859-1 which means that the MySQL driver will convert the text to ISO-8859-1.
确保你的连接字符集到MySQL是UTF-8。它通常默认为ISO-8859-1,这意味着MySQL驱动程序将把文本转换为ISO-8859-1。
You can set the connection charset with mysql_set_charset, mysqli_set_charset or with the query SET NAMES 'utf-8'
您可以使用mysql_set_charset、mysqli_set_charset或查询集名称“utf-8”来设置连接字符集。
#5
5
Using this code might help. It solved my problem!
使用此代码可能会有所帮助。它解决了我的问题!
mb_convert_encoding($post["post"],'UTF-8','UTF-8');
or like that
或者像这样
mb_convert_encoding($string,'UTF-8','UTF-8');
#6
2
The symbol you posted is the placeholder symbol for a broken byte sequence. Basically, it's not a real symbol but an error in your string.
您发布的符号是一个破字节序列的占位符符号。基本上,它不是一个真正的符号,而是字符串中的一个错误。
What is the exact byte value of the symbol? Blindly applying utf8_encode is not a good idea, it's better to find out first where the byte(s) came from and what they mean.
符号的确切字节值是多少?盲目地应用utf8_encode不是一个好主意,最好先找出字节来自哪里以及它们的含义。
#7
0
Another thing that throws this error, when you use php's json_encode function, is when unicode characters are upper case \U and not lower case \u
当你使用php的json_encode函数时,另一个错误是,当unicode字符是大写字母U,而不是小写字母U时。
#8
0
json_encode works only with UTF-8 data. You'll have to ensure that your data is in UTF-8. alternatively, you can use iconv() to convert your results to UTF-8 before feeding them to json_encode()
json_encode只使用UTF-8数据。你必须确保你的数据是UTF-8。或者,您也可以使用iconv()将结果转换为UTF-8,然后再将其输入json_encode()
#9
0
Updated.. I solved this issue by stating the charset on PDO connection as below:
更新. .我通过在PDO连接上的charset来解决这个问题:
"mysql:host=$host;dbname=$db;charset=utf8"
“mysql:主机= $主机;dbname = $ db;charset = utf8 "
All data received was then in the correct charset for the rest of the code to use
接收到的所有数据都在正确的字符集中,以供其余代码使用。
#10
0
I am very late but if some one working on SLIM to make rest api and getting same error can solve this problem by adding below line as:
<?php
// DbConnect.php file
class DbConnect
{
//Variable to store database link
private $con;
//Class constructor
function __construct()
{
}
//This method will connect to the database
function connect()
{
//Including the constants.php file to get the database constants
include_once dirname(__FILE__) . '/Constants.php';
//connecting to mysql database
$this->con = new mysqli(DB_HOST, DB_USERNAME, DB_PASSWORD, DB_NAME);
mysqli_set_charset($this->con, "utf8"); // add this line
//Checking if any error occured while connecting
if (mysqli_connect_errno()) {
echo "Failed to connect to MySQL: " . mysqli_connect_error();
}
//finally returning the connection link
return $this->con;
}
}
#11
-1
Using setLocale('fr_FR.UTF8') before json_encode solved the problem.
在json_encode解决问题之前,使用setLocale('fr_FR.UTF8')。
#1
31
I had a similar error which caused json_encode to return a null field whenever there was a hi-ascii character such as a curly apostrophe in a string, due to the wrong character set being returned in the query.
我有一个类似的错误,它导致json_encode返回一个空字段,因为在查询中返回的字符设置错误,在字符串中出现了一个高ascii字符,比如字符串中的一个卷撇号。
The solution was to make sure it comes as utf8 by adding:
解决方案是通过添加:
mysql_set_charset('utf8');
after the mysql connect statement.
在mysql连接语句之后。
#2
23
Seems like the symbol was Å
, but since data consists of surnames that shouldn't be public, only first letter was shown and it was done by just $lastname[0]
, which is wrong for multibyte strings and caused the whole hassle. Changed it to mb_substr($lastname, 0, 1)
- works like a charm.
似乎这个符号是A,但是由于数据包含了不应该公开的姓氏,只显示了第一个字母,它只使用了$lastname[0],这对于多字节字符串是错误的,并且引起了整个麻烦。将其更改为mb_substr($lastname, 0,1)——工作起来很有魅力。
#3
21
The problem is that this character is UTF8, but json_encode does not handle it correctly. To say more, there is a list of other characters (see Unicode characters list), that will trigger the same error, so stripping off this one (Å) will not correct an issue to the end.
问题是这个字符是UTF8,但是json_encode不能正确处理它。另外,还有其他字符的列表(参见Unicode字符列表),这将触发相同的错误,因此去掉这个(a)将不会纠正一个问题。
What we have used is to convert these chars to html entities like this:
我们所使用的是将这些字符转换成这样的html实体:
htmlentities( (string) $value, ENT_QUOTES, 'utf-8', FALSE);
#4
12
Make sure that your connection charset to MySQL is UTF-8. It often defaults to ISO-8859-1 which means that the MySQL driver will convert the text to ISO-8859-1.
确保你的连接字符集到MySQL是UTF-8。它通常默认为ISO-8859-1,这意味着MySQL驱动程序将把文本转换为ISO-8859-1。
You can set the connection charset with mysql_set_charset, mysqli_set_charset or with the query SET NAMES 'utf-8'
您可以使用mysql_set_charset、mysqli_set_charset或查询集名称“utf-8”来设置连接字符集。
#5
5
Using this code might help. It solved my problem!
使用此代码可能会有所帮助。它解决了我的问题!
mb_convert_encoding($post["post"],'UTF-8','UTF-8');
or like that
或者像这样
mb_convert_encoding($string,'UTF-8','UTF-8');
#6
2
The symbol you posted is the placeholder symbol for a broken byte sequence. Basically, it's not a real symbol but an error in your string.
您发布的符号是一个破字节序列的占位符符号。基本上,它不是一个真正的符号,而是字符串中的一个错误。
What is the exact byte value of the symbol? Blindly applying utf8_encode is not a good idea, it's better to find out first where the byte(s) came from and what they mean.
符号的确切字节值是多少?盲目地应用utf8_encode不是一个好主意,最好先找出字节来自哪里以及它们的含义。
#7
0
Another thing that throws this error, when you use php's json_encode function, is when unicode characters are upper case \U and not lower case \u
当你使用php的json_encode函数时,另一个错误是,当unicode字符是大写字母U,而不是小写字母U时。
#8
0
json_encode works only with UTF-8 data. You'll have to ensure that your data is in UTF-8. alternatively, you can use iconv() to convert your results to UTF-8 before feeding them to json_encode()
json_encode只使用UTF-8数据。你必须确保你的数据是UTF-8。或者,您也可以使用iconv()将结果转换为UTF-8,然后再将其输入json_encode()
#9
0
Updated.. I solved this issue by stating the charset on PDO connection as below:
更新. .我通过在PDO连接上的charset来解决这个问题:
"mysql:host=$host;dbname=$db;charset=utf8"
“mysql:主机= $主机;dbname = $ db;charset = utf8 "
All data received was then in the correct charset for the rest of the code to use
接收到的所有数据都在正确的字符集中,以供其余代码使用。
#10
0
I am very late but if some one working on SLIM to make rest api and getting same error can solve this problem by adding below line as:
<?php
// DbConnect.php file
class DbConnect
{
//Variable to store database link
private $con;
//Class constructor
function __construct()
{
}
//This method will connect to the database
function connect()
{
//Including the constants.php file to get the database constants
include_once dirname(__FILE__) . '/Constants.php';
//connecting to mysql database
$this->con = new mysqli(DB_HOST, DB_USERNAME, DB_PASSWORD, DB_NAME);
mysqli_set_charset($this->con, "utf8"); // add this line
//Checking if any error occured while connecting
if (mysqli_connect_errno()) {
echo "Failed to connect to MySQL: " . mysqli_connect_error();
}
//finally returning the connection link
return $this->con;
}
}
#11
-1
Using setLocale('fr_FR.UTF8') before json_encode solved the problem.
在json_encode解决问题之前,使用setLocale('fr_FR.UTF8')。