I have a set of keywords that are passed through via JSON from a DB (encoded UTF-8), some of which may have special characters like é, è, ç, etc. This is used as part of an auto-completer. Example:
我有一组关键字,这些关键字是通过一个DB(编码的UTF-8)通过JSON传递的(编码UTF-8),其中一些可能具有像e、e、c等特殊字符,这是作为自动完成器的一部分使用的。例子:
array('Coffee', 'Cappuccino', 'Café');
I should add that the array as it comes from the DB would be:
我要补充的是,来自DB的数组应该是:
array('Coffee', 'Cappuccino', 'Café');
But JSON encodes as:
但JSON编码为:
["coffee", "cappuccino", null];
If I print these via print_r(), they show up fine on a UTF-8 encoded webpage, but café comes through as "café" if text/plain is used if I want to look at the array using print_r($array);exit();.
如果我打印这些通过print_r(),他们出现罚款一个utf - 8编码的网页,但咖啡馆来自“*美院©”如果使用text /平原如果我想看看数组使用print_r(数组)美元;退出();。
If I encode using utf8_encode() before encoding to JSON, it comes through fine, but what gets printed on the webpage is "café" and not "café".
如果我使用utf8_encode编码()JSON编码之前,通过精细,但印在什么网页“*美院©”而不是“咖啡馆”。
Also strange, but json_last_error() is being seen as an undefined function, but json_decode() and json_encode() work fine.
同样奇怪的是,json_last_error()被视为一个未定义的函数,但是json_decode()和json_encode()工作得很好。
Any ideas on how to get UTF-8 encoded data from the database to behave the same throughout the entire process?
关于如何从数据库中获取UTF-8编码的数据,使其在整个过程中都具有相同的行为,您有什么想法吗?
EIDT: Here is the PHP function that grabs the keywords and makes them into a single array:
这里是PHP函数,它获取关键字并将它们组成一个数组:
private function get_keywords()
{
global $db, $json;
$output = array();
$db->query("SELECT keywords FROM listings");
while ($r = $db->get_array())
{
$split = explode(",", $r['keywords']);
foreach ($split as $s)
{
$s = trim($s);
if ($s != "" && !in_array($s, $output)) $output[] = strtolower($s);
}
}
$json->echo_json($output);
}
The json::echo_json method just encodes, sets the header and prints it (for usage with Prototype)
json::echo_json方法只是编码,设置头并打印出来(用于原型)
EDIT: DB Connection method:
编辑:DB连接方法:
function connect()
{
if ($this->set['sql_connect'])
{
$this->connection = @mysql_connect( $this->set['sql_host'], $this->set['sql_user'], $this->set['sql_pass'])
OR $this->debug( "Connection Error", mysql_errno() .": ". mysql_error());
$this->db = @mysql_select_db( $this->set['sql_name'], $this->connection)
OR $this->debug( "Database Error", "Cannot Select Database '". $this->set['sql_name'] ."'");
$this->is_connected = TRUE;
}
return TRUE;
}
More Updates: Simple PHP script I ran:
更多更新:我运行的简单PHP脚本:
echo json_encode( array("Café") ); // ["Caf\u00e9"]
echo json_encode( array("Café") ); // null
5 个解决方案
#1
11
The reason could be the current client character setting. A simple solution could be to do set the client with mysql_query('SET CHARACTER SET utf8')
before running the SELECT
query.
原因可能是当前的客户端字符设置。一个简单的解决方案是在运行SELECT查询之前使用mysql_query(“设置字符集utf8”)设置客户机。
Update (June 2014)
更新(2014年6月)
The mysql extension is deprecated as of PHP 5.5.0. It is now recommended to use mysqli. Also, upon further reading - the above way of setting the client set should be avoided for reasons including security.
在PHP 5.5.0中,mysql扩展被弃用。现在建议使用mysqli。此外,在进一步阅读时——由于安全等原因,应该避免使用上述设置客户端集的方法。
I haven't tested it, but this should be an ok substitute:
我还没有测试过,但是这个应该是一个不错的替代品:
$mysqli = new mysqli("localhost", "my_user", "my_password", "my_db");
if (!$mysqli->set_charset('utf8')) {
printf("Error loading character set utf8: %s\n", $mysqli->error);
} else {
printf("Current character set: %s\n", $mysqli->character_set_name());
}
or with the connection parameter :
或与连接参数:
$conn = mysqli_connect("localhost", "my_user", "my_password", "my_db");
if (!mysqli_set_charset($conn, "utf8")) {
# TODO - Error: Unable to set the character set
exit;
}
#2
3
json_encode
seems to be dropping strings that contain invalid characters. It is likely that your UTF-8 data is not arriving in the proper form from your database.
json_encode似乎正在删除包含无效字符的字符串。很可能您的UTF-8数据没有从您的数据库中到达正确的格式。
Looking at the examples you give, my wild guess would be that your database connection is not UTF-8 encoded and serves ISO-8859-1 characters instead.
看看您给出的示例,我的猜测是您的数据库连接不是UTF-8编码的,而是服务于ISO-8859-1字符。
Can you try a SET NAMES utf8;
after initializing the connection?
你能试试叫utf8吗?在初始化连接?
#3
3
I tried your code sample like this
我试过你的代码样例
[~]> cat utf.php
<?php
$arr = array('Coffee', 'Cappuccino', 'Café');
print json_encode($arr);
[~]> php utf.php
["Coffee","Cappuccino","Caf\u00e9"]
[~]>
Based on that I would say that if the source data is really UTF-8, then json_encode works just fine. If its not, then thats where you get null. Why its not, I cannot tell based on this information.
基于此,我认为如果源数据是真正的UTF-8,那么json_encode就可以正常工作了。如果不是,那就是你得到空值的地方。为什么不呢?根据这个信息,我说不上来。
#4
1
Try sending your array through this function before doing json_encode():
在执行json_encode()之前,尝试通过这个函数发送数组:
<?php
function utf8json($inArray) {
static $depth = 0;
/* our return object */
$newArray = array();
/* safety recursion limit */
$depth ++;
if($depth >= '30') {
return false;
}
/* step through inArray */
foreach($inArray as $key=>$val) {
if(is_array($val)) {
/* recurse on array elements */
$newArray[$key] = utf8json($inArray);
} else {
/* encode string values */
$newArray[$key] = utf8_encode($val);
}
}
/* return utf8 encoded array */
return $newArray;
}
?>
Taken from comment on phpnet @ http://php.net/manual/en/function.json-encode.php.
摘自phpnet @ http://php.net/manual/en/function.json-encode.php的评论。
The function basically loops though array elements, perhaps you did your utf-8 encode on the array itself?
函数基本上是循环遍历数组元素,也许你在数组本身上进行了utf-8编码?
#5
0
My solution to encode utf8 data was :
我编码utf8数据的解决方案是:
$jsonArray = addslashes(json_encode($array, JSON_FORCE_OBJECT|JSON_UNESCAPED_UNICODE))
#1
11
The reason could be the current client character setting. A simple solution could be to do set the client with mysql_query('SET CHARACTER SET utf8')
before running the SELECT
query.
原因可能是当前的客户端字符设置。一个简单的解决方案是在运行SELECT查询之前使用mysql_query(“设置字符集utf8”)设置客户机。
Update (June 2014)
更新(2014年6月)
The mysql extension is deprecated as of PHP 5.5.0. It is now recommended to use mysqli. Also, upon further reading - the above way of setting the client set should be avoided for reasons including security.
在PHP 5.5.0中,mysql扩展被弃用。现在建议使用mysqli。此外,在进一步阅读时——由于安全等原因,应该避免使用上述设置客户端集的方法。
I haven't tested it, but this should be an ok substitute:
我还没有测试过,但是这个应该是一个不错的替代品:
$mysqli = new mysqli("localhost", "my_user", "my_password", "my_db");
if (!$mysqli->set_charset('utf8')) {
printf("Error loading character set utf8: %s\n", $mysqli->error);
} else {
printf("Current character set: %s\n", $mysqli->character_set_name());
}
or with the connection parameter :
或与连接参数:
$conn = mysqli_connect("localhost", "my_user", "my_password", "my_db");
if (!mysqli_set_charset($conn, "utf8")) {
# TODO - Error: Unable to set the character set
exit;
}
#2
3
json_encode
seems to be dropping strings that contain invalid characters. It is likely that your UTF-8 data is not arriving in the proper form from your database.
json_encode似乎正在删除包含无效字符的字符串。很可能您的UTF-8数据没有从您的数据库中到达正确的格式。
Looking at the examples you give, my wild guess would be that your database connection is not UTF-8 encoded and serves ISO-8859-1 characters instead.
看看您给出的示例,我的猜测是您的数据库连接不是UTF-8编码的,而是服务于ISO-8859-1字符。
Can you try a SET NAMES utf8;
after initializing the connection?
你能试试叫utf8吗?在初始化连接?
#3
3
I tried your code sample like this
我试过你的代码样例
[~]> cat utf.php
<?php
$arr = array('Coffee', 'Cappuccino', 'Café');
print json_encode($arr);
[~]> php utf.php
["Coffee","Cappuccino","Caf\u00e9"]
[~]>
Based on that I would say that if the source data is really UTF-8, then json_encode works just fine. If its not, then thats where you get null. Why its not, I cannot tell based on this information.
基于此,我认为如果源数据是真正的UTF-8,那么json_encode就可以正常工作了。如果不是,那就是你得到空值的地方。为什么不呢?根据这个信息,我说不上来。
#4
1
Try sending your array through this function before doing json_encode():
在执行json_encode()之前,尝试通过这个函数发送数组:
<?php
function utf8json($inArray) {
static $depth = 0;
/* our return object */
$newArray = array();
/* safety recursion limit */
$depth ++;
if($depth >= '30') {
return false;
}
/* step through inArray */
foreach($inArray as $key=>$val) {
if(is_array($val)) {
/* recurse on array elements */
$newArray[$key] = utf8json($inArray);
} else {
/* encode string values */
$newArray[$key] = utf8_encode($val);
}
}
/* return utf8 encoded array */
return $newArray;
}
?>
Taken from comment on phpnet @ http://php.net/manual/en/function.json-encode.php.
摘自phpnet @ http://php.net/manual/en/function.json-encode.php的评论。
The function basically loops though array elements, perhaps you did your utf-8 encode on the array itself?
函数基本上是循环遍历数组元素,也许你在数组本身上进行了utf-8编码?
#5
0
My solution to encode utf8 data was :
我编码utf8数据的解决方案是:
$jsonArray = addslashes(json_encode($array, JSON_FORCE_OBJECT|JSON_UNESCAPED_UNICODE))