UTF-8中的值在JSON中编码为NULL

时间:2022-10-24 22:34:32

I have a set of keywords that are passed through via JSON from a DB (encoded UTF-8), some of which may have special characters like é, è, ç, etc. This is used as part of an auto-completer. Example:

我有一组关键字,这些关键字是通过一个DB(编码的UTF-8)通过JSON传递的(编码UTF-8),其中一些可能具有像e、e、c等特殊字符,这是作为自动完成器的一部分使用的。例子:

array('Coffee', 'Cappuccino', 'Café');

I should add that the array as it comes from the DB would be:

我要补充的是,来自DB的数组应该是:

array('Coffee', 'Cappuccino', 'Café');

But JSON encodes as:

但JSON编码为:

["coffee", "cappuccino", null];

If I print these via print_r(), they show up fine on a UTF-8 encoded webpage, but café comes through as "café" if text/plain is used if I want to look at the array using print_r($array);exit();.

如果我打印这些通过print_r(),他们出现罚款一个utf - 8编码的网页,但咖啡馆来自“*美院©”如果使用text /平原如果我想看看数组使用print_r(数组)美元;退出();。

If I encode using utf8_encode() before encoding to JSON, it comes through fine, but what gets printed on the webpage is "café" and not "café".

如果我使用utf8_encode编码()JSON编码之前,通过精细,但印在什么网页“*美院©”而不是“咖啡馆”。

Also strange, but json_last_error() is being seen as an undefined function, but json_decode() and json_encode() work fine.

同样奇怪的是,json_last_error()被视为一个未定义的函数,但是json_decode()和json_encode()工作得很好。

Any ideas on how to get UTF-8 encoded data from the database to behave the same throughout the entire process?

关于如何从数据库中获取UTF-8编码的数据,使其在整个过程中都具有相同的行为,您有什么想法吗?

EIDT: Here is the PHP function that grabs the keywords and makes them into a single array:

这里是PHP函数,它获取关键字并将它们组成一个数组:

private function get_keywords() 
{
    global $db, $json;

    $output = array();

    $db->query("SELECT keywords FROM listings");

    while ($r = $db->get_array())
    {
        $split = explode(",", $r['keywords']);

        foreach ($split as $s)
        {
            $s = trim($s);
            if ($s != "" && !in_array($s, $output)) $output[] = strtolower($s);
        }
    }

    $json->echo_json($output);
}

The json::echo_json method just encodes, sets the header and prints it (for usage with Prototype)

json::echo_json方法只是编码,设置头并打印出来(用于原型)

EDIT: DB Connection method:

编辑:DB连接方法:

function connect()
{

    if ($this->set['sql_connect'])
    {
        $this->connection = @mysql_connect( $this->set['sql_host'], $this->set['sql_user'], $this->set['sql_pass'])
                OR $this->debug( "Connection Error", mysql_errno() .": ". mysql_error());
        $this->db = @mysql_select_db( $this->set['sql_name'], $this->connection)
                OR $this->debug( "Database Error", "Cannot Select Database '". $this->set['sql_name'] ."'");

        $this->is_connected = TRUE;
    }

    return TRUE;
}

More Updates: Simple PHP script I ran:

更多更新:我运行的简单PHP脚本:

echo json_encode( array("Café") ); // ["Caf\u00e9"]
echo json_encode( array("Café") ); // null

5 个解决方案

#1


11  

The reason could be the current client character setting. A simple solution could be to do set the client with mysql_query('SET CHARACTER SET utf8') before running the SELECT query.

原因可能是当前的客户端字符设置。一个简单的解决方案是在运行SELECT查询之前使用mysql_query(“设置字符集utf8”)设置客户机。

Update (June 2014)

更新(2014年6月)

The mysql extension is deprecated as of PHP 5.5.0. It is now recommended to use mysqli. Also, upon further reading - the above way of setting the client set should be avoided for reasons including security.

在PHP 5.5.0中,mysql扩展被弃用。现在建议使用mysqli。此外,在进一步阅读时——由于安全等原因,应该避免使用上述设置客户端集的方法。

I haven't tested it, but this should be an ok substitute:

我还没有测试过,但是这个应该是一个不错的替代品:

$mysqli = new mysqli("localhost", "my_user", "my_password", "my_db");
if (!$mysqli->set_charset('utf8')) {
    printf("Error loading character set utf8: %s\n", $mysqli->error);
} else {
    printf("Current character set: %s\n", $mysqli->character_set_name());
}

or with the connection parameter :

或与连接参数:

$conn = mysqli_connect("localhost", "my_user", "my_password", "my_db");
if (!mysqli_set_charset($conn, "utf8")) {
    # TODO - Error: Unable to set the character set
    exit;
}

#2


3  

json_encode seems to be dropping strings that contain invalid characters. It is likely that your UTF-8 data is not arriving in the proper form from your database.

json_encode似乎正在删除包含无效字符的字符串。很可能您的UTF-8数据没有从您的数据库中到达正确的格式。

Looking at the examples you give, my wild guess would be that your database connection is not UTF-8 encoded and serves ISO-8859-1 characters instead.

看看您给出的示例,我的猜测是您的数据库连接不是UTF-8编码的,而是服务于ISO-8859-1字符。

Can you try a SET NAMES utf8; after initializing the connection?

你能试试叫utf8吗?在初始化连接?

#3


3  

I tried your code sample like this

我试过你的代码样例

[~]> cat utf.php 
<?php
$arr = array('Coffee', 'Cappuccino', 'Café');
print json_encode($arr);
[~]> php utf.php 
["Coffee","Cappuccino","Caf\u00e9"]
[~]>

Based on that I would say that if the source data is really UTF-8, then json_encode works just fine. If its not, then thats where you get null. Why its not, I cannot tell based on this information.

基于此,我认为如果源数据是真正的UTF-8,那么json_encode就可以正常工作了。如果不是,那就是你得到空值的地方。为什么不呢?根据这个信息,我说不上来。

#4


1  

Try sending your array through this function before doing json_encode():

在执行json_encode()之前,尝试通过这个函数发送数组:

<?php

function utf8json($inArray) {

    static $depth = 0;

    /* our return object */
    $newArray = array();

    /* safety recursion limit */
    $depth ++;
    if($depth >= '30') {
        return false;
    }

    /* step through inArray */
    foreach($inArray as $key=>$val) {
        if(is_array($val)) {
            /* recurse on array elements */
            $newArray[$key] = utf8json($inArray);
        } else {
            /* encode string values */
            $newArray[$key] = utf8_encode($val);
        }
    }

    /* return utf8 encoded array */
    return $newArray;
}
?>

Taken from comment on phpnet @ http://php.net/manual/en/function.json-encode.php.

摘自phpnet @ http://php.net/manual/en/function.json-encode.php的评论。

The function basically loops though array elements, perhaps you did your utf-8 encode on the array itself?

函数基本上是循环遍历数组元素,也许你在数组本身上进行了utf-8编码?

#5


0  

My solution to encode utf8 data was :

我编码utf8数据的解决方案是:

$jsonArray = addslashes(json_encode($array, JSON_FORCE_OBJECT|JSON_UNESCAPED_UNICODE))

#1


11  

The reason could be the current client character setting. A simple solution could be to do set the client with mysql_query('SET CHARACTER SET utf8') before running the SELECT query.

原因可能是当前的客户端字符设置。一个简单的解决方案是在运行SELECT查询之前使用mysql_query(“设置字符集utf8”)设置客户机。

Update (June 2014)

更新(2014年6月)

The mysql extension is deprecated as of PHP 5.5.0. It is now recommended to use mysqli. Also, upon further reading - the above way of setting the client set should be avoided for reasons including security.

在PHP 5.5.0中,mysql扩展被弃用。现在建议使用mysqli。此外,在进一步阅读时——由于安全等原因,应该避免使用上述设置客户端集的方法。

I haven't tested it, but this should be an ok substitute:

我还没有测试过,但是这个应该是一个不错的替代品:

$mysqli = new mysqli("localhost", "my_user", "my_password", "my_db");
if (!$mysqli->set_charset('utf8')) {
    printf("Error loading character set utf8: %s\n", $mysqli->error);
} else {
    printf("Current character set: %s\n", $mysqli->character_set_name());
}

or with the connection parameter :

或与连接参数:

$conn = mysqli_connect("localhost", "my_user", "my_password", "my_db");
if (!mysqli_set_charset($conn, "utf8")) {
    # TODO - Error: Unable to set the character set
    exit;
}

#2


3  

json_encode seems to be dropping strings that contain invalid characters. It is likely that your UTF-8 data is not arriving in the proper form from your database.

json_encode似乎正在删除包含无效字符的字符串。很可能您的UTF-8数据没有从您的数据库中到达正确的格式。

Looking at the examples you give, my wild guess would be that your database connection is not UTF-8 encoded and serves ISO-8859-1 characters instead.

看看您给出的示例,我的猜测是您的数据库连接不是UTF-8编码的,而是服务于ISO-8859-1字符。

Can you try a SET NAMES utf8; after initializing the connection?

你能试试叫utf8吗?在初始化连接?

#3


3  

I tried your code sample like this

我试过你的代码样例

[~]> cat utf.php 
<?php
$arr = array('Coffee', 'Cappuccino', 'Café');
print json_encode($arr);
[~]> php utf.php 
["Coffee","Cappuccino","Caf\u00e9"]
[~]>

Based on that I would say that if the source data is really UTF-8, then json_encode works just fine. If its not, then thats where you get null. Why its not, I cannot tell based on this information.

基于此,我认为如果源数据是真正的UTF-8,那么json_encode就可以正常工作了。如果不是,那就是你得到空值的地方。为什么不呢?根据这个信息,我说不上来。

#4


1  

Try sending your array through this function before doing json_encode():

在执行json_encode()之前,尝试通过这个函数发送数组:

<?php

function utf8json($inArray) {

    static $depth = 0;

    /* our return object */
    $newArray = array();

    /* safety recursion limit */
    $depth ++;
    if($depth >= '30') {
        return false;
    }

    /* step through inArray */
    foreach($inArray as $key=>$val) {
        if(is_array($val)) {
            /* recurse on array elements */
            $newArray[$key] = utf8json($inArray);
        } else {
            /* encode string values */
            $newArray[$key] = utf8_encode($val);
        }
    }

    /* return utf8 encoded array */
    return $newArray;
}
?>

Taken from comment on phpnet @ http://php.net/manual/en/function.json-encode.php.

摘自phpnet @ http://php.net/manual/en/function.json-encode.php的评论。

The function basically loops though array elements, perhaps you did your utf-8 encode on the array itself?

函数基本上是循环遍历数组元素,也许你在数组本身上进行了utf-8编码?

#5


0  

My solution to encode utf8 data was :

我编码utf8数据的解决方案是:

$jsonArray = addslashes(json_encode($array, JSON_FORCE_OBJECT|JSON_UNESCAPED_UNICODE))