问题json_encode utf - 8(复制)

时间:2023-01-05 22:12:51

This question already has an answer here:

这个问题已经有了答案:

I have a problem with json_encode function with special characters.

我有一个关于json_encode函数的特殊字符的问题。

For example I try this:

举个例子,我试试这个:

$string="Svrček";

echo "ENCODING=".mb_detect_encoding($string); //ENCODING=UTF-8

echo "JSON=".json_encode($string); //JSON="Svr\u010dek"

What can I do to display the string correctly, so JSON="Svrček"?

我能做些什么来显示正确的字符串,那么JSON =“Svrček”?

Thank you very much.

非常感谢。

3 个解决方案

#1


39  

json_encode() is not actually outputting JSON* there. It’s outputting a javascript string. (It outputs JSON when you give it an object or an array to encode.) That’s fine, as a javascript string is what you want.

json_encode()实际上并不是输出JSON*。它输出一个javascript字符串。(当你给它一个对象或一个数组来编码时,它输出JSON。)这很好,因为javascript字符串就是你想要的。

In javascript (and in JSON), č may be escaped as \u010. The two are equivalent. So there’s nothing wrong with what json_encode() is doing. It should work fine. I’d be very surprised if this is actually causing you any form of problem. However, if the transfer is safely in a Unicode encoding (UTF-8, usually)†, there’s no need for it either. If you want to turn off the escaping, you can do so thus: json_encode('Svrček', JSON_UNESCAPED_UNICODE). Note that the flag JSON_UNESCAPED_UNICODE was introduced in PHP 5.4.0, and is unavailable in earlier versions.

在javascript(和JSON)中,可以将其转义为\u010。这两个是等价的。所以json_encode()所做的事情没有任何问题。它应该工作很好。如果这真的会给你带来任何形式的问题,我会非常惊讶。但是,如果传输安全地使用Unicode编码(通常是UTF-8),那么也不需要它。如果你想关掉逃离,你可以这样做:json_encode(Svrček,JSON_UNESCAPED_UNICODE)。注意,JSON_UNESCAPED_UNICODE标记是在PHP 5.4.0中引入的,在早期版本中是不可用的。

By the way, contrary to what @onteria_ says, JSON does use UTF-8:

顺便说一下,与@onteria_所说的相反,JSON使用UTF-8:

The character encoding of JSON text is always Unicode. UTF-8 is the only encoding that makes sense on the wire, but UTF-16 and UTF-32 are also permitted.

JSON文本的字符编码始终是Unicode。UTF-8是唯一在这条线上有意义的编码,但是UTF-16和UTF-32也是允许的。


* Or, at least, it's not outputting JSON as defined in RFC 4627. However, there are other definitions of JSON, by which scalar values are allowed.

或者,至少,它不会输出RFC 4627中定义的JSON。但是,还有其他的JSON定义,允许使用标量值。

† JSON may be in UTF-8, UTF-16LE, UTF-16BE, UFT-32LE, or UTF-32BE.

可以使用UTF-8、UTF-16LE、UTF-16BE、UFT-32LE或utf -32。

#2


9  

Ok, so, after you make database connection in your php script, put this line, and it should work, at least it solved my problem:

在php脚本中建立数据库连接后,放这条线,它应该能工作,至少它解决了我的问题:

mysql_query('SET CHARACTER SET utf8');

#3


6  

Yes, json_encode escapes non-ascii characters. If you decode it you'll get your original result:

是的,json_encode可以逃离非ascii字符。如果你解码,你会得到原来的结果:

$string="こんにちは";
echo "ENCODING: " . mb_detect_encoding($string) . "\n";
$encoded = json_encode($string);
echo "ENCODED JSON: $encoded\n";
$decoded = json_decode($encoded);
echo "DECODED JSON: $decoded\n";

Output:

输出:

ENCODING: UTF-8
ENCODED JSON: "\u3053\u3093\u306b\u3061\u306f"
DECODED JSON: こんにちは

EDIT: It's worth nothing that:

编辑:它一文不值:

JSON uses Unicode exclusively.

JSON使用Unicode。

The self-documenting format that describes structure and field names as well as specific values;

描述结构和字段名称以及特定值的自记录格式;

Source: http://www.json.org/fatfree.html

来源:http://www.json.org/fatfree.html

It uses Unicode NOT UTF-8. This FAQ Explains the difference between UTF-8 and Unicode:

它使用的是Unicode而不是UTF-8。这个FAQ解释了UTF-8和Unicode的区别:

http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8

http://www.cl.cam.ac.uk/ ~ mgk25 / unicode.html # utf - 8

You use JSON, your non-ascii characters get escaped into Unicode code points. For example こ = code point 3053.

您使用JSON,您的非ascii字符被转成Unicode代码点。例如こ= 3053代码点。

#1


39  

json_encode() is not actually outputting JSON* there. It’s outputting a javascript string. (It outputs JSON when you give it an object or an array to encode.) That’s fine, as a javascript string is what you want.

json_encode()实际上并不是输出JSON*。它输出一个javascript字符串。(当你给它一个对象或一个数组来编码时,它输出JSON。)这很好,因为javascript字符串就是你想要的。

In javascript (and in JSON), č may be escaped as \u010. The two are equivalent. So there’s nothing wrong with what json_encode() is doing. It should work fine. I’d be very surprised if this is actually causing you any form of problem. However, if the transfer is safely in a Unicode encoding (UTF-8, usually)†, there’s no need for it either. If you want to turn off the escaping, you can do so thus: json_encode('Svrček', JSON_UNESCAPED_UNICODE). Note that the flag JSON_UNESCAPED_UNICODE was introduced in PHP 5.4.0, and is unavailable in earlier versions.

在javascript(和JSON)中,可以将其转义为\u010。这两个是等价的。所以json_encode()所做的事情没有任何问题。它应该工作很好。如果这真的会给你带来任何形式的问题,我会非常惊讶。但是,如果传输安全地使用Unicode编码(通常是UTF-8),那么也不需要它。如果你想关掉逃离,你可以这样做:json_encode(Svrček,JSON_UNESCAPED_UNICODE)。注意,JSON_UNESCAPED_UNICODE标记是在PHP 5.4.0中引入的,在早期版本中是不可用的。

By the way, contrary to what @onteria_ says, JSON does use UTF-8:

顺便说一下,与@onteria_所说的相反,JSON使用UTF-8:

The character encoding of JSON text is always Unicode. UTF-8 is the only encoding that makes sense on the wire, but UTF-16 and UTF-32 are also permitted.

JSON文本的字符编码始终是Unicode。UTF-8是唯一在这条线上有意义的编码,但是UTF-16和UTF-32也是允许的。


* Or, at least, it's not outputting JSON as defined in RFC 4627. However, there are other definitions of JSON, by which scalar values are allowed.

或者,至少,它不会输出RFC 4627中定义的JSON。但是,还有其他的JSON定义,允许使用标量值。

† JSON may be in UTF-8, UTF-16LE, UTF-16BE, UFT-32LE, or UTF-32BE.

可以使用UTF-8、UTF-16LE、UTF-16BE、UFT-32LE或utf -32。

#2


9  

Ok, so, after you make database connection in your php script, put this line, and it should work, at least it solved my problem:

在php脚本中建立数据库连接后,放这条线,它应该能工作,至少它解决了我的问题:

mysql_query('SET CHARACTER SET utf8');

#3


6  

Yes, json_encode escapes non-ascii characters. If you decode it you'll get your original result:

是的,json_encode可以逃离非ascii字符。如果你解码,你会得到原来的结果:

$string="こんにちは";
echo "ENCODING: " . mb_detect_encoding($string) . "\n";
$encoded = json_encode($string);
echo "ENCODED JSON: $encoded\n";
$decoded = json_decode($encoded);
echo "DECODED JSON: $decoded\n";

Output:

输出:

ENCODING: UTF-8
ENCODED JSON: "\u3053\u3093\u306b\u3061\u306f"
DECODED JSON: こんにちは

EDIT: It's worth nothing that:

编辑:它一文不值:

JSON uses Unicode exclusively.

JSON使用Unicode。

The self-documenting format that describes structure and field names as well as specific values;

描述结构和字段名称以及特定值的自记录格式;

Source: http://www.json.org/fatfree.html

来源:http://www.json.org/fatfree.html

It uses Unicode NOT UTF-8. This FAQ Explains the difference between UTF-8 and Unicode:

它使用的是Unicode而不是UTF-8。这个FAQ解释了UTF-8和Unicode的区别:

http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8

http://www.cl.cam.ac.uk/ ~ mgk25 / unicode.html # utf - 8

You use JSON, your non-ascii characters get escaped into Unicode code points. For example こ = code point 3053.

您使用JSON,您的非ascii字符被转成Unicode代码点。例如こ= 3053代码点。