This question already has an answer here:
这个问题在这里已有答案:
- UTF-8 all the way through 13 answers
UTF-8一直通过13个答案
For a few days now I've been looking for a solution to display UTF8 on my webpage. The character currently causing trouble is į (unicode: \u012f decimal: 303) however, there are over 10,000 records in my database and I cannot guarantee that all others are displaying correctly. So I'm looking for a solution that should cover all characters.
几天后,我一直在寻找在我的网页上显示UTF8的解决方案。当前导致问题的字符是į(unicode:\ u012f decimal:303)但是,我的数据库中有超过10,000条记录,我不能保证所有其他记录都正确显示。所以我正在寻找一个涵盖所有角色的解决方案。
The į is displaying as a ? in the HTML.
į显示为?在HTML中。
My setup is a HTML page, which uses AJAX to send a request to a PHP file. The PHP then queries a MYSQL database to find a specific entry, it then takes a lithuanian word from that entry and echoes it as a response to AJAX. Back in the Javascript, the response is set as the innerHTML of a HTML element. This current setup is not using JQuery.
我的设置是一个HTML页面,它使用AJAX向PHP文件发送请求。然后,PHP查询MYSQL数据库以查找特定条目,然后从该条目中获取立陶宛语单词并将其作为对AJAX的响应。回到Javascript,响应被设置为HTML元素的innerHTML。此当前设置不使用JQuery。
Below is my progress on attempting to fix the issue.
以下是我尝试解决问题的进展情况。
First, I verified that all files I was working with are correctly encoded to UTF8, not UTF8BOM.
首先,我验证了我正在使用的所有文件都正确编码为UTF8,而不是UTF8BOM。
Then I opened the MYSQL database in phpMyAdmin to view the entries. Seeing characters replaced with ? in the entries, I done some research and found the database had the wrong collation. After changing the collation to utf8_general_ci for the database/table nothing changed, so I looked into it further and found that changing it for individual columns of a table was another solution. This worked and my database is now displaying the characters correctly.
然后我在phpMyAdmin中打开MYSQL数据库来查看条目。看到字符被替换为?在条目中,我做了一些研究,发现数据库有错误的整理。将数据库/表的排序更改为utf8_general_ci之后没有任何更改,因此我进一步调查并发现为表的各列更改它是另一种解决方案。这工作正常,我的数据库正在正确显示字符。
Next the character š (unicode: \u0161 decimal: 353) would not display in my webpage, I fixed this by using the following code in PHP which I found on *.
接下来,我的网页中不会显示字符š(unicode:\ u0161十进制:353),我通过使用我在*上找到的PHP中的以下代码来修复此问题。
function encode_string($string){
$encoded = "";
for ($n=0;$n<strlen($string);$n++){
$check = htmlentities($string[$n],ENT_QUOTES);
$string[$n] == $check ? $encoded .= "&#".ord($string[$n]).";" : $encoded .= $check;
}
return $encoded;
}
I can't say I completely understand this code but it caused the character š to display correctly when it got to my HTML. However this did not work for the character į.
我不能说我完全理解这段代码但它导致角色š在到达我的HTML时正确显示。然而,这对角色į不起作用。
I have also tried $conn->set_charset('utf8');
to set the connection to use utf8 however this resulted in į being displayed as į instead, same result for $conn->query("SET NAMES UTF8;");
我也试过$ conn-> set_charset('utf8');设置连接使用utf8然而这导致į显示为į,$ conn-> query(“SET NAMES UTF8;”)的结果相同;
I have found that hardcoding the į into the Javascript or PHP, allow it to be sent back and displayed correctly, for example echo "į";
works. So I believe the issue may be related to the database or in the PHP before the echo. However I don't have the knowledge to identify the problem.
我发现将hard硬编码到Javascript或PHP中,允许它被发回并正确显示,例如echo“į”;作品。所以我认为这个问题可能与回声之前的数据库或PHP有关。但是,我没有识别问题的知识。
Here is my php code below:
这是我的PHP代码如下:
<?php
header('Content-Type: text/html charset=utf-8');
//Connection to database is made. Referred to as $conn
$sql = "SELECT * FROM Words";
$result = $conn->query($sql);
if ($result->num_rows > 0) {
//Loop through the results to find a word with the status of 1
while($row = $result->fetch_assoc()) {
$status = $row["status"];
if($status == 1){
//respond to AJAX with the word
$ltword = trim($row["lt"]);
echo utf8_encode(encode_string($ltword));
//Has also been tested as
//echo encode_string($ltword);
//with no noticeable difference.
break;
}
}
}
function encode_string($string){
$encoded = "";
for ($n=0;$n<strlen($string);$n++){
$check = htmlentities($string[$n],ENT_QUOTES);
$string[$n] == $check ? $encoded .= "&#".ord($string[$n]).";" : $encoded .= $check;
}
return $encoded;
}
?>
At the core my question is, given my current setup, how do I correctly get an encoded UTF8 character from my database to display on my webpage?
我的问题是,根据我目前的设置,如何正确地从我的数据库中获取编码的UTF8字符以显示在我的网页上?
EDIT: The mb_check_encoding()
function of php, verifies that the data received from the database is valid utf8.
编辑:php的mb_check_encoding()函数,验证从数据库接收的数据是否有效utf8。
php.ini is using utf8 as it's default charset.
php.ini使用utf8作为它的默认字符集。
Using $conn->character_set_name();
returns the result latin1. Using $conn->set_charset("utf8");
causes it return utf8, however į is then displayed as į which is still incorrect.
使用$ conn-> character_set_name();返回结果latin1。使用$ conn-> set_charset(“utf8”);导致它返回utf8,但是į然后显示为į仍然不正确。
3 个解决方案
#1
0
in your case problem was collation, which was modified later. As a good practice try to set table collation as well as column collation same ie. utf8_unicode_ci (general is faster but unicode is much better for sort/display).
在你的情况下问题是整理,后来被修改。作为一种好的做法,尝试设置表格排序以及列排序相同即。 utf8_unicode_ci(一般来说速度更快,但unicode对于排序/显示更好)。
Now coming back to problem, the problem lies with already added data which was stored wrong due to non proper collation. For that you need to look & resolve method as you cant be sure it was stored properly.
现在回到问题,问题在于已经添加的数据由于不正确的整理而存储错误。为此,您需要查看并解决方法,因为您无法确定它是否正确存储。
#2
0
If you're using mysqli, you can call set_charset():
如果您使用的是mysqli,则可以调用set_charset():
$mysqli->set_charset('utf8mb4'); // object oriented style
mysqli_set_charset($link, 'utf8mb4'); // procedural style
#3
0
If you have UTF8 end to end (db > connection > php) you should not have to echo utf8_encode. Just echo the variable and it should display correctly.
如果您有端到端的UTF8(db> connection> php),则不必回显utf8_encode。只是回显变量,它应该正确显示。
Most likely, the character is is messed up in the database because it's still in the original encoding. Try updating the contents of the database with native UTF8 characters now that the collation has been fixed and it should work.
最有可能的是,该字符在数据库中混乱,因为它仍然是原始编码。尝试使用本机UTF8字符更新数据库的内容,因为已经修复了排序规则并且它应该可以工作。
So most likey you will need the $conn->set_charset('utf8') too.
所以最喜欢你也需要$ conn-> set_charset('utf8')。
#1
0
in your case problem was collation, which was modified later. As a good practice try to set table collation as well as column collation same ie. utf8_unicode_ci (general is faster but unicode is much better for sort/display).
在你的情况下问题是整理,后来被修改。作为一种好的做法,尝试设置表格排序以及列排序相同即。 utf8_unicode_ci(一般来说速度更快,但unicode对于排序/显示更好)。
Now coming back to problem, the problem lies with already added data which was stored wrong due to non proper collation. For that you need to look & resolve method as you cant be sure it was stored properly.
现在回到问题,问题在于已经添加的数据由于不正确的整理而存储错误。为此,您需要查看并解决方法,因为您无法确定它是否正确存储。
#2
0
If you're using mysqli, you can call set_charset():
如果您使用的是mysqli,则可以调用set_charset():
$mysqli->set_charset('utf8mb4'); // object oriented style
mysqli_set_charset($link, 'utf8mb4'); // procedural style
#3
0
If you have UTF8 end to end (db > connection > php) you should not have to echo utf8_encode. Just echo the variable and it should display correctly.
如果您有端到端的UTF8(db> connection> php),则不必回显utf8_encode。只是回显变量,它应该正确显示。
Most likely, the character is is messed up in the database because it's still in the original encoding. Try updating the contents of the database with native UTF8 characters now that the collation has been fixed and it should work.
最有可能的是,该字符在数据库中混乱,因为它仍然是原始编码。尝试使用本机UTF8字符更新数据库的内容,因为已经修复了排序规则并且它应该可以工作。
So most likey you will need the $conn->set_charset('utf8') too.
所以最喜欢你也需要$ conn-> set_charset('utf8')。