如何处理数据库中的重音和奇怪的字符?

时间:2021-07-10 00:26:29

im trying to safe spanish words with accent in my database but it won't work, i have already tried:

我试图用我的数据库中的重音安全西班牙语单词,但它不会工作,我已经尝试过:

1) changing conllation from tables and rows to utf8_spanish_ci and utf_unicode_ci.

1)将表和行的连接更改为utf8_spanish_ci和utf_unicode_ci。

2)adding a header tag with

2)添加标头标签

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />

3)adding

3)加入

header("Content-Type: text/html;charset=utf-8");

in a php tag.

在一个PHP标签。

doing this in an xampp server in my laptop will work, but when i upload the database to a login monster server it wont save the accent properly.

在我的笔记本电脑的xampp服务器中执行此操作将工作,但当我将数据库上传到登录怪物服务器时,它不会正确保存重音。

edit: this is the connection im using:

编辑:这是我正在使用的连接:

    private function Connect()
    {
        //$this->settings = parse_ini_file("settings.ini.php");
        try 
        {
            # Read settings from INI file, set UTF8
            $this->pdo = new PDO('mysql:host=localhost;dbname=xxxxx;charset=utf8', 'xxxxx', 'xxxxxx', array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));

            # We can now log any exceptions on Fatal error. 
            $this->pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

            # Disable emulation of prepared statements, use REAL prepared statements instead.
            $this->pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);

            # Connection succeeded, set the boolean to true.
            $this->bConnected = true;
        }
        catch (PDOException $e) 
        {
            # Write into log
            echo $this->ExceptionLog($e->getMessage());
            die();
        }
    }

Edit:

编辑:

i can't save accent, it shows like strange characters like á = á

我无法保存口音,它显示像á=á

4 个解决方案

#1


5  

Collation affects text sorting only, it has no effect on actual character set of stored data.

排序规则仅影响文本排序,它对存储数据的实际字符集没有影响。

I would recommend this configuration:

我推荐这个配置:

  1. Set the character set for the whole DB only, so you don't have to set it for each table separately. Character set is inherited from DB to tables to columns. Use utf8 as the character set.

    仅为整个数据库设置字符集,因此您不必分别为每个表设置它。字符集从DB继承到表到列。使用utf8作为字符集。

  2. Set the character set for the DB connection. Execute these queries after you connect to the database:

    设置数据库连接的字符集。连接到数据库后执行这些查询:

    SET CHARACTER SET 'utf8'
    SET NAMES 'utf8'
    
  3. Set the character set for the page, using HTTP header and/or HTML meta tag. One of these is enough. Use utf-8 as the charset.

    使用HTTP标头和/或HTML元标记设置页面的字符集。其中一个就足够了。使用utf-8作为字符集。

This should be enough.

这应该足够了。

If you want to have proper sorting of Spanish strings, set collation for the whole database. utf8_spanish_ci should work (ci means Case Insensitive). Without proper collation, accented Spanish characters would be sorted always last.

如果要对西班牙语字符串进行适当的排序,请为整个数据库设置排序规则。 utf8_spanish_ci应该工作(ci表示Case Insensitive)。如果没有适当的整理,重音西班牙语字符将始终排序。

Note: it's possible that the character set of data you already have in a table is broken, because you character set configuration was wrong previously. You should check it using some DB client first to exclude this case. If it's broken, just re-insert your data with the right character set configuration.

注意:表中已有的数据字符集可能已损坏,因为之前字符集配置错误。您应首先使用某个数据库客户端进行检查以排除此情况。如果它坏了,只需使用正确的字符集配置重新插入数据。

How does character set work in a database

  • objects have a character set attribute, which can be set explicitly or it's inherited (server > database > table > column), so the best option is to set it for the whole database

    对象具有字符集属性,可以显式设置或继承(服务器>数据库>表>列),因此最好的选择是为整个数据库设置它

  • client connection has also a character set attribute and it's telling the database in which encoding you're sending the data

    客户端连接还有一个字符集属性,它告诉数据库您要发送数据的编码

If client connection's and target object's character sets are different, the data you're sending to the database are automatically converted from the connection's character set to the object's character set.

如果客户端连接和目标对象的字符集不同,则发送到数据库的数据将自动从连接的字符集转换为对象的字符集。

So if you have for example the data in utf8, but client connection set to latin1, the database will break the data, because it'll try to convert utf8 like it's latin1.

因此,如果你有例如utf8中的数据,但客户端连接设置为latin1,数据库将破坏数据,因为它会尝试转换utf8,就像它的latin1一样。

#2


2  

Here is my checklist for storing UTF8 characters. Though, be sure to isolate the cause of failure to be on the part where you store the strings into the database -- meaning the string to store is still as it was when the user inputed it.

这是我存储UTF8字符的清单。但是,请确保将存储字符串的部分的失败原因隔离到数据库中 - 这意味着要存储的字符串仍然与用户输入的字符串相同。

First. Make sure the character set of the table being used is utf8 or better yet use utf8mb4 for full unicode support (though it has its drawbacks too). It doesn't matter which charset has been set for the entire database; it is overridden by the table definition, if specified. The DDL code for creating such a table would be like:

第一。确保正在使用的表的字符集是utf8或更好,但使用utf8mb4来获得完整的unicode支持(尽管它也有它的缺点)。为整个数据库设置了哪个字符集并不重要;如果指定,它将被表定义覆盖。用于创建此类表的DDL代码如下:

CREATE TABLE table_name (
    id INT AUTO_INCREMENT NOT NULL,
    name VARCHAR(190) NOT NULL,
    date_created DATETIME NOT NULL,
    PRIMARY KEY(id)
)
DEFAULT CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci
ENGINE = InnoDB;

Second. Use utf8 charset for the database connection.

第二。使用utf8 charset进行数据库连接。

// This should be enough
new PDO(
    'mysql:host=localhost;dbname=xxxxx;charset=utf8mb4;',
    'username',
    'password'
);

#3


0  

For MySql Use these code after invoking the database connection:

对于MySql在调用数据库连接后使用以下代码:

$set_utf=$dbh->exec("SET NAMES UTF8"); 

#4


0  

I had to store a lot of accentuated letters from different languages (including french and spanish), and the only safe way I found at the moment was to store everything in utf8_bin in MySQL, and display pages in charset utf-8 like you do. No further processing needed, neither from MySQL, nor from PHP.

我不得不存储很多来自不同语言(包括法语和西班牙语)的强调字母,而我目前唯一安全的方法是将所有内容存储在MySQL中的utf8_bin中,并像你一样以charset utf-8显示页面。无需进一步处理,无论是MySQL还是PHP。

Also, make sure your IDE manages your files in utf8.

另外,请确保您的IDE使用utf8管理您的文件。

#1


5  

Collation affects text sorting only, it has no effect on actual character set of stored data.

排序规则仅影响文本排序,它对存储数据的实际字符集没有影响。

I would recommend this configuration:

我推荐这个配置:

  1. Set the character set for the whole DB only, so you don't have to set it for each table separately. Character set is inherited from DB to tables to columns. Use utf8 as the character set.

    仅为整个数据库设置字符集,因此您不必分别为每个表设置它。字符集从DB继承到表到列。使用utf8作为字符集。

  2. Set the character set for the DB connection. Execute these queries after you connect to the database:

    设置数据库连接的字符集。连接到数据库后执行这些查询:

    SET CHARACTER SET 'utf8'
    SET NAMES 'utf8'
    
  3. Set the character set for the page, using HTTP header and/or HTML meta tag. One of these is enough. Use utf-8 as the charset.

    使用HTTP标头和/或HTML元标记设置页面的字符集。其中一个就足够了。使用utf-8作为字符集。

This should be enough.

这应该足够了。

If you want to have proper sorting of Spanish strings, set collation for the whole database. utf8_spanish_ci should work (ci means Case Insensitive). Without proper collation, accented Spanish characters would be sorted always last.

如果要对西班牙语字符串进行适当的排序,请为整个数据库设置排序规则。 utf8_spanish_ci应该工作(ci表示Case Insensitive)。如果没有适当的整理,重音西班牙语字符将始终排序。

Note: it's possible that the character set of data you already have in a table is broken, because you character set configuration was wrong previously. You should check it using some DB client first to exclude this case. If it's broken, just re-insert your data with the right character set configuration.

注意:表中已有的数据字符集可能已损坏,因为之前字符集配置错误。您应首先使用某个数据库客户端进行检查以排除此情况。如果它坏了,只需使用正确的字符集配置重新插入数据。

How does character set work in a database

  • objects have a character set attribute, which can be set explicitly or it's inherited (server > database > table > column), so the best option is to set it for the whole database

    对象具有字符集属性,可以显式设置或继承(服务器>数据库>表>列),因此最好的选择是为整个数据库设置它

  • client connection has also a character set attribute and it's telling the database in which encoding you're sending the data

    客户端连接还有一个字符集属性,它告诉数据库您要发送数据的编码

If client connection's and target object's character sets are different, the data you're sending to the database are automatically converted from the connection's character set to the object's character set.

如果客户端连接和目标对象的字符集不同,则发送到数据库的数据将自动从连接的字符集转换为对象的字符集。

So if you have for example the data in utf8, but client connection set to latin1, the database will break the data, because it'll try to convert utf8 like it's latin1.

因此,如果你有例如utf8中的数据,但客户端连接设置为latin1,数据库将破坏数据,因为它会尝试转换utf8,就像它的latin1一样。

#2


2  

Here is my checklist for storing UTF8 characters. Though, be sure to isolate the cause of failure to be on the part where you store the strings into the database -- meaning the string to store is still as it was when the user inputed it.

这是我存储UTF8字符的清单。但是,请确保将存储字符串的部分的失败原因隔离到数据库中 - 这意味着要存储的字符串仍然与用户输入的字符串相同。

First. Make sure the character set of the table being used is utf8 or better yet use utf8mb4 for full unicode support (though it has its drawbacks too). It doesn't matter which charset has been set for the entire database; it is overridden by the table definition, if specified. The DDL code for creating such a table would be like:

第一。确保正在使用的表的字符集是utf8或更好,但使用utf8mb4来获得完整的unicode支持(尽管它也有它的缺点)。为整个数据库设置了哪个字符集并不重要;如果指定,它将被表定义覆盖。用于创建此类表的DDL代码如下:

CREATE TABLE table_name (
    id INT AUTO_INCREMENT NOT NULL,
    name VARCHAR(190) NOT NULL,
    date_created DATETIME NOT NULL,
    PRIMARY KEY(id)
)
DEFAULT CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci
ENGINE = InnoDB;

Second. Use utf8 charset for the database connection.

第二。使用utf8 charset进行数据库连接。

// This should be enough
new PDO(
    'mysql:host=localhost;dbname=xxxxx;charset=utf8mb4;',
    'username',
    'password'
);

#3


0  

For MySql Use these code after invoking the database connection:

对于MySql在调用数据库连接后使用以下代码:

$set_utf=$dbh->exec("SET NAMES UTF8"); 

#4


0  

I had to store a lot of accentuated letters from different languages (including french and spanish), and the only safe way I found at the moment was to store everything in utf8_bin in MySQL, and display pages in charset utf-8 like you do. No further processing needed, neither from MySQL, nor from PHP.

我不得不存储很多来自不同语言(包括法语和西班牙语)的强调字母,而我目前唯一安全的方法是将所有内容存储在MySQL中的utf8_bin中,并像你一样以charset utf-8显示页面。无需进一步处理,无论是MySQL还是PHP。

Also, make sure your IDE manages your files in utf8.

另外,请确保您的IDE使用utf8管理您的文件。