I need to store special characters and symbols into mysql database. So either I can store it as it is like 'ü' or convert it to html code such as 'ü'
我需要将特殊字符和符号存储到mysql数据库中。所以要么我可以将它存储为'ü'或将其转换为html代码,例如'ü'
I am not sure which would be better.
我不确定哪个更好。
Also I am having symbols like '♥', '„' .
我也有像'♥','''这样的符号。
Please suggest which one is better? Also suggest if there is any alternative method.
请建议哪一个更好?还建议是否有任何替代方法。
Thanks.
3 个解决方案
#1
5
HTML entities have been introduced years ago to transport character information over the wire when transportation was not binary safe and for the case that the user-agent (browser) did not support the charset encoding of the transport-layer or server.
多年前引入了HTML实体,以便在传输不是二进制安全的情况下通过线路传输字符信息,以及用户代理(浏览器)不支持传输层或服务器的字符集编码的情况。
As a HTML entity contains only very basic characters (&
, ;
, a-z
and 0-9
) and those characters have the same binary encoding in most character sets, this is and was very safe from those side-effects.
由于HTML实体仅包含非常基本的字符(&,;,a-z和0-9),并且这些字符在大多数字符集中具有相同的二进制编码,因此这些副作用非常安全。
However when you store something in the database, you don't have these issues because you're normally in control and you know what and how you can store text into the database.
但是,当您在数据库中存储某些内容时,您不会遇到这些问题,因为您通常处于控制状态,并且您知道将文本存储到数据库中的方式和方式。
For example, if you allow Unicode for text inside the database, you can store all characters, none is actually special. Note that you need to know your database here, there are some technical details you can run into. Like you don't know the charset encoding for your database connection so you can't exactly tell your database which text you want to store in there. But generally, you just store the text and retrieve it later. Nothing special to deal with.
例如,如果您允许Unicode用于数据库中的文本,则可以存储所有字符,实际上没有任何字符是特殊的。请注意,您需要在此处了解您的数据库,您可以遇到一些技术细节。就像你不知道你的数据库连接的charset编码一样,你不能准确地告诉你的数据库你要在那里存储哪些文本。但通常,您只需存储文本并在以后检索它。没什么特别要处理的。
In fact there are downsides when you use HTML entities instead of the plain character:
实际上,当您使用HTML实体而不是普通字符时,存在缺点:
- HTML entities consume more space:
ü
is much larger thanü
in LATIN-1, UTF-8, UTF-16 or UTF-32. - HTML entities need further processing. They need to be created, and when read, they need to be parsed. Imagine you need to search for a specific text in your database, or any other action would need additional handling. That's just overhead.
HTML实体消耗更多空间:ü比LATIN-1,UTF-8,UTF-16或UTF-32中的ü大得多。
HTML实体需要进一步处理。需要创建它们,并且在读取时需要对它们进行解析。想象一下,您需要在数据库中搜索特定文本,否则任何其他操作都需要额外处理。那只是开销。
The real fun starts when you mix both concepts. You come to a place you really don't want to go into. So just don't do it because you ain't gonna need it.
当你混合这两个概念时,真正的乐趣开始。你来到一个你真的不想进入的地方。所以不要这样做,因为你不需要它。
#2
5
Leave your data raw in the database. Don't use HTML entities for these until you need them for HTML. You never know when you may want to use your data elsewhere, not on a web page.
将数据原始保留在数据库中。在您需要HTML实体之前,请不要使用HTML实体。您永远不知道何时可能希望在其他地方使用您的数据,而不是在网页上。
#3
1
My suggestion would mirror the other contributors, don't convert the special entities when saving them to your database.
我的建议会反映其他贡献者,在将特殊实体保存到数据库时不要转换它们。
Some reasons against conversion:
反转换的一些原因:
- K.I.S.S principle (my biggest reason not to do it)
- most entities will end up consuming more space then prior to being converted
- loose the ability to search for the entities
ü
in a word, would be[word]+ü+[/word]
, and you would have to do a string comparison of the html equivalent ofü
=>[word]+ü+[/word]
. - your ouput may change from HTML to say an API for mobile, etc which makes conversion very unnecessary.
- need to convert on input of data, and on output (again if your output changes from plain HTML to something else).
K.I.S.S原则(我最大的理由不这样做)
大多数实体最终会在转换之前消耗更多空间
松散搜索实体的能力ü在一个单词中,将是[word] +ü+ [/ word],你必须对html等效的ü=> [word] +ü进行字符串比较; + [/字]。
您的输出可能会从HTML更改为说移动API等,这使转换变得非常不必要。
需要转换数据输入和输出(如果您的输出从纯HTML更改为其他内容)。
#1
5
HTML entities have been introduced years ago to transport character information over the wire when transportation was not binary safe and for the case that the user-agent (browser) did not support the charset encoding of the transport-layer or server.
多年前引入了HTML实体,以便在传输不是二进制安全的情况下通过线路传输字符信息,以及用户代理(浏览器)不支持传输层或服务器的字符集编码的情况。
As a HTML entity contains only very basic characters (&
, ;
, a-z
and 0-9
) and those characters have the same binary encoding in most character sets, this is and was very safe from those side-effects.
由于HTML实体仅包含非常基本的字符(&,;,a-z和0-9),并且这些字符在大多数字符集中具有相同的二进制编码,因此这些副作用非常安全。
However when you store something in the database, you don't have these issues because you're normally in control and you know what and how you can store text into the database.
但是,当您在数据库中存储某些内容时,您不会遇到这些问题,因为您通常处于控制状态,并且您知道将文本存储到数据库中的方式和方式。
For example, if you allow Unicode for text inside the database, you can store all characters, none is actually special. Note that you need to know your database here, there are some technical details you can run into. Like you don't know the charset encoding for your database connection so you can't exactly tell your database which text you want to store in there. But generally, you just store the text and retrieve it later. Nothing special to deal with.
例如,如果您允许Unicode用于数据库中的文本,则可以存储所有字符,实际上没有任何字符是特殊的。请注意,您需要在此处了解您的数据库,您可以遇到一些技术细节。就像你不知道你的数据库连接的charset编码一样,你不能准确地告诉你的数据库你要在那里存储哪些文本。但通常,您只需存储文本并在以后检索它。没什么特别要处理的。
In fact there are downsides when you use HTML entities instead of the plain character:
实际上,当您使用HTML实体而不是普通字符时,存在缺点:
- HTML entities consume more space:
ü
is much larger thanü
in LATIN-1, UTF-8, UTF-16 or UTF-32. - HTML entities need further processing. They need to be created, and when read, they need to be parsed. Imagine you need to search for a specific text in your database, or any other action would need additional handling. That's just overhead.
HTML实体消耗更多空间:ü比LATIN-1,UTF-8,UTF-16或UTF-32中的ü大得多。
HTML实体需要进一步处理。需要创建它们,并且在读取时需要对它们进行解析。想象一下,您需要在数据库中搜索特定文本,否则任何其他操作都需要额外处理。那只是开销。
The real fun starts when you mix both concepts. You come to a place you really don't want to go into. So just don't do it because you ain't gonna need it.
当你混合这两个概念时,真正的乐趣开始。你来到一个你真的不想进入的地方。所以不要这样做,因为你不需要它。
#2
5
Leave your data raw in the database. Don't use HTML entities for these until you need them for HTML. You never know when you may want to use your data elsewhere, not on a web page.
将数据原始保留在数据库中。在您需要HTML实体之前,请不要使用HTML实体。您永远不知道何时可能希望在其他地方使用您的数据,而不是在网页上。
#3
1
My suggestion would mirror the other contributors, don't convert the special entities when saving them to your database.
我的建议会反映其他贡献者,在将特殊实体保存到数据库时不要转换它们。
Some reasons against conversion:
反转换的一些原因:
- K.I.S.S principle (my biggest reason not to do it)
- most entities will end up consuming more space then prior to being converted
- loose the ability to search for the entities
ü
in a word, would be[word]+ü+[/word]
, and you would have to do a string comparison of the html equivalent ofü
=>[word]+ü+[/word]
. - your ouput may change from HTML to say an API for mobile, etc which makes conversion very unnecessary.
- need to convert on input of data, and on output (again if your output changes from plain HTML to something else).
K.I.S.S原则(我最大的理由不这样做)
大多数实体最终会在转换之前消耗更多空间
松散搜索实体的能力ü在一个单词中,将是[word] +ü+ [/ word],你必须对html等效的ü=> [word] +ü进行字符串比较; + [/字]。
您的输出可能会从HTML更改为说移动API等,这使转换变得非常不必要。
需要转换数据输入和输出(如果您的输出从纯HTML更改为其他内容)。