为什么HTML字符实体是必需的?

时间:2021-08-13 17:07:03

Why are HTML character entities necessary? What good are they? I don't see the point.

为什么需要HTML字符实体?他们是有什么好处?我不明白重点。

6 个解决方案

#1


22  

Two main things.

两个主要的事情。

  1. They let you use characters that are not defined in a current charset. E.g., you can legally use ASCII as the charset, and still include arbitrary Unicode characters thorugh entities.
  2. 它们允许您使用当前字符集中未定义的字符。例如,您可以合法地使用ASCII作为字符集,并且仍然包含任意的Unicode字符和实体。
  3. They let you quote characters that HTML gives special meaning to, as Simon noted.
  4. 他们让你引用HTML赋予特殊意义的字符,就像Simon指出的那样。

#2


14  

"1 &lt; 2" lets you put "1 < 2" in your page.

“1 & lt;2“让你把1 < 2”放在你的页面中。

Long answer:

长一点的回答:

Since HTML uses '<' to open tags, you can't just type '<' if you want that as text. Therefore, you have to have a way to say "I want the text < in my page". Whoever designed HTML (or, actually SGML, HTML's predecessor) decided to use '&something;', so you can also put things like non-breaking space: '&nbsp;' (spaces that are not collapsed or allow a line break). Of course, now you need to have a way to say '&', so you get '&amp;'...

因为HTML使用'<'来打开标签,所以如果你想要把'<'作为文本,你不能只输入'<'。因此,您必须有一种方法来表示“我希望文本 <在我的页面中”。不管是谁设计了html(或者,实际上是sgml, html的前身),都决定使用“&something;”,所以你也可以使用“不间断空间”(不折叠或不允许换行的空间)。当然,现在你需要有一种表达“&”的方式,这样你就会得到“&;< p>

#3


7  

They aren't, apart from &amp;, &lt;, &gt;, &quot; and probably &nbsp;. For all other characters, just use UTF-8.

他们不是,除了……而且可能,。对于所有其他字符,只需使用UTF-8。

#4


4  

In SGML and XML they aren't just for characters. They are generic inclusion mechanism, and their use for special characters is just one of many cases.

在SGML和XML中,它们不只是用于字符。它们是通用的包含机制,对特殊字符的使用只是众多例子中的一个。

<!ENTITY signature "<hr/><p>Regards, <i>&myname;</i></p>">
<!ENTITY myname "John Doe">

This kind of entities is not useful for web sites, because they work only in XML mode, and you can't use external DTD file without enabling "validating" parsing mode in browser configuration.

这种实体对web站点不太有用,因为它们只在XML模式下工作,而且在浏览器配置中不能使用“验证”解析模式,所以不能使用外部DTD文件。


Entities can be expanded recursively. This allows use of XML for Denial of Serice attack called "Billion Laughs Attack".

实体可以递归地展开。这允许使用XML来拒绝所谓的“十亿笑声攻击”。


Firefox uses entities internally (in XUL and such) for internationalization and brand-independent messages (to make life easier for Flock and IceWeasel):

Firefox在内部使用实体(XUL等)进行国际化和品牌无关的消息(让Flock和IceWeasel的日子更轻松):

<!ENTITY hidemac.label "Hide &brandShortName;">
<!ENTITY hidewin.label "Hide - &brandShortName;">

In HTML you just need &lt;, &amp; and &quot; to avoid ambiguities between text and markup.

在HTML中,你只需要<和“;避免文本和标记之间的歧义。

All other entities are basically obsoleted by Unicode encodings and remain only as covenience (but a good text editor should have macros/snippets that can replace them).

所有其他实体基本上都被Unicode编码淘汰,只保留为covenience(但是一个好的文本编辑器应该有可以替换它们的宏/片段)。


In XHTML all entities except the basic few are problematic, because won't work with stand-alone XML parsers (e.g. &nbsp; won't work).

在XHTML中,除了基本的几个实体外,所有实体都有问题,因为独立的XML解析器(例如:不会工作)。

To parse all XHTML entities you need validating XML parser (option's usually called "resolve externals") which is slower and needs DTD Catalog set up. If you ignore or screw up your DTD Catalog, you'll be participating in DDoS of W3C servers.

要解析所有的XHTML实体,您需要验证XML解析器(选项通常称为“解决外部问题”),它比较慢,并且需要建立DTD目录。如果忽略或破坏了DTD目录,就会参与到W3C服务器的DDoS中。

#5


3  

Character entities are used to represent character which are reserved to write HTML for.ex. <, >, /, & etc, if you want to represent these characters in your content you should use character entities, this will help the parser to distinguish between the content and markup

字符实体用于表示保留用于编写HTML for.ex的字符。<、>、/等,如果您想在内容中表示这些字符,您应该使用字符实体,这将有助于解析器区分内容和标记

#6


1  

You use entities to help the parser distinguish when a character should be represented as HTML, and what you really want to show the user, as HTML will reserve a special set of characters for itself.

您可以使用实体来帮助解析器区分何时应该将字符表示为HTML,以及您真正想要向用户显示的内容,因为HTML将为自己保留一组特殊的字符。

Typing this literally in HTML

在HTML中逐字输入。

I don't mean it like that </sarcasm>

我不是那个意思。

will cause the "</sarcasm>" tag to disappear,

将导致“”标签消失,

e.g.

如。

I don't mean it like that

我不是那个意思

as HTML does not have a tag defined as such. In this case, using entities will allow the text to display properly.

因为HTML没有这样定义的标签。在这种情况下,使用实体将允许文本正确显示。

e.g.

如。

No, really! &lt;/sarcasm&gt;

不,真的!& lt;/ sarcasm>

gives

给了

No, really! </sarcasm>

不,真的!< /讽刺>

as desired.

根据需要。

#1


22  

Two main things.

两个主要的事情。

  1. They let you use characters that are not defined in a current charset. E.g., you can legally use ASCII as the charset, and still include arbitrary Unicode characters thorugh entities.
  2. 它们允许您使用当前字符集中未定义的字符。例如,您可以合法地使用ASCII作为字符集,并且仍然包含任意的Unicode字符和实体。
  3. They let you quote characters that HTML gives special meaning to, as Simon noted.
  4. 他们让你引用HTML赋予特殊意义的字符,就像Simon指出的那样。

#2


14  

"1 &lt; 2" lets you put "1 < 2" in your page.

“1 & lt;2“让你把1 < 2”放在你的页面中。

Long answer:

长一点的回答:

Since HTML uses '<' to open tags, you can't just type '<' if you want that as text. Therefore, you have to have a way to say "I want the text < in my page". Whoever designed HTML (or, actually SGML, HTML's predecessor) decided to use '&something;', so you can also put things like non-breaking space: '&nbsp;' (spaces that are not collapsed or allow a line break). Of course, now you need to have a way to say '&', so you get '&amp;'...

因为HTML使用'<'来打开标签,所以如果你想要把'<'作为文本,你不能只输入'<'。因此,您必须有一种方法来表示“我希望文本 <在我的页面中”。不管是谁设计了html(或者,实际上是sgml, html的前身),都决定使用“&something;”,所以你也可以使用“不间断空间”(不折叠或不允许换行的空间)。当然,现在你需要有一种表达“&”的方式,这样你就会得到“&;< p>

#3


7  

They aren't, apart from &amp;, &lt;, &gt;, &quot; and probably &nbsp;. For all other characters, just use UTF-8.

他们不是,除了……而且可能,。对于所有其他字符,只需使用UTF-8。

#4


4  

In SGML and XML they aren't just for characters. They are generic inclusion mechanism, and their use for special characters is just one of many cases.

在SGML和XML中,它们不只是用于字符。它们是通用的包含机制,对特殊字符的使用只是众多例子中的一个。

<!ENTITY signature "<hr/><p>Regards, <i>&myname;</i></p>">
<!ENTITY myname "John Doe">

This kind of entities is not useful for web sites, because they work only in XML mode, and you can't use external DTD file without enabling "validating" parsing mode in browser configuration.

这种实体对web站点不太有用,因为它们只在XML模式下工作,而且在浏览器配置中不能使用“验证”解析模式,所以不能使用外部DTD文件。


Entities can be expanded recursively. This allows use of XML for Denial of Serice attack called "Billion Laughs Attack".

实体可以递归地展开。这允许使用XML来拒绝所谓的“十亿笑声攻击”。


Firefox uses entities internally (in XUL and such) for internationalization and brand-independent messages (to make life easier for Flock and IceWeasel):

Firefox在内部使用实体(XUL等)进行国际化和品牌无关的消息(让Flock和IceWeasel的日子更轻松):

<!ENTITY hidemac.label "Hide &brandShortName;">
<!ENTITY hidewin.label "Hide - &brandShortName;">

In HTML you just need &lt;, &amp; and &quot; to avoid ambiguities between text and markup.

在HTML中,你只需要<和“;避免文本和标记之间的歧义。

All other entities are basically obsoleted by Unicode encodings and remain only as covenience (but a good text editor should have macros/snippets that can replace them).

所有其他实体基本上都被Unicode编码淘汰,只保留为covenience(但是一个好的文本编辑器应该有可以替换它们的宏/片段)。


In XHTML all entities except the basic few are problematic, because won't work with stand-alone XML parsers (e.g. &nbsp; won't work).

在XHTML中,除了基本的几个实体外,所有实体都有问题,因为独立的XML解析器(例如:不会工作)。

To parse all XHTML entities you need validating XML parser (option's usually called "resolve externals") which is slower and needs DTD Catalog set up. If you ignore or screw up your DTD Catalog, you'll be participating in DDoS of W3C servers.

要解析所有的XHTML实体,您需要验证XML解析器(选项通常称为“解决外部问题”),它比较慢,并且需要建立DTD目录。如果忽略或破坏了DTD目录,就会参与到W3C服务器的DDoS中。

#5


3  

Character entities are used to represent character which are reserved to write HTML for.ex. <, >, /, & etc, if you want to represent these characters in your content you should use character entities, this will help the parser to distinguish between the content and markup

字符实体用于表示保留用于编写HTML for.ex的字符。<、>、/等,如果您想在内容中表示这些字符,您应该使用字符实体,这将有助于解析器区分内容和标记

#6


1  

You use entities to help the parser distinguish when a character should be represented as HTML, and what you really want to show the user, as HTML will reserve a special set of characters for itself.

您可以使用实体来帮助解析器区分何时应该将字符表示为HTML,以及您真正想要向用户显示的内容,因为HTML将为自己保留一组特殊的字符。

Typing this literally in HTML

在HTML中逐字输入。

I don't mean it like that </sarcasm>

我不是那个意思。

will cause the "</sarcasm>" tag to disappear,

将导致“”标签消失,

e.g.

如。

I don't mean it like that

我不是那个意思

as HTML does not have a tag defined as such. In this case, using entities will allow the text to display properly.

因为HTML没有这样定义的标签。在这种情况下,使用实体将允许文本正确显示。

e.g.

如。

No, really! &lt;/sarcasm&gt;

不,真的!& lt;/ sarcasm>

gives

给了

No, really! </sarcasm>

不,真的!< /讽刺>

as desired.

根据需要。