是否需要替换XML正文文本中的双引号/单引号?

时间:2023-01-25 20:05:48

Please correct my terminology here if it's off:

如果是off,请纠正我这里的术语:

The 5 character substitutions for XML are:

XML的5个字符替换是:

  • & ( & )
  • ,(&)
  • &lt; ( < )
  • & lt;(<)
  • &gt; ( > )
  • 比;(>)
  • &quot; ( " )
  • “;(")
  • &apos; ( ' )
  • '(')

Do all of these substitutions need to happen in a element text? Or only attribute text? (terminology correction?)

所有这些替换都需要在元素文本中发生吗?还是只属性文本?(术语校正?)

e.g. is this valid XML?

例如,这是有效的XML吗?

<myelement>x && y</myelement>
<myelement>And I quote, "no"</myelement>

&gt; and &lt; seem obvious to replace in this context, but I'm not clear if the replacement rules are global for the entire XML document, or if they apply differently to different parts of the document (example, cdata sections apply different rules).

比;和& lt;在这种上下文中,替换似乎是显而易见的,但我不清楚替换规则是否对整个XML文档是全局的,或者它们是否对文档的不同部分应用不同(例如,cdata节应用不同的规则)。

Assumption: this is invalid XML:

假设:这是无效的XML:

<myelement field="no & allowed here"/>
<myelement field="no <> allowed here"/>

Quotes are obvious delimiters of attributes, and <> are obvious delimiters of element text.

引号是属性的明显分隔符,<>是元素文本的明显分隔符。

1 个解决方案

#1


10  

In element content you only need to escape & and <; you never need to escape single or double quotes, and you need to escape > only if it appears as part of the sequence ]]> (many people replace it unconditionally, because that's simpler).

在元素内容中,您只需要转义&和<;你永远不需要逃避单引号或双引号,而且只有当它作为序列的一部分出现时,你才需要摆脱>(许多人无条件地替换它,因为这更简单)。

In attribute content you only need to escape & and < and either ' or ", depending which one was used as the attribute delimiter.

在属性内容中,您只需要转义&和 <以及'或'或',这取决于哪个被用作属性分隔符。< p>

Entities starting with & are not recognized in comments or CDATA sections, or in element or attribute names, so special characters must not be escaped in those contexts.

以&开头的实体在注释或CDATA节中不能识别,或者在元素或属性名中不能识别,因此在那些上下文中不能转义特殊字符。

#1


10  

In element content you only need to escape & and <; you never need to escape single or double quotes, and you need to escape > only if it appears as part of the sequence ]]> (many people replace it unconditionally, because that's simpler).

在元素内容中,您只需要转义&和<;你永远不需要逃避单引号或双引号,而且只有当它作为序列的一部分出现时,你才需要摆脱>(许多人无条件地替换它,因为这更简单)。

In attribute content you only need to escape & and < and either ' or ", depending which one was used as the attribute delimiter.

在属性内容中,您只需要转义&和 <以及'或'或',这取决于哪个被用作属性分隔符。< p>

Entities starting with & are not recognized in comments or CDATA sections, or in element or attribute names, so special characters must not be escaped in those contexts.

以&开头的实体在注释或CDATA节中不能识别,或者在元素或属性名中不能识别,因此在那些上下文中不能转义特殊字符。