DTD中PCDATA与CDATA的区别

时间:2022-11-07 16:29:03

What is the difference between #PCDATA and #CDATA in DTD?

DTD中的#PCDATA和#CDATA有什么区别?

6 个解决方案

#1


65  

PCDATA - Parsed Character Data

PCDATA -解析字符数据

XML parsers normally parse all the text in an XML document.

XML解析器通常解析XML文档中的所有文本。

CDATA - (Unparsed) Character Data

(未解析的)字符数据

The term CDATA is used about text data that should not be parsed by the XML parser.

术语CDATA用于不应该由XML解析器解析的文本数据。

Characters like "<" and "&" are illegal in XML elements.

像“<”和“&”这样的字符在XML元素中是非法的。

#2


62  

  • PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded.
  • PCDATA是由解析器解析的文本。文本中的标记将被视为标记,实体将被展开。
  • CDATA is text that will not be parsed by a parser. Tags inside the text will not be treated as markup and entities will not be expanded.
  • CDATA是不会被解析器解析的文本。文本中的标记不会被视为标记,实体也不会被扩展。

By default, everything is PCDATA. In the following example, ignoring the root, <bar> will be parsed, and it'll have no content, but one child.

默认情况下,一切都是PCDATA。在下面的例子中,忽略根, 将被解析,它将没有内容,只有一个孩子。

<?xml version="1.0"?>
<foo>
<bar><test>content!</test></bar>
</foo>

When we want to specify that an element will only contain text, and no child elements, we use the keyword PCDATA, because this keyword specifies that the element must contain parsable character data – that is , any text except the characters less-than (<) , greater-than (>) , ampersand (&), quote(') and double quote (").

当我们想要指定一个元素只包含文本,也没有子元素,我们使用关键字PCDATA,因为这个关键字指定元素必须包含解析的字符数据,也就是说,任何文本除了字符小于号(<)、大于号(>),与字符(&),引号(')和双引号(")。

In the next example, <bar> contains CDATA. Its content will not be parsed and is thus <test>content!</test>.

在下一个示例中, 包含CDATA。它的内容不会被解析,因此 content!

<?xml version="1.0"?>
<foo>
<bar><![CDATA[<test>content!</test>]]></bar>
</foo>

There are several content models in SGML. The #PCDATA content model says that an element may contain plain text. The "parsed" part of it means that markup (including PIs, comments and SGML directives) in it is parsed instead of displayed as raw text. It also means that entity references are replaced.

在SGML中有几个内容模型。PCDATA内容模型说一个元素可能包含纯文本。它的“解析”部分意味着其中的标记(包括PIs、注释和SGML指令)被解析,而不是显示为原始文本。它还意味着替换实体引用。

Another type of content model allowing plain text contents is CDATA. In XML, the element content model may not implicitly be set to CDATA, but in SGML, it means that markup and entity references are ignored in the contents of the element. In attributes of CDATA type however, entity references are replaced.

另一种允许纯文本内容的内容模型是CDATA。在XML中,元素内容模型可能不会被隐式地设置为CDATA,但是在SGML中,它意味着标记和实体引用在元素的内容中被忽略。然而,在CDATA类型的属性中,实体引用被替换。

In XML #PCDATA is the only plain text content model. You use it if you at all want to allow text contents in the element. The CDATA content model may be used explicitly through the CDATA block markup in #PCDATA, but element contents may not be defined as CDATA per default.

在XML #PCDATA中,只有纯文本内容模型。如果您想要在元素中允许文本内容,可以使用它。CDATA内容模型可以通过#PCDATA中的CDATA块标记显式地使用,但是元素内容可能不会默认定义为CDATA。

In a DTD, the type of an attribute that contains text must be CDATA. The CDATA keyword in an attribute declaration has a different meaning than the CDATA section in an XML document. In a CDATA section all characters are legal (including <,>,&,’ and “ characters), except the “]]>” end tag.

在DTD中,包含文本的属性的类型必须是CDATA。属性声明中的CDATA关键字与XML文档中的CDATA节有不同的含义。在CDATA区域中,所有字符都是合法的(包括<、>、&、'和"字符),除了"][]> "结束标记。

#PCDATA is not appropriate for the type of an attribute. It is used for the type of "leaf" text.

#PCDATA不适合属性的类型。它用于“叶”文本的类型。

#PCDATA is prepended by a hash simply for historical reasons.

由于历史原因,#PCDATA使用散列进行预写。

#3


10  

From here (Google is your friend):

从这里(谷歌是你的朋友):

In a DTD, PCDATA and CDATA are used to assert something about the allowable content of elements and attributes, respectively. In an element's content model, #PCDATA says that the element contains (may contain) "any old text." (With exceptions as noted below.) In an attribute's declaration, CDATA is one sort of constraint you can put on the attribute's allowable values (other sorts, all mutually exclusive, include ID, IDREF, and NMTOKEN). An attribute whose allowable values are CDATA can (like PCDATA in an element) contain "any old text."

在DTD中,PCDATA和CDATA分别用于断言元素和属性的允许内容。在元素的内容模型中,#PCDATA表示元素包含(可能包含)“任何旧的文本。”(例外情况如下。)在属性的声明中,CDATA是一种约束,您可以对属性的允许值(其他类型,所有互斥的,包括ID、IDREF和NMTOKEN)进行约束。可允许值为CDATA的属性(如元素中的PCDATA)包含“任何旧文本”。

A potentially really confusing issue is that there's another "CDATA," also referred to as marked sections. A marked section is a portion of element (#PCDATA) content delimited with special strings: to close it. If you remember that PCDATA is "parsed character data," a CDATA section is literally the same thing, without the "parsed." Parsers transmit the content of a marked section to downstream applications without hiccupping every time they encounter special characters like < and &. This is useful when you're coding a document that contains lots of those special characters (like scripts and code fragments); it's easier on data entry, and easier on reading, than the corresponding entity reference.

一个潜在的真正令人困惑的问题是还有另一个“CDATA”,也称为标记节。标记部分是元素(#PCDATA)内容的一部分,内容用特殊的字符串分隔:关闭它。如果您还记得PCDATA是“解析字符数据”,那么CDATA部分实际上是相同的,没有“解析”。解析器将标记部分的内容传输到下游应用程序,而不会在每次遇到 <和&等特殊字符时中断。当您编写包含许多这些特殊字符(如脚本和代码片段)的文档时,这是非常有用的;与相应的实体引用相比,它更易于数据输入和读取。< p>

So you can infer that the exception to the "any old text" rule is that PCDATA cannot include any of these unescaped special characters, UNLESS they fall within the scope of a CDATA marked section.

因此,您可以推断“任何旧文本”规则的例外是,PCDATA不能包含任何这些未转义的特殊字符,除非它们属于CDATA标记的部分。

#4


7  

PCDATA – parsed character data. It parse to all the data in an xml document.

PCDATA -解析字符数据。它解析xml文档中的所有数据。

Example:

例子:

<family>
    <mother>mom</mother>
    <father>dad</father>
</family>

Here, the family element contains 2 more elements “mother” and ”father”. So it parse further to get the text of mother and father to give the value of family as “mom dad”

在这里,family元素包含另外两个元素“mother”和“father”。所以它进一步解析,得到了母亲和父亲的文本并给出了家庭作为“母亲父亲”的价值

CDATA – unparsed characted Data. This is the data that should not be parsed further in an xml document.

未解析的字符数据。这是不应该在xml文档中进一步解析的数据。

<family>
    <![CDATA[ 
       <mother>mom</mother>
       <father>dad</father>
    ]]>
</family>

Here, the value of family will be <mother>mom</mother><father>dad</father>.

在这里,家庭的价值将是“妈妈>”“妈妈>”“爸爸>爸爸”“爸爸>”。

#5


3  

The very main difference between PCDATA and CDATA is

PCDATA和CDATA最主要的区别是

PCDATA - Basically used for ELEMENTS while

PCDATA——基本用于元素

CDATA - Used for Attributes of XML i.e ATTLIST

CDATA——用于XML i的属性。e ATTLIST

#6


0  

CDATA (Character DATA): It is similarly to a comment but it is part of document. i.e. CDATA is a data, it is part of the document but the data can not parsed in XML.
Note: XML comment omits while parsing an XML but CDATA shows as it is.

CDATA(字符数据):它类似于注释,但它是文档的一部分。即CDATA是一个数据,它是文档的一部分,但是数据不能用XML解析。注意:在解析XML时,XML注释省略了,但是CDATA显示的是这样。

PCDATA (Parsed Character DATA) :By default, everything is PCDATA. PCDATA is a data, it can be parsed in XML.

PCDATA(解析字符数据):默认情况下,一切都是PCDATA。PCDATA是一个数据,可以用XML解析。

#1


65  

PCDATA - Parsed Character Data

PCDATA -解析字符数据

XML parsers normally parse all the text in an XML document.

XML解析器通常解析XML文档中的所有文本。

CDATA - (Unparsed) Character Data

(未解析的)字符数据

The term CDATA is used about text data that should not be parsed by the XML parser.

术语CDATA用于不应该由XML解析器解析的文本数据。

Characters like "<" and "&" are illegal in XML elements.

像“<”和“&”这样的字符在XML元素中是非法的。

#2


62  

  • PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded.
  • PCDATA是由解析器解析的文本。文本中的标记将被视为标记,实体将被展开。
  • CDATA is text that will not be parsed by a parser. Tags inside the text will not be treated as markup and entities will not be expanded.
  • CDATA是不会被解析器解析的文本。文本中的标记不会被视为标记,实体也不会被扩展。

By default, everything is PCDATA. In the following example, ignoring the root, <bar> will be parsed, and it'll have no content, but one child.

默认情况下,一切都是PCDATA。在下面的例子中,忽略根, 将被解析,它将没有内容,只有一个孩子。

<?xml version="1.0"?>
<foo>
<bar><test>content!</test></bar>
</foo>

When we want to specify that an element will only contain text, and no child elements, we use the keyword PCDATA, because this keyword specifies that the element must contain parsable character data – that is , any text except the characters less-than (<) , greater-than (>) , ampersand (&), quote(') and double quote (").

当我们想要指定一个元素只包含文本,也没有子元素,我们使用关键字PCDATA,因为这个关键字指定元素必须包含解析的字符数据,也就是说,任何文本除了字符小于号(<)、大于号(>),与字符(&),引号(')和双引号(")。

In the next example, <bar> contains CDATA. Its content will not be parsed and is thus <test>content!</test>.

在下一个示例中, 包含CDATA。它的内容不会被解析,因此 content!

<?xml version="1.0"?>
<foo>
<bar><![CDATA[<test>content!</test>]]></bar>
</foo>

There are several content models in SGML. The #PCDATA content model says that an element may contain plain text. The "parsed" part of it means that markup (including PIs, comments and SGML directives) in it is parsed instead of displayed as raw text. It also means that entity references are replaced.

在SGML中有几个内容模型。PCDATA内容模型说一个元素可能包含纯文本。它的“解析”部分意味着其中的标记(包括PIs、注释和SGML指令)被解析,而不是显示为原始文本。它还意味着替换实体引用。

Another type of content model allowing plain text contents is CDATA. In XML, the element content model may not implicitly be set to CDATA, but in SGML, it means that markup and entity references are ignored in the contents of the element. In attributes of CDATA type however, entity references are replaced.

另一种允许纯文本内容的内容模型是CDATA。在XML中,元素内容模型可能不会被隐式地设置为CDATA,但是在SGML中,它意味着标记和实体引用在元素的内容中被忽略。然而,在CDATA类型的属性中,实体引用被替换。

In XML #PCDATA is the only plain text content model. You use it if you at all want to allow text contents in the element. The CDATA content model may be used explicitly through the CDATA block markup in #PCDATA, but element contents may not be defined as CDATA per default.

在XML #PCDATA中,只有纯文本内容模型。如果您想要在元素中允许文本内容,可以使用它。CDATA内容模型可以通过#PCDATA中的CDATA块标记显式地使用,但是元素内容可能不会默认定义为CDATA。

In a DTD, the type of an attribute that contains text must be CDATA. The CDATA keyword in an attribute declaration has a different meaning than the CDATA section in an XML document. In a CDATA section all characters are legal (including <,>,&,’ and “ characters), except the “]]>” end tag.

在DTD中,包含文本的属性的类型必须是CDATA。属性声明中的CDATA关键字与XML文档中的CDATA节有不同的含义。在CDATA区域中,所有字符都是合法的(包括<、>、&、'和"字符),除了"][]> "结束标记。

#PCDATA is not appropriate for the type of an attribute. It is used for the type of "leaf" text.

#PCDATA不适合属性的类型。它用于“叶”文本的类型。

#PCDATA is prepended by a hash simply for historical reasons.

由于历史原因,#PCDATA使用散列进行预写。

#3


10  

From here (Google is your friend):

从这里(谷歌是你的朋友):

In a DTD, PCDATA and CDATA are used to assert something about the allowable content of elements and attributes, respectively. In an element's content model, #PCDATA says that the element contains (may contain) "any old text." (With exceptions as noted below.) In an attribute's declaration, CDATA is one sort of constraint you can put on the attribute's allowable values (other sorts, all mutually exclusive, include ID, IDREF, and NMTOKEN). An attribute whose allowable values are CDATA can (like PCDATA in an element) contain "any old text."

在DTD中,PCDATA和CDATA分别用于断言元素和属性的允许内容。在元素的内容模型中,#PCDATA表示元素包含(可能包含)“任何旧的文本。”(例外情况如下。)在属性的声明中,CDATA是一种约束,您可以对属性的允许值(其他类型,所有互斥的,包括ID、IDREF和NMTOKEN)进行约束。可允许值为CDATA的属性(如元素中的PCDATA)包含“任何旧文本”。

A potentially really confusing issue is that there's another "CDATA," also referred to as marked sections. A marked section is a portion of element (#PCDATA) content delimited with special strings: to close it. If you remember that PCDATA is "parsed character data," a CDATA section is literally the same thing, without the "parsed." Parsers transmit the content of a marked section to downstream applications without hiccupping every time they encounter special characters like < and &. This is useful when you're coding a document that contains lots of those special characters (like scripts and code fragments); it's easier on data entry, and easier on reading, than the corresponding entity reference.

一个潜在的真正令人困惑的问题是还有另一个“CDATA”,也称为标记节。标记部分是元素(#PCDATA)内容的一部分,内容用特殊的字符串分隔:关闭它。如果您还记得PCDATA是“解析字符数据”,那么CDATA部分实际上是相同的,没有“解析”。解析器将标记部分的内容传输到下游应用程序,而不会在每次遇到 <和&等特殊字符时中断。当您编写包含许多这些特殊字符(如脚本和代码片段)的文档时,这是非常有用的;与相应的实体引用相比,它更易于数据输入和读取。< p>

So you can infer that the exception to the "any old text" rule is that PCDATA cannot include any of these unescaped special characters, UNLESS they fall within the scope of a CDATA marked section.

因此,您可以推断“任何旧文本”规则的例外是,PCDATA不能包含任何这些未转义的特殊字符,除非它们属于CDATA标记的部分。

#4


7  

PCDATA – parsed character data. It parse to all the data in an xml document.

PCDATA -解析字符数据。它解析xml文档中的所有数据。

Example:

例子:

<family>
    <mother>mom</mother>
    <father>dad</father>
</family>

Here, the family element contains 2 more elements “mother” and ”father”. So it parse further to get the text of mother and father to give the value of family as “mom dad”

在这里,family元素包含另外两个元素“mother”和“father”。所以它进一步解析,得到了母亲和父亲的文本并给出了家庭作为“母亲父亲”的价值

CDATA – unparsed characted Data. This is the data that should not be parsed further in an xml document.

未解析的字符数据。这是不应该在xml文档中进一步解析的数据。

<family>
    <![CDATA[ 
       <mother>mom</mother>
       <father>dad</father>
    ]]>
</family>

Here, the value of family will be <mother>mom</mother><father>dad</father>.

在这里,家庭的价值将是“妈妈>”“妈妈>”“爸爸>爸爸”“爸爸>”。

#5


3  

The very main difference between PCDATA and CDATA is

PCDATA和CDATA最主要的区别是

PCDATA - Basically used for ELEMENTS while

PCDATA——基本用于元素

CDATA - Used for Attributes of XML i.e ATTLIST

CDATA——用于XML i的属性。e ATTLIST

#6


0  

CDATA (Character DATA): It is similarly to a comment but it is part of document. i.e. CDATA is a data, it is part of the document but the data can not parsed in XML.
Note: XML comment omits while parsing an XML but CDATA shows as it is.

CDATA(字符数据):它类似于注释,但它是文档的一部分。即CDATA是一个数据,它是文档的一部分,但是数据不能用XML解析。注意:在解析XML时,XML注释省略了,但是CDATA显示的是这样。

PCDATA (Parsed Character DATA) :By default, everything is PCDATA. PCDATA is a data, it can be parsed in XML.

PCDATA(解析字符数据):默认情况下,一切都是PCDATA。PCDATA是一个数据,可以用XML解析。