I'm generating XML in a view with CakePHP's Xml core library:
我使用CakePHP的XML核心库在视图中生成XML:
$xml = Xml::build($data, array('return' => 'domdocument'));echo $xml->saveXML();
View is fed from the controller with an array:
用数组从控制器输入视图:
$this->set( array( 'data' => array( 'root' => array( array( '@id' => 'A & B: OK', 'name' => 'C & D: OK', 'sub1' => array( '@id' => 'E & F: OK', 'name' => 'G & H: OK', 'sub2' => array( array( '@id' => 'I & J: OK', 'name' => 'K & L: OK', 'sub3' => array( '@id' => 'M & N: OK', 'name' => 'O & P: OK', 'sub4' => array( '@id' => 'Q & R: OK', '@' => 'S & T: ERROR', ), ), ), ), ), ), ), ), ));
For whatever the reason, CakePHP is issuing an internal call like this:
无论出于什么原因,CakePHP正在发出这样的内部调用:
$dom = new DOMDocument;$key = 'sub4';$childValue = 'S & T: ERROR';$dom->createElement($key, $childValue);
... which triggers a PHP warning:
…触发PHP警告:
Warning (2): DOMDocument::createElement(): unterminated entity reference T [CORE\Cake\Utility\Xml.php, line 292
... because (as documented), DOMDocument::createElement
does not escape values. However, it only does it in certain nodes, as the test case illustrates.
…因为(如文档所示),DOMDocument: createElement没有转义值。但是,它只在某些节点上执行,如测试用例所示。
Am I doing something wrong or I just hit a bug in CakePHP?
我是不是做错了什么,或者我只是在CakePHP中遇到了错误?
4 个解决方案
#1
15
This might a bug in PHPs DOMDocument::createElement()
method. You can avoid it. Create the textnode separately and append it to the element node.
这可能是PHPs DOMDocument中的一个bug::createElement()方法。你可以避免它。分别创建textnode并将其附加到元素节点。
$dom = new DOMDocument;$dom ->appendChild($dom->createElement('element')) ->appendChild($dom->createTextNode('S & T: ERROR'));var_dump($dom->saveXml());
Output: https://eval.in/134277
输出:https://eval.in/134277
string(58) "<?xml version="1.0"?><element>S & T: ERROR</element>"
This is the intended way to add text nodes to a DOM. You always create a node (element, text , cdata, ...) and append it to its parent node. You can add more then one node and different kind of nodes to one parent. Like in the following example:
这是向DOM添加文本节点的预期方式。您总是创建一个节点(元素、文本、cdata、…)并将其附加到它的父节点。可以将多个节点和不同类型的节点添加到父节点中。就像下面这个例子:
$dom = new DOMDocument;$p = $dom->appendChild($dom->createElement('p'));$p->appendChild($dom->createTextNode('Hello '));$b = $p->appendChild($dom->createElement('b'));$b->appendChild($dom->createTextNode('World!'));echo $dom->saveXml();
Output:
输出:
<?xml version="1.0"?><p>Hello <b>World!</b></p>
#2
4
This is in fact because the DOMDocument methods wants correct characters to be outputted in html; that is, characters such as &
will break content and generate a unterminated entity reference
error
这实际上是因为DOMDocument方法希望在html中输出正确的字符;也就是说,诸如&之类的字符将破坏内容并生成一个未终止的实体引用错误
just htmlentities() it before using it to create elements:
只是htmlentities()在使用它创建元素之前:
$dom = new DOMDocument;$key = 'sub4';$childValue = htmlentities('S & T: ERROR');$dom->createElement($key ,$childValue);
#3
0
it is because of this character: &
You need to replace that with the relevant HTML entity. &
To perform the translation, you can use the htmlspecialchars function. You have to escape the value when writing writing to the nodeValue property. As quoted from a bug report in 2005 located here
因为这个字符:&您需要用相关的HTML实体替换它。,要执行转换,可以使用htmlspecialchars函数。在写入nodeValue属性时,必须转义值。引用2005年的bug报告
ampersands ARE properly encoded when setting the property textContent. Unfortunately they are not encoded when the text string is passed as the optional second arguement to DOMElement::createElement You must create a text node, set the textContent, then append the text node to the new element.
在设置属性textContent时,符号被正确编码。不幸的是,当文本字符串作为可选的第二个论述传递给DOMElement::createElement时,您必须创建一个文本节点,设置textContent,然后将文本节点附加到新元素。
htmlspecialchars($string, ENT_QUOTES, 'UTF-8');
This is the translation table:
这是翻译表:
'&' (ampersand) becomes '&''"' (double quote) becomes '"' when ENT_NOQUOTES is not set."'" (single quote) becomes ''' (or ') only when ENT_QUOTES is set.'<' (less than) becomes '<''>' (greater than) becomes '>'
This script will do the translations recursively:
这个脚本将递归地进行翻译:
<?phpfunction clean($type) { if(is_array($type)) { foreach($type as $key => $value){ $type[$key] = clean($value); } return $type; } else { $string = htmlspecialchars($type, ENT_QUOTES, 'UTF-8'); return $string; }}$data = array( 'data' => array( 'root' => array( array( '@id' => 'A & B: OK', 'name' => 'C & D: OK', 'sub1' => array( '@id' => 'E & F: OK', 'name' => 'G & H: OK', 'sub2' => array( array( '@id' => 'I & J: OK', 'name' => 'K & L: OK', 'sub3' => array( '@id' => 'M & N: OK', 'name' => 'O & P: OK', 'sub4' => array( '@id' => 'Q & R: OK', '@' => 'S & T: ERROR', ) , ) , ) , ) , ) , ) , ) , ) ,);$data = clean($data);
Output
输出
Array( [data] => Array ( [root] => Array ( [0] => Array ( [@id] => A & B: OK [name] => C & D: OK [sub1] => Array ( [@id] => E & F: OK [name] => G & H: OK [sub2] => Array ( [0] => Array ( [@id] => I & J: OK [name] => K & L: OK [sub3] => Array ( [@id] => M & N: OK [name] => O & P: OK [sub4] => Array ( [@id] => Q & R: OK [@] => S & T: ERROR ) ) ) ) ) ) ) ))
#4
-1
The problem seems to be in nodes that have both attributes and values thus need to use the @
syntax:
问题似乎出现在同时具有属性和值的节点上,因此需要使用@语法:
'@id' => 'A & B: OK', // <-- Handled as plain text'name' => 'C & D: OK', // <-- Handled as plain text'@' => 'S & T: ERROR', // <-- Handled as raw XML
I've written a little helper function:
我写了一个辅助函数
protected function escapeXmlValue($value){ return is_null($value) ? null : htmlspecialchars($value, ENT_XML1, 'UTF-8');}
... and take care of calling it manually when I create the array:
…当我创建数组时,请注意手动调用:
'@id' => 'A & B: OK','name' => 'C & D: OK','@' => $this->escapeXmlValue('S & T: NOW WORKS FINE'),
It's hard to say if it's bug or feature since the documentation doesn't mention it.
很难说它是bug还是特性,因为文档中没有提到它。
#1
15
This might a bug in PHPs DOMDocument::createElement()
method. You can avoid it. Create the textnode separately and append it to the element node.
这可能是PHPs DOMDocument中的一个bug::createElement()方法。你可以避免它。分别创建textnode并将其附加到元素节点。
$dom = new DOMDocument;$dom ->appendChild($dom->createElement('element')) ->appendChild($dom->createTextNode('S & T: ERROR'));var_dump($dom->saveXml());
Output: https://eval.in/134277
输出:https://eval.in/134277
string(58) "<?xml version="1.0"?><element>S & T: ERROR</element>"
This is the intended way to add text nodes to a DOM. You always create a node (element, text , cdata, ...) and append it to its parent node. You can add more then one node and different kind of nodes to one parent. Like in the following example:
这是向DOM添加文本节点的预期方式。您总是创建一个节点(元素、文本、cdata、…)并将其附加到它的父节点。可以将多个节点和不同类型的节点添加到父节点中。就像下面这个例子:
$dom = new DOMDocument;$p = $dom->appendChild($dom->createElement('p'));$p->appendChild($dom->createTextNode('Hello '));$b = $p->appendChild($dom->createElement('b'));$b->appendChild($dom->createTextNode('World!'));echo $dom->saveXml();
Output:
输出:
<?xml version="1.0"?><p>Hello <b>World!</b></p>
#2
4
This is in fact because the DOMDocument methods wants correct characters to be outputted in html; that is, characters such as &
will break content and generate a unterminated entity reference
error
这实际上是因为DOMDocument方法希望在html中输出正确的字符;也就是说,诸如&之类的字符将破坏内容并生成一个未终止的实体引用错误
just htmlentities() it before using it to create elements:
只是htmlentities()在使用它创建元素之前:
$dom = new DOMDocument;$key = 'sub4';$childValue = htmlentities('S & T: ERROR');$dom->createElement($key ,$childValue);
#3
0
it is because of this character: &
You need to replace that with the relevant HTML entity. &
To perform the translation, you can use the htmlspecialchars function. You have to escape the value when writing writing to the nodeValue property. As quoted from a bug report in 2005 located here
因为这个字符:&您需要用相关的HTML实体替换它。,要执行转换,可以使用htmlspecialchars函数。在写入nodeValue属性时,必须转义值。引用2005年的bug报告
ampersands ARE properly encoded when setting the property textContent. Unfortunately they are not encoded when the text string is passed as the optional second arguement to DOMElement::createElement You must create a text node, set the textContent, then append the text node to the new element.
在设置属性textContent时,符号被正确编码。不幸的是,当文本字符串作为可选的第二个论述传递给DOMElement::createElement时,您必须创建一个文本节点,设置textContent,然后将文本节点附加到新元素。
htmlspecialchars($string, ENT_QUOTES, 'UTF-8');
This is the translation table:
这是翻译表:
'&' (ampersand) becomes '&''"' (double quote) becomes '"' when ENT_NOQUOTES is not set."'" (single quote) becomes ''' (or ') only when ENT_QUOTES is set.'<' (less than) becomes '<''>' (greater than) becomes '>'
This script will do the translations recursively:
这个脚本将递归地进行翻译:
<?phpfunction clean($type) { if(is_array($type)) { foreach($type as $key => $value){ $type[$key] = clean($value); } return $type; } else { $string = htmlspecialchars($type, ENT_QUOTES, 'UTF-8'); return $string; }}$data = array( 'data' => array( 'root' => array( array( '@id' => 'A & B: OK', 'name' => 'C & D: OK', 'sub1' => array( '@id' => 'E & F: OK', 'name' => 'G & H: OK', 'sub2' => array( array( '@id' => 'I & J: OK', 'name' => 'K & L: OK', 'sub3' => array( '@id' => 'M & N: OK', 'name' => 'O & P: OK', 'sub4' => array( '@id' => 'Q & R: OK', '@' => 'S & T: ERROR', ) , ) , ) , ) , ) , ) , ) , ) ,);$data = clean($data);
Output
输出
Array( [data] => Array ( [root] => Array ( [0] => Array ( [@id] => A & B: OK [name] => C & D: OK [sub1] => Array ( [@id] => E & F: OK [name] => G & H: OK [sub2] => Array ( [0] => Array ( [@id] => I & J: OK [name] => K & L: OK [sub3] => Array ( [@id] => M & N: OK [name] => O & P: OK [sub4] => Array ( [@id] => Q & R: OK [@] => S & T: ERROR ) ) ) ) ) ) ) ))
#4
-1
The problem seems to be in nodes that have both attributes and values thus need to use the @
syntax:
问题似乎出现在同时具有属性和值的节点上,因此需要使用@语法:
'@id' => 'A & B: OK', // <-- Handled as plain text'name' => 'C & D: OK', // <-- Handled as plain text'@' => 'S & T: ERROR', // <-- Handled as raw XML
I've written a little helper function:
我写了一个辅助函数
protected function escapeXmlValue($value){ return is_null($value) ? null : htmlspecialchars($value, ENT_XML1, 'UTF-8');}
... and take care of calling it manually when I create the array:
…当我创建数组时,请注意手动调用:
'@id' => 'A & B: OK','name' => 'C & D: OK','@' => $this->escapeXmlValue('S & T: NOW WORKS FINE'),
It's hard to say if it's bug or feature since the documentation doesn't mention it.
很难说它是bug还是特性,因为文档中没有提到它。