I've got html datas that i'm converting into a Dom4J document.
我已经将html数据转换为Dom4J文档。
I've met an error:
我遇到了一个错误:
org.dom4j.DocumentException: Error on line 1 of document : Reference is not allowed in prolog. Nested exception: Reference is not allowed in prolog.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
at MonTest.main(MonTest.java:21)
Nested exception:
org.xml.sax.SAXParseException: Reference is not allowed in prolog.
It was a character "&" that i needed to escape into & amp; in order to build the document.
我需要逃进&安普的是一个“&”字为了构建文档。
In XML, it seems that we need to escape 5 characters: (gt, lt, quot, amp, apos)
在XML中,我们似乎需要转义5个字符:(gt, lt, ", amp, apos)
Nevertheless, how can i escape it, without escaping it into the "nodes" elements:
然而,我如何能够在不将其转义为“节点”元素的情况下,将其转义为:
<div id="test" class='toto'>A&A<A"A</div>
should give:
应该给:
<div id="test" class='toto'>A&A<A"A</div>
and not
而不是
<div id="test" class='toto'>A&A<A"A</div>
Thank you,
谢谢你!
2 个解决方案
#1
7
Escape strings before adding to XML document. Use StringEscapeUtils.escapeXml method from Apache Commons Lang. Use some library to build XML e.g. http://code.google.com/p/joox/.
在添加到XML文档之前要转义字符串。使用Apache Commons Lang中的StringEscapeUtils.escapeXml方法,使用一些库来构建XML,例如http://code.google.com/p/joox/。
#2
#1
7
Escape strings before adding to XML document. Use StringEscapeUtils.escapeXml method from Apache Commons Lang. Use some library to build XML e.g. http://code.google.com/p/joox/.
在添加到XML文档之前要转义字符串。使用Apache Commons Lang中的StringEscapeUtils.escapeXml方法,使用一些库来构建XML,例如http://code.google.com/p/joox/。