My code write a XML file with the LSSerializer class :
我的代码使用LSSerializer类编写一个XML文件:
DOMImplementation impl = doc.getImplementation();
DOMImplementationLS implLS = (DOMImplementationLS) impl.getFeature("LS","3.0");
LSSerializer ser = implLS.createLSSerializer();
String str = ser.writeToString(doc);
System.out.println(str);
String file = racine+"/"+p.getNom()+".xml";
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream(file),"UTF-8");
out.write(str);
out.close();
The XML is well-formed, but when I parse it, I get an error.
XML格式良好,但是当我解析它时,我就会得到一个错误。
Parse code :
解析代码:
File f = new File(racine+"/"+filename);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(f);
XPathFactory xpfactory = XPathFactory.newInstance();
XPath xp = xpfactory.newXPath();
String expression;
expression = "root/nom";
String nom = xp.evaluate(expression, doc);
The error :
错误:
[Fatal Error] Terray.xml:1:40: Content is not allowed in prolog.
9 août 2011 19:42:58 controller.MakaluController activatePatient
GRAVE: null
org.xml.sax.SAXParseException: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:208)
at model.MakaluModel.setPatientActif(MakaluModel.java:147)
at controller.MakaluController.activatePatient(MakaluController.java:59)
at view.ListePatientsPanel.jButtonOKActionPerformed(ListePatientsPanel.java:92)
...
Now, with some research, I found that this error is dure to a "hidden" character at the very beginning of the XML.
现在,通过一些研究,我发现这个错误不会出现在XML开头的“隐藏”字符上。
In fact, I can fix the bug by creating a XML file manually.
事实上,我可以通过手工创建XML文件来修复这个bug。
But where is the error in the XML writing ? (When I try to println the string, there is no space before ths
但是XML写入的错误在哪里呢?(当我尝试打印字符串时,前面没有空格
Solution : change the serializer
I run the solution of UTF-16 encoding for a while, but it was not very stable. So I found a new solution : change the serializer of the XML document, so that the encoding is coherent between the XML header and the file encoding. :
我对UTF-16编码的解决方案运行了一段时间,但它不是很稳定。因此,我找到了一种新的解决方案:更改XML文档的序列化器,以便XML头和文件编码之间的编码是一致的。:
DOMSource domSource = new DOMSource(doc);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
String file = racine+"/"+p.getNom()+".xml";
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream(file),"UTF-8");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.INDENT,"yes");
transformer.transform(domSource, new StreamResult(out));
4 个解决方案
#1
4
But where is the error in the XML writing ?
但是XML写入的错误在哪里呢?
Looks like the error is not in the writing but the parsing. As you have already discovered there is a blank character at the beginning of the file, which causes the error in the parse call in your stach trace:
看起来错误不在书写中,而是在解析过程中。正如您已经发现在文件的开头有一个空白字符,这将导致stach跟踪中的解析调用错误:
Document doc = builder.parse(f);
The reason you do not see the space when you print it out may be simply the encoding you are using. Try changing this line:
打印时看不到空格的原因可能仅仅是您正在使用的编码。试着改变这条线:
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream(file),"UTF-8");
to use 'UTF-16' or 'US-ASCII'
使用“UTF-16”或“US-ASCII”
#2
4
I think that it is probably linked to BOM (Byte Order Mark). See Wikipedia
我认为它可能链接到BOM(字节顺序标记)。看到*
You can verify with Notepad++ by example : Open your file and check the "Encoding" Menu to see if you're in "UTF8 without BOM" or "UTF8 with BOM".
您可以通过示例验证Notepad++:打开文件并检查“编码”菜单,看看您是在“没有BOM的UTF8”还是“带有BOM的UTF8”。
#3
1
Using UTF-16 is the way to go,
使用UTF-16是一种方法,
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream(fileName),"UTF-16");
This can read the file with no issues
这可以在没有问题的情况下读取文件。
#4
0
Try this code:
试试这段代码:
InputStream is = new FileInputStream(file);
Document doc = builder.parse(is , "UTF-8");
#1
4
But where is the error in the XML writing ?
但是XML写入的错误在哪里呢?
Looks like the error is not in the writing but the parsing. As you have already discovered there is a blank character at the beginning of the file, which causes the error in the parse call in your stach trace:
看起来错误不在书写中,而是在解析过程中。正如您已经发现在文件的开头有一个空白字符,这将导致stach跟踪中的解析调用错误:
Document doc = builder.parse(f);
The reason you do not see the space when you print it out may be simply the encoding you are using. Try changing this line:
打印时看不到空格的原因可能仅仅是您正在使用的编码。试着改变这条线:
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream(file),"UTF-8");
to use 'UTF-16' or 'US-ASCII'
使用“UTF-16”或“US-ASCII”
#2
4
I think that it is probably linked to BOM (Byte Order Mark). See Wikipedia
我认为它可能链接到BOM(字节顺序标记)。看到*
You can verify with Notepad++ by example : Open your file and check the "Encoding" Menu to see if you're in "UTF8 without BOM" or "UTF8 with BOM".
您可以通过示例验证Notepad++:打开文件并检查“编码”菜单,看看您是在“没有BOM的UTF8”还是“带有BOM的UTF8”。
#3
1
Using UTF-16 is the way to go,
使用UTF-16是一种方法,
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream(fileName),"UTF-16");
This can read the file with no issues
这可以在没有问题的情况下读取文件。
#4
0
Try this code:
试试这段代码:
InputStream is = new FileInputStream(file);
Document doc = builder.parse(is , "UTF-8");