Similar to this question, but unfortunately didn't help
类似的问题,但不幸的是没有帮助
I am trying to parse a String to XML in Java and keep getting the error:
我试图用Java将字符串解析为XML,并不断得到错误:
[Fatal Error] output.txt:1:1: Content is not allowed in prolog.
I know it must be something to do with my XML string, because I ran a test with very basic XML and the error dissappeared.
我知道这一定与我的XML字符串有关,因为我用非常基本的XML运行了一个测试,错误消失了。
XML
XML
<?xml version="1.0" encoding="UTF-8"?>
<?xfa generator="ff99v250_01" APIVersion="1.4.3139.0"?>
<jfxpf:XPF xmlns:jfxpf="http://www.xfa.com/schema/xml-package">
<jfxpf:Package>
<jfxpf:Resource Location="GenReq">
<jfxpf:Link ContentType="application/x-jetform-cft" />
</jfxpf:Resource>
<jfxpf:Resource Location="default.xml">
<jfxpf:Content ContentType="text/xml" Location="default.xml">
<xfa:Data xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
<xfa:DataGroup>
<data xmlns:xfe="http://www.xfa.org/schema/xfa-events/1.0" xfe:script="$config.proto.common.template.uri='GenReq'" xfe:event="$config:load">
<?jetform ^Dat ^page 1?>
<FR_NAME>Administrator</FR_NAME>
<JFWF_DELEGATE />
<ADHOC_DLN_ACTOR />
<ADHOC_DLN_MSG />
<ADHOC_DLN_TIME />
<ADHOC_DLN_UNITS>Days</ADHOC_DLN_UNITS>
<ADHOC_RMD_MSG />
<ADHOC_RMD_TIME />
<ADHOC_RMD_UNITS>Days</ADHOC_RMD_UNITS>
<ADHOC_RPT_TIME />
<ADHOC_RPT_UNITS>Days</ADHOC_RPT_UNITS>
<CIRCULATETO />
<COMPLETION />
<FOLLOWUP />
<MSGSUBJECT />
<OTHERFIELD />
<PRIORITY>Low</PRIORITY>
<REQUEST />
<RESPONSE />
<Submit />
<ADHOC_VALIDDATA>True</ADHOC_VALIDDATA>
<JFWF_TRANID>2xxyg9sffane7pwd5j8yv9t49s.1</JFWF_TRANID>
<JFWF_INSTRUCTION>Initiate a General Request. Fill the request form, then identify the next participant.</JFWF_INSTRUCTION>
<JFWF_TRANSPORT>HTTP</JFWF_TRANSPORT>
<JFWF_STATUS>RECEIVED</JFWF_STATUS>
<JFWF_ACTION />
<JFWF_CHOICE>*Select Next Participant,Cancel</JFWF_CHOICE>
<JFWF_VERSION>6.2</JFWF_VERSION>
<JFWF_READONLY>1</JFWF_READONLY>
</data>
</xfa:DataGroup>
</xfa:Data>
</jfxpf:Content>
</jfxpf:Resource>
</jfxpf:Package>
</jfxpf:XPF>
However, I am having trouble finding the text that is causing this issue. My Java code is below:
然而,我很难找到导致这个问题的文本。我的Java代码如下:
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder()
.parse(new InputSource(new StringReader(xml)));
EDIT Removing the Data
node works, so the error is somewhere deep in the XML. This does not throw an error:
编辑删除数据节点可以工作,因此错误位于XML的深处。这不会抛出错误:
<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<?xfa generator=\"ff99v250_01\" APIVersion=\"1.4.3139.0\"?>
<jfxpf:XPF xmlns:jfxpf=\"http://www.xfa.com/schema/xml-package\">
<jfxpf:Package>
<jfxpf:Resource Location=\"GenReq\">
<jfxpf:Link ContentType=\"application/x-jetform-cft\"/>
</jfxpf:Resource>
<jfxpf:Resource Location=\"default.xml\">
<jfxpf:Content ContentType=\"text/xml\" Location=\"default.xml\">
</jfxpf:Content>
</jfxpf:Resource>
</jfxpf:Package>
</jfxpf:XPF>
My Imports
我的进口
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import javax.swing.JFileChooser;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
2 个解决方案
#1
1
The document and sample code you provided works fine in Java 1.8u25:
您提供的文档和示例代码在Java 1.8u25中运行良好:
import static org.junit.Assert.*;
import java.io.IOException;
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.junit.Test;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
public class FatalErrorTest
{
@Test
public void as_given() throws SAXException, IOException, ParserConfigurationException
{
String xml ="<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<?xfa generator=\"ff99v250_01\" APIVersion=\"1.4.3139.0\"?>\r\n<jfxpf:XPF xmlns:jfxpf=\"http://www.xfa.com/schema/xml-package\">\r\n <jfxpf:Package>\r\n <jfxpf:Resource Location=\"GenReq\">\r\n <jfxpf:Link ContentType=\"application/x-jetform-cft\" />\r\n </jfxpf:Resource>\r\n <jfxpf:Resource Location=\"default.xml\">\r\n <jfxpf:Content ContentType=\"text/xml\" Location=\"default.xml\">\r\n <xfa:Data xmlns:xfa=\"http://www.xfa.org/schema/xfa-data/1.0/\">\r\n <xfa:DataGroup>\r\n <data xmlns:xfe=\"http://www.xfa.org/schema/xfa-events/1.0\" xfe:script=\"$config.proto.common.template.uri='GenReq'\" xfe:event=\"$config:load\">\r\n <?jetform ^Dat ^page 1?>\r\n <FR_NAME>Administrator</FR_NAME>\r\n <JFWF_DELEGATE />\r\n <ADHOC_DLN_ACTOR />\r\n <ADHOC_DLN_MSG />\r\n <ADHOC_DLN_TIME />\r\n <ADHOC_DLN_UNITS>Days</ADHOC_DLN_UNITS>\r\n <ADHOC_RMD_MSG />\r\n <ADHOC_RMD_TIME />\r\n <ADHOC_RMD_UNITS>Days</ADHOC_RMD_UNITS>\r\n <ADHOC_RPT_TIME />\r\n <ADHOC_RPT_UNITS>Days</ADHOC_RPT_UNITS>\r\n <CIRCULATETO />\r\n <COMPLETION />\r\n <FOLLOWUP />\r\n <MSGSUBJECT />\r\n <OTHERFIELD />\r\n <PRIORITY>Low</PRIORITY>\r\n <REQUEST />\r\n <RESPONSE />\r\n <Submit />\r\n <ADHOC_VALIDDATA>True</ADHOC_VALIDDATA>\r\n <JFWF_TRANID>2xxyg9sffane7pwd5j8yv9t49s.1</JFWF_TRANID>\r\n <JFWF_INSTRUCTION>Initiate a General Request. Fill the request form, then identify the next participant.</JFWF_INSTRUCTION>\r\n <JFWF_TRANSPORT>HTTP</JFWF_TRANSPORT>\r\n <JFWF_STATUS>RECEIVED</JFWF_STATUS>\r\n <JFWF_ACTION />\r\n <JFWF_CHOICE>*Select Next Participant,Cancel</JFWF_CHOICE>\r\n <JFWF_VERSION>6.2</JFWF_VERSION>\r\n <JFWF_READONLY>1</JFWF_READONLY>\r\n </data>\r\n </xfa:DataGroup>\r\n </xfa:Data>\r\n </jfxpf:Content>\r\n </jfxpf:Resource>\r\n </jfxpf:Package>\r\n</jfxpf:XPF>";
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder()
.parse(new InputSource(new StringReader(xml)));
assertNotNull(doc);
}
}
#2
4
My guess is that the file starts with a BOM character U+FEFF: error at line 1, column 1.This is a zero-width space used sometimes to mark a file as being in some Unicode representation, UTF-8, UTF-16LE, UTF-16BE.
我的猜测是,该文件以BOM字符U+FEFF:第1行第1列错误开始。这是一个零宽度空间,有时用于将文件标记为某种Unicode表示形式,即UTF-8、UTF-16LE、UTF-16BE。
The BOM character can be removed. Check the file size, and then look what options you have: save as UTF-8 without BOM, delete.
可以删除BOM字符。检查文件大小,然后看看您有什么选项:保存为UTF-8,没有BOM,删除。
In java (should the editor be stubborn):
在java中(如果编辑器很固执):
Path path = Paths.get(".... .xml");
byte[] content = Files.readAllBytes(path);
String s = new String(content, StandardCharsets.UTF_8);
s = s.replaceFirst("^\uFEFF", "");
byte[] content2 = s.getBytes(StandardCharsets.UTF_8);
if (content2.length != content.length) {
Files.write(path, content2);
}
#1
1
The document and sample code you provided works fine in Java 1.8u25:
您提供的文档和示例代码在Java 1.8u25中运行良好:
import static org.junit.Assert.*;
import java.io.IOException;
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.junit.Test;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
public class FatalErrorTest
{
@Test
public void as_given() throws SAXException, IOException, ParserConfigurationException
{
String xml ="<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<?xfa generator=\"ff99v250_01\" APIVersion=\"1.4.3139.0\"?>\r\n<jfxpf:XPF xmlns:jfxpf=\"http://www.xfa.com/schema/xml-package\">\r\n <jfxpf:Package>\r\n <jfxpf:Resource Location=\"GenReq\">\r\n <jfxpf:Link ContentType=\"application/x-jetform-cft\" />\r\n </jfxpf:Resource>\r\n <jfxpf:Resource Location=\"default.xml\">\r\n <jfxpf:Content ContentType=\"text/xml\" Location=\"default.xml\">\r\n <xfa:Data xmlns:xfa=\"http://www.xfa.org/schema/xfa-data/1.0/\">\r\n <xfa:DataGroup>\r\n <data xmlns:xfe=\"http://www.xfa.org/schema/xfa-events/1.0\" xfe:script=\"$config.proto.common.template.uri='GenReq'\" xfe:event=\"$config:load\">\r\n <?jetform ^Dat ^page 1?>\r\n <FR_NAME>Administrator</FR_NAME>\r\n <JFWF_DELEGATE />\r\n <ADHOC_DLN_ACTOR />\r\n <ADHOC_DLN_MSG />\r\n <ADHOC_DLN_TIME />\r\n <ADHOC_DLN_UNITS>Days</ADHOC_DLN_UNITS>\r\n <ADHOC_RMD_MSG />\r\n <ADHOC_RMD_TIME />\r\n <ADHOC_RMD_UNITS>Days</ADHOC_RMD_UNITS>\r\n <ADHOC_RPT_TIME />\r\n <ADHOC_RPT_UNITS>Days</ADHOC_RPT_UNITS>\r\n <CIRCULATETO />\r\n <COMPLETION />\r\n <FOLLOWUP />\r\n <MSGSUBJECT />\r\n <OTHERFIELD />\r\n <PRIORITY>Low</PRIORITY>\r\n <REQUEST />\r\n <RESPONSE />\r\n <Submit />\r\n <ADHOC_VALIDDATA>True</ADHOC_VALIDDATA>\r\n <JFWF_TRANID>2xxyg9sffane7pwd5j8yv9t49s.1</JFWF_TRANID>\r\n <JFWF_INSTRUCTION>Initiate a General Request. Fill the request form, then identify the next participant.</JFWF_INSTRUCTION>\r\n <JFWF_TRANSPORT>HTTP</JFWF_TRANSPORT>\r\n <JFWF_STATUS>RECEIVED</JFWF_STATUS>\r\n <JFWF_ACTION />\r\n <JFWF_CHOICE>*Select Next Participant,Cancel</JFWF_CHOICE>\r\n <JFWF_VERSION>6.2</JFWF_VERSION>\r\n <JFWF_READONLY>1</JFWF_READONLY>\r\n </data>\r\n </xfa:DataGroup>\r\n </xfa:Data>\r\n </jfxpf:Content>\r\n </jfxpf:Resource>\r\n </jfxpf:Package>\r\n</jfxpf:XPF>";
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder()
.parse(new InputSource(new StringReader(xml)));
assertNotNull(doc);
}
}
#2
4
My guess is that the file starts with a BOM character U+FEFF: error at line 1, column 1.This is a zero-width space used sometimes to mark a file as being in some Unicode representation, UTF-8, UTF-16LE, UTF-16BE.
我的猜测是,该文件以BOM字符U+FEFF:第1行第1列错误开始。这是一个零宽度空间,有时用于将文件标记为某种Unicode表示形式,即UTF-8、UTF-16LE、UTF-16BE。
The BOM character can be removed. Check the file size, and then look what options you have: save as UTF-8 without BOM, delete.
可以删除BOM字符。检查文件大小,然后看看您有什么选项:保存为UTF-8,没有BOM,删除。
In java (should the editor be stubborn):
在java中(如果编辑器很固执):
Path path = Paths.get(".... .xml");
byte[] content = Files.readAllBytes(path);
String s = new String(content, StandardCharsets.UTF_8);
s = s.replaceFirst("^\uFEFF", "");
byte[] content2 = s.getBytes(StandardCharsets.UTF_8);
if (content2.length != content.length) {
Files.write(path, content2);
}