如何在java中打印格式正确的无效XML片段?

时间:2021-12-29 21:47:19

I've tried

我试过了

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(false);
factory.setValidating(false);
XMLReader reader = factory.newSAXParser().getXMLReader();
Source xmlInput = new SAXSource(reader, new InputSource(new StringReader(xml)));
StringWriter stringWriter = new StringWriter();
xmlPretty = new StreamResult(stringWriter);   
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
transformer.transform(xmlInput, xmlPretty);
return xmlPretty.getWriter().toString();

but as soon as there is an "ignorable space" the indentation stops. I've searched a lot but found nothing about ignorable spaces in sax parsers, except in Handlers. So I've tried to add a handler of mine:

但是只要有一个“可忽略的空间”,压痕就会停止。我搜索了很多但是在sax解析器中找不到任何可忽略的空格,除了Handlers。所以我试图添加我的处理程序:

class MyHandler extends DefaultHandler {
  @Override
  public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException {
    System.out.println("foo");
  }
}
...
reader.setContentHandler(new MyHandler());

but it never prints "foo".

但它从不打印“foo”。

Update: Here is an example of input:

更新:以下是输入示例:

<n:a>  <b>foo </b>  </n:a>

So well-formed but invalid (n is not defined). I want the function to output something like:

结构良好但无效(n未定义)。我希望函数输出如下内容:

<n:a>
  <b>foo </b>
</n:a>

The program above does output this if I provide it with:

如果我提供以下程序,它会输出这个:

<n:a><b>foo </b></n:a>

But not with <n:a> <b>foo </b> </n:a>.

但不是 foo 。 :a>

1 个解决方案

#1


1  

I don't think the namespace not declared makes any difference, while additional whitespaces do. I tried your code and, I'm still trying to understand why, if you add this line

我不认为未声明的命名空间有任何区别,而额外的空格则有所不同。我尝试了你的代码,我仍然试图理解为什么,如果你添加这一行

transformer.setOutputProperty(OutputKeys.METHOD, "html");

you should have the desired output. Could you confirm this and check for any eventual side effects?

你应该有所需的输出。你能证实这一点并检查是否有任何副作用?

#1


1  

I don't think the namespace not declared makes any difference, while additional whitespaces do. I tried your code and, I'm still trying to understand why, if you add this line

我不认为未声明的命名空间有任何区别,而额外的空格则有所不同。我尝试了你的代码,我仍然试图理解为什么,如果你添加这一行

transformer.setOutputProperty(OutputKeys.METHOD, "html");

you should have the desired output. Could you confirm this and check for any eventual side effects?

你应该有所需的输出。你能证实这一点并检查是否有任何副作用?