纠正Doxygen XML输出中的错误

I am currently writing a parser for the doxygen XML output. Partly for academic reasons and because the code of doxygen/addons/doxmlparser is ancient.

我目前正在为doxygen XML输出编写解析器。部分原因是出于学术原因,因为doxygen / addons / doxmlparser的代码很古老。

I am using QXmlStreamReader to parse the XML and it raises errors in some attributes. For example the following XML is generated by doxygen:

我正在使用QXmlStreamReader来解析XML,并在某些属性中引发错误。例如,doxygen生成以下XML:

...
<listofallmembers>
...
<member refid="qset_1operator&" prot="public" virt="non-virtual"><scope>libDatabase::Set</scope><name>operator&amp;</name></member>
...
</listofallmembers>

This refid="qset_1operator&" is of course a problem:

这个refid =“qset_1operator&”当然是一个问题:

XmlStreamReaderError: Expected '#' or '[a-zA-Z]', but got '"'.

Other errors include having <> characters (and others) in XML attributes.

其他错误包括在XML属性中包含<>字符(和其他字符)。

I know that these characters have to be replaced by their <, >, etc counterparts.

我知道这些角色必须由他们的< ;,>等对应物替换。

How would I easily (and automatically of course) correct the XML, when I can not use Qt's classes to even look at the XML?

当我不能使用Qt的类来查看XML时,我将如何轻松(并自动地)修正XML?

1 个解决方案

#1

One possibility would be to work around the errors and fix them manually as they appear, iterating over the XML until it is well-formed. See this * question: Ignoring a Invalid XML-Tag using Qdom?

一种可能性是解决错误并在它们出现时手动修复它们,迭代XML直到它格式良好。请参阅此*问题:使用Qdom忽略无效的XML标记?

You could also use the tidy library to repair the input before processing.

您还可以在处理之前使用整洁的库来修复输入。

#1