JAVA / SAX - 使用XML Parser丢失字符

I'm using SAX Parser to parse the XML file of RSS feeds on an Android App and sometimes the parsing of the pubDate of an item isn't completed (incomplete characters).

我正在使用SAX Parser在Android应用程序上解析RSS提要的XML文件,有时解析项目的pubDate没有完成(不完整的字符)。

Ex:

Actual PubDate Thu, 02 Apr 2015 12:23:41 +0000

Actual PubDate Thu,02 Apr 2015 12:23:41 +0000

PubDate Result of the parse: Thu,

PubDate解析结果:星期四,

Here is the code that I'm using in the parser handler:

这是我在解析器处理程序中使用的代码:

public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if ("item".equalsIgnoreCase(localName)) {
            currentItem = new RssItem(url);
        } else if ("title".equalsIgnoreCase(localName)) {
            parsingTitle = true;
        } else if ("link".equalsIgnoreCase(localName)) {
            parsingLink = true;
        } else if ("pubDate".equalsIgnoreCase(localName)) {
            parsingDate = true;
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if ("item".equalsIgnoreCase(localName)) {
            rssItems.add(currentItem);
            currentItem = null;
        } else if ("title".equalsIgnoreCase(localName)) {
            parsingTitle = false;
        } else if ("link".equalsIgnoreCase(localName)) {
            parsingLink = false;
        } else if ("pubDate".equalsIgnoreCase(localName)) {
            parsingDate = false;
        }
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        if (parsingTitle) {
            if (currentItem != null) {
                currentItem.setTitle(new String(ch, start, length));
                parsingTitle = false;
            }
        } else if (parsingLink) {
            if (currentItem != null) {
                currentItem.setLink(new String(ch, start, length));
                parsingLink = false;
            }
        } else if (parsingDate) {
            if (currentItem != null) {
                currentItem.setDate(new String(ch, start, length));
                parsingDate = false;
            }
        }
    }

The loss of characters is pretty random, it happens in different XML items every time I run the app.

字符丢失非常随机,每次运行应用程序时都会在不同的XML项中发生。

1 个解决方案

#1

You are assuming that there is exactly one characters() call per element. That is not a safe assumption. Build up your string over 1+ calls to characters(), then apply it in endElement().

您假设每个元素只有一个字符()调用。这不是一个安全的假设。通过1次以上的字符调用()构建字符串,然后将其应用于endElement()。

Or, better yet, use any one of a number of existing RSS parser libraries.

或者,更好的是,使用许多现有的RSS解析器库中的任何一个。

#1