RSS阅读器使用Sax Parser从标题中丢失角色

时间:2021-10-12 03:58:31

I'm trying to use a SAX parser in order to return the contents of an RSS feed from a URL - http://pitchfork.com/rss/news/, but often characters are lost in displaying the title, showing partial text or just a closing tag ">"

我正在尝试使用SAX解析器从URL返回RSS源的内容 - http://pitchfork.com/rss/news/,但是在显示标题时显示字符丢失,显示部分文本或只是一个结束标签“>”

How can i modify my handler class to prevent this? I think I should probably use StringBuilder or StringBuffer, but i'm not sure how to implement it.

我如何修改我的处理程序类以防止这种情况?我想我应该使用StringBuilder或StringBuffer,但我不知道如何实现它。

ParseHandler.java

public class RssParseHandler extends DefaultHandler {
//Parsed items
private List<RssItem> rssItems;
private RssItem currentItem;
private boolean parsingTitle;
private boolean parsingLink;
private boolean parsing_id;
private boolean parsingDescription;

public RssParseHandler() {
    rssItems = new ArrayList<RssItem>();
}

public List<RssItem> getItems() {
    return rssItems;
}

//Creates empty RssItem object during the process of an item start tag
//Indicators are set to true when particular tag is being processed
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {

    if ("item".equals(qName)) {
        currentItem = new RssItem();

    } else if ("title".equals(qName)) {
        parsingTitle = true;


    } else if ("link".equals(qName)) {
        parsingLink = true;


    } else if ("_id".equals(qName)) {
        parsing_id = true;


    } else if ("description".equals(qName)) {
        parsingDescription = true;

    }
}

//Current RssItem is added to the list following process of end tag
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {

    if ("item".equals(qName)) {
        rssItems.add(currentItem);
        currentItem = null;

    } else if ("title".equals(qName)) {
        parsingTitle = false;

    } else if ("link".equals(qName)) {
        parsingLink = false;

    } else if ("_id".equals(qName)) {
        parsing_id = false;

    } else if ("description".equals(qName)) {
        parsingDescription = false;
    }
}

@Override
public void characters(char[] ch, int start, int length) throws SAXException {

    if (parsingTitle) {
        if (currentItem != null)
            currentItem.setTitle(new String(ch, start, length));

    } else if (parsingLink) {
        if (currentItem != null) {
            currentItem.setLink(new String(ch, start, length));
            parsingLink = false;
        }

    } else if (parsing_id) {
        if (currentItem != null) {
            currentItem.set_id(new String(ch, start, length));
            parsing_id = false;
        }

    } else if (parsingDescription) {
        if (currentItem != null) {
            currentItem.setDescription(new String(ch, start, length));
            parsingDescription = false;
        }

    }
}}//rssHandlerClass

1 个解决方案

#1


0  

Use a StringBuilder to build the tag, rather than using a new String instance as the documentation says:

使用StringBuilder构建标记,而不是使用新的String实例,如文档所示:

The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.

解析器将调用此方法来报告每个字符数据块。 SAX解析器可以在一个块中返回所有连续的字符数据,或者它们可以将它分成几个块;但是,任何单个事件中的所有字符都必须来自同一个外部实体,以便Locator提供有用的信息。

And @CommonWares says this exactly in his post Here.

而且@CommonWares正是在这里发表的。

Build your tag as it is found using StringBuilder, since there is chunks coming in at once rather than the entire string (This explains the incomplete tags!). You may or may not need the isBuilding flag, but I don't know your entire implementation so I added it incase.

使用StringBuilder构建你的标签,因为有一个块一次进入而不是整个字符串(这解释了不完整的标签!)。您可能需要也可能不需要isBuilding标志,但我不知道您的整个实现,所以我添加了它。

   StringBuilder mSb;
   boolean isBuilding;

   @Override
   public void startElement(String uri, String localName, String qName,
         Attributes attributes) throws SAXException {

        mSb = new StringBuilder();
        isBuilding = true;

        if(qName.equals("title")){
            parsingTitle = true;
        }
        ...
        ...
    }

    @Override
    public void characters (char ch[], int start, int length) {
        if (mSb !=null && isBuilding) {
            for (int i=start; i<start+length; i++) {
                mSb.append(ch[i]);
            }
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName)
        throws SAXException {

        if(parsingTitle){
            currentItem.setTitle(sb.toString().trim());
            parsingTitle = false;  
            isBuilding = false;
        }
    }

#1


0  

Use a StringBuilder to build the tag, rather than using a new String instance as the documentation says:

使用StringBuilder构建标记,而不是使用新的String实例,如文档所示:

The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.

解析器将调用此方法来报告每个字符数据块。 SAX解析器可以在一个块中返回所有连续的字符数据,或者它们可以将它分成几个块;但是,任何单个事件中的所有字符都必须来自同一个外部实体,以便Locator提供有用的信息。

And @CommonWares says this exactly in his post Here.

而且@CommonWares正是在这里发表的。

Build your tag as it is found using StringBuilder, since there is chunks coming in at once rather than the entire string (This explains the incomplete tags!). You may or may not need the isBuilding flag, but I don't know your entire implementation so I added it incase.

使用StringBuilder构建你的标签,因为有一个块一次进入而不是整个字符串(这解释了不完整的标签!)。您可能需要也可能不需要isBuilding标志,但我不知道您的整个实现,所以我添加了它。

   StringBuilder mSb;
   boolean isBuilding;

   @Override
   public void startElement(String uri, String localName, String qName,
         Attributes attributes) throws SAXException {

        mSb = new StringBuilder();
        isBuilding = true;

        if(qName.equals("title")){
            parsingTitle = true;
        }
        ...
        ...
    }

    @Override
    public void characters (char ch[], int start, int length) {
        if (mSb !=null && isBuilding) {
            for (int i=start; i<start+length; i++) {
                mSb.append(ch[i]);
            }
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName)
        throws SAXException {

        if(parsingTitle){
            currentItem.setTitle(sb.toString().trim());
            parsingTitle = false;  
            isBuilding = false;
        }
    }