I'm using Scrapy 0.22. I have a spider that uses an item loader to extract an item. When I run the spider from scrapy shell, I see only debug messages containing None instead of the item from my item loader.
我使用Scrapy 0.22。我有一个爬行器,它使用一个项目加载器来提取一个项目。当我从擦伤的shell中运行蜘蛛时,我只看到调试消息,而不是从我的项目加载器中得到的。
2014-01-26 20:33:08+0100 [ChatroomSpider] DEBUG: Scraped from <200 http://somedomain.com/?a=chat_rooms> None
2014-01-26 20:33:08+0100 [ChatroomSpider]调试:从<200 http://somedomain.com/?一个= chat_rooms >
However, if I uncomment the #print item
line, I can see the item printed to stdout as expected.
但是,如果我取消注释#print itemline,我就可以看到打印出来的条目和预期的一样。
Spider:
蜘蛛:
class ChatroomSpider(BaseSpider):
name = 'ChatroomSpider'
allowed_domains = ['somedomain.com']
start_urls = ['http://somedomain.com/?a=chat_rooms']
def parse(self, response):
selector = Selector(response)
for chatroom_div in selector.xpath(r'id("body")/div[count(div) = 4 and div/div]'):
loader = ChatroomLoader(chatroom_div)
chatroom = loader.load_item()
#print chatroom
yield chatroom
Loader:
加载程序:
class ChatroomLoader(XPathItemLoader):
default_item_class = ChatRoomItem
name_in = Encode(encoding='utf-8')
name_out = TakeFirst()
description_in = StripAndEncode(encoding='utf-8')
description_out = TakeFirst()
datetime_in = StrpTime('%d.%m.%Y %H:%M:%S')
datetime_out = TakeFirst()
def __init__(self, room_selector):
super(ChatroomLoader, self).__init__(selector=room_selector)
self.add_xpath('name', r'div[1]/div/font/b/a/text()')
self.add_xpath('description', r'div[2]/div/text()')
self.add_xpath('users', r'div[3]/div/a/font/text()')
self.add_xpath('datetime', r'id("copyright")/text()[4]', re=r'[0-3]?[0-9].[0-2][0-9].201[3-9] [0-2][0-9]:[0-5][0-9]:[0-5][0-9]')
1 个解决方案
#1
1
If you are using your own pipeline, make sure that, item is returned from it.
如果您正在使用自己的管道,请确保从它返回项目。
For more information about pipelines; http://doc.scrapy.org/en/latest/topics/item-pipeline.html
有关管道的更多信息;http://doc.scrapy.org/en/latest/topics/item-pipeline.html
#1
1
If you are using your own pipeline, make sure that, item is returned from it.
如果您正在使用自己的管道,请确保从它返回项目。
For more information about pipelines; http://doc.scrapy.org/en/latest/topics/item-pipeline.html
有关管道的更多信息;http://doc.scrapy.org/en/latest/topics/item-pipeline.html