I'm not sure how I have 3 arguements here, and if so, how do i call DmozItem from items.py? This seems like a simple inheritance issue I'm missing. This code is copied directly from the scrapy tutorial website.
我不知道这里有3个论点,如果是,我怎么从items.py中调用DmozItem ?这似乎是一个简单的继承问题。此代码直接从剪贴教程网站复制。
-- Shell error --
——壳牌错误
SyntaxError: invalid syntax
PS C:\Users\Steve\tutorial> scrapy crawl dmoz
Traceback (most recent call last):
File "c:\python27\scripts\scrapy-script.py", line 9, in <module>
load_entry_point('scrapy==1.0.3', 'console_scripts', 'scrapy')()
File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\cmdline.py", line 142, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\crawler.py", line 209, in __init__
super(CrawlerProcess, self).__init__(settings)
File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\crawler.py", line 115, in __init__
self.spider_loader = _get_spider_loader(settings)
File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\crawler.py", line 296, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\spiderloader.py", line 30, in from_settings
return cls(settings)
File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\spiderloader.py", line 21, in __init__
for module in walk_modules(name):
File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\utils\misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "C:\Python27\lib\importlib\__init__.py", line 37, in import_module
__import__(name)
File "C:\Users\Steve\tutorial\tutorial\spiders\dmoz_spider.py", line 3, in <module>
from tutorial.items import DmozItem
File "C:\Users\Steve\tutorial\tutorial\items.py", line 11, in <module>
class DmozItem(scrapy.item):
TypeError: Error when calling the metaclass bases
module._init_() takes at most 2 arguments (3 given)
-- items.py -- my items list for parsing
——项目。py——我的解析项列表。
import scrapy
class DmozItem(scrapy.item):
title = scrapy.Field()
link = scrapy.Field()
desc = scrapy.Field()
-- dmoz_spider.py -- this is the spider
——dmoz_spider。py,这是蜘蛛。
import scrapy
from tutorial.items import DmozItem
class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"https://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"https://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
for sel in response.xpath('//ul/li'):
item = DmozItem()
item['title'] = sel.xpath('a/text()').extract()
item['link'] = sel.xpath('a/@href').extract()
item['desc'] = sel.xpath('text()').extract()
yield item
1 个解决方案
#1
1
You have mistyped scrapy.Item class name.
你有输错的scrapy。项目类的名字。
In items.py
, change:
在项目。py变化:
scrapy.item
to
来
scrapy.Item
It should look like this:
它应该是这样的:
import scrapy
class DmozItem(scrapy.Item):
title = scrapy.Field()
link = scrapy.Field()
desc = scrapy.Field()
#1
1
You have mistyped scrapy.Item class name.
你有输错的scrapy。项目类的名字。
In items.py
, change:
在项目。py变化:
scrapy.item
to
来
scrapy.Item
It should look like this:
它应该是这样的:
import scrapy
class DmozItem(scrapy.Item):
title = scrapy.Field()
link = scrapy.Field()
desc = scrapy.Field()