Scrapy爬虫:代理IP配置

时间:2021-08-29 16:58:28

Scrapy设置代理IP步骤:

1、在Scrapy工程下新建"middlewares.py":

?
12345678910111213 import base64 # Start your middleware classclass ProxyMiddleware(object):    # overwrite process request    def process_request(self, request, spider):        # Set the location of the proxy        request.meta['proxy'= "http://YOUR_PROXY_IP:PORT"           # Use the following lines if your proxy requires authentication        proxy_user_pass = "USERNAME:PASSWORD"        # setup basic authentication for the proxy        encoded_user_pass = base64.encodestring(proxy_user_pass)        request.headers['Proxy-Authorization'= 'Basic ' + encoded_user_pass


2、在项目配置文件里setting.py添加:

?
1234 DOWNLOADER_MIDDLEWARES = {    'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware'110,    'pythontab.middlewares.ProxyMiddleware'100,}
转载自:http://my.oschina.net/jhao104/blog/639745