爬取 Caused by SSLError(SSLError("bad handshake: Error

时间:2024-04-18 13:03:49
在爬虫中遇到如下报错:

Traceback (most recent call last):
File "C:/Users/xuchunlin/PycharmProjects/A9_25/haiwai__guanwang/11__Gorringes/2__gorringes__no__detail_info.py", line 88, in <module>
spider()
File "C:/Users/xuchunlin/PycharmProjects/A9_25/haiwai__guanwang/11__Gorringes/2__gorringes__no__detail_info.py", line 77, in spider
result = session.get(url=url, headers=headers, params=data).text
File "C:\Python27\lib\site-packages\requests\sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests\adapters.py", line 506, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='auction.gorringes.co.uk', port=443): Max retries exceeded with url:
/asp/searchresults.asp?ps=25&pg=1&sale_no=181217&st=D (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines',
'tls_process_server_certificate', 'certificate verify failed')],)",),))

对于这个错误,查看代码如下

            try:
result = session.get(url=url,headers=headers,params = data).text
except:
result = session.get(url=url, headers=headers, params=data).text if 'javascript">setTimeout' in result:
result = session.get(url=url, headers=headers, params=data).text

因为请求的是https 协议,所以请求禁用证书验证

正常的代码是:

            try:
result = session.get(url=url,headers=headers,params = data,verify=False).text
except:
result = session.get(url=url, headers=headers, params=data,verify=False).text if 'javascript">setTimeout' in result:
result = session.get(url=url, headers=headers, params=data,verify=False).text