【python爬虫】总结反反爬的技巧

时间:2024-07-05 20:54:03

1. 当请求失败时重复请求

def get_url(url):
    try:
        response = requests.get(url, timeout=10)  # 超时设置为10秒
    except:
        for i in range(10):  # 循环去请求网站
            response = requests.get(url, proxies=proxies, timeout=20)
            if response.status_code == 200:
                break
    return response

2. 适当增加sleep时间

mu, sigma = 5, 0.2  # 正态分布
time.sleep(norm.rvs(mu, sigma))

3. 设置代理ip

proxies = {
        'http': '127.0.0.1:1212',
        'https': '127.0.0.1:1212'
    }
response = requests.get(url, proxies=proxies, timeout=20)