python爬虫—requests库的用法

时间:2025-03-06 17:53:44

python爬虫—requests库的用法
总结:get、post等用法…

import requests
req = requests.get("/?tn=15007414_8_dg")
req = requests.post("/?tn=15007414_8_dg")
req = requests.put("/?tn=15007414_8_dg")
req = requests.delete("/?tn=15007414_8_dg")
req = requests.head("/?tn=15007414_8_dg")
req = requests.options("/?tn=15007414_8_dg")
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

get请求

import requests

url = "/s"
params = {"wd":'篮球'}
response = requests.get(url,params = params)#传个字典不用格式
print(response.url)
response.encoding = "utf-8"
html = response.text
print(html)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

post请求,参数是字典,也可以传输json参数

import requests
from fake_useragent import UserAgent

url = "/hsjw/cas/"
headers = {
    "User-Agent":UserAgent().firefox
}
formdata  = {
    "user":"******",
    "password":"******  "
}
response = requests.post(url,data=formdata,headers =headers)
response.encoding = "utf-8"
html = response.text
print(html)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

自定义请求头部,伪装请求头是采集术静静常用的,我们可用这个方法去隐藏

import requests
from fake_useragent import UserAgent
headers = {"User-Agent":UserAgent().firefox}
r = requests.get("",headers =headers)
print(r.request.headers["User-Agent"])
  • 1
  • 2
  • 3
  • 4
  • 5

设置超时时间

import requests
requests.get("/?tn=monline_3_dg",timeout = 0.001)
  • 1
  • 2

代理访问

import requests


proxies = {
    "http":"http://122.9.101.6:8888",
    "https":"https://61.157.206.174:37259"
    "http":"http:user:password@//122.9.101.6:8888"#需要账户密码
}

requests.get("/",proxies = proxies)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

session 自动保存cookies\seesion的意思是操持一个对话,比如登陆后续继续操作(记录身份信息),而requests的请求,身份信息不会被记录

import requests

s = requests.Session()
#用session对象发出get请求,设置cookies
print(s.get("/cookies/set/sessioncookie/123456789"))
  • 1
  • 2
  • 3
  • 4
  • 5

ssl验证,禁用安全请求警告

import requests
from fake_useragent import UserAgent
headers = {
    'User-Agent':UserAgent().firefox
}

url = "/?tn=monline_3_dg"
requests.packages.urllib3.disable_warnings()#关闭安全请求的警告
response = requests.get(url,verify = False,headers =headers)
print(response)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

获取响应信息
“”""
() 获取相应内容以json为例
获取响应内容以字符串形式
获取响应内容(以字节的形式)
获取响应头内容
获取访问地址
获取网页编码
resp.request.headers 请求头内容
获取cookie

“”"