代码如下:
# 这里是封装的一个下载url页面的方法
import requests
def download_page(url, user_Agent=None, referer=None):
print("Downloading:",url)
headers = {
"Referer":referer,
"User-Agent":user_Agent
}
response = requests.get(url=url,headers=headers)
try:
html = response.text
except Exception as e:
print("Download error:",e)
html = None
return html
if __name__ == '__main__':
u = "http://192.168.1.19:8080/edu/"
u_a = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36"
print(download_page(url=u, user_Agent=u_a))
执行结果:
页面是下载下来了,但是有乱码
考虑:
response.text以文本格式查看的时候有乱码,可能是返回的内容被压缩了,这里修改一下
response.content.decode("utf-8") 按utf-8格式输出
修改后的代码为:
import requests
def download_page(url, user_Agent=None, referer=None):
print("Downloading:",url)
headers = {
"Referer":referer,
"User-Agent":user_Agent
}
response = requests.get(url=url,headers=headers)
try:
html = response.content.decode("utf-8")
except Exception as e:
print("Download error:",e)
html = None
return html
if __name__ == '__main__':
u = "http://192.168.1.19:8080/edu/"
u_a = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36"
print(download_page(url=u, user_Agent=u_a))
优化后执行结果:
正常显示