python3 模拟登录网站

最近学习python，因经常登录公积金网站查看公积金缴存还款情况，so网上找了写脚本，修改了一下，方便获取网页中的数据。

使用谷歌浏览器F12查看登录请求内容

1.request header需要参数：User-Agent、Referer等。

2.post内容。

python 3.x中urllib库和urilib2库合并成了urllib库。

urllib2.urlopen()变成了urllib.request.urlopen()

urllib2.Request()变成了urllib.request.Request()

cookielib 模块-》http.cookiejar

#! /usr/bin/env python

# -*- coding:gb2312 -*-

# __author__="zhaowei"

'''

　　 python3.4

    模拟登录郑州公积金网站，查询缴存至月份。

'''

from html.parser import HTMLParser

import urllib

import http.cookiejar

import string

import re

hosturl = 'http://www.zzgjj.com/index.asp'

posturl = 'http://www.zzgjj.com/user/login.asp'

cj = http.cookiejar.CookieJar()

cookie_support = urllib.request.HTTPCookieProcessor(cj)

opener = urllib.request.build_opener(cookie_support, urllib.request.HTTPHandler)

urllib.request.install_opener(opener)

h = urllib.request.urlopen(hosturl)

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0',

           'Referer': 'http://www.zzgjj.com/index.asp'}

postData = {'selectlb': '',#登录模式，身份证2，账号1

            'username': '', #公积金账号

            'radename': '赵威',#姓名

            'mm': '',#密码

            'submit322': '确认'#固定值

            }

postData = urllib.parse.urlencode(postData, encoding='gb2312').encode('gb2312')

#因为post里面有中文，因此需要先把url经过gb2312编码处理，然后再把请求编码为gb2312字节码（post必须是字节码）。

request = urllib.request.Request(posturl, postData, headers)

response = urllib.request.urlopen(request)

text = response.read()

html = text.decode('gb2312')

hgjj_last_data = re.findall('<td><p>缴至月份:</p>(\s*)</td>(\s*)<td>(.*?)</td>', html)

#使用正则表达式匹配缴至月份

print(hgjj_last_data[0][2])

 referer：http://www.blogjava.net/hongqiang/archive/2012/08/01/384552.html

秒客网

python3 模拟登录网站

相关文章