通过页面源码找到增加访问量的url
http://blog.51cto.com/js/header.php?uid=822334&tid=xxxx
通过python脚本访问
- import urllib2
- import re
- import sys
- import time
- import threading
- l=[]
- def main():
- list1=[]
- list2=[]
- global l
- pr=r'<a href="/822334/(\d+)'
- rr=re.compile(pr)
- yeurl=['http://gaoming.blog.51cto.com/all/822334/page/1','http://gaoming.blog.51cto.com/all/822334/page/2','http://gaoming.blog.51cto.com/all/822334/page/3']
- for i in yeurl:
- req=urllib2.Request(i)
- req.add_header("User-Agent","Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0)")
- cn=urllib2.urlopen(req)
- f=cn.read()
- list2=re.findall(rr,f)
- list1=list1+list2
- cn.close()
- for o in list1:
- url='http://blog.51cto.com/js/header.php?uid=822334&tid='+o
- l.append(url)
- def su(url):
- c=urllib2.urlopen(url)
- print c.read()
- c.close()
- time.sleep(10)
- sem.release()
- if __name__ == "__main__":
- main()
- maxThread=5
- sem=threading.BoundedSemaphore(maxThread)
- while 1:
- for i in l:
- sem.acquire()
- T=threading.Thread(target=su,args=(i,))
- T.start()
纯属无聊之作
本文出自 “高明” 博客,请务必保留此出处http://gaoming.blog.51cto.com/822334/1195390