python脚本工具－1 制作爬虫下载网页图片

参考：http://www.cnblogs.com/fnng/p/3576154.html

本文参考虫师的博客“python实现简单爬虫功能”，整理分析后抓取其他站点的图片并下载保存在本地。

抓取图片等网址：http://www.cnblogs.com/fnng/p/3576154.html
用到的正则表达式：reg = r'src="(.+?\.png)"'

源代码：

 #! /usr/bin/python

 # coding:utf-8

 #导入urllib与re模块

 import urllib

 import re

 # 定义一个函数获片取页面的信息，返回html文件。

 def getHtml(url):

   page = urllib.urlopen(url)

   html = page.read()

   return html

 #将页面中的图片保存为正则表达式对象，通过for循环，

 #利用urllib.urlretrieve()方法将所有图片下载到本地。

 def getImg(html):

     reg = r'src="(.+?\.png)"'

     imgre = re.compile(reg)

     imglist = re.findall(imgre,html)

     x = 0

     for imgurl in imglist:

       urllib.urlretrieve(imgurl,'%s.png' % x)

       x+=1

 html = getHtml("http://www.cnblogs.com/fnng/p/3576154.html")

　　2. 终端下看到的已下载好的图片

spdbmadeMacBook-Pro:crawler spdbma$ ls

0.png        2.png        4.png        6.png

1.png        3.png        5.png        getjpg.py

秒客网

python脚本工具－1 制作爬虫下载网页图片

相关文章