图书馆最新购买书籍

欢迎拜访我的新博客～～
http://blog.xieldy.cn

上周写的一个练手的小爬虫，用来自动抓取西电图书馆的最新购买的书籍，程序很简单，直接贴代码：

#encoding=utf8
import urllib

def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html

def content(html):
    content=[]
    nextpart=html
    flag=1
    while flag==1:
        str1= ',t:"'
        nextpart = nextpart.partition(str1)[2]
        str2 = '"}'
        if nextpart.partition(str2)[1]==str2:
            flag=1
        else:
            flag=0
        content.append(nextpart.partition(str2)[0])
    return content
    
def main():
    html=getHtml("http://al.lib.xidian.edu.cn/cgi-bin/newbook.cgi?base=ALL&cls=ALL&date=180")
    a = content(html)
    print "以下为图书馆最新购买书籍："
    for i in a:
        print i

main()

秒客网

图书馆最新购买书籍

相关文章