文件名称:Faculty-App-Scraper:屏幕抓取应用
文件大小:368KB
文件格式:ZIP
更新时间:2024-07-25 15:53:16
Python
Faculty-App-Scraper 从旧网站上抓取 PDF 文档的快速/廉价方法。 ##目的:从 GCAST 下载 PDF 格式的研究生申请 工具: - python 2.7 - selenium: http://www.seleniumhq.org/ - beautiful soup: http://www.crummy.com/software/BeautifulSoup/ 过程: - Use Selenium to drive Firefox and open the GCAST page, pausing for entry of HUID - Navigate to the list of applications - "Click" on each application to downloading the PDF file - The PDF file
【文件预览】:
Faculty-App-Scraper-master
----docs()
--------click_counts.xlsx(32KB)
--------aries_review_2011-1110.docx(112KB)
--------aries_review_2011-1110.pdf(128KB)
--------faculty_recommendations.docx(106KB)
----gcast_scraping()
--------old()
--------scraper()
----.gitignore(675B)
----README.md(878B)
----faculty_scraper()
--------__init__.py(0B)
--------settings.py(4KB)
--------common()
--------manage.py(503B)
--------urls.py(585B)
--------candidate()