文件名称:python-scraping
文件大小:38KB
文件格式:ZIP
更新时间:2021-05-14 10:40:59
python
Code samples from the book Web Scraping with Python http://shop.oreilly.com/product/0636920034391.do
【文件预览】:
chapter6
----3-readingCsv.py(330B)
----4-readingCsvDict.py(319B)
----1-getText.py(144B)
----6-readDocx.py(769B)
----2-getUtf8Text.py(341B)
----5-readPdf.py(699B)
chapter12
----2-seleniumCookies.py(545B)
----1-headers.py(520B)
----3-honeypotDetection.py(552B)
chapter4
----6-wikiHistories.py(2KB)
----1-searchTwitter.py(276B)
----3-getTwitterStatus.py(312B)
----4-decodeJson.py(287B)
----2-updateTwitter.py(287B)
----5-jsonParsing.py(416B)
chapter7
----2-clean2grams.py(1KB)
----1-2grams.py(504B)
chapter9
----3-cookies.py(375B)
----5-BasicAuth.py(235B)
----1-simpleForm.py(170B)
----4-sessionCookies.py(390B)
----2-fileSubmission.py(199B)
read.t
chapter2
----6-regularExpressions.py(320B)
----7-lambdaExpressions.py(261B)
----4-findSiblings.py(270B)
----2-selectByAttribute.py(249B)
----1-selectByClass.py(287B)
----5-findParents.py(272B)
----3-findDescendants.py(259B)
chapter13
----6-combinedTest.py(1KB)
----1-wikiUnitTest.py(2KB)
----5-takeScreenshot.py(520B)
----4-dragAndDrop.py(749B)
----2-wikiSeleniumTest.py(323B)
----3-interactiveTest.py(1KB)
chapter8
----2-countUncommon2Grams.py(2KB)
----5-NltkTokenize.py(142B)
----7-NltkAnalysis.py(486B)
----6-NltkSearch.py(167B)
----1-count2Grams.py(1KB)
----3-markovGenerator.py(2KB)
----4-6DegreesFinder.py(1KB)
.gitignore
chapter1
----2-beautifulSoup.py(204B)
----3-exceptionHandling.py(581B)
----1-basicExample.py(164B)
chapter11
----2-cleanImage.py(566B)
----4-solveCaptcha.py(2KB)
----3-readWebImages.py(1KB)
----1-basicImage.py(189B)
chapter3
----scrapy()
--------wikiSpider()
----4-getExternalLinks.py(2KB)
----2-crawlWikipedia.py(915B)
----3-crawlSite.py(2KB)
----5-getAllExternalLinks.py(3KB)
----1-getWikiLinks.py(576B)
README.md
chapter10
----1-seleniumBasic.py(430B)
----3-javascriptRedirect.py(941B)
----2-waitForLoad.py(698B)
chapter5
----4-mysqlBasicExample.py(254B)
----5-storeWikiLinks.py(1KB)
----7-sendEmail.py(289B)
----1-getPageMedia.py(1KB)
----3-scrapeCsv.py(637B)
----6-6DegreesCrawlWiki.py(2KB)
----8-sendEmailWhenChristmas.py(742B)
----2-createCsv.py(296B)
chapter14
----2-seleniumSocks.py(276B)
----1-socks.py(204B)