WebCrawler:包含Java中的webCrawler实现

时间:2021-05-26 12:24:48
【文件属性】:
文件名称:WebCrawler:包含Java中的webCrawler实现
文件大小:367KB
文件格式:ZIP
更新时间:2021-05-26 12:24:48
Java 网络爬虫 包含Java中的webCrawler实现搜寻器包含四个类,即WebCrawler.java,LinksManage.java,PageLinkExtractor.java,UrlAccessor.java。 “ designOfCrawler.png”文件显示了应用程序的结构。 算法 : 1. First the seedUrl is parsed. 2. It is also stored in the visited Urls Set. 3. Then all the links found in that Url are stored in the a List. These Urls are to be visited. 4. Then till the required number of Urls are visited, 1. The first Url i
【文件预览】:
WebCrawler-master
----manifest.mf(82B)
----CrawlerInfo.txt(2KB)
----src()
--------webcrawler()
----lib()
--------jsoup-1.8.2.jar(308KB)
--------CopyLibs()
--------jsoup-1.8.1.jar(39KB)
--------nblibraries.properties(173B)
----build()
--------classes()
----README.md(2KB)
----designOfCrawler.png(17KB)
----build.xml(3KB)
----nbproject()
--------genfiles.properties(467B)
--------project.properties(2KB)
--------private()
--------build-impl.xml(78KB)
--------project.xml(671B)

网友评论