Heritrix 网络爬虫

时间:2014-10-04 04:59:11
【文件属性】:

文件名称:Heritrix 网络爬虫

文件大小:21.72MB

文件格式:ZIP

更新时间:2014-10-04 04:59:11

网络爬虫 java

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project. Heritrix (sometimes spelled heretrix, or misspelled or missaid as heratrix/heritix/ heretix/heratix) is an archaic word for heiress (woman who inherits). Since our crawler seeks to collect and preserve the digital artifacts of our culture for the benefit of future researchers and generations, this name seemed apt.


网友评论

  • 居然是源码,太假了,到处都有下的。