java开源软件项目网络爬虫-webharvest

时间:2013-12-20 14:53:31
【文件属性】:

文件名称:java开源软件项目网络爬虫-webharvest

文件大小:5.47MB

文件格式:RAR

更新时间:2013-12-20 14:53:31

java 开源 软件 网络爬虫

The main goal behind Web-Harvest is to empower the usage of already existing extraction technologies. Its purpose is not to propose a new method, but to provide a way to easily use and combine the existing ones. Web-Harvest offers the set of processors for data handling and control flow. Each processor can be regarded as a function - it has zero or more input parameters and gives a result after execution. Processors could be combined in a pipeline, making the chain of execution. For easier manipulation and data reuse Web-Harvest provides variable context where named variables are stored. The following diagram describes one pipeline


网友评论

  • 不错,不错~
  • 不是很好使,有一些问题
  • 还行,但是用起来还是不够灵活
  • 没有文档说明,看不懂
  • 没有说明文档
  • 能够运行,没有说明文档啊,只有几个例子,例子里面都没有注释。。。。慢慢琢磨中
  • 没有说明文档很纠结啊,编译也过不了……
  • 没有文档 挺乱的
  • 没有说明文档,还是顶一个
  • E文的看不懂,是swing的JAVA本地程序,抓取对应网址后只有一个xml文件?