boilerpipe-api:从HTML页面提取主要文章文本下载

【文件属性】：

文件名称：boilerpipe-api:从HTML页面提取主要文章文本

文件大小：1.26MB

文件格式：ZIP

更新时间：2024-06-07 23:22:26

Scala

该API将Java包装到HTTP API中，以从HTML页面提取原始文章文本。用法有两种使用API的方法。您可以传递url或原始html： curl -X POST http://localhost:3000/extract -H " Content-Type: application/json " -d ' { "url": "http://techcrunch.com/2014/07/07/matterport-16m-dcm/" } ' curl -X POST http://localhost:3000/extract -H " Content-Type: application/json " -d ' { "html": "YOUR HTML CODE HERE" } ' 跑步运行API的最简单方法是使用Docker。可在blik

立即下载

【文件预览】：
boilerpipe-api-master
----project()
--------build.properties(19B)
--------plugins.sbt(67B)
----circle.yml(525B)
----src()
--------main()
----Dockerfile(361B)
----lib()
--------nekohtml-1.9.13.jar(119KB)
--------boilerpipe-1.2.0.jar(105KB)
--------xerces-2.9.1.jar(1.17MB)
----build.sbt(597B)
----README.md(681B)
----.gitignore(10B)

秒客网

boilerpipe-api:从HTML页面提取主要文章文本

网友评论

相关文章