webscraping_indexing:通过Lucene索引网络抓取的数据下载

【文件属性】：

文件名称：webscraping_indexing:通过Lucene索引网络抓取的数据

文件大小：149KB

文件格式：ZIP

更新时间：2024-04-07 05:34:43

Java

webscraping_indexing 正在安装要运行代码，首先需要克隆存储库。 git clone https://github.com/abdelrahim-hentabli/webscraping_indexing.git 编译并运行要在Linux系统上进行编译，请转到主目录并运行compile.sh文件。您可能需要授予其执行权限， chmod +x compile.sh 您需要导出PATH_TO_LUCENE变量以使编译正确运行 export PATH_TO_LUCNE= 每次重新打开终端时都需要执行此操作，也可以将其放入.bashrc / .zshrc中

立即下载

【文件预览】：
webscraping_indexing-main
----.gitignore(348B)
----src()
--------HadoopQuery.java(2KB)
--------ArrayListTextWritable.java(2KB)
--------server()
--------CSVNLineInputFormat.java(5KB)
--------HadoopIndex.java(7KB)
--------LuceneQuery.java(4KB)
--------LuceneIndex.java(3KB)
--------CSVLineRecordReader.java(9KB)
--------Pair.java(142B)
----Phase1 Project Report.pdf(113KB)
----README.md(597B)
----main()
--------hadoop_index.sh(91B)
--------tweepy_scraping()
--------compile.sh(336B)
--------query.sh(260B)
--------lucene_index.sh(245B)
--------index.lucene()

秒客网

webscraping_indexing:通过Lucene索引网络抓取的数据

网友评论

相关文章