ilovescience:arxiv.org文章的文本挖掘下载

【文件属性】：

文件名称：ilovescience:arxiv.org文章的文本挖掘

文件大小：233KB

文件格式：ZIP

更新时间：2024-05-30 09:51:43

JupyterNotebook

爱情科学用于文章文本挖掘的脚本集剧本 articles_crawl.py加载文章。可能要花几个小时 annotations_crawl.py加载注解 lda.py通过提取主题 terms_cn.py计算文章库中的关键字 cites.py计算引用并显示引用最多的文章 word_vec.py建立模型脚本存储在src路径中。以.txt格式存储的文章，格式为arxiv/

///结果存储在stat路径中。用法 discover.py

.运行所有分析脚本 notes.py

.使用计算出的统计信息生成并打开Jupyter笔记本

立即下载

【文件预览】：
ilovescience-master
----README.md(806B)
----notes.py(597B)
----arxiv()
--------.gitignore(70B)
----topics()
--------.gitignore(70B)
----stat()
--------.gitignore(70B)
----src()
--------lda.py(5KB)
--------article_crawl.py(5KB)
--------cache()
--------cites.py(6KB)
--------annotation_crawl.py(2KB)
--------word_vec.py(6KB)
--------terms_cn.py(3KB)
--------extra()
----requirements.txt(162B)
----notebooks()
--------visual.py(23KB)
--------template.json(3KB)
--------abbreviations()
--------demo()
----discover.py(786B)

秒客网

ilovescience:arxiv.org文章的文本挖掘

网友评论

相关文章