文件名称:Quora_query:quora短文本相似
文件大小:192.28MB
文件格式:ZIP
更新时间:2024-06-08 03:45:46
deep-learning tensorflow cnn lstm quora-question-pairs
Quora Question Pairs (短文本主题相似) 使用Siamese网络结构: 采用BLSTM最后一个神经元的输出,训练准确率93,测试准确率为83 过拟合解决方法:期权,正则,但是还没有做. 数据预处理还没有做完. 单层LSTM有问题,可以继续搞一搞,但基本知道什么问题了 数据(data文件夹) /data/csv/train.csv : Quora公开的数据集,具有数据标签 /data/csv/test_part_aa, /data/csv/test_part_bb : 测试数据(test.py)split之后的数据,可以使用cat连接数据。 /data/vovab.model : VocabularyProcessor的模型(max_length = 60) /data/lr_sentiment.model : logistics regression回归模型,用来预测情
【文件预览】:
Quora_query-master
----edit_distance.cpp(3KB)
----data()
--------csv()
--------xgb_sentiment.model(966KB)
--------lr_sentiment.model(71KB)
--------stop_words_eng.txt(6KB)
--------vocab.model(1MB)
----extral_features.py(11KB)
----PreProcess.py(18KB)
----cnn_src()
--------model()
--------test.py(2KB)
--------train.py(12KB)
--------__pycache__()
--------temp.pkl(70.81MB)
--------cnn.py(7KB)
--------png()
--------cnn_RF.py(14KB)
----rnn_src()
--------train.py(3KB)
--------siamese_network.py(9KB)
----lstm_src()
--------train.py(8KB)
--------__pycache__()
--------lstm.py(9KB)
----papers()
--------NATURAL LANGUAGE INFERENCE OVER INTERACTION SPACE.pdf(607KB)
--------Siamese with Random Forest for duplicate.pdf(205KB)
--------Scaling Quality On Quora Using Machine Learning.pdf(3.62MB)
--------Quora Question Pairs Identify if two questions have the same intent.pdf(357KB)
--------Identifying Quora question pairs having the same intent.pdf(127KB)
--------C-LSTM.pdf(119KB)
--------A Survey of Community Question Answering.pdf(1008KB)
--------Quora Question Duplication.pdf(552KB)
--------kaggle_report_shangjiao.pdf(2.13MB)
--------Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks.pdf(304KB)
--------Quora_winning_solution.pdf(1.69MB)
----论文()
--------NATURAL LANGUAGE INFERENCE OVER INTERACTION SPACE.pdf(607KB)
--------Quora Question Pairs Identify if two questions have the same intent.pdf(357KB)
--------Identifying Quora question pairs having the same intent.pdf(127KB)
--------Quora Question Duplication.pdf(552KB)
--------kaggle_report_shangjiao.pdf(2.13MB)
--------Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks.pdf(304KB)
--------Quora_winning_solution.pdf(1.66MB)
----integration()
--------integration.py(12KB)
--------train.py(11KB)
--------__pycache__()
----README.md(7KB)