文件名称:微软亚洲研究院中文分词语料___icwb2-data
文件大小:50.24MB
文件格式:ZIP
更新时间:2021-10-31 13:02:32
语料集
微软亚洲研究院中文分词语料_icwb2-data_自然语言处理_科研数据集
【文件预览】:
icwb2-data
----gold()
--------as_testing_gold.utf8(920KB)
--------msr_training_words.txt(723KB)
--------as_training_words.utf8(1.33MB)
--------msr_test_gold.txt(569KB)
--------cityu_test_gold.txt(171KB)
--------pku_training_words.utf8(479KB)
--------pku_training_words.txt(339KB)
--------as_testing_gold.txt(624KB)
--------cityu_training_words.utf8(571KB)
--------cityu_training_words.txt(412KB)
--------msr_training_words.utf8(1.02MB)
--------pku_test_gold.utf8(701KB)
--------cityu_test_gold.utf8(235KB)
--------as_training_words.txt(951KB)
--------pku_test_gold.txt(539KB)
--------msr_test_gold.utf8(749KB)
----scripts()
--------score(7KB)
--------mwseg.pl(3KB)
----doc()
--------result_instructions.txt(4KB)
--------instructions.txt(7KB)
----testing()
--------as_test.txt(412KB)
--------pku_test.utf8(498KB)
--------pku_test.txt(335KB)
--------cityu_test.utf8(197KB)
--------as_test.utf8(604KB)
--------cityu_test.txt(133KB)
--------msr_test.txt(367KB)
--------msr_test.utf8(547KB)
----training()
--------as_training.utf8(38.86MB)
--------cityu_training.txt(5.94MB)
--------cityu_training.utf8(8.15MB)
--------pku_training.txt(5.63MB)
--------as_training.b5(26.36MB)
--------pku_training.utf8(7.37MB)
--------msr_training.txt(12.25MB)
--------msr_training.utf8(16.11MB)
----README(2KB)