lucene使用IKAnalyzer3.2.5中文分词器进行索引的一个小例子

时间:2021-08-10 05:53:43

本文通过一个小例子方便大家学习IKAnalyzer3.2.5和lucene的索引功能。以下是需要的准备环境  需要两个jar包。

分别是lucene 3.5.0.jarIKAnalyzer3.2.5两个包

lucene使用IKAnalyzer3.2.5中文分词器进行索引的一个小例子

代码如下:

import java.io.File;
import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
//import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.IndexReader;

import org.apache.lucene.search.Query;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.LockObtainFailedException;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;
import org.wltea.analyzer.lucene.IKAnalyzer;
import org.wltea.analyzer.lucene.IKQueryParser;
import org.wltea.analyzer.lucene.IKSimilarity;

public class test {
//private static final Directory IndexReader = null;

public static void main(String args[]) throws Exception
{
RAMDirectory directory = new RAMDirectory();
//File INDEX_DIR = new File("E:\\temp\\index");

Analyzer analyzer = new IKAnalyzer();

IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_35, analyzer);
IndexWriter indexWriter = new IndexWriter(directory, iwc);

String str = "你好俊杰,这是成功的开始!加油!";
Document doc = new Document();
doc.add(new Field("contents",str,Field.Store.YES,Field.Index.ANALYZED));
indexWriter.addDocument(doc);

str = "希望能够实现这一个项目,俊杰你可以的!";
doc = new Document();
doc.add(new Field("contents",str,Field.Store.YES,Field.Index.ANALYZED));
indexWriter.addDocument(doc);

indexWriter.close();

//IndexReader reader = new IndexReader();
IndexReader reader = IndexReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
searcher.setSimilarity(new IKSimilarity());
String keyWords = "俊杰";

Query query = IKQueryParser.parse("contents", keyWords);
TopDocs topDocs = searcher.search(query, Integer.MAX_VALUE);
System.out.println(topDocs.totalHits);


}
}

同时还要给大家分享一个在编程过程中出现的问题。截图如下:

Exception in thread "main" org.apache.lucene.index.IndexNotFoundException: no segments* file found in org.apache.lucene.store.RAMDirectory@1938039 lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@c743eb: files: [_0.fdx, _0.fdt]
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:712)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:75)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:462)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:308)
at test.main(test.java:51)

出现这种 no segments* file found  的问题的时候,一种解决方法就是缺少了如下的一行代码。
indexWriter.close();