lucene使用IKAnalyzer3.2.5中文分词器进行索引的一个小例子

时间:2021-06-10 03:09:40

本文通过一个小例子方便大家学习IKAnalyzer3.2.5和lucene的索引功能。以下是需要的准备环境  需要两个jar包。

分别是lucene 3.5.0.jarIKAnalyzer3.2.5两个包

lucene使用IKAnalyzer3.2.5中文分词器进行索引的一个小例子

代码如下:

import java.io.File;
import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
//import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.IndexReader;

import org.apache.lucene.search.Query;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.LockObtainFailedException;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;
import org.wltea.analyzer.lucene.IKAnalyzer;
import org.wltea.analyzer.lucene.IKQueryParser;
import org.wltea.analyzer.lucene.IKSimilarity;

public class test {
	//private static final Directory IndexReader = null;

	public static void main(String args[]) throws Exception
	{
		RAMDirectory directory = new RAMDirectory();
		//File INDEX_DIR = new File("E:\\temp\\index");  
		
		Analyzer analyzer = new IKAnalyzer();
		
		IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_35, analyzer);  
        IndexWriter indexWriter = new IndexWriter(directory, iwc);
        
		String str = "你好俊杰,这是成功的开始!加油!";
		Document doc = new Document();
		doc.add(new Field("contents",str,Field.Store.YES,Field.Index.ANALYZED));
		indexWriter.addDocument(doc);
		
		str = "希望能够实现这一个项目,俊杰你可以的!";
		doc = new Document();
		doc.add(new Field("contents",str,Field.Store.YES,Field.Index.ANALYZED));
		indexWriter.addDocument(doc);
		
		indexWriter.close();
		
		//IndexReader reader = new IndexReader();
        IndexReader reader = IndexReader.open(directory);  
        IndexSearcher searcher = new IndexSearcher(reader); 
		searcher.setSimilarity(new IKSimilarity());
		String keyWords = "俊杰";
		
		Query query = IKQueryParser.parse("contents", keyWords);
		TopDocs topDocs = searcher.search(query, Integer.MAX_VALUE);
		System.out.println(topDocs.totalHits);
	
		
	}
}

同时还要给大家分享一个在编程过程中出现的问题。截图如下:

Exception in thread "main" org.apache.lucene.index.IndexNotFoundException: no segments* file found in org.apache.lucene.store.RAMDirectory@1938039 lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@c743eb: files: [_0.fdx, _0.fdt]
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:712)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:75)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:462)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:308)
at test.main(test.java:51)

出现这种 no segments* file found  的问题的时候,一种解决方法就是缺少了如下的一行代码。
indexWriter.close();