大数据量中的模糊查询优化方案

----------------------------------------------------------------------------------------------
[版权申明：本文系作者原创，转载请注明出处]
文章出处： http://blog.csdn.net/sdksdk0/article/details/52589761
作者：朱培 ID：sdksdk0

--------------------------------------------------------------------------------------------

对工作单使用 like模糊查询时，实际上数据库内部索引无法使用，需要逐条比较查询内容，效率比较低在数据量很多情况下，提供模糊查询性能，我们可以使用lucene全文索引库技术。本文示例是在SSH框架中进行使用。使用Hibernate Search (用来整合 Hibernate + Lucene)，工作单搜索功能。

1、首先可以在我们的maven工程中引入需要的jar包，

		<dependency>			<groupId>org.hibernate</groupId>
			<artifactId>hibernate-search</artifactId>
			<version>3.4.2.Final</version>
		</dependency>

2、导入IKAnalyzer分词器。因为IKAnalyzer在maven中没有，所以我们需要手动下载这个jar包,当然了，在http://mvnrepository.com/网站上面可以找到。

下载好之后可以装载到你自己的maven仓库中或者直接放在你工程的lib目录中，然后来引用：例如我的是在

		<dependency>	        <groupId>org.wltea</groupId>	        <artifactId>IKAnalyzer</artifactId>	        <version>2012_u6</version>	        <scope>system</scope>	        <systemPath>E:\myeclipse_work\BOS\src\main\webapp\WEB-INF\lib\IKAnalyzer2012_u6.jar</systemPath>   		</dependency>

3、在resource目录中新建stopword.dic文件，内容为：

aanandareasatbebutbyforifinintoisitnonotofonorsuchthatthetheirthentherethesetheythistowaswillwith

4、新建一个IKAnalyzer.cfg.xml文件，内容为：

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">  <properties>  	<comment>IK Analyzer 扩展配置</comment>	<!--用户可以在这里配置自己的扩展字典 	<entry key="ext_dict">ext.dic;</entry> 	-->	<!--用户可以在这里配置自己的扩展停止词字典-->	<entry key="ext_stopwords">stopword.dic;</entry> 	</properties>

5、在spring中进行配置：在配置SessionFactory中加入一行：当然了，这个时候需要自己去D盘目录中新建一个文件夹DBIndex

<!-- 配置索引库 -->				<prop key="hibernate.search.default.indexBase">d:/DBIndex</prop>

完整的如下：

<!-- 配置SessionFactory  -->	<bean id="sessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean">		<property name="dataSource" ref="dataSource"></property>		<!-- 配置hibernate 属性 ，参考 hibernate.properties 文件 -->		<property name="hibernateProperties">			<props>				<prop key="hibernate.dialect">org.hibernate.dialect.MySQL5InnoDBDialect</prop>				<prop key="hibernate.show_sql">true</prop>				<prop key="hibernate.format_sql">true</prop>				<prop key="hibernate.hbm2ddl.auto">update</prop>				<!-- 配置索引库 -->				<prop key="hibernate.search.default.indexBase">d:/DBIndex</prop>			</props>		</property>		<!-- 映射hbm -->		<property name="mappingDirectoryLocations" value="classpath:cn/tf/bos/domain"></property>	</bean>

6、在想要实现查询功能的那个domain中添加注解：想要搜索哪个字段就在哪个字段上面加上@Field注解，注意导入的是IKAnalyzer的分词器，不是hibernate-search的分词器。

@Indexed@Analyzer(impl = IKAnalyzer.class)public class WorkOrderManage implements java.io.Serializable {	// Fields	@DocumentId	private String id;	@Field	private String arrivecity;  //到达城市	@Field	private String product;

分词的效果如下：

使用 Luke 工具，查询索引文件内容！在cmd中运行 java -jar lukeall-3.5.0.jar，即可打开下图这个页面，查看具体的索引信息。

大数据量中的模糊查询优化方案

7、在界面中添加搜索框，我这里使用的是easyui，so...

<div data-options="region:'north'">		<!-- 编写搜索框 -->		<!--			 prompt 默认提示内容			 menu 搜索条件下拉选项 			 searcher 点击搜索按钮执行js函数名称		 -->		<input id="ss" class="easyui-searchbox" style="width:300px" 			data-options="prompt:'请输入您的查询内容',menu:'#nm',searcher:doSearch"/>					<div id="nm">			<div data-options="name:'arrivecity'">按照到达地搜索</div>			<div data-options="name:'product'">按照货物名称搜索</div>		</div>	</div>

8、写doSeach这个js函数

	function doSearch(value,name){		//将查询条件缓存到datagrid		$('#grid').datagrid('load',{			conditionName:name,			conditionValue:value		});	}

9、在action中接收页面传过来的name和value属性的值，然后进行处理：

public String findByPage(){				if(conditionName!=null && conditionName.trim().length()>0 && conditionValue!=null && conditionValue.trim().length()>0){			//有条件查询			PageResponseBean pageResponseBean=workordermanagerService.findByLucene(conditionName,conditionValue,page,rows);			ActionContext.getContext().put("pageResponseBean", pageResponseBean);					}else{			DetachedCriteria detachedCriteria=DetachedCriteria.forClass(WorkOrderManage.class);			PageRequestBean  pageRequestBean=initPageRequestBean(detachedCriteria);						PageResponseBean pageResponseBean=workordermanagerService.findByPage(pageRequestBean);						ActionContext.getContext().put("pageResponseBean", pageResponseBean);		}		return "findByPage";	}		private String conditionName;	private String conditionValue;	public void setConditionName(String conditionName) {		this.conditionName = conditionName;	}	public void setConditionValue(String conditionValue) {		this.conditionValue = conditionValue;	}

返回值之后如何处理这里我就不在说了。

10、在service中进行处理，经过service和serviceImpl之后，就会到达dao中，所以我们可以在dao中进行处理。

//luence查询	@Override	public PageResponseBean findByLucene(String conditionName,			String conditionValue, int page, int rows) {		Session session=this.getSession();		FullTextSession  fullTextSession=new FullTextSessionImpl(session);				Query query=new WildcardQuery(new Term(conditionName,"*"+conditionValue+"*"));				//获得全文检索的query		FullTextQuery  fullTextQuery=fullTextSession.createFullTextQuery(query);		PageResponseBean  pageResponseBean=new PageResponseBean();		pageResponseBean.setTotal(fullTextQuery.getResultSize());				//当前页数据		int firstResult=(page-1)*rows;		int maxResults=rows;		List  list=fullTextQuery.setFirstResult(firstResult).setMaxResults(maxResults).list();		pageResponseBean.setRows(list);				return pageResponseBean;	}

11、在页面中查看搜索的效果

大数据量中的模糊查询优化方案

这样我们整个开发流程就完成了。使用luence对大数据量中的模糊查询是非常实用的功能，当然了，luence只适用于站内搜索，对于模糊查询的支持还是非常好的。

秒客网

大数据量中的模糊查询优化方案

相关文章