coreseek+sphinx+mysql+thinkphp整合

时间:2022-12-18 19:24:37

1、安装coreseek

  1.1首先升级或安装系统依赖库  

    

yum install make gcc g++ automake libtool mysql-client libmysqlclient15-dev libxml2-dev libexpat1-dev

  

  1.2下载Coreseek3.2.14

    解压:tar zxvf coreseek-3.2.14.tar.gz

    

    cd coreseek-3.2.14

    cd mmseg-3.2.14 #在安装前首先安装mmseg插件

    ./configure --prefix=/Data/apps/mmseg3

  

    如果报如下错误

    

    config.status: creating Makefile

    config.status: creating src/Makefile

    config.status: error: cannot find input file: src/Makefile.in

  

    通过安装autoconf和automake解决

    

    yum -y install autoconf automake

    aclocal

    configure.in:26: warning: macro `AM_PROG_LIBTOOL' not found in library

    yum -y install libtool

    aclocal

    libtoolize --force

    libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `config'.
    libtoolize: linking file `config/ltmain.sh'
    libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.in and
    libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree.
    libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am.     automake --add-missing     autoconf     autoheader     make clean

  

    完成上面操作,重新配置即可

    

       ./configure --prefix=/Data/apps/mmseg3

    ------------------------------------------------------------------------
    Configuration:     Source code location: .
    Compiler: gcc
    Compiler flags: -g -O2
    Host System Type: x86_64-redhat-linux-gnu
    Install path: /Data/apps/mmseg3     See config.h for further configuration information.
    ------------------------------------------------------------------------

  

    安装mmseg

    

    make && make install

    test -z "/Data/apps/mmseg3/etc" || /bin/mkdir -p "/Data/apps/mmseg3/etc"
    /usr/bin/install -c data/unigram.txt data/uni.lib data/mmseg.ini     '/Data/apps/mmseg3/etc'
    make[2]: Leaving directory `/Data/tgz/coreseek-3.2.14/mmseg-3.2.14'
    make[1]: Leaving directory `/Data/tgz/coreseek-3.2.14/mmseg-3.2.14'

  

  1.3安装完mmseg中文分词插件后,就可以安装Coreseek3.2.14了

    

ln -s /Data/apps/mmseg3/bin/mmseg /bin/mmseg

    cd ..

    cd csft-3.2.14/

    ./configure --prefix=/Data/apps/coreseek --without-unixodbc --with-mmseg-includes=/Data/apps/mmseg3/include/mmseg/ --with-mmseg-libs=/Data/apps/mmseg3/lib/ --with-mysql

    make && make install

  

    通过以上步骤,mmseg和Coreseek3.2.14就安装完成,通过ls命令查看安装后的目录及文件

    

[root@localhost csft-3.2.14]# ls /Data/apps/coreseek/

    bin etc var

  

2、检测Coreseek中文分词

  

1、cd /Data/tgz/coreseek-3.2.14/testpack/

  2、/Data/apps/mmseg3/bin/mmseg -d /Data/apps/mmseg3/etc var/test/test.xml

  3、/Data/apps/coreseek/bin/indexer -c etc/csft.conf --all,如果报错

  Unigram dictionary load Error,修改:/Data/tgz/coreseek-3.2.14/testpack/etc/csft.conf中的charset_dictpath = var/mmseg3/etc/路径为:charset_dictpath = /Data/apps/mmseg3/etc/即可

  4、/Data/apps/coreseek/bin/search -c etc/csft.conf 网络搜索,报错如下

  index 'xml': search error: failed to open var/data/xml.sph: No such file or directory.

  

  解决如下:

  yum install expat-devel*   然后重新安装sphinx/coreseek 即可

  重新执行命令:/Data/apps/coreseek/bin/search -c etc/csft.conf 网络搜索

  

Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]
  Copyright (c) 2007-2011,
  Beijing Choice Software Technologies Inc (http://www.coreseek.com)   using config file 'etc/csft.conf'...
  index 'xml': query '网络搜索 ': returned 1 matches of 1 total in 0.010 sec   displaying matches:
  1. document=1, weight=1, published=Thu Apr 1 15:20:07 2010, author_id=1   words:
  1. '网络': 1 documents, 1 hits
  2. '搜索': 2 documents, 5 hits

  

3、配置coreseek并使用

  3.1、把实例配置复制到安装coreseek的etc目录下

  

cp /Data/tgz/coreseek-3.2.14/testpack/etc/csft_mysql.conf /Data/apps/coreseek/etc/csft_mysql.conf

  3.2、修改配置文件

  

vim /Data/apps/coreseek/etc/csft_mysql.conf

  

source mysql
{
type = mysql sql_host = localhost
sql_user = test
sql_pass = root
sql_db = kp_account
sql_port = 3306
sql_query_pre = SET NAMES utf8 sql_query = SELECT goods_id, goods_id as gid, goods_name, add_time FROM fc_goods
#sql_query第一列id需为整数
#title、content作为字符串/文本字段,被全文索引
sql_attr_uint = gid #从SQL读取到的值必须为整数
sql_attr_timestamp = add_time #从SQL读取到的值必须为整数,作为时间属性 sql_query_info_pre = SET NAMES utf8 #命令行查询时,设置正确的字符集
sql_query_info = SELECT * FROM documents WHERE id=$id #命令行查询时,从数据库读取原始数据信息
} #index定义
index mysql
{
source = mysql #对应的source名称
path = /Data/apps/coreseek/var/data/mysql #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
docinfo = extern
mlock = 0
morphology = none
min_word_len = 1
html_strip = 0 #中文分词配置,详情请查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
#charset_dictpath = /usr/local/mmseg3/etc/ #BSD、Linux环境下设置,/符号结尾
charset_dictpath = /Data/apps/mmseg3/etc/
#charset_dictpath = etc/ #Windows环境下设置,/符号结尾,最好给出绝对路径,例如:C:/usr/local/coreseek/etc/...
charset_type = zh_cn.utf-8
} #全局index定义
indexer
{
mem_limit = 128M
} #searchd服务定义
searchd
{
listen = 9312
read_timeout = 5
max_children = 30
max_matches = 1000
seamless_rotate = 0
preopen_indexes = 0
unlink_old = 1
pid_file = /Data/apps/coreseek/var/log/searchd_mysql.pid #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
log = /Data/apps/coreseek/var/log/searchd_mysql.log #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
query_log = /Data/apps/coreseek/var/log/query_mysql.log #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
}

  保存

  3.3、建立索引

  

/Data/apps/coreseek/bin/indexer -c /Data/apps/coreseek/etc/csft_mysql.conf --all

  创建成功后提示

Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]
Copyright (c) 2007-2011,
Beijing Choice Software Technologies Inc (http://www.coreseek.com) using config file '/Data/apps/coreseek/etc/csft_mysql.conf'...
indexing index 'mysql'...
collected 811 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 811 docs, 15729 bytes
total 0.072 sec, 217260 bytes/sec, 11202.12 docs/sec
total 2 reads, 0.000 sec, 14.2 kb/call avg, 0.0 msec/call avg
total 7 writes, 0.000 sec, 10.3 kb/call avg, 0.0 msec/call avg

  3.4、启动服务

/Data/apps/coreseek/bin/searchd -c /Data/apps/coreseek/etc/csft_mysql.conf

  

4、使用sphinx有两种方法,一种是安装php的sphinx扩展,第二种是使用sphinx的接口,这里直接使用sphinx接口

  4.1、把sphinxapi.php文件复制到项目目录

  

cp api/sphinxapi.php /Data/apps/nginx/html/sphinx/

  4.2、php调用sphinx接口进行搜索

<?php
require("sphinxapi.php");
$sphinx = new SphinxClient();
$sphinx->setServer("127.0.0.1", 9312);
$keyword = $_GET['keyword'] ? $_GET['keyword'] : '轮胎';
$res = $sphinx->query($keyword, 'mysql');
print_r($res);

  运行结果:

Array ( [error] => [warning] => [status] => 0 [fields] => Array ( [0] => goods_name ) [attrs] => Array ( [gid] => 1 [add_time] => 2 ) [matches] => Array ( [1] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 1 [add_time] => 1411609439 ) ) [2] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 2 [add_time] => 1411610728 ) ) [3] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 3 [add_time] => 1411610941 ) ) [4] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 4 [add_time] => 1411611452 ) ) [5] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 5 [add_time] => 1411612148 ) ) [6] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 6 [add_time] => 1411670735 ) ) [7] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 7 [add_time] => 1411670863 ) ) [8] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 8 [add_time] => 1411671000 ) ) [9] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 9 [add_time] => 1411671203 ) ) [10] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 10 [add_time] => 1411671953 ) ) [11] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 11 [add_time] => 1411672386 ) ) [12] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 12 [add_time] => 1411672846 ) ) [13] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 13 [add_time] => 1411673187 ) ) [15] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 15 [add_time] => 1411675395 ) ) [16] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 16 [add_time] => 1411675806 ) ) [17] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 17 [add_time] => 1411675957 ) ) [18] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 18 [add_time] => 1411676124 ) ) [19] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 19 [add_time] => 1411676262 ) ) [21] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 21 [add_time] => 1411676661 ) ) [22] => Array ( [weight] => 1 [attrs] => Array ( [gid] => 22 [add_time] => 1411676864 ) ) ) [total] => 49 [total_found] => 49 [time] => 0.017 [words] => Array ( [汽车] => Array ( [docs] => 49 [hits] => 57 ) ) )

  小结:到此,就已经基本整合到php当中了,接下来就整合到ThinkPHP中去。

coreseek+sphinx+mysql+thinkphp整合的更多相关文章

  1. coreseek &lpar;sphinx&rpar;&plus; Mysql &plus; Thinkphp搭建中文搜索引擎详解

      一, 前言    1,研究coreseek的动机 我有一个自己的笔记博客,经常在上面做一些技术文章分析.在查询一些文章的时候,以前只能将要查询的内容去mysql中用like模糊匹配.在文章多了的情 ...

  2. Coreseek &plus; Sphinx &plus; Mysql &plus; PHP构建中文检索引擎

    首先明确几个概念 Sphinx是开源的搜索引擎,它支持英文的全文检索.所以如果单独搭建Sphinx,你就已经可以使用全文索引了.但是往往我们要求的是中文索引,怎么做呢?国人提供了一个可供企业使用的,基 ...

  3. centos&plus;php&plus;coreseek&plus;sphinx&plus;mysql之二sphinx配置篇

    先进入文件夹进行以下操作 cd /usr/local/coreseek/etc cp sphinx.conf.dist sphinx.conf source src1 { sql_host = 127 ...

  4. Coreseek&sol;sphinx全文检索的了解

    Coreseek/sphinx全文检索的了解 概述: 全文检索是一种将文件里全部文本与检索项匹配的文字资料检索方法,全文检索是将存储于数据库中整本书.整篇文章中的随意内容信息查找出来的检索.它能够依据 ...

  5. 搭建coreseek&lpar;sphinx&plus;mmseg3&rpar;详细安装配置&plus;php之sphinx扩展安装&plus;php调用示例&lpar;转&rpar;

    一个文档包含了安装.增量备份.扩展.api调用示例,省去了查找大量文章的时间. 搭建coreseek(sphinx+mmseg3)安装 [第一步] 先安装mmseg3 cd /var/install ...

  6. ThinkPHP 整合Bootstrap Ajax分页

    ThinkPHP Ajax分页代码 publicfunction index() { $where=array(); $name = I('name'); if(!empty($name)){ $wh ...

  7. ThinkPHP整合支付宝担保交易

    ThinkPHP整合支付宝担保交易本代码参考大神 http://www.thinkphp.cn/code/240.html 的思路 1.登陆支付宝后台,下载担保交易的集成包. 2.下载完成后的文件说明 ...

  8. Spring3 &plus; Spring MVC&plus; Mybatis 3&plus;Mysql 项目整合(注解及源码)

    Spring3 + Spring MVC+ Mybatis 3+Mysql 项目整合(注解及源码) 备注: 之前在Spring3 + Spring MVC+ Mybatis 3+Mysql 项目整合中 ...

  9. coreseek&lpar;sphinx&rpar;错误&colon;WARNING&colon; attribute &&num;39&semi;id&&num;39&semi; not found - IGNORING

    coreseek(sphinx)错误:WARNING: attribute 'id' not found - IGNORING原因及解决方法 coreseek(sphinx)建立索引时提示错误: WA ...

随机推荐

  1. 关于tomcat小知识

    1.linux下的tomcat中有用于记录日志的catalina.out文件(在logs下),windows中没有.catalina.out文件主要记录system.out  system.err等  ...

  2. DELPHI与C&num;语法比较

    1.我做了三年的.NET,也是三个月前因为项目需要转的delphi整个过渡差不多要一周到两周.正常情况两周后就能熟悉delphi.delphi可以调整开发环境的,你把他的属性和解决方案窗口调成和你用V ...

  3. Python 计算程序运行时间

    import time def start_sleep():    time.sleep(3) if __name__ == '__main__':    #The start time     st ...

  4. Codeforces 366C Dima and Salad

    http://codeforces.com/problemset/problem/366/C 题意:在一个冰箱里有n种水果,并且这些水果每一种都有一个美味度和一个卡路里的属性, 小明要从这些水果中选出 ...

  5. linux下mysql连接jar包的位置在哪里?

    linux下连接mysql数据库,肯定也会用到驱动jar包. 该jar包应该被置于jdk安装路径下jre文件夹lib目录的ext文件夹下.例如我的JDK安装路径为/usr/java/jdk1.6.0_ ...

  6. php中 0 与 字符串比较的问题

    今天我的技术群里,被一个很不起眼的问题炸出很多基础不稳的phper,就是这么一句 : <?php if(0 == 'yes'){ echo 'yes'; }else{ echo 'no'; } ...

  7. Linux根目录各个文件夹介绍及说明

    /bin 二进制可执行命令 /dev 设备特殊文件 /etc 系统管理和配置文件 /etc/rc.d 启动的配置文件和脚本 /home 用户主目录的基点,比如用户user的主目录就是/home/use ...

  8. PHP多进程消费队列

    引言 最近开发一个小功能,用到了队列mcq,启动一个进程消费队列数据,后边发现一个进程处理不过来了,又加了一个进程,过了段时间又处理不过来了...... 这种方式每次都要修改crontab,如果进程挂 ...

  9. 详解基于MSSQL &OpenCurlyDoubleQuote;order by”语句报错的SQL注入技术

    SQL注入,又名黑客技术之母,是一种臭名昭著的安全漏洞,由于流毒甚广,已经给网络世界造成了巨大的破坏.当然,对于该漏洞的利用技术,也是花样繁多,如访问存储在数据库中的数据,使用MySQL的load和i ...

  10. SA &sol; SAM 题目集

    上一次做 SA / SAM 相关的题还要数到某场毒瘤 NOIP 模拟赛--这么久没做了都快忘光了--写点东西记录一些最近做到的水好题. LOJ2059 「TJOI / HEOI2016」字符串 题意 ...