ElasticSearch-分词器 analyze-ik

时间:2021-04-19 00:59:13

下载安装

到/mydata/elasticsearch/config 下面直接解压ik 之后重启es

测试

post _analyze?
{
  "text":"*",
  "analyzer":"ik_smart"
}

{
  "tokens": [
    {
      "token": "中华",
      "start_offset": 0,
      "end_offset": 2,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "人民",
      "start_offset": 2,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 1
    },
    {
      "token": "*",
      "start_offset": 4,
      "end_offset": 7,
      "type": "CN_WORD",
      "position": 2
    }
  ]
}

ElasticSearch-分词-自定义扩展词库

先下载nginx

cd /mydata
mkdir  nginx
启动nginx (没有会去下载)
docker run -p 80:80 --name nginx -d nginx:1.10
复制 nginx
docker container cp nginx:/etc/nginx .
从命名nginx
mv nginx conf


停止docker stop nginx
删除docker rm nginx
创建
docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf:/etc/nginx \
-d nginx:1.10

wqg Mall 能显示就可
创建 ex/fenci.txt
访问http://192.168.56.10/es/fenci.txt

配置自定义分词器

修改 /mydata/elasticsearch/plugins/analysis-ik/config 下面的IKAnalyzer.cfg.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
	<comment>IK Analyzer 扩展配置</comment>
	<!--用户可以在这里配置自己的扩展字典 -->
	<entry key="ext_dict"></entry>
	 <!--用户可以在这里配置自己的扩展停止词字典-->
	<entry key="ext_stopwords"></entry>
	<!--用户可以在这里配置远程扩展字典 -->
	<entry key="remote_ext_dict">http://192.168.56.10/es/fenci.txt</entry>
	<!--用户可以在这里配置远程扩展停止词字典-->
	<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

重启es
每次自定义分词 需要加入fenci.txt中 并重启es即可