Elasticsearch 中文分词(elasticsearch-analysis-ik) 安装

Elasticsearch 中文分词(elasticsearch-analysis-ik) 安装
由于elasticsearch基于lucene，所以天然地就多了许多lucene上的中文分词的支持，比如 IK, Paoding, MMSEG4J等lucene中文分词原理上都能在elasticsearch上使用。当然前提是有elasticsearch的插件。至于插件怎么开发，这里有一片文章介绍：
http://log.medcl.net/item/2011/07/diving-into-elasticsearch-3-custom-analysis-plugin/
暂时还没时间看，留在以后仔细研究，这里只记录本人使用medcl提供的IK分词插件的集成步骤。

安装步骤：

1、到github网站下载源代码，网站地址为：https://github.com/medcl/elasticsearch-analysis-ik

右侧下方有一个按钮“Download ZIP"，点击下载源代码elasticsearch-analysis-ik-master.zip。

2、解压文件elasticsearch-analysis-ik-master.zip，进入下载目录，执行命令：
unzip elasticsearch-analysis-ik-master.zip
3、因为是源代码，此处需要使用maven打包，进入解压文件夹中，执行命令：
mvn clean package
4、将打包后，得到的目录文件target/releases下的elasticsearch-analysis-ik-1.9.4.zip复制到ES安装目录的plugins/analysis-ik目录下。

5、在plugins/analysis-ik目录下解压elasticsearch-analysis-ik-1.9.4.zip

6、在ES的配置文件elasticsearch.yml中增加ik的配置，在最后增加：
index.analysis.analyzer.ik.type: "ik"
7、重新启动elasticsearch服务，这样就完成配置了，收入命令：
curl -XPOST "http://localhost:9200/_analyze?analyzer=ik&pretty=true&text=helloworld,中华人民共和国"
　　
测试结果如下：
```
{
  "tokens" : [ {
    "token" : "helloworld",
    "start_offset" : 0,
    "end_offset" : 10,
    "type" : "ENGLISH",
    "position" : 0
  }, {
    "token" : "中华人民共和国",
    "start_offset" : 11,
    "end_offset" : 18,
    "type" : "CN_WORD",
    "position" : 1
  }, {
    "token" : "中华人民",
    "start_offset" : 11,
    "end_offset" : 15,
    "type" : "CN_WORD",
    "position" : 2
  }, {
    "token" : "中华",
    "start_offset" : 11,
    "end_offset" : 13,
    "type" : "CN_WORD",
    "position" : 3
  }, {
    "token" : "华人",
    "start_offset" : 12,
    "end_offset" : 14,
    "type" : "CN_WORD",
    "position" : 4
  }, {
    "token" : "人民共和国",
    "start_offset" : 13,
    "end_offset" : 18,
    "type" : "CN_WORD",
    "position" : 5
  }, {
    "token" : "人民",
    "start_offset" : 13,
    "end_offset" : 15,
    "type" : "CN_WORD",
    "position" : 6
  }, {
    "token" : "共和国",
    "start_offset" : 15,
    "end_offset" : 18,
    "type" : "CN_WORD",
    "position" : 7
  }, {
    "token" : "共和",
    "start_offset" : 15,
    "end_offset" : 17,
    "type" : "CN_WORD",
    "position" : 8
  }, {
    "token" : "国",
    "start_offset" : 17,
    "end_offset" : 18,
    "type" : "CN_CHAR",
    "position" : 9
  } ]
}
```
注意点：

本人绕了很多弯路，网上很多都不行，总结：

一、maven一定要编译，因为elasticsearch和ik各个版本不同，对应编译生成的文件就不同，所以想引用elasticsearch-rtm包的朋友，一定要注意区分。

二、我是通过rpm安装elasticsearch，事实证明字典config目录，可以在plugins目录下，和插件unzip放在一起

参考资料:

elasticsearch中文分词

elasticsearch安装plugin----ik

ElasticSearch中文分词ik安装

Elasticsearch初步使用(安装、Head配置、分词器配置)
相关阅读:
Sqoop详细知识
 数据分析与数据挖掘
 数仓星形模型与雪花模型简单理解
 mapreduce多进程与spark多线程比较
 ETL工具总结
 数据仓库概述
 利用 Azure Devops 创建和发布 Nuget 包
 设置 Nuget 本地源、在线私有源、自动构建打包
 简单理解 OAuth 2.0 及资料收集，IdentityServer4 部分源码解析
 asp.net core 健康检查
原文地址：https://www.cnblogs.com/Hai--D/p/5751403.html

Elasticsearch 中文分词(elasticsearch-analysis-ik) 安装

安装步骤：