• ES系列一、CentOS7安装ES 6.3.1、集成IK分词器


    Elasticsearch 6.3.1 地址:

    wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.1.tar.gz

    2.安装配置

    1.拷贝

    拷贝到服务器上,解压:tar -xvzf elasticsearch-6.3.1.tar.gz 。解压后路径:/home/elasticsearch-6.3.1

    3.创建用户

    创建用户,创建esdata目录,并赋予权限

    [root@bogon home]# adduser esuser
    [root@bogon home]# cd /home
    [root@bogon home]# mkdir -p esdata/data
    [root@bogon home]# mkdir -p esdata/log
    [root@bogon home]# chown -R esuser elasticsearch-6.3.1 
    [root@bogon home]# chown -R esuser esdata

    4.配置es节点

    [root@bogon esdata]# cat /home/elasticsearch-6.3.1/config/elasticsearch.yml
    # ======================== Elasticsearch Configuration =========================
    #
    # NOTE: Elasticsearch comes with reasonable defaults for most settings.
    #       Before you set out to tweak and tune the configuration, make sure you
    #       understand what are you trying to accomplish and the consequences.
    #
    # The primary way of configuring a node is via this file. This template lists
    # the most important settings you may want to configure for a production cluster.
    #
    # Please consult the documentation for further information on configuration options:
    # https://www.elastic.co/guide/en/elasticsearch/reference/index.html
    #
    # ---------------------------------- Cluster -----------------------------------
    #
    # Use a descriptive name for your cluster:
    #
    cluster.name: my-application
    #
    # ------------------------------------ Node ------------------------------------
    #
    # Use a descriptive name for the node:
    #
    node.name: node-1
    #
    # Add custom attributes to the node:
    #
    node.attr.rack: r1
    #
    # ----------------------------------- Paths ------------------------------------
    #
    # Path to directory where to store the data (separate multiple locations by comma):
    #
    path.data: /home/esdata/data
    #
    # Path to log files:
    #
    path.logs: /home/esdata/log
    #
    # ----------------------------------- Memory -----------------------------------
    #
    # Lock the memory on startup:
    #
    bootstrap.memory_lock: true
    #
    # Make sure that the heap size is set to about half the memory available
    # on the system and that the owner of the process is allowed to use this
    # limit.
    #
    # Elasticsearch performs poorly when the system is swapping the memory.
    #
    # ---------------------------------- Network -----------------------------------
    #
    # Set the bind address to a specific IP (IPv4 or IPv6):
    # 允许访问的ip,0.0.0.0表示任意ip可以访问
    network.host: 0.0.0.0
    #
    # Set a custom port for HTTP:
    # 对外端口
    http.port: 9200
    #
    # For more information, consult the network module documentation.
    #
    # --------------------------------- Discovery ----------------------------------
    #
    # Pass an initial list of hosts to perform discovery when new node is started:
    # The default list of hosts is ["127.0.0.1", "[::1]"]
    # 集群其他节点IP,只有一个节点写本机ip
    discovery.zen.ping.unicast.hosts: ["host1", "host2"]
    #
    # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
    #
    #discovery.zen.minimum_master_nodes:
    #
    # For more information, consult the zen discovery module documentation.
    #
    # ---------------------------------- Gateway -----------------------------------
    #
    # Block initial recovery after a full cluster restart until N nodes are started:
    # 集群节点数量
    gateway.recover_after_nodes: 1
    #
    # For more information, consult the gateway module documentation.
    #
    # ---------------------------------- Various -----------------------------------
    #
    # Require explicit names when deleting indices:
    #
    action.destructive_requires_name: true

    3.配置系统参数

    [root@bogon bin]#  vim /etc/security/limits.conf(在文件最后添加)
    esuser hard nofile 65536
    esuser soft nofile 65536
    esuser soft memlock unlimited
    esuser hard memlock unlimited

    以上配置解决问题:

    max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
    memory locking requested for elasticsearch process but memory is not locked
    临时设置:sysctl -w vm.max_map_count=262144
    永久修改:
    修改vim /etc/sysctl.conf 文件,添加 “vm.max_map_count”设置
    并执行:sysctl -p

    以上配置解决问题:

    max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
    [root@bogon logs]# visudo
    。。。。。。。。
    ## Allow root to run any commands anywhere
    root    ALL=(ALL)       ALL
    esuser  ALL=(ALL)       ALL
    。。。。。。。。

    以上配置解决某些情况下无法读写的问题

    1.ulimit -n和-u可以查看linux的最大进程数和最大文件打开数

    1、vim /etc/security/limits.d/90-nproc.conf文件尾添加

    * soft nproc 204800  
    * hard nproc 204800  
    

      

    2、vim /etc/security/limits.d/def.conf文件尾添加

    * soft nofile 204800  
    * hard nofile 204800  
    

      

    这两个文件的设置将会覆盖前面的设置。重启后生效

    以上配置解决问题:max number of threads [3895] for user [esuser] is too low, increase to at least [4096]
    

    问题一:警告提示

    [2016-11-06T16:27:21,712][WARN ][o.e.b.JNANatives ] unable to install syscall filter: 

    java.lang.UnsupportedOperationException: seccomp unavailable: requires kernel 3.5+ with CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER compiled in
    at org.elasticsearch.bootstrap.Seccomp.linuxImpl(Seccomp.java:349) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.bootstrap.Seccomp.init(Seccomp.java:630) ~[elasticsearch-5.0.0.jar:5.0.0]

    报了一大串错误,其实只是一个警告。

    解决:使用新的centOS版本,centOS7就不会出现此类问题了。

    问题二:报错

    报错:
    ERROR: bootstrap checks failed
    system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk

    原因:
    这是在因为Centos6不支持SecComp,而ES5.2.0默认bootstrap.system_call_filter为true进行检测,所以导致检测失败,失败后直接导致ES不能启动。

    解决:
    在elasticsearch.yml中配置bootstrap.system_call_filter为false,注意要在Memory下面:
    bootstrap.memory_lock: false
    bootstrap.system_call_filter: false

    4.启动

    复制代码
    [root@bogon ~]# cd /home/elasticsearch-6.3.1/bin/
    [root@bogon bin]# su esuser
    [esuser@bogon bin]$ ./elasticsearch
    [2018-07-17T10:17:30,139][INFO ][o.e.n.Node               ] [node-1] initializing ...
    [2018-07-17T10:17:30,234][INFO ][o.e.e.NodeEnvironment    ] [node-1] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [22.1gb], net total_space [27.6gb], types [rootfs]
    [2018-07-17T10:17:30,234][INFO ][o.e.e.NodeEnvironment    ] [node-1] heap size [1007.3mb], compressed ordinary object pointers [true]
    [2018-07-17T10:17:30,236][INFO ][o.e.n.Node               ] [node-1] node name [node-1], node ID [cb69e4JjSBKeHJ9y-q-hNA]
    [2018-07-17T10:17:30,236][INFO ][o.e.n.Node               ] [node-1] version[6.3.1], pid[26327], build[default/tar/eb782d0/2018-06-29T21:59:26.107521Z], OS[Linux/3.10.0-514.6.1.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_92/25.92-b14]
    [2018-07-17T10:17:30,236][INFO ][o.e.n.Node               ] [node-1] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.F1Jh0AOB, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.path.home=/home/elasticsearch-6.3.1, -Des.path.conf=/home/elasticsearch-6.3.1/config, -Des.distribution.flavor=default, -Des.distribution.type=tar]
    [2018-07-17T10:17:33,136][INFO ][o.e.p.PluginsService     ] [node-1] loaded module [aggs-matrix-stats]
    [2018-07-17T10:17:33,136][INFO ][o.e.p.PluginsService     ] [node-1] loaded module [analysis-common]
    [2018-07-17T10:17:33,137][INFO ][o.e.p.PluginsService     ] [node-1] loaded module [ingest-common]
    。。。。。。
    复制代码

    5.验证

    浏览器访问:http://192.168.20.115:9200/  (192.168.20.115是es服务器的IP,另外请确保9200端口能够被外部访问),返回:

    复制代码
    {
      "name" : "node-1",
      "cluster_name" : "my-application",
      "cluster_uuid" : "_na_",
      "version" : {
        "number" : "6.3.1",
        "build_flavor" : "default",
        "build_type" : "tar",
        "build_hash" : "eb782d0",
        "build_date" : "2018-06-29T21:59:26.107521Z",
        "build_snapshot" : false,
        "lucene_version" : "7.3.1",
        "minimum_wire_compatibility_version" : "5.6.0",
        "minimum_index_compatibility_version" : "5.0.0"
      },
      "tagline" : "You Know, for Search"
    }
    复制代码

    当然最方便的安装方法还是下载docker镜像,官方安装手册:https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html  步骤:

    1)下载镜像:docker pull docker.elastic.co/elasticsearch/elasticsearch:6.3.1

    2)运行容器:docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:6.3.1

    6.ElasticSearch Head安装

    官方的模拟工具是控制台的curl,不是很直观,可以在chrome浏览器中安装head插件来作为请求的工具:head插件的地址:Cenos7安装ES head6.3.1

    七、集成集成Ikanalyzer分词器

    1. 获取 ES-IKAnalyzer插件

    一定和ES的版本一致( 6.3.1)

    地址: https://github.com/medcl/elasticsearch-analysis-ik/releases

    2. 安装插件

     将 ik 的压缩包解压到 ES安装目录的plugins/目录下(最好把解出的目录名改一下,防止安装别的插件时同名冲突),然后重启ES。

    3. 扩展词库

    扩展词典可以修改配置文件config/IKAnalyzer.cfg.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
    <properties>
        <comment>IK Analyzer 扩展配置</comment>
        <!--用户可以在这里配置自己的扩展字典 -->
        <entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic</entry>
         <!--用户可以在这里配置自己的扩展停止词字典-->
        <entry key="ext_stopwords">custom/ext_stopword.dic</entry>
        <!--用户可以在这里配置远程扩展字典 远程词库,可热更新,在一处地方维护-->
        <!-- <entry key="remote_ext_dict">words_location</entry> -->
        <!--用户可以在这里配置远程扩展停止词字典-->
        <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
    </properties>

    4. 测试 IK

    1、创建一个索引

    http://start.com:9200/iktest
    {
        "mappings":{
            "_doc":{
                    "properties": {
                    "content": {
                    "type": "text",
                    "analyzer": "ik_max_word",
                    "search_analyzer": "ik_max_word"
                    }
                }
            }
        
        }
    }

    2.分词测试

    http://start.com:9200/_analyze
    {
      "analyzer":"ik_smart",
      "text":"天团S.H.E昨在两厅院艺文广场举办17万人露天音乐会,3人献唱多首经典好歌,让现场粉丝听得如痴如醉"
    }

    结果:

    {
        "tokens": [
            {
                "token": "",
                "start_offset": 0,
                "end_offset": 1,
                "type": "CN_CHAR",
                "position": 0
            },
            {
                "token": "",
                "start_offset": 1,
                "end_offset": 2,
                "type": "CN_CHAR",
                "position": 1
            },
            {
                "token": "s.h.e",
                "start_offset": 2,
                "end_offset": 7,
                "type": "LETTER",
                "position": 2
            },
            {
                "token": "昨在",
                "start_offset": 7,
                "end_offset": 9,
                "type": "CN_WORD",
                "position": 3
            },
            {
                "token": "两厅",
                "start_offset": 9,
                "end_offset": 11,
                "type": "CN_WORD",
                "position": 4
            },
            {
                "token": "",
                "start_offset": 11,
                "end_offset": 12,
                "type": "CN_CHAR",
                "position": 5
            },
            {
                "token": "艺文",
                "start_offset": 12,
                "end_offset": 14,
                "type": "CN_WORD",
                "position": 6
            },
            {
                "token": "广场",
                "start_offset": 14,
                "end_offset": 16,
                "type": "CN_WORD",
                "position": 7
            },
            {
                "token": "举办",
                "start_offset": 16,
                "end_offset": 18,
                "type": "CN_WORD",
                "position": 8
            },
            {
                "token": "17",
                "start_offset": 18,
                "end_offset": 20,
                "type": "ARABIC",
                "position": 9
            },
            {
                "token": "万人",
                "start_offset": 20,
                "end_offset": 22,
                "type": "CN_WORD",
                "position": 10
            },
            {
                "token": "露天",
                "start_offset": 22,
                "end_offset": 24,
                "type": "CN_WORD",
                "position": 11
            },
            {
                "token": "音乐会",
                "start_offset": 24,
                "end_offset": 27,
                "type": "CN_WORD",
                "position": 12
            },
            {
                "token": "3人",
                "start_offset": 28,
                "end_offset": 30,
                "type": "TYPE_CQUAN",
                "position": 13
            },
            {
                "token": "",
                "start_offset": 30,
                "end_offset": 31,
                "type": "CN_CHAR",
                "position": 14
            },
            {
                "token": "",
                "start_offset": 31,
                "end_offset": 32,
                "type": "CN_CHAR",
                "position": 15
            },
            {
                "token": "多首",
                "start_offset": 32,
                "end_offset": 34,
                "type": "CN_WORD",
                "position": 16
            },
            {
                "token": "经典",
                "start_offset": 34,
                "end_offset": 36,
                "type": "CN_WORD",
                "position": 17
            },
            {
                "token": "好歌",
                "start_offset": 36,
                "end_offset": 38,
                "type": "CN_WORD",
                "position": 18
            },
            {
                "token": "",
                "start_offset": 39,
                "end_offset": 40,
                "type": "CN_CHAR",
                "position": 19
            },
            {
                "token": "现场",
                "start_offset": 40,
                "end_offset": 42,
                "type": "CN_WORD",
                "position": 20
            },
            {
                "token": "粉丝",
                "start_offset": 42,
                "end_offset": 44,
                "type": "CN_WORD",
                "position": 21
            },
            {
                "token": "听得",
                "start_offset": 44,
                "end_offset": 46,
                "type": "CN_WORD",
                "position": 22
            },
            {
                "token": "如痴如醉",
                "start_offset": 46,
                "end_offset": 50,
                "type": "CN_WORD",
                "position": 23
            }
        ]
    }

    对比standard分词器:

    http://start.com:9200/_analyze
    {
      "analyzer":"standard",
      "text":"天团S.H.E昨在两厅院艺文广场 举办17万人露 天音乐会,3人献唱多首 经典好歌,让现场 粉丝听得如痴如醉"
    }

    结果:

    {
        "tokens": [
            {
                "token": "",
                "start_offset": 0,
                "end_offset": 1,
                "type": "<IDEOGRAPHIC>",
                "position": 0
            },
            {
                "token": "",
                "start_offset": 1,
                "end_offset": 2,
                "type": "<IDEOGRAPHIC>",
                "position": 1
            },
            {
                "token": "s.h.e",
                "start_offset": 2,
                "end_offset": 7,
                "type": "<ALPHANUM>",
                "position": 2
            },
            {
                "token": "",
                "start_offset": 7,
                "end_offset": 8,
                "type": "<IDEOGRAPHIC>",
                "position": 3
            },
            {
                "token": "",
                "start_offset": 8,
                "end_offset": 9,
                "type": "<IDEOGRAPHIC>",
                "position": 4
            },
            {
                "token": "",
                "start_offset": 9,
                "end_offset": 10,
                "type": "<IDEOGRAPHIC>",
                "position": 5
            },
            {
                "token": "",
                "start_offset": 10,
                "end_offset": 11,
                "type": "<IDEOGRAPHIC>",
                "position": 6
            },
            {
                "token": "",
                "start_offset": 11,
                "end_offset": 12,
                "type": "<IDEOGRAPHIC>",
                "position": 7
            },
            {
                "token": "",
                "start_offset": 12,
                "end_offset": 13,
                "type": "<IDEOGRAPHIC>",
                "position": 8
            },
            {
                "token": "",
                "start_offset": 13,
                "end_offset": 14,
                "type": "<IDEOGRAPHIC>",
                "position": 9
            }
          。。。
        ]
    }

     standard分词器把中文都拆分成了单个字。IK分词器拆分成了字和词语。

  • 相关阅读:
    AWK 学习手札之一: an AWK tutorial
    SQL语句教程学习笔记之一
    c#支付宝支付
    table隔行变色
    读取接口
    倒计时
    新建的mvc项目运行之后报错找不到页面
    sql向表中添加字段
    取小数点后面几位数
    H5拨打电话
  • 原文地址:https://www.cnblogs.com/wangzhuxing/p/9351245.html
Copyright © 2020-2023  润新知