• elasticsearch5.6.1集群安装


    下载ES5.6.1:
        wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.1.tar.gz
    解压到当前文件夹:
        tar -xzvf elasticsearch-5.6.1.tar.gz

    修改sysctl文件:sudo vim  /etc/sysctl.conf ,增加下面配置项:注意在每台机器上执行
    增加改行配置:vm.max_map_count=655360
    保存退出后,执行:
    sudo sysctl -p

    cd到/home/hadoop/elasticsearch-5.6.1/config目录,找到elasticsearch.yml文件
        vim elasticsearch.yml



    # ---------------------------------- Cluster -----------------------------------
    #
    # Use a descriptive name for your cluster:
    # 集群名
    cluster.name: es-app
    #
    # ------------------------------------ Node ------------------------------------
    #
    # Use a descriptive name for the node:
    # 节点名
    node.name: master

    # ----------------------------------- Memory -----------------------------------
    #
    # Lock the memory on startup:
    #  内存/这个跟系统有关的,如果系统底会出现版本太底的错误
    bootstrap.memory_lock: false
    bootstrap.system_call_filter: false
    # ---------------------------------- Network -----------------------------------
    #
    # Set the bind address to a specific IP (IPv4 or IPv6):
    # 绑定地址
    network.host: 192.168.93.140
    #
    # Set a custom port for HTTP:
    # http端口,外部通这个来请求数据;tcp:端口; 当在一台主机上配置多个节点时,这个一定要配置的。
    http.port: 9200
    transport.tcp.port: 9300
    # --------------------------------- Discovery ----------------------------------
    #
    # Pass an initial list of hosts to perform discovery when new node is started:
    # The default list of hosts is ["127.0.0.1", "[::1]"]
    # 这个节点的IP:port;默认的是这个端口,在一台机器配置多节点一定要加上port
    discovery.zen.ping.unicast.hosts: ["192.168.93.140:9300", "192.168.93.141:9300","192.168.93.142:9300"]
    #
    # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
    # 防止脑裂
    discovery.zen.minimum_master_nodes: 1

    配置好之后,向其它节点复制过去就可以了,然后在各个节点把node.name与IP修改就可以了。
    scp -r  /home/hadoop/elasticsearch-5.6.1 hadoop@slaver1:~

    scp -r  /home/hadoop/elasticsearch-5.6.1 hadoop@slaver2:~


    启动 cd 到cd到/home/hadoop/elasticsearch-5.6.1/bin下
    ./elasticsearch
    注意每个节点都要启动
    hadoop@master:~/elasticsearch-5.6.1/bin$ ./elasticsearch
    [2017-09-24T19:02:08,979][INFO ][o.e.n.Node               ] [master] initializing ...
    [2017-09-24T19:02:09,245][INFO ][o.e.e.NodeEnvironment    ] [master] using [1] data paths, mounts [[/ (/dev/sda1)]], net usable_space [21.5gb], net total_space [41.2gb], spins? [possibly], types [ext4]
    [2017-09-24T19:02:09,246][INFO ][o.e.e.NodeEnvironment    ] [master] heap size [1.9gb], compressed ordinary object pointers [true]
    [2017-09-24T19:02:09,248][INFO ][o.e.n.Node               ] [master] node name [master], node ID [h1_nDt8nSiCPysC_YvCiCQ]
    [2017-09-24T19:02:09,249][INFO ][o.e.n.Node               ] [master] version[5.6.1], pid[81870], build[667b497/2017-09-14T19:22:05.189Z], OS[Linux/4.10.0-35-generic/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_131/25.131-b11]
    [2017-09-24T19:02:09,249][INFO ][o.e.n.Node               ] [master] JVM arguments [-Xms2g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/home/hadoop/elasticsearch-5.6.1]
    [2017-09-24T19:02:11,242][INFO ][o.e.p.PluginsService     ] [master] loaded module [aggs-matrix-stats]
    [2017-09-24T19:02:11,242][INFO ][o.e.p.PluginsService     ] [master] loaded module [ingest-common]
    [2017-09-24T19:02:11,243][INFO ][o.e.p.PluginsService     ] [master] loaded module [lang-expression]
    [2017-09-24T19:02:11,243][INFO ][o.e.p.PluginsService     ] [master] loaded module [lang-groovy]
    [2017-09-24T19:02:11,243][INFO ][o.e.p.PluginsService     ] [master] loaded module [lang-mustache]
    [2017-09-24T19:02:11,244][INFO ][o.e.p.PluginsService     ] [master] loaded module [lang-painless]
    [2017-09-24T19:02:11,244][INFO ][o.e.p.PluginsService     ] [master] loaded module [parent-join]
    [2017-09-24T19:02:11,244][INFO ][o.e.p.PluginsService     ] [master] loaded module [percolator]
    [2017-09-24T19:02:11,245][INFO ][o.e.p.PluginsService     ] [master] loaded module [reindex]
    [2017-09-24T19:02:11,245][INFO ][o.e.p.PluginsService     ] [master] loaded module [transport-netty3]
    [2017-09-24T19:02:11,246][INFO ][o.e.p.PluginsService     ] [master] loaded module [transport-netty4]
    [2017-09-24T19:02:11,247][INFO ][o.e.p.PluginsService     ] [master] no plugins loaded
    [2017-09-24T19:02:14,304][INFO ][o.e.d.DiscoveryModule    ] [master] using discovery type [zen]
    [2017-09-24T19:02:15,462][INFO ][o.e.n.Node               ] [master] initialized
    [2017-09-24T19:02:15,463][INFO ][o.e.n.Node               ] [master] starting ...
    [2017-09-24T19:02:15,793][INFO ][o.e.t.TransportService   ] [master] publish_address {192.168.93.140:9300}, bound_addresses {192.168.93.140:9300}
    [2017-09-24T19:02:15,815][INFO ][o.e.b.BootstrapChecks    ] [master] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
    [2017-09-24T19:02:18,924][INFO ][o.e.c.s.ClusterService   ] [master] new_master {master}{h1_nDt8nSiCPysC_YvCiCQ}{efcDdMrKSmSObgecX79mEw}{192.168.93.140}{192.168.93.140:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
    [2017-09-24T19:02:18,971][INFO ][o.e.h.n.Netty4HttpServerTransport] [master] publish_address {192.168.93.140:9200}, bound_addresses {192.168.93.140:9200}
    [2017-09-24T19:02:18,972][INFO ][o.e.n.Node               ] [master] started
    [2017-09-24T19:02:18,988][INFO ][o.e.g.GatewayService     ] [master] recovered [0] indices into cluster_state
    [2017-09-24T19:03:19,541][INFO ][o.e.c.s.ClusterService   ] [master] added {{slaver1}{xyb705aPSPq9iH-z8WHBpg}{C6yz5DQtSje3hHqmEwiPSw}{192.168.93.141}{192.168.93.141:9300},}, reason: zen-disco-node-join[{slaver1}{xyb705aPSPq9iH-z8WHBpg}{C6yz5DQtSje3hHqmEwiPSw}{192.168.93.141}{192.168.93.141:9300}]
    [2017-09-24T19:03:20,157][WARN ][o.e.d.z.ElectMasterService] [master] value for setting "discovery.zen.minimum_master_nodes" is too low. This can result in data loss! Please set it to at least a quorum of master-eligible nodes (current value: [1], total number of master-eligible nodes used for publishing in this round: [2])
    [2017-09-24T19:05:25,236][INFO ][o.e.c.s.ClusterService   ] [master] added {{slaver2}{_klsi3jPQP2hiPULnjsvyA}{Mu7pypT3R8CtuPh4ar9mLw}{192.168.93.142}{192.168.93.142:9300},}, reason: zen-disco-node-join[{slaver2}{_klsi3jPQP2hiPULnjsvyA}{Mu7pypT3R8CtuPh4ar9mLw}{192.168.93.142}{192.168.93.142:9300}]

    我们查看slaver1和slaver2上的日志:
    slaver1:
    [2017-09-24T19:03:20,144][INFO ][o.e.c.s.ClusterService   ] [slaver1] detected_master {master}{h1_nDt8nSiCPysC_YvCiCQ}{efcDdMrKSmSObgecX79mEw}{192.168.93.140}{192.168.93.140:9300}, added {{master}{h1_nDt8nSiCPysC_YvCiCQ}{efcDdMrKSmSObgecX79mEw}{192.168.93.140}{192.168.93.140:9300},}, reason: zen-disco-receive(from master [master {master}{h1_nDt8nSiCPysC_YvCiCQ}{efcDdMrKSmSObgecX79mEw}{192.168.93.140}{192.168.93.140:9300} committed version [3]])
    [2017-09-24T19:03:20,187][INFO ][o.e.h.n.Netty4HttpServerTransport] [slaver1] publish_address {192.168.93.141:9200}, bound_addresses {192.168.93.141:9200}
    [2017-09-24T19:03:20,188][INFO ][o.e.n.Node               ] [slaver1] started
    [2017-09-24T19:05:25,309][INFO ][o.e.c.s.ClusterService   ] [slaver1] added {{slaver2}{_klsi3jPQP2hiPULnjsvyA}{Mu7pypT3R8CtuPh4ar9mLw}{192.168.93.142}{192.168.93.142:9300},}, reason: zen-disco-receive(from master [master {master}{h1_nDt8nSiCPysC_YvCiCQ}{efcDdMrKSmSObgecX79mEw}{192.168.93.140}{192.168.93.140:9300} committed version [4]])

    slaver2:

    [2017-09-24T19:05:25,706][INFO ][o.e.c.s.ClusterService   ] [slaver2] detected_master {master}{h1_nDt8nSiCPysC_YvCiCQ}{efcDdMrKSmSObgecX79mEw}{192.168.93.140}{192.168.93.140:9300}, added {{slaver1}{xyb705aPSPq9iH-z8WHBpg}{C6yz5DQtSje3hHqmEwiPSw}{192.168.93.141}{192.168.93.141:9300},{master}{h1_nDt8nSiCPysC_YvCiCQ}{efcDdMrKSmSObgecX79mEw}{192.168.93.140}{192.168.93.140:9300},}, reason: zen-disco-receive(from master [master {master}{h1_nDt8nSiCPysC_YvCiCQ}{efcDdMrKSmSObgecX79mEw}{192.168.93.140}{192.168.93.140:9300} committed version [4]])
    [2017-09-24T19:05:28,167][INFO ][o.e.h.n.Netty4HttpServerTransport] [slaver2] publish_address {192.168.93.142:9200}, bound_addresses {192.168.93.142:9200}
    [2017-09-24T19:05:28,168][INFO ][o.e.n.Node               ] [slaver2] started

    通过以上日志可以看到各个节点相互发现了。

    集群健康值:
    hadoop@master:/opt/Hadoop/zookeeper-3.4.10/bin$ curl http://192.168.93.140:9200/_cluster/health?pretty=true或者在浏览器中输入http://192.168.93.140:9200/_cluster/health?pretty=true
    {
      "cluster_name" : "es-app",
      "status" : "green",
      "timed_out" : false,
      "number_of_nodes" : 3,
      "number_of_data_nodes" : 3,
      "active_primary_shards" : 0,
      "active_shards" : 0,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 0,
      "delayed_unassigned_shards" : 0,
      "number_of_pending_tasks" : 0,
      "number_of_in_flight_fetch" : 0,
      "task_max_waiting_in_queue_millis" : 0,
      "active_shards_percent_as_number" : 100.0
    }

    集群状态:

    hadoop@master:/opt/Hadoop/zookeeper-3.4.10/bin$ curl http://192.168.93.140:9200/_cluster/state或者在浏览器中输入http://192.168.93.140:9200/_cluster/state

    {"cluster_name":"es-app","version":4,"state_uuid":"nNfQrzNOSs6yrU7VNHNlwg","master_node":"h1_nDt8nSiCPysC_YvCiCQ","blocks":{},"nodes":{"_klsi3jPQP2hiPULnjsvyA":{"name":"slaver2","ephemeral_id":"Mu7pypT3R8CtuPh4ar9mLw","transport_address":"192.168.93.142:9300","attributes":{}},"xyb705aPSPq9iH-z8WHBpg":{"name":"slaver1","ephemeral_id":"C6yz5DQtSje3hHqmEwiPSw","transport_address":"192.168.93.141:9300","attributes":{}},"h1_nDt8nSiCPysC_YvCiCQ":{"name":"master","ephemeral_id":"efcDdMrKSmSObgecX79mEw","transport_address":"192.168.93.140:9300","attributes":{}}},"metadata":{"cluster_uuid":"o1DzbTt6RY-bgh4ilZ47Yw","templates":{},"indices":{},"index-graveyard":{"tombstones":[]}},"routing_table":{"indices":{}},"routing_nodes":{"unassigned":[],"nodes":{"_klsi3jPQP2hiPULnjsvyA":[],"xyb705aPSPq9iH-z8WHBpg":[],"h1_nDt8nSiCPysC_YvCiCQ":[]}}}

    集群统计:

    hadoop@master:/opt/Hadoop/zookeeper-3.4.10/bin$ curl http://192.168.93.140:9200/_cluster/stats
    {"_nodes":{"total":3,"successful":3,"failed":0},"cluster_name":"es-app","timestamp":1506308656817,"status":"green","indices":{"count":0,"shards":{},"docs":{"count":0,"deleted":0},"store":{"size_in_bytes":0,"throttle_time_in_millis":0},"fielddata":{"memory_size_in_bytes":0,"evictions":0},"query_cache":{"memory_size_in_bytes":0,"total_count":0,"hit_count":0,"miss_count":0,"cache_size":0,"cache_count":0,"evictions":0},"completion":{"size_in_bytes":0},"segments":{"count":0,"memory_in_bytes":0,"terms_memory_in_bytes":0,"stored_fields_memory_in_bytes":0,"term_vectors_memory_in_bytes":0,"norms_memory_in_bytes":0,"points_memory_in_bytes":0,"doc_values_memory_in_bytes":0,"index_writer_memory_in_bytes":0,"version_map_memory_in_bytes":0,"fixed_bit_set_memory_in_bytes":0,"max_unsafe_auto_id_timestamp":-9223372036854775808,"file_sizes":{}}},"nodes":{"count":{"total":3,"data":3,"coordinating_only":0,"master":3,"ingest":3},"versions":["5.6.1"],"os":{"available_processors":48,"allocated_processors":48,"names":[{"name":"Linux","count":3}],"mem":{"total_in_bytes":25049698304,"free_in_bytes":1368895488,"used_in_bytes":23680802816,"free_percent":5,"used_percent":95}},"process":{"cpu":{"percent":0},"open_file_descriptors":{"min":446,"max":447,"avg":446}},"jvm":{"max_uptime_in_millis":3758298,"versions":[{"version":"1.8.0_131","vm_name":"OpenJDK 64-Bit Server VM","vm_version":"25.131-b11","vm_vendor":"Oracle Corporation","count":3}],"mem":{"heap_used_in_bytes":1444532368,"heap_max_in_bytes":6227755008},"threads":229},"fs":{"total_in_bytes":132766040064,"free_in_bytes":77017890816,"available_in_bytes":70202990592,"spins":"true"},"plugins":[],"network_types":{"transport_types":{"netty4":3},"http_types":{"netty4":3}}}}

    用python测试
        sudo pip install elasticsearch

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    from elasticsearch import Elasticsearch
    from datetime import datetime
    # 创建连接
    es = Elasticsearch(hosts='192.168.93.140')
    for i in range(1,100000):
        es.index(index='els_student', doc_type='test-type', id=i, body={"name": "student" + str(i), "age": (i % 100), "timestamp": datetime.now()})


    curl -XPOST '192.168.93.140:9200/els_student/_search?pretty' -d '
    {
      "query": { "match_all": {} }
     
    }'

    curl -XPOST '192.168.93.140:9200/els_student/_search?pretty' -d '
    {
      "query": { "match": { "name": "student41" } }
     
    }'


    curl -XPUT http://192.168.93.140:9200/index


    curl -XPOST http://192.168.93.140:9200/index/fulltext/_mapping -d'
    {
            "properties": {
                "content": {
                    "type": "text",
                    "analyzer": "ik_max_word",
                    "search_analyzer": "ik_max_word"
                }
            }
        
    }'

    增加ik分词器(注意找对应版本的,可以参考 https://github.com/medcl/elasticsearch-analysis-ik)

    hadoop@master:~/elasticsearch-5.6.1$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.6.1/elasticsearch-analysis-ik-5.6.1.zip
    -> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.6.1/elasticsearch-analysis-ik-5.6.1.zip
    [=================================================] 100%??
    -> Installed analysis-ik
    hadoop@master:~/elasticsearch-5.6.1$ cd bin/

    安装完重启ES

    curl -XPOST http://192.168.93.140:9200/index/fulltext/1 -d'
    {"content":"美国留给伊拉克的是个烂摊子吗"}
    '

    curl -XPOST http://192.168.93.140:9200/index/fulltext/2 -d'
    {"content":"公安部:各地校车将享最高路权"}
    '

    curl -XPOST http://192.168.93.140:9200/index/fulltext/3 -d'
    {"content":"中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"}
    '

    curl -XPOST http://192.168.93.140:9200/index/fulltext/4 -d'
    {"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
    '

    curl -XPOST http://192.168.93.140:9200/index/fulltext/_search  -d'
    {
        "query" : { "match" : { "content" : "中国" }},
        "highlight" : {
            "pre_tags" : ["<tag1>", "<tag2>"],
            "post_tags" : ["</tag1>", "</tag2>"],
            "fields" : {
                "content" : {}
            }
        }
    }
    '


    结果:

    {
        "took": 169,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 2,
            "max_score": 0.6099695,
            "hits": [
                {
                    "_index": "index",
                    "_type": "fulltext",
                    "_id": "4",
                    "_score": 0.6099695,
                    "_source": {
                        "content": "中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"
                    },
                    "highlight": {
                        "content": [
                            "<tag1>中国</tag1>驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"
                        ]
                    }
                },
                {
                    "_index": "index",
                    "_type": "fulltext",
                    "_id": "3",
                    "_score": 0.27179778,
                    "_source": {
                        "content": "中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"
                    },
                    "highlight": {
                        "content": [
                            "中韩渔警冲突调查:韩警平均每天扣1艘<tag1>中国</tag1>渔船"
                        ]
                    }
                }
            ]
        }
    }

    Dictionary Configuration

    IKAnalyzer.cfg.xml can be located at {conf}/analysis-ik/config/IKAnalyzer.cfg.xml or {plugins}/elasticsearch-analysis-ik-*/config/IKAnalyzer.cfg.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
    <properties>
        <comment>IK Analyzer 扩展配置</comment>
        <!--用户可以在这里配置自己的扩展字典 -->
        <entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic</entry>
         <!--用户可以在这里配置自己的扩展停止词字典-->
        <entry key="ext_stopwords">custom/ext_stopword.dic</entry>
         <!--用户可以在这里配置远程扩展字典 -->
        <entry key="remote_ext_dict">location</entry>
         <!--用户可以在这里配置远程扩展停止词字典-->
        <entry key="remote_ext_stopwords">http://xxx.com/xxx.dic</entry>
    </properties>

    热更新 IK 分词使用方法

    目前该插件支持热更新 IK 分词,通过上文在 IK 配置文件中提到的如下配置

         <!--用户可以在这里配置远程扩展字典 -->
        <entry key="remote_ext_dict">location</entry>
         <!--用户可以在这里配置远程扩展停止词字典-->
        <entry key="remote_ext_stopwords">location</entry>

    其中 location 是指一个 url,比如 http://yoursite.com/getCustomDict,该请求只需满足以下两点即可完成分词热更新。

        该 http 请求需要返回两个头部(header),一个是 Last-Modified,一个是 ETag,这两者都是字符串类型,只要有一个发生变化,该插件就会去抓取新的分词进而更新词库。

        该 http 请求返回的内容格式是一行一个分词,换行符用 即可。

    满足上面两点要求就可以实现热更新分词了,不需要重启 ES 实例。

    可以将需自动更新的热词放在一个 UTF-8 编码的 .txt 文件里,放在 nginx 或其他简易 http server 下,当 .txt 文件修改时,http server 会在客户端请求该文件时自动返回相应的 Last-Modified 和 ETag。可以另外做一个工具来从业务系统提取相关词汇,并更新这个 .txt 文件。

    have fun.
    常见问题

    1.自定义词典为什么没有生效?

    请确保你的扩展词典的文本格式为 UTF8 编码

    2.如何手动安装?

    git clone https://github.com/medcl/elasticsearch-analysis-ik
    cd elasticsearch-analysis-ik
    git checkout tags/{version}
    mvn clean
    mvn compile
    mvn package

    拷贝和解压release下的文件: #{project_path}/elasticsearch-analysis-ik/target/releases/elasticsearch-analysis-ik-*.zip 到你的 elasticsearch 插件目录, 如: plugins/ik 重启elasticsearch

    3.分词测试失败 请在某个索引下调用analyze接口测试,而不是直接调用analyze接口 如:http://localhost:9200/your_index/_analyze?text=中华人民共和国MN&tokenizer=my_ik

        ik_max_word 和 ik_smart 什么区别?

    ik_max_word: 会将文本做最细粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,中华人民,中华,华人,人民共和国,人民,人,民,共和国,共和,和,国国,国歌”,会穷尽各种可能的组合;

    ik_smart: 会做最粗粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,国歌”。


    curl -XPOST '192.168.93.140:9200/parsetext-index/_delete_by_query?pretty' -d '{
        "query": {
            "match_all": {}
        }
    }'

  • 相关阅读:
    求子数组最大和
    layout_weight layout_width = 0dp
    一些日历的实现
    只显示年月日的日历
    每日学习之0512
    git 出现The current branch is not configured for pull No value for key branch.master.merge found in configuration错误的解决办法
    git的配置
    使用Spring security框架实现登陆页面时跳转到favicon.ico问题
    播放视频(c#)
    太阳沉落了
  • 原文地址:https://www.cnblogs.com/herosoft/p/8134134.html
Copyright © 2020-2023  润新知