• 第1部分 Elasticsearch基础


    一、安装

    es端口:9200
    kibana端口:5601

    brew install elasticsearch
    brew install elasticsearch
    brew services start elasticsearch
    brew services start kibana
    
    

    二、elastic交互-基本

    1、集群信息

    访问数据模式REST

    <HTTP Verb> /<Index>/<Type>/<ID>
    

    查看集群健康检查

    $ curl -X GET "localhost:9200/_cat/health?v"
    epoch      timestamp cluster           status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
    1551424260 07:11:00  elasticsearch_yjn green           1         1      0   0    0    0        0             0                  -                100.0%
    

    查看节点列表

    $ curl -X GET "localhost:9200/_cat/nodes?v"
    ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
    127.0.0.1           24          99  23    2.89                  mdi       *      d-0YrG2
    

    查看所有指数

    $ curl -X GET "localhost:9200/_cat/indices?v"
    health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    green  open   .kibana_1 oRuiVW1wSbWyFK4Wu0k6MA   1   0          1            0      3.6kb          3.6kb
    

    2、索引

    创建索引index

    PUT /customer?pretty
    GET /_cat/indices?v
    health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    green  open   .kibana_1 oRuiVW1wSbWyFK4Wu0k6MA   1   0          1            0      3.6kb          3.6kb
    yellow open   customer  Fe4Y2hU2Rcek0kl7SYZoKQ   5   1          0            0      1.1kb          1.1kb
    
    • 目前只有一个节点,默认为此索引值创建了一个副本,所以为黄色。

    删除索引

    DELETE /customer?pretty
    

    3、文档

    索引文档

    PUT /customer/_doc/1?pretty
    {
      "name": "John Doe"
    }
    

    结果

    {
      "_index" : "customer",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "result" : "created",
      "_shards" : {
        "total" : 2,
        "successful" : 1,
        "failed" : 0
      },
      "_seq_no" : 0,
      "_primary_term" : 1
    }
    

    查询

    GET /customer/_doc/1?pretty
    {
      "_index" : "customer",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "_seq_no" : 0,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "John Doe"
      }
    }
    

    替换文档

    POST /customer/_doc?pretty
    {
      "name": "Jane Doe"
    }
    
    • 不指定id会随机产生一个id,创建相同id的已经存在就会被替换。

    更新文档

    POST /customer/_doc/1/_update?pretty
    {
      "doc": { "name": "Jane Doe", "age": 20 }
    }
    
    • 结果:
    GET /customer/_doc/1?pretty
    {
      "_index" : "customer",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 3,
      "_seq_no" : 2,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "Jane Doe",
        "age" : 20
      }
    }
    

    使用脚本更新文档:

    POST /customer/_doc/1/_update?pretty
    {
      "script" : "ctx._source.age += 5"
    }
    
    • 结果:
    GET /customer/_doc/1?pretty
    {
      "_index" : "customer",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 4,
      "_seq_no" : 3,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "Jane Doe",
        "age" : 25
      }
    }
    

    删除文档:

    DELETE /customer/_doc/2?pretty
    

    批量处理

    POST /customer/_doc/_bulk?pretty
    {"index":{"_id":"1"}}
    {"name": "John Doe" }
    {"index":{"_id":"2"}}
    {"name": "Jane Doe" }
    
    POST /customer/_doc/_bulk?pretty
    {"update":{"_id":"1"}}
    {"doc": { "name": "John Doe becomes Jane Doe" } }
    {"delete":{"_id":"2"}}
    
    GET /customer/_doc/_search
    
    • Bulk API不会因其中一个操作失败而失败,如果单个操作因任何原因失败,它将继续处理其后的其余操作。
      批量API返回时,它将为每个操作提供一个状态(按照发送的顺序),以便您可以检查特定操作是否失败。

    • 状态:"result" : "noop"(created,deleted,updated,not_found")

    4、数据集处理例子

    添加数据集

    curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"
    curl "localhost:9200/_cat/indices?v"
    

    查询

    GET /bank/_search?q=*&sort=account_number:asc&pretty
    
    • q=* 表示匹配索引中的所有文档
    • sort=account_number:asc 结果按照account_number升序排序
    • pretty 好看的格式
      使用查询表达式:结果同上
    GET /bank/_search
    {
      "query": { "match_all": {} },
      "sort": [
        { "account_number": "asc" }
      ]
    }
    
    

    结果分析:

    {
      "took" : 8,   #es执行搜索的时间ms
      "timed_out" : false,  #是否超时
      "_shards" : {
        "total" : 5,    #搜索到了多少分片
        "successful" : 5,   #成功的分片数
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {    #查询结果
        "total" : 1000, #符合条件的文档总数
        "max_score" : null,
        "hits" : [ {    #实际的搜索结果数组(默认前10个文档)
          "_index" : "bank",
          "_type" : "_doc",
          "_id" : "0",
          "sort": [0],
          "_score" : null,
          "_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"}
        }, {
          "_index" : "bank",
          "_type" : "_doc",
          "_id" : "1",
          "sort": [1],
          "_score" : null,
          "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
        }, ...
        ]
      }
    }
    

    5、查询

    查询语言

    GET /bank/_search
    {
      "query": { "match_all": {} },
      "from": 10,
      "size": 10
    }
    
    • from:从那个文档索引开始,不指定from默认为0
    • size:从from开始返回的文档数
    GET /bank/_search
    {
      "query": { "match_all": {} },
      "sort": { "balance": { "order": "desc" } }
    }
    
    • 按照balance降序排序

    更复杂的查询

    GET /_all/tweet/_search?q=name:(mary john) +date:>2014-09-10 +(aggregations geo)
    
    • _all在所有索引中
    • 在tweet类型中
    • name 字段中包含 mary 或者 john
    • date 值大于 2014-09-10
    • _all 字段包含 aggregations 或者 geo

    6、聚合

    GET /bank/_search
    {
      "size": 0,
      "aggs": {
        "group_by_state": {
          "terms": {
            "field": "state.keyword",
            "order": {
              "average_balance": "desc"
            }
          },
          "aggs": {
            "average_balance": {
              "avg": {
                "field": "balance"
              }
            }
          }
        }
      }
    }
    

    三、分析器

    测试分析器

    可以使用 analyze API 来看文本是如何被分析的

    GET /_analyze
    {
      "analyzer": "standard",
      "text": "Text to analyze"
    }
    

    结果:
    token 是实际存储到索引中的词条。 position 指明词条在原始文本中出现的位置。 start_offset 和 end_offset 指明字符在原始字符串中的位置。

    映射

    查看索引bank类型_doc的映射

    GET /bank/_mapping/_doc
    

    四、请求体查询 及 五、排序与相关性

    详情查看
    (https://www.cnblogs.com/yangjianan/p/10525925.html)

  • 相关阅读:
    在二元树中查找和为某一值的所有路径
    求整数的二进制表示1的个数
    Javascript AJAX 解析XML 兼容FIREFOX/IE
    DOM解析XML笔记
    Linux c 共享内存
    C Socket 发送/接收数据结构
    Linux c 获取系统内存
    7.5备忘
    linux c 唤醒进程 获取子进程结束状态
    7.1-7.2备忘
  • 原文地址:https://www.cnblogs.com/yangjianan/p/10522403.html
Copyright © 2020-2023  润新知