• es创建普通索引以及各种查询


    创建索引

    • 创建普通索引:
    PUT /my_index
    {
      "settings": {
          "index": {
            "number_of_shards": "5",
            "number_of_replicas": "1"
          }
        }
    }
    
    • 查询索引属性
    GET /my_index
    
    结果:
    {
      "my_index": {
        "aliases": {},
        "mappings": {},
        "settings": {
          "index": {
            "creation_date": "1599903519568",
            "number_of_shards": "5",    主分片
            "number_of_replicas": "1",  副分片
            "uuid": "2WW-BXNxTFafswb0oURYjQ",
            "version": {
              "created": "5060999"
            },
            "provided_name": "my_index"
          }
        }
      }
    }
    
    • 创建type
    PUT /my_index/my_type/_mapping
    {
      "properties": {
        "id":{
          "type": "integer"
        },
        "name":{
          "type": "text"
        },
        "age":{
          "type": "integer"
        },
        "productID":{
          "type": "text"
        },
        "createtime":{
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss"
        }
      }
    }
    
    • 查看type
    GET /my_index/my_type/_mapping
    
    结果:
    {
      "my_index": {
        "mappings": {
          "my_type": {
            "properties": {
              "age": {
                "type": "integer"
              },
              "createtime": {
                "type": "date",
                "format": "yyyy-MM-dd HH:mm:ss"
              },
              "id": {
                "type": "integer"
              },
              "name": {
                "type": "text"
              },
              "productID": {
                "type": "text"
              }
            }
          }
        }
      }
    }
    
    • 添加数据
    PUT /my_index/my_type/_bulk
    { "index": { "_id":1}}
    { "id":1,"name": "张三","age":18,"createtime":"2020-09-01 16:16:16","productID":"XHDK-A-1293-#fJ3"}
    { "index": { "_id": 2}}
    { "id":2,"name": "张四","age":20,"createtime":"2020-08-01 16:16:16","productID":"KDKE-B-9947-#kL5"}
    { "index": { "_id": 3}}
    {"id":3, "name": "李四","age":22,"createtime":"2020-09-02 16:16:16","productID":"JODL-X-1937-#pV7"}
    
    --  没有手动插入映射,因此es会为我们自动创建映射,这就意味着只要是文本就会为我们使用分词器分词。
    

    各种查询

    空查询(不推荐)

    GET _search   查询所有索引下的数据
    
    GET /my_index/_search    查询my_index索引下的所有数据
    
    GET /my_index/my_type/_search    查询my_index索引下my_type下的所有数据
    

    精确查询

    当进行精确值查找时, 我们会使用过滤器(filters)。过滤器很重要,因为它们执行速度非常快,不会计算相关度(直接跳过了整个评分阶段)而且很容易被缓存。我们会在本章后面的 过滤器缓存 中讨论过滤器的性能优势,不过现在只要记住:请尽可能多的使用过滤式查询。

    term查询:

    • elasticsearch对这个搜索的词语不做分词,用于精确匹配,比如Id,数值类型的查询。
    • 可以用它处理数字(numbers)、布尔值(Booleans)、日期(dates)以及不被分析的文本(keyword)。

    查询数值:

    • 使用constant_score查询以非评分模式来执行 term 查询并以一作为统一评分,这样返回的结果的评分全部是1
    • 使用constant_score将term转化为过滤器查询
    GET /my_index/my_type/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "term":{
              "age": 20
            }
          }
        }
      }
    }
    
    结果:
    {
      "took": 0,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 1,
        "hits": [
          {
            "_index": "my_index",
            "_type": "my_type",
            "_id": "2",
            "_score": 1,
            "_source": {
              "id": 2,
              "name": "张四",
              "age": 20,
              "createtime": "2020-08-01 16:16:16",
              "productID": "KDKE-B-9947-#kL5"
            }
          }
        ]
      }
    }
    

    查询文本

    本文是怎样分词的?

    • 大写字母转为小写字母
    • 复数变为单数
    • 去掉特殊符号
    GET /my_index/my_type/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "term":{
              "productID": "KDKE-B-9947-#kL5"
            }
          }
        }
      }
    }
    
    查询结果:
    {
      "took": 0,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 0,
        "max_score": null,
        "hits": []
      }
    }
    
    
    查询无结果?

    由于term是精确查询,但是在查询文本的时候,很有可能这个文本已经进行了分词,但是term查询的时候搜索的词不分词,因此可能两个文本明明是一样的,但是却匹配不上,我们可以使用分词分析器看看这个productID如何实现分词的,如下:

    GET /my_index/_analyze
    {
      "field": "{productID}",
      "text": "KDKE-B-9947-#kL5"
    }
    
    查询结果:
    {
      "tokens": [
        {
          "token": "kdke",
          "start_offset": 0,
          "end_offset": 4,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "b",
          "start_offset": 5,
          "end_offset": 6,
          "type": "<ALPHANUM>",
          "position": 1
        },
        {
          "token": "9947",
          "start_offset": 7,
          "end_offset": 11,
          "type": "<NUM>",
          "position": 2
        },
        {
          "token": "kl5",
          "start_offset": 13,
          "end_offset": 16,
          "type": "<ALPHANUM>",
          "position": 3
        }
      ]
    }
    
    从上面查询结果来看:
    1、将特殊符号-分词时自动去掉了
    2、大写字母全部转为小写
    
    解决方案:

    如果需要使用term精确匹配查询文本,那么这个文本就不能使用分词器分词,因此需要手动创建索引的映射(mapping),如下:

    DELETE my_index    删除索引
    
    PUT /my_index                  重新创建索引
    {
      "settings": {
          "index": {
            "number_of_shards": "5",
            "number_of_replicas": "1"
          }
        }
    }
    
    PUT /my_index/my_type/_mapping
    {
      "properties": {
        "id":{
          "type": "integer"
        },
        "name":{
          "type": "text"
        },
        "age":{
          "type": "integer"
        },
        "productID":{                  重新指定字段索引映射,文本keyword类型是不被分词的
          "type": "text",
          "fields": {
            "keyword":{
              "type": "keyword"
            }
          }
        },
        "createtime":{
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss"
        }
      }
    }
    
    
    重新加入数据后就能精确匹配到信息了
    
    GET /my_index/my_type/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "term":{
              "productID.keyword": "KDKE-B-9947-#kL5"    
            }
          }
        }
      }
    }
    

    terms查询

    • 对于多个关键字的查询,假设我们需要查询age在18,20,22中的其中一个即可,那么需要使用terms指定多组值。
    • 精确查询,不会使用分词器
    GET /my_index/my_type/_search
    {
      "query": {
        "terms": {
          "age": [
            18,
            20,
            22
          ]
        }
      }
    }
    
    
    指定文档数量(from,size)
    • 假设我们需要对前两个文档进行查询,那么可以使用from和size指定文档的数量,如下:
    GET /my_index/my_type/_search
    {
      "from": 0,  从第一个文档
      "size": 2,  查询两个文档
     "query": {
       "terms": {
         "age": [
            18,
            20,
            22
          ]
       }
     } 
    } 
    
    返回指定字段_source
    • 在使用查询的时候默认返回的是全部的字段,那么我们可以使用_source指定返回的字段
    GET /my_index/my_type/_search
    {
      "from": 0,
      "size": 2, 
      "_source": ["id","name","age"], 
     "query": {
       "terms": {
         "age": [
            18,
            20,
            22
          ]
       }
     } 
    }
    
    排除不返回的字段exclude
    GET /my_index/my_type/_search
    {
      "from": 0,
      "size": 2, 
      "_source": {
          "includes": ["id","name","age"],  返回字段
          "excludes":["productID"]          不返回的字段
        }, 
     "query": {
       "terms": {
         "age": [
            18,
            20,
            22
          ]
       }
     } 
    } 
    

    match查询

    • match查询和term查询相反,知道分词器的存在,会对搜索的词语进行分词。
    • 上面使用match查询productId的时候,因为term不知道分词器的存在,因此查询不到,但是我们使用match查询可以匹配到,如下:
    GET /my_index/my_type/_search
    {
      "query": {
        "match": {
          "productID": "KDKE-B-9947-#kL5"
        }
      }
    }
    
    查询结果:
    {
      "took": 0,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 0.2876821,
        "hits": [
          {
            "_index": "my_index",
            "_type": "my_type",
            "_id": "2",
            "_score": 0.2876821,
            "_source": {
              "id": 2,
              "name": "张四",
              "age": 20,
              "createtime": "2020-08-01 16:16:16",
              "productID": "KDKE-B-9947-#kL5"
            }
          }
        ]
      }
    }
    
    • 比如我们查询姓名为张三的数据
    GET /my_index/my_type/_search
    {
      "query": {
        "match": {
          "name": "张三"  会对这个短语先进行分词之后再去查询
        }
      }
    }
    
    查询结果:
    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 0.51623213,
        "hits": [
          {
            "_index": "my_index",
            "_type": "my_type",
            "_id": "1",
            "_score": 0.51623213,
            "_source": {
              "id": 1,
              "name": "张三",
              "age": 18,
              "createtime": "2020-09-01 16:16:16",
              "productID": "XHDK-A-1293-#fJ3"
            }
          },
          {
            "_index": "my_index",
            "_type": "my_type",
            "_id": "2",
            "_score": 0.25811607,
            "_source": {
              "id": 2,
              "name": "张四",
              "age": 20,
              "createtime": "2020-08-01 16:16:16",
              "productID": "KDKE-B-9947-#kL5"
            }
          }
        ]
      }
    }
    
    分析:match查询会将查询语句先按标准的分词器分析后,根据分析后的单词去匹配索引。
    GET /my_index/_analyze
    {
      "text": "张三"
    }
    
    分词结果:
    {
      "tokens": [
        {
          "token": "张",
          "start_offset": 0,
          "end_offset": 1,
          "type": "<IDEOGRAPHIC>",
          "position": 0
        },
        {
          "token": "三",
          "start_offset": 1,
          "end_offset": 2,
          "type": "<IDEOGRAPHIC>",
          "position": 1
        }
      ]
    }
    

    match_phrase(短语匹配)

    • 类似 match 查询, match_phrase 查询首先将查询字符串解析成一个词项列表,然后对这些词项进行搜索,但只保留那些包含 全部 搜索词项,且 位置 与搜索词项相同的文档。 比如对于 quick fox 的短语搜索可能不会匹配到任何文档,因为没有文档包含的 quick 词之后紧跟着 fox
    • 位置顺序必须一致
    GET /my_index/my_type/_search
    {
      "query": {
        "match_phrase": {
          "name": "张三"
        }
      }
    }
    
    查询结果:
    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 0.51623213,
        "hits": [
          {
            "_index": "my_index",
            "_type": "my_type",
            "_id": "1",
            "_score": 0.51623213,
            "_source": {
              "id": 1,
              "name": "张三",
              "age": 18,
              "createtime": "2020-09-01 16:16:16",
              "productID": "XHDK-A-1293-#fJ3"
            }
          }
        ]
      }
    }
    
    • 如果觉得短语匹配过于严格,那么也可以设置slop这个关键字指定相隔的距离。

    举例:

    先添加一个名字为张啊三的数据
    
    PUT /my_index/my_type/_bulk
    { "index": { "_id":4}}
    { "id":4,"name": "张啊三","age":26,"createtime":"2020-10-01 16:16:16","productID":"XHDK-B-1293-#fJ2"}
    { "index": { "_id":5}}
    { "id":5,"name": "张家口测试三","age":26,"createtime":"2020-10-01 16:16:16","productID":"XHDK-B-1293-#fJ2"}
    
    查询:
    GET /my_index/my_type/_search
    {
      "query": {
        "match_phrase": {
          "name":{
            "query": "张三",
            "slop":1          设置分词相隔距离
          }
        }
      }
    }
    
    查询结果:
    {
      "took": 0,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 0.51623213,
        "hits": [
          {
            "_index": "my_index",
            "_type": "my_type",
            "_id": "1",
            "_score": 0.51623213,
            "_source": {
              "id": 1,
              "name": "张三",
              "age": 18,
              "createtime": "2020-09-01 16:16:16",
              "productID": "XHDK-A-1293-#fJ3"
            }
          },
          {
            "_index": "my_index",
            "_type": "my_type",
            "_id": "4",
            "_score": 0.42991763,
            "_source": {
              "id": 4,
              "name": "张啊三",
              "age": 26,
              "createtime": "2020-10-01 16:16:16",
              "productID": "XHDK-B-1293-#fJ2"
            }
          }
        ]
      }
    }
    

    排序

    • 使用sort可以进行排序
    GET /my_index/my_type/_search
    {
      "query": {
        "match_all": {}
      },
      "sort": [
        {
          "createtime": {
            "order": "desc"
          },
          "age": {
            "order": "desc"
          }
        }
      ][]()
    }
    
    • 对于文本排序就比较特殊,不能在analyzed(分析过)的字符串字段上排序,因为分析器将字符串拆分成了很多词汇单元,就像一个 词汇袋 ,所以 Elasticsearch 不知道使用那一个词汇单元排序。所以analyzed 域用来搜索, not_analyzed 域用来排序。但是依赖于 not_analyzed 域来排序的话不是很灵活,也可以自定义分析器进行排序。
    GET /my_index/my_type/_search
    {
      "query": {
        "match_all": {}
      },
      "sort": [
        {
          "productID.keyword": {
            "order": "desc"
          }
        }
      ]
    } 
    
    

    range(范围查询)

    • gt : > 大于(greater than)
    • lt : < 小于(less than)
    • gte : >= 大于或等于(greater than or equal to)
    • lte : <= 小于或等于(less than or equal to)
    GET /my_index/my_type/_search
    {
      "query": {
        "range": {
          "createtime": {
            "lte": "now"    小于等于当前时间
          }
        }
      }
    }
    
    GET /my_index/my_type/_search
    {
      "query": {
        "range": {
          "createtime": {
            "lte": "now-1M"  小于等于当前时间减去一个月    
          }
        }
      }
    }
    
    y:年、M:月、d:天、h:时、m:分、s:秒
    
    GET /my_index/my_type/_search
    {
      "query": {
        "range": {
          "createtime": {
            "gte": "2020-10-01 16:16:16",   也可以指定到秒
            "lte": "2020-10-01 16:16:16"
          }
        }
      }
    }
    
    GET /my_index/my_type/_search
    {
      "query": {
        "range": {
          "age": {
            "gte": 18,    数值类型
            "lte": 20
          }
        }
      }
    }
    
    

    fuzzy(模糊查询)

    • fuzzy 查询是一个词项级别的查询,所以它不做任何分析。它通过某个词项以及指定的 fuzziness 查找到词典中所有的词项。 fuzziness 默认设置为 AUTO 。
    • Elasticsearch 指定了 fuzziness参数支持对最大编辑距离的配置,默认为2。建议设置为1会得到更好的结果和更好的性能。
    GET /my_index/my_type/_search
    {
      "query": {
        "fuzzy": {
          "productID": {
            "value": "xhdl",   你如果输入的是XHDL是查询不到的,因为查询语句并没有被分词器分析。
            "fuzziness": 1
          }
        }
      }
    }
    

    null值的查询

    • exists这个语句用来查询存在值的信息,如果和must结合表示查询不为null的数据,如果must_not集合表示查询为null的数据,如下
    先添加一条订单号为null的数据:
    
    PUT /my_index/my_type/_bulk
    { "index": { "_id":6}}
    { "id":6,"name": "赵六","age":22,"createtime":"2020-10-01 16:16:16"}
    
    查询productID为null的数据:
    
    GET my_index/my_type/_search
    {
      "query": {
        "bool": {
          "must_not":{
            "exists":{
              "field":"productID"
            }
          }
        }
      }
    }
    
    查询productID不为null的数据:
    
    GET my_index/my_type/_search
    {
      "query": {
        "bool": {
          "must":{
            "exists":{
              "field":"productID"
            }
          }
        }
      }
    }
    

    filter(过滤查询)

    • 缓存,不返回相关性,速度比query快

    简单的过滤器

    • 使用post_filter
    GET /my_index/my_type/_search
    {
      "post_filter": {
        "term": {
          "age": 20
        }
      }
    }
    

    使用bool组合过滤器

    • must :所有的语句都 必须(must) 匹配,与 AND 等价。
    • must_not :所有的语句都 不能(must not) 匹配,与 NOT 等价。
    • should:至少有一个语句要匹配,与 OR 等价。
    GET /my_index/my_type/_search
    {
      "query": {
        "bool": {
          "must_not": [
            {}
          ],
          "must": [
            {}
          ],
          "should": [
            {}
          ]
        }
      }
    }
    
    -- 根据业务需求选择。
    
    实例:匹配查询张三,并且年龄是18岁的。
    
    GET /my_index/my_type/_search
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "name": "张三"
              }
            },
            {
              "term": {
                "age": {
                  "value": 18
                }
              }
            }
          ]
        }
      }
    }
    
    匹配查询叫张三,年龄在20到30之间并且订单号中不包含kdke的数据。
    
    GET /my_index/my_type/_search
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "name": "张三"
              }
            },
            {
              "range": {
                "age": {
                  "gte": 20,
                  "lte": 30
                }
              }
            }
          ],
          "must_not": [
            {
              "term": {
                "productID": "kdke"
              }
            }
          ]
        }
      }
    }
    

    嵌套bool组合过滤查询

    GET /my_index/my_type/_search
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "name": "张三"
              }
            },
            {
              "range": {
                "age": {
                  "gte": 20,
                  "lte": 30
                }
              }
            },
            {
              "bool": {
                "should": [
                  {
                    "match_phrase":{
                      "name": "测试"
                    }
                  }
                ]
              }
            }
          ]
         }
        }
     
    } 
    

    聚合查询

    • 在sql中有许多的聚合函数,那么在Elasticsearch中页存在这些聚合函数,比如sum,avg,count等等
    
    count:数量
    
    GET my_index/my_type/_search
    {
      "size": 0,    在使用聚合时,默认返回10条数据,可以设置大小,如果不需要可以设置为0
      "aggs": {
        "count_age": {    //自定义返回的字段名称
          "value_count": {  //count是查询聚合函数的数量
            "field": "age"    //指定的聚合字段
          }
        }
      }
    }
    
    avg: 平均值
    
    GET my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "avg_age": {
          "avg": {
            "field": "age"
          }
        }
      }
    }
    
    max: 最大值
    
    GET my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "max_age": {
          "max": {
            "field": "age"
          }
        }
      }
    }
    
    min: 最小值
    
    GET my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "min_age": {
          "min": {
            "field": "age"
          }
        }
      }
    }
    
    sum: 求和
    
    GET my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "sum_age": {
          "sum": {
            "field": "age"
          }
        }
      }
    }
    
    stats: 统计聚合,基于文档的某个值,计算出一些统计信息(min、max、sum、count、avg)。
    
    GET my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "stats_age": {
          "stats": {
            "field": "age"
          }
        }
      }
    }
    
    cardinality:相当于该字段互不相同的值有多少类,输出的是种类数
    
    GET my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "cardinality_age": {
          "cardinality": {
            "field": "age"
          }
        }
      }
    }
    

    group(分组),使用的是terms

    添加数据:
    
    PUT /my_index/my_type/_bulk
    { "index": { "_id":7}}
    { "id":7,"name": "鲜橙多","age":15,"createtime":"2020-07-01 16:16:16","productID":"XHDK-C-1293-#fJ3"}
    { "index": { "_id":8}}
    { "id":8,"name": "果粒橙","age":20,"createtime":"2020-12-01 16:16:16","productID":"KDKH-B-9947-#kL5"}
    { "index": { "_id": 9}}
    {"id":9, "name": "可口可乐","age":25,"createtime":"2020-09-02 16:16:16","productID":"JODL-X-1937-#pV7"}
    { "index": { "_id":10}}
    { "id":10,"name": "红牛","age":18,"createtime":"2020-09-10 16:16:16","productID":"XHDF-A-1293-#fJ3"}
    { "index": { "_id":11}}
    { "id":11,"name": "体制能量","age":20,"createtime":"2020-08-01 16:16:16","productID":"KDKE-B-9947-#kL5"}
    { "index": { "_id": 12}}
    {"id":12, "name": "芬达","age":22,"createtime":"2020-09-02 16:16:16","productID":"JODL-X-1937-#pV7"}
    
    
    GET my_index/my_type/_search
    {
      "size": 0,         返回条数,默认返回10条。
      "aggs": {
        "age_group": {    自定义返回的聚合桶名称
          "terms": {
            "field": "age",      分组字段
            "size":10           返回分组的数量,默认返回10条
          }
        }
      }
    }
    
    
    查询结果:
    {
      "took": 4,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 12,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "age_group": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": 20,          每个桶key
              "doc_count": 3      每个桶的文档数量。
            },
            {
              "key": 22,
              "doc_count": 3
            },
            {
              "key": 18,
              "doc_count": 2
            },
            {
              "key": 26,
              "doc_count": 2
            },
            {
              "key": 15,
              "doc_count": 1
            },
            {
              "key": 25,
              "doc_count": 1
            }
          ]
        }
      }
    }
    
    • 查询年龄18到22随的用户并且按创建时间分组
    GET /my_index/my_type/_search
    {
      "size": 0, 
      "query": {
        "range": {
          "age": {
            "gte": 18,
            "lte": 22
          }
        }
      },
      "aggs": {
        "group_createtime": {
          "terms": {
            "field": "createtime.keyword",
            "size": 10
          }
        }
      }
    }
    
    查询结果:
    
    {
      "took": 4,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 8,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "group_createtime": {        
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "2020-08-01 16:16:16",  
              "doc_count": 2
            },
            {
              "key": "2020-09-02 16:16:16",
              "doc_count": 2
            },
            {
              "key": "2020-09-01 16:16:16",
              "doc_count": 1
            },
            {
              "key": "2020-09-10 16:16:16",
              "doc_count": 1
            },
            {
              "key": "2020-10-01 16:16:16",
              "doc_count": 1
            },
            {
              "key": "2020-12-01 16:16:16",
              "doc_count": 1
            }
          ]
        }
      }
    }
    
    • 针对年龄在18到22岁之间的用户按照创建时间分组,并按照分组结果进行正序
    GET /my_index/my_type/_search
    {
      "size": 0, 
      "query": {
        "range": {
          "age": {
            "gte": 18,
            "lte": 22
          }
        }
      },
      "aggs": {
        "group_createtime": {
          "terms": {
            "field": "createtime.keyword",
            "size": 10,
            "order": {
              "_term": "asc"
            }
          }
        }
      }
    }
    
    • 针对年龄在18到22岁之间的用户并按创建时间分组后再按年龄分组结果倒序排,求出年龄平均值
    GET /my_index/my_type/_search
    {
      "size": 0, 
      "query": {
        "range": {
          "age": {
            "gte": 18,
            "lte": 22
          }
        }
      },
      "aggs": {
        "group_createtime": {
          "terms": {
            "field": "createtime.keyword",
            "size": 10
          },
          "aggs": {
            "group_age": {
              "terms": {
                "field": "age",
                "size": 10,
                "order": {
                  "_term": "desc"
                }
              }
            }
          }
        },
        "avg_age":{
          "avg": {
            "field": "age"
          }
        }
      }
    }
    
    • 针对年龄在18到22岁之间的用户并按创建时间分组后再按照年龄分组,时间分组后再按照每个时间段年龄数量倒序排,求出年龄平均值。
    GET /my_index/my_type/_search
    {
      "size": 0, 
      "query": {
        "range": {
          "age": {
            "gte": 18,
            "lte": 22
          }
        }
      },
      "aggs": {
        "group_createtime": {
          "terms": {
            "field": "createtime.keyword",
            "size": 10,
            "order": {
              "terms_age.count": "desc"
            }
          },
          "aggs": {
            "terms_age": {
              "extended_stats": {    度量计算,可以按照度量排序
                "field": "age"
              }
            },
            "group_age": {
              "terms": {
                "field": "age",
                "size": 10
              }
            }
          }
        },
        "avg_age":{
          "avg": {
            "field": "age"
          }
        }
      }
    }
    
    
    • 聚合去重

    查询用户订单号的数量

    GET /my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "cardinality_productID": {
          "cardinality": {
            "field": "productID.keyword"
          }
        }
      }
    }
    
    结果:
    {
      "took": 4,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 12,       总数量12个
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "cardinality_productID": {
          "value": 7            说明有7种订单号
        }
      }
    }
    

    date_histogram(按时间聚合统计)

    • 查询出每月份时间段订单完成数量最多
    GET /my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "date_month": {
          "date_histogram": {
            "field": "createtime",
            "interval": "month"
          },
          "aggs": {
            "cardinality_productID": {
              "cardinality": {
                "field": "productID.keyword"
              }
            }
          }
        }
      }
    }
    
    结果:
    {
      "took": 12,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 12,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "date_month": {
          "buckets": [
            {
              "key_as_string": "2020-07-01 00:00:00",
              "key": 1593561600000,
              "doc_count": 1,
              "cardinality_productID": {
                "value": 1
              }
            },
            {
              "key_as_string": "2020-08-01 00:00:00",
              "key": 1596240000000,
              "doc_count": 2,
              "cardinality_productID": {
                "value": 1
              }
            },
            {
              "key_as_string": "2020-09-01 00:00:00",
              "key": 1598918400000,
              "doc_count": 5,
              "cardinality_productID": {
                "value": 3
              }
            },
            {
              "key_as_string": "2020-10-01 00:00:00",
              "key": 1601510400000,
              "doc_count": 3,
              "cardinality_productID": {
                "value": 1
              }
            },
            {
              "key_as_string": "2020-11-01 00:00:00",
              "key": 1604188800000,
              "doc_count": 0,
              "cardinality_productID": {
                "value": 0
              }
            },
            {
              "key_as_string": "2020-12-01 00:00:00",
              "key": 1606780800000,
              "doc_count": 1,
              "cardinality_productID": {
                "value": 1
              }
            }
          ]
        }
      }
    }
    
    • 但是我们想要查看2020年每月份所有订单数量,没有订单的月份返回0
    GET /my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "date_month": {
          "date_histogram": {
            "field": "createtime",
            "interval": "month",
            "format":"yyyy-MM",   日期格式化
            "min_doc_count": 0,    强制返回空桶,默认会被过滤掉
            "extended_bounds":{   设置需要聚合的时间段,默认返回全部
              "min":"2020-01",   
              "max":"2020-12"
            }
          },
          "aggs": {
            "cardinality_productID": {
              "cardinality": {
                "field": "productID.keyword"
              }
            }
          }
        }
      }
    }
    
    • 我们想获取2020所有月份完成订单数量以及订单号,按照订单数量倒序排
    GET /my_index/my_type/_search
    {
      "size": 0, 
      "aggs": {
        "date_month": {
          "date_histogram": {
            "field": "createtime",
            "interval": "month",
            "format":"yyyy-MM",
            "min_doc_count": 0,
            "extended_bounds":{
              "min":"2020-01",
              "max":"2020-12"
            },
            "order": {
              "cardinality_productID": "desc"
            }
          },
          "aggs": {
            "name_terms":{
              "terms": {
                "field": "productID.keyword",
                "size": 10
              }
            },
            "cardinality_productID": {
              "cardinality": {
                "field": "productID.keyword"
              }
            }
          }
        }
      }
    }
    
  • 相关阅读:
    python+selenium之页面元素截图
    selenium八大定位
    http概述之URL与资源
    数组中只出现一次的数字
    数字在排序数组中出现的次数
    把数组排成最小的数
    数组中出现次数超过一半的数字
    调整数组顺序使得奇数位于偶数的前面
    旋转数组的最小值
    二维数组的查找
  • 原文地址:https://www.cnblogs.com/dashuaiguo/p/13683945.html
Copyright © 2020-2023  润新知