• ElasticSearch 入门


    ElasticSearch学习笔记

    1、ElasticSearch安装

    替换 ik分词器 :版本要对应,如果不对应,会报错..

    需要Java JDK 配置。

     

    2、ElasticSearch简单的CRUD

    1> 创建索引------>> 类型------>>文档

    给字段确定类型

    PUT /schools/_mapping/school

    {

        "properties":{

            "TimeFormat":{

                "type":"date",

                "format":"yyyy-MM-dd HH:mm:ss"

            }

        }

    }

     

    创建index student typearticle 的 字段subject 类型为text 使用ik_max_word 分词器的文档。

    PUT /student/?pretty

    {

            "settings" : {

            "analysis" : {

                "analyzer" : {

                    "ik" : {

                        "tokenizer" : "ik_max_word"

                    }

                }

            }

        },

        "mappings" : {

            "article" : {

                "dynamic" : true,

                "properties" : {

                    "subject" : {

                        "type" : "text",

                        "analyzer" : "ik_max_word"

                    }

                }

            }

        }

    }

    如果不手动指定,分词器就不会默认使用ik .且以上只能针对文档中的字段指定

    以下针对index 进行指定使用ik分词器

    PUT /students

    {

        "settings" : {

            "index" : {

                "analysis.analyzer.default.type": "ik_max_word"

            }

        }

    }

    A .  单条插入

    PUT http://localhost:9200/movies/movie/3

    {

        "title": "To Kill a Mockingbird",

        "director": "Robert Mulligan",

        "year": 1962

    }

    PUT  url/index/type/id

    {

    字段:

    字段:

    字段:

    ....

     

    }

    使用以上格式创建索引、类型、文档

     

    { "_index": "movies", "_type": "movie", "_id": "1", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "created": true }

    Version,为1,result 为:created

     

    B. 批量插入

    POST /schools/_bulk

    {"index":{"_index":"schools","_type":"school","_id":"1"}}

    {"name":"Central School","description":"CBSE Affiliation","street":"Nagan","city":"paprola","state":"HP","zip":"176115","location":[31.8955385,76.8380405],"fees":2000,"tags":["Senior Secondary","beautiful campus"],"rating":"3.5"}

    {"index":{"_index":"schools","_type":"school","_id":"2"}}

    {"name":"Saint Paul School","description":"ICSE Afiliation","street":"Dawarka","city":"Delhi","state":"Delhi","zip":"110075","location":[28.5733056,77.0122136],"fees":5000,"tags":["Good Faculty","Great Sports"],"rating":"4.5"}

    {"index":{"_index":"schools","_type":"school","_id":"3"}}

    {"name":"Crescent School","description":"State Board Affiliation","street":"Tonk Road","city":"Jaipur","state":"RJ","zip":"176114","location":[26.8535922,75.7923988],"fees":2500,"tags":["Well equipped labs"],"rating":"4.5"}

    使用_bulk 进行批量的插入数据。

     

     

    2> 修改文档

    现在,在索引中有了一部电影信息,接下来来了解如何更新它,添加一个类型列表。要做到这一点,只需使用相同的ID索引它。使用与之前完全相同的索引请求,但类型扩展了JSON对象

    PUT  http://localhost:9200/movies/movie/3

    {

        "title": "To Kill a Mockingbird",

        "director": "Robert Mulligan",

        "year": 1962,

        "genres": ["Crime", "Drama", "Mystery"]

    }

     

    响应如下:

    { "_index": "movies", "_type": "movie", "_id": "1", "_version": 2, "result": "updated", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "created": false }

    Version,变为了2,result 为:updated

     

    修改文档的单个字段 script   inline)

     

    POST schools/school/_update_by_query

    {

      "script": {

        "inline": "ctx._source.TimeFormat ='2016-09-08 15:20:30';ctx._source.zip='1766889'"

      },

      "query":{

          "term":{

              "city":"delhi"

          }

      }

      

    }

     

     

    3> 删除文档

    为了通过ID从索引中删除单个指定的文档,使用与获取索引文档相同的URL,只是这里将HTTP方法更改为DELETE

     

    DELETE http://localhost:9200/movies/movie/3

     

    返回响应:

    {

       "_index": "movies",

       "_type": "movie",

       "_id": "1",

       "_version": 2,

       "result": "deleted",

       "_shards": {

          "total": 2,

          "successful": 1,

          "failed": 0

       },

       "_seq_no": 5,

       "_primary_term": 1

    }

     

    4> 查询文档

    为了通过ID从索引中查询单个指定的文档,使用与获取索引文档相同的URL,只是这里将HTTP方法更改为GET

     

    GET http://localhost:9200/movies/movie/3

     

    条件搜索

     

    常用查询:

    全文本查询:针对文本

    1、查询全部:match_all

    2、模糊匹配: match (类似sql 的 like)

    3、全句匹配: match_phrase (类似sql 的 = )

    4、多字段匹配:multi_match (多属性查询)

    5、语法查询:query_string (直接写需要配置的 关键字 )

    6、字段查询 : term (针对某个属性的查询,这里注意 term 不会进行分词,比如 在 es 中 存了 “火锅” 会被分成 “火/锅” 当你用 term 去查询 “火时能查到”,但是查询 “火锅” 时,就什么都没有,而 match 就会将词语分成 “火/锅”去查)

    7、范围查询:range ()

    字段查询:针对结构化数据,如数字,日期 。。。

     

    分页:

    “from”: 10,

    “size”: 10

     

    constant_score: 固定分数。

     

    filter: 查询: (query 属于类似就可以查出来,而 filter 类似 = 符号,要么成功,要么失败,没有中间值,查询速度比较快

     

    1、查询全部:match_all

    POST _search

    {

       "query": {

          "match_all": {}

       }

    }

     

    2、模糊匹配: match (类似sql 的 like)

    POST /schools/school/_search

    {

        "query": {

            "match": {

                "name":"Saint Paul School"

              

            }

        }

    }

     

    使用 match 进行搜索时:搜索内容通过分词器进行分词后,与文本分词后的结果进行匹配,如上例:搜索 /schools/school/ 中的name 字段中 Saint Paul School 进过分词的所有匹配项 ,只要name中有分词其中之一就会被匹配。

     

    3、全句匹配: match_phrase (类似sql 的 = )

     

    POST /schools/school/_search

    {

        "query": {

            "match_phrase": {

                "name":"Saint Paul School"

              

            }

        }

    }

     

    使用 match_phrase进行搜索时:搜索内容通过分词器进行分词后,与文本分词后的结果进行连续,精确的匹配,如上例:搜索 /schools/school/ 中的name 字段中 Saint Paul School 进过分词的所有匹配项 ,只有name中同时有Saint Paul School 三个连续的分词才会被匹配。相当于是对 sql中 =的用法,但可以忽略 空格。

     

    4、多字段匹配:multi_match (多属性查询)

     

    POST /schools/school/_search

    {

        "query": {

            "multi_match": {

                "query":"Saint Paul School",

                "fields": [

                   "name","tags"

                ]

              

            }

        }

    }

     

    multi_match 可以对多字段进行模糊搜索, query 中的搜索字段会被分词,并各自匹配,fields 字段用来确定搜索的字段。

     

    5、语法查询:query_string (直接写需要配置的 关键字 )

     

    POST /schools/school/_search

    {

        "query": {

            "query_string": {

                "query":"Saint Paul School",

                "fields": [

                   "name","tags"

                ]

              

            }

        }

    }

     

    query_string 可以对多字段进行模糊搜索, query 中的搜索字段会被分词,并各自匹配,fields 字段用来确定搜索的字段。

    6、字段查询 term

    POST /schools/school/_search

    {

        "query": {

            "term": {

                "name":"Saint Paul School"

              

            }

        }

    }

     

    Term 搜索时,需要没有空格,不会进行分词,还需要条件全小写。要不然查不出来....

     

    7、范围查询:range ()

     

    POST /schools/school/_search

    {

        "query": {

            "range": {

               "fees": {

                  "from": 1000,

                  "to": 2500

               }

            }

           

        }

    }

     

    组合查询不好使,大概需要 bool 查询....

     

    8、bool 查询

     

    POST /schools/school/_search

    {

        "query": {

            "bool": {

                "must": [

                   {

                       "range": {

                          "fees": {

                             "from": 1000,

                             "to": 3000

                          }

                       }

                   },

                   {

                       "match": {

                          "name": "School"

                       }

                   },

                   {

                       "wildcard": {

                          "zip": {

                             "value": "17*15"

                          }

                       }

                   }

                   

                ],

                "boost": 1,

                "must_not": [

                   {

                          "term": {

                             "name": {

                                "value": "to"

                             }

                          }

                   }

                ]

      "should": [

                {

                   "match": {

                      "city": "paprola"

                   }

                }

             ]

            }

     

           

        }

    }

    9、高亮设置

     

    POST /schools/school/_search

    {

        "query": {

            "match": {

               "name": "Saint school"

            }

        },

        "highlight": {

            "fields": {

                "name":{}

            }

        }

    }

     

    10、分页 from 当前行数,从0开始(是行数,不是页码!!)  size 展示条数(下图,第二行开始,查一条数据)

    POST /schools/school/_search

    {

        "query": {

            "match": {

               "name": "Saint school"

            }

        },

        "highlight": {

            "fields": {

                "name":{}

            }

        }

        , "from": 1

        , "size": 1

    }

    11、过滤查询 ,查询多个filtersort 以数组的形式查询。

     

    POST /schools/school/_search

    {

        "query": {

            "bool": {

                "must": [

                   {

                       "match": {

                          "name": "school"

                       }

                   }

                ],

                "filter":[{

                    "exists": {

                       "field": "name"

                    }

                    

                },

            {

                    "range": {

                       "fees": {

                          "from": 10,

                          "to": 2000

                       }

                    }

                    

                }

                ]

            }

             

     

        }

        , "from": 1

        , "size": 10

        , "sort": [

           {

              "fees": {

                 "order": "desc"

              }

           }

        ]

    }

    11.1、 id过滤器

    11.2、 range 过滤器

    11.3、exists 过滤器

    11.4、term/terms 过滤器

     

    POST /schools/school/_search

    {

        "query": {

            "bool": {

                "must": [

                   {

                       "match": {

                          "name": "school"

                       }

                   }

                ],

                "filter":[{

                    "exists": {

                       "field": "name"

                    }

                    

                },

            {

                    "range": {

                       "fees": {

                          "from": 10,

                          "to": 5000

                       }

                    }

                    

                },

                        {

                    "ids":{

                        "values":[1,2,3]

                    }

                    

                },{

                    "term":{

                        "street":"tonk"

                    }

                }

                ]

            }

             

     

        }

        , "from": 0

        , "size": 10

        , "sort": [

           {

              "fees": {

                 "order": "desc"

              }

           }

        ]

    }

     

    11、聚合(Aggregations)

    聚合提供了功能可以分组并统计你的数据。理解聚合最简单的方式就是可以把它粗略的看做SQLGROUP BY 操作和SQL 的聚合函数。

    ES中常用的聚合:

    Metric(度量聚合) :度量聚合主要针对number类型的数据,需要ES做比较多的计算工作

    Bucketing (桶聚合):划分不同的“桶”,将数据分配到不同的“桶”里。非常类似sql中的group By 语句的含义。

     

    ES中的聚合API(格式)

    "aggregations" : {          // 表示聚合操作,可以使用aggs替代

      "<aggregation_name>" : {  // 聚合名,可以是任意的字符串。用做响应的key,便于快速取得正确的响应数据。

        "<aggregation_type>" : {   // 聚合类别,就是各种类型的聚合,如min

          <aggregation_body>    // 聚合体,不同的聚合有不同的body

       }

       [,"aggregations" : { [<sub_aggregation>]+ } ]? // 嵌套的子聚合,可以有0或多个

     }

     [,"<aggregation_name_2>" : { ... } ]* // 另外的聚合,可以有0或多个

    }

    1. 度量(metric)聚合

    A、avg平均值聚合 (min) 最小值聚合、(max)最大值聚合、(sum)相加和聚合 、(stats)以上4种打包聚合

    query": {

          "match": {

             "name": "Saint school"

          }

       },

       "highlight": {

          "fields": {

             "name": {}

          }

       },

       "aggregations":

          {

             "fees_avg": {

                "avg": {

                   "field": "fees"

                }

             },         "fees_min": {

                "min": {

                   "field": "fees"

                }

             },         "fees_max": {

                "max": {

                   "field": "fees"

                }

             },         "fees_sum": {

                "sum": {

                   "field": "fees"

                }

             },        "fees_stats": {

                "stats": {

                   "field": "fees"

                }

             }

          }

       ,

       "from": 0,

       "size": 10

    }

     

    2. 桶(bucketing)聚合

    自定义区间范围的聚合rangeto不包含自身

    POST /schools/school/_search

    {

       "query": {

          "match": {

             "name": "Saint school"

          }

       },

       "highlight": {

          "fields": {

             "name": {}

          }

       },

       "aggregations": {

          "fees_range": {

             "range": {

                "field": "fees",

                "ranges": [

                   {

                      "from": 0,

                      "to": 2000

                   },

                   {

                      "from": 2000,

                      "to": 3000

                   },

                   {

                      "from": 3000,

                      "to": 5001

                   }

                ]

             }

          }

       },

       "from": 0,

       "size": 10

    }

     

    自定义分组依据Term(不能选择text类型的field)

    POST /schools/school/_search

    {

       "query": {

          "match": {

             "name": "Saint school"

          }

       },

       "highlight": {

          "fields": {

             "name": {}

          }

       },

       "aggregations": {

          "fees_term": {

             "terms": {

                "field": "location",

                "size":3

                

             }

          }

       },

       "from": 0,

       "size": 10

    }

     

    时间区间聚合(Date Range Aggregation)

    # 时间区间聚合专门针对date类型的字段,它与Range Aggregation的主要区别是其可以使用时间运算表达式。

    #now+10y:表示从现在开始的第10年。

    #now+10M:表示从现在开始的第10个月。

    #1990-01-10||+20y:表示从1990-01-01开始后的第20年,即2010-01-01

    #now/y:表示在年位上做舍入运算。

    POST /schools/school/_search

    {

       "query": {

          "match": {

             "name": "Saint school"

          }

       },

       "highlight": {

          "fields": {

             "name": {}

          }

       },

       "aggregations": {

          "fees_term": {

             "terms": {

                "field": "location",

                "size":3

                

             }

          },

          "time_aggs":{

              "date_range":{

                  "field":"TimeFormat",

                  "format":"yyyy-MM-dd",

                  "ranges":[

                      {

                      "from":"now/y",

                      "to":"now"

                      },

                                        {

                      "from":"now/y-1y",

                      "to":"now/y"

                      },

                                        {

                      "from":"now/y-3y",

                      "to":"now/y-1y"

                      }

                      

                      ]

              }

          }

       },

       "from": 0,

       "size": 10

    }

    直方图聚合(Histogram Aggregation)

    # Histogram Aggregation

    #直方图聚合,它将某个number类型字段等分成n份,统计落在每一个区间内的记录数。它与前面介绍的Range聚合

    # 非常像,只不过Range可以任意划分区间,而Histogram做等间距划分。既然是等间距划分,那么参数里面必然有距离参数,就是interval参数。

     

    POST /schools/school/_search

    {

       "query": {

          "match": {

             "name": "Saint school"

          }

       },

       "highlight": {

          "fields": {

             "name": {}

          }

       },

       "aggregations": {

          "fees_aggs":{

              "histogram":{

                  "field":"fees",

                  "interval":1000

                 

              }

          },      "time_agg":{

              "date_histogram":{

                  "field":"TimeFormat",

                  "interval":"year",

                  "format":"yyyy-MM_dd"

                 

              }

          }

       },

       "from": 0,

       "size": 10

    }

     

     

  • 相关阅读:
    投票练习
    多条件查询
    PHP 购物车
    PHP TP模型
    PHP smarty函数
    PHP smarty复习
    PHP smarty缓存
    PHP phpcms
    php smarty查询分页
    PHP Smarty变量调节器
  • 原文地址:https://www.cnblogs.com/hxz-nl/p/11880216.html
Copyright © 2020-2023  润新知