ElasticSearch 入门 - 润新知

ElasticSearch 入门

ElasticSearch学习笔记

1、ElasticSearch安装

替换 ik分词器：版本要对应，如果不对应，会报错..

需要Java JDK 配置。

2、ElasticSearch简单的CRUD

1> 创建索引------>> 类型------>>文档

给字段确定类型

PUT /schools/_mapping/school

{

    "properties":{

        "TimeFormat":{

            "type":"date",

            "format":"yyyy-MM-dd HH:mm:ss"

        }

    }

}

创建index 为student ，type为article 的字段subject 类型为text 使用ik_max_word 分词器的文档。

PUT /student/?pretty

{

        "settings" : {

        "analysis" : {

            "analyzer" : {

                "ik" : {

                    "tokenizer" : "ik_max_word"

                }

            }

        }

    },

    "mappings" : {

        "article" : {

            "dynamic" : true,

            "properties" : {

                "subject" : {

                    "type" : "text",

                    "analyzer" : "ik_max_word"

                }

            }

        }

    }

}

如果不手动指定，分词器就不会默认使用ik .且以上只能针对文档中的字段指定

以下针对index 进行指定使用ik分词器

PUT /students

{

    "settings" : {

        "index" : {

            "analysis.analyzer.default.type": "ik_max_word"

        }

    }

}

A . 单条插入

PUT http://localhost:9200/movies/movie/3

{

    "title": "To Kill a Mockingbird",

    "director": "Robert Mulligan",

    "year": 1962

}

PUT url/index/type/id

{

“字段”:”值”，

“字段”:”值”，

“字段”:”值”，

....

}

使用以上格式创建索引、类型、文档

{ "_index": "movies", "_type": "movie", "_id": "1", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "created": true }

Version，为1，result 为：created

B. 批量插入

POST /schools/_bulk

{"index":{"_index":"schools","_type":"school","_id":"1"}}

{"name":"Central School","description":"CBSE Affiliation","street":"Nagan","city":"paprola","state":"HP","zip":"176115","location":[31.8955385,76.8380405],"fees":2000,"tags":["Senior Secondary","beautiful campus"],"rating":"3.5"}

{"index":{"_index":"schools","_type":"school","_id":"2"}}

{"name":"Saint Paul School","description":"ICSE Afiliation","street":"Dawarka","city":"Delhi","state":"Delhi","zip":"110075","location":[28.5733056,77.0122136],"fees":5000,"tags":["Good Faculty","Great Sports"],"rating":"4.5"}

{"index":{"_index":"schools","_type":"school","_id":"3"}}

{"name":"Crescent School","description":"State Board Affiliation","street":"Tonk Road","city":"Jaipur","state":"RJ","zip":"176114","location":[26.8535922,75.7923988],"fees":2500,"tags":["Well equipped labs"],"rating":"4.5"}

使用_bulk 进行批量的插入数据。

2> 修改文档

现在，在索引中有了一部电影信息，接下来来了解如何更新它，添加一个类型列表。要做到这一点，只需使用相同的ID索引它。使用与之前完全相同的索引请求，但类型扩展了JSON对象

PUT  http://localhost:9200/movies/movie/3

{

    "title": "To Kill a Mockingbird",

    "director": "Robert Mulligan",

    "year": 1962,

    "genres": ["Crime", "Drama", "Mystery"]

}

响应如下：

{ "_index": "movies", "_type": "movie", "_id": "1", "_version": 2, "result": "updated", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "created": false }

Version，变为了2，result 为：updated

修改文档的单个字段（script   inline）

POST schools/school/_update_by_query

{

  "script": {

    "inline": "ctx._source.TimeFormat ='2016-09-08 15:20:30';ctx._source.zip='1766889'"

  },

  "query":{

      "term":{

          "city":"delhi"

      }

  }



}

3> 删除文档

为了通过ID从索引中删除单个指定的文档，使用与获取索引文档相同的URL，只是这里将HTTP方法更改为DELETE。

DELETE http://localhost:9200/movies/movie/3

返回响应：

{

   "_index": "movies",

   "_type": "movie",

   "_id": "1",

   "_version": 2,

   "result": "deleted",

   "_shards": {

      "total": 2,

      "successful": 1,

      "failed": 0

   },

   "_seq_no": 5,

   "_primary_term": 1

}

4> 查询文档

为了通过ID从索引中查询单个指定的文档，使用与获取索引文档相同的URL，只是这里将HTTP方法更改为GET。

GET http://localhost:9200/movies/movie/3

条件搜索：

常用查询：

全文本查询：针对文本

1、查询全部：match_all

2、模糊匹配： match (类似sql 的 like)

3、全句匹配： match_phrase (类似sql 的 = )

4、多字段匹配：multi_match （多属性查询）

5、语法查询：query_string (直接写需要配置的关键字 )

6、字段查询： term (针对某个属性的查询，这里注意 term 不会进行分词，比如在 es 中存了 “火锅” 会被分成 “火/锅” 当你用 term 去查询 “火时能查到”，但是查询 “火锅” 时，就什么都没有，而 match 就会将词语分成 “火/锅”去查)

7、范围查询：range ()

字段查询：针对结构化数据，如数字，日期。。。

分页：

“from”: 10,

“size”: 10

constant_score: 固定分数。

filter: 查询：（query 属于类似就可以查出来，而 filter 类似 = 符号，要么成功，要么失败，没有中间值，查询速度比较快

1、查询全部：match_all

POST _search

{

   "query": {

      "match_all": {}

   }

}

2、模糊匹配： match (类似sql 的 like)

POST /schools/school/_search

{

    "query": {

        "match": {

            "name":"Saint Paul School"



        }

    }

}

使用 match 进行搜索时：搜索内容通过分词器进行分词后，与文本分词后的结果进行匹配，如上例：搜索 /schools/school/ 中的name 字段中 Saint Paul School 进过分词的所有匹配项，只要name中有分词其中之一就会被匹配。

3、全句匹配： match_phrase (类似sql 的 = )

POST /schools/school/_search

{

    "query": {

        "match_phrase": {

            "name":"Saint Paul School"



        }

    }

}

使用 match_phrase进行搜索时：搜索内容通过分词器进行分词后，与文本分词后的结果进行连续，精确的匹配，如上例：搜索 /schools/school/ 中的name 字段中 Saint Paul School 进过分词的所有匹配项，只有name中同时有Saint、 Paul 、School 三个连续的分词才会被匹配。相当于是对 sql中 =的用法，但可以忽略空格。

4、多字段匹配：multi_match （多属性查询）

POST /schools/school/_search

{

    "query": {

        "multi_match": {

            "query":"Saint Paul School",

            "fields": [

               "name","tags"

            ]



        }

    }

}

multi_match 可以对多字段进行模糊搜索， query 中的搜索字段会被分词，并各自匹配，fields 字段用来确定搜索的字段。

5、语法查询：query_string (直接写需要配置的关键字 )

POST /schools/school/_search

{

    "query": {

        "query_string": {

            "query":"Saint Paul School",

            "fields": [

               "name","tags"

            ]



        }

    }

}

query_string 可以对多字段进行模糊搜索， query 中的搜索字段会被分词，并各自匹配，fields 字段用来确定搜索的字段。

6、字段查询： term

POST /schools/school/_search

{

    "query": {

        "term": {

            "name":"Saint Paul School"



        }

    }

}

Term 搜索时，需要没有空格，不会进行分词，还需要条件全小写。要不然查不出来....

7、范围查询：range ()

POST /schools/school/_search

{

    "query": {

        "range": {

           "fees": {

              "from": 1000,

              "to": 2500

           }

        }



    }

}

组合查询不好使，大概需要 bool 查询....

8、bool 查询

POST /schools/school/_search

{

    "query": {

        "bool": {

            "must": [

               {

                   "range": {

                      "fees": {

                         "from": 1000,

                         "to": 3000

                      }

                   }

               },

               {

                   "match": {

                      "name": "School"

                   }

               },

               {

                   "wildcard": {

                      "zip": {

                         "value": "17*15"

                      }

                   }

               }



            ],

            "boost": 1,

            "must_not": [

               {

                      "term": {

                         "name": {

                            "value": "to"

                         }

                      }

               }

            ]，

  "should": [

            {

               "match": {

                  "city": "paprola"

               }

            }

         ]

        }



    }

}

9、高亮设置

POST /schools/school/_search

{

    "query": {

        "match": {

           "name": "Saint school"

        }

    },

    "highlight": {

        "fields": {

            "name":{}

        }

    }

}

10、分页 from 当前行数，从0开始（是行数，不是页码！！） size 展示条数（下图，第二行开始，查一条数据）

POST /schools/school/_search

{

    "query": {

        "match": {

           "name": "Saint school"

        }

    },

    "highlight": {

        "fields": {

            "name":{}

        }

    }

    , "from": 1

    , "size": 1

}

11、过滤查询，查询多个filter，sort 以数组的形式查询。

POST /schools/school/_search

{

    "query": {

        "bool": {

            "must": [

               {

                   "match": {

                      "name": "school"

                   }

               }

            ],

            "filter":[{

                "exists": {

                   "field": "name"

                }



            },

        {

                "range": {

                   "fees": {

                      "from": 10,

                      "to": 2000

                   }

                }



            }

            ]

        }



    }

    , "from": 1

    , "size": 10

    , "sort": [

       {

          "fees": {

             "order": "desc"

          }

       }

    ]

}

11.1、 id过滤器

11.2、 range 过滤器

11.3、exists 过滤器

11.4、term/terms 过滤器

POST /schools/school/_search

{

    "query": {

        "bool": {

            "must": [

               {

                   "match": {

                      "name": "school"

                   }

               }

            ],

            "filter":[{

                "exists": {

                   "field": "name"

                }



            },

        {

                "range": {

                   "fees": {

                      "from": 10,

                      "to": 5000

                   }

                }



            },

                    {

                "ids":{

                    "values":[1,2,3]

                }



            },{

                "term":{

                    "street":"tonk"

                }

            }

            ]

        }



    }

    , "from": 0

    , "size": 10

    , "sort": [

       {

          "fees": {

             "order": "desc"

          }

       }

    ]

}

11、聚合（Aggregations）

聚合提供了功能可以分组并统计你的数据。理解聚合最简单的方式就是可以把它粗略的看做SQL的GROUP BY 操作和SQL 的聚合函数。

ES中常用的聚合：

Metric(度量聚合) ：度量聚合主要针对number类型的数据，需要ES做比较多的计算工作

Bucketing (桶聚合)：划分不同的“桶”，将数据分配到不同的“桶”里。非常类似sql中的group By 语句的含义。

ES中的聚合API(格式) ：

"aggregations" : {          // 表示聚合操作，可以使用aggs替代

  "<aggregation_name>" : { // 聚合名，可以是任意的字符串。用做响应的key，便于快速取得正确的响应数据。

    "<aggregation_type>" : {   // 聚合类别，就是各种类型的聚合，如min等

      <aggregation_body>    // 聚合体，不同的聚合有不同的body

   }

   [,"aggregations" : { [<sub_aggregation>]+ } ]? // 嵌套的子聚合，可以有0或多个

}

[,"<aggregation_name_2>" : { ... } ]* // 另外的聚合，可以有0或多个

}

1. 度量(metric)聚合

A、（avg）平均值聚合 、(min) 最小值聚合、（max）最大值聚合、（sum）相加和聚合、（stats）以上4种打包聚合

query": {

      "match": {

         "name": "Saint school"

      }

   },

   "highlight": {

      "fields": {

         "name": {}

      }

   },

   "aggregations":

      {

         "fees_avg": {

            "avg": {

               "field": "fees"

            }

         },         "fees_min": {

            "min": {

               "field": "fees"

            }

         },         "fees_max": {

            "max": {

               "field": "fees"

            }

         },         "fees_sum": {

            "sum": {

               "field": "fees"

            }

         },        "fees_stats": {

            "stats": {

               "field": "fees"

            }

         }

      }

   ,

   "from": 0,

   "size": 10

}

2. 桶（bucketing）聚合

自定义区间范围的聚合（range）to不包含自身

POST /schools/school/_search

{

   "query": {

      "match": {

         "name": "Saint school"

      }

   },

   "highlight": {

      "fields": {

         "name": {}

      }

   },

   "aggregations": {

      "fees_range": {

         "range": {

            "field": "fees",

            "ranges": [

               {

                  "from": 0,

                  "to": 2000

               },

               {

                  "from": 2000,

                  "to": 3000

               },

               {

                  "from": 3000,

                  "to": 5001

               }

            ]

         }

      }

   },

   "from": 0,

   "size": 10

}

自定义分组依据Term（不能选择text类型的field）

POST /schools/school/_search

{

   "query": {

      "match": {

         "name": "Saint school"

      }

   },

   "highlight": {

      "fields": {

         "name": {}

      }

   },

   "aggregations": {

      "fees_term": {

         "terms": {

            "field": "location",

            "size":3



         }

      }

   },

   "from": 0,

   "size": 10

}

时间区间聚合(Date Range Aggregation)

# 时间区间聚合专门针对date类型的字段，它与Range Aggregation的主要区别是其可以使用时间运算表达式。

#now+10y：表示从现在开始的第10年。

#now+10M：表示从现在开始的第10个月。

#1990-01-10||+20y：表示从1990-01-01开始后的第20年，即2010-01-01。

#now/y：表示在年位上做舍入运算。

POST /schools/school/_search

{

   "query": {

      "match": {

         "name": "Saint school"

      }

   },

   "highlight": {

      "fields": {

         "name": {}

      }

   },

   "aggregations": {

      "fees_term": {

         "terms": {

            "field": "location",

            "size":3



         }

      },

      "time_aggs":{

          "date_range":{

              "field":"TimeFormat",

              "format":"yyyy-MM-dd",

              "ranges":[

                  {

                  "from":"now/y",

                  "to":"now"

                  },

                                    {

                  "from":"now/y-1y",

                  "to":"now/y"

                  },

                                    {

                  "from":"now/y-3y",

                  "to":"now/y-1y"

                  }



                  ]

          }

      }

   },

   "from": 0,

   "size": 10

}

直方图聚合(Histogram Aggregation)

# Histogram Aggregation

#直方图聚合，它将某个number类型字段等分成n份，统计落在每一个区间内的记录数。它与前面介绍的Range聚合

# 非常像，只不过Range可以任意划分区间，而Histogram做等间距划分。既然是等间距划分，那么参数里面必然有距离参数，就是interval参数。

POST /schools/school/_search

{

   "query": {

      "match": {

         "name": "Saint school"

      }

   },

   "highlight": {

      "fields": {

         "name": {}

      }

   },

   "aggregations": {

      "fees_aggs":{

          "histogram":{

              "field":"fees",

              "interval":1000



          }

      },      "time_agg":{

          "date_histogram":{

              "field":"TimeFormat",

              "interval":"year",

              "format":"yyyy-MM_dd"



          }

      }

   },

   "from": 0,

   "size": 10

}
相关阅读:
投票练习
 多条件查询
 PHP 购物车
 PHP TP模型
 PHP smarty函数
 PHP smarty复习
 PHP smarty缓存
 PHP phpcms
php smarty查询分页
 PHP Smarty变量调节器
原文地址：https://www.cnblogs.com/hxz-nl/p/11880216.html

最新文章
Foreach嵌套For
相同name 值的传输分解数组为字符串
 分页类的使用
 修改,删除
 修改
 查询
 添加
 空操作命名空间
 控制器
 tp基础框架基本知识

热门文章
smarty初始配置，文件存放路径
 smarty原理
 aiax登陆页面
 留言板
 文件操作··方法
 jQuery全选
 Jquery
PDO
php分页查询
 租房子查询练习

Copyright © 2020-2023 润新知