• es中的term和match的区别


    term用法

    先看看term的定义,term是代表完全匹配,也就是精确查询,搜索前不会再对搜索词进行分词拆解。

    这里通过例子来说明,先存放一些数据:

    {
        "title": "love China",
        "content": "people very love China",
        "tags": ["China", "love"]
    }
    {
        "title": "love HuBei",
        "content": "people very love HuBei",
        "tags": ["HuBei", "love"]
    }

    来使用term 查询下:

    {
      "query": {
        "term": {
          "title": "love"
        }
      }
    }

    结果是,上面的两条数据都能查询到:

    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 0.6931472,
        "hits": [
          {
            "_index": "test",
            "_type": "doc",
            "_id": "8",
            "_score": 0.6931472,
            "_source": {
              "title": "love HuBei",
              "content": "people very love HuBei",
              "tags": ["HuBei","love"]
            }
          },
          {
            "_index": "test",
            "_type": "doc",
            "_id": "7",
            "_score": 0.6931472,
            "_source": {
              "title": "love China",
              "content": "people very love China",
              "tags": ["China","love"]
            }
          }
        ]
      }
    }

    发现,title里有关love的关键字都查出来了,但是我只想精确匹配 love China这个,按照下面的写法看看能不能查出来:

    {
      "query": {
        "term": {
          "title": "love China"
        }
      }
    }

    执行发现无数据,从概念上看,term属于精确匹配,只能查单个词。我想用term匹配多个词怎么做?可以使用terms来:

    {
      "query": {
        "terms": {
          "title": ["love", "China"]
        }
      }
    }

    查询结果为:

    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 0.6931472,
        "hits": [
          {
            "_index": "test",
            "_type": "doc",
            "_id": "8",
            "_score": 0.6931472,
            "_source": {
              "title": "love HuBei",
              "content": "people very love HuBei",
              "tags": ["HuBei","love"]
            }
          },
          {
            "_index": "test",
            "_type": "doc",
            "_id": "7",
            "_score": 0.6931472,
            "_source": {
              "title": "love China",
              "content": "people very love China",
              "tags": ["China","love"]
            }
          }
        ]
      }
    }

    发现全部查询出来,为什么?因为terms里的[ ] 多个是或者的关系,只要满足其中一个词就可以。想要通知满足两个词的话,就得使用bool的must来做,如下:

    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "title": "love"
              }
            },
            {
              "term": {
                "title": "china"
              }
            }
          ]
        }
      }
    }
    可以看到,我们上面使用china是小写的。当使用的是大写的China 我们进行搜索的时候,发现搜不到任何信息。这是为什么了?title这个词在进行存储的时候,进行了分词处理。我们这里使用的是默认的分词处理器进行了分词处理。我们可以看看如何进行分词处理的?

    分词处理器

    GET test/_analyze
    {
      "text" : "love China"
    }

    结果为:

    {
      "tokens": [
        {
          "token": "love",
          "start_offset": 0,
          "end_offset": 4,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "china",
          "start_offset": 5,
          "end_offset": 10,
          "type": "<ALPHANUM>",
          "position": 1
        }
      ]
    }

    分析出来的为lovechina的两个词。而term只能完完整整的匹配上面的词,不做任何改变的匹配。所以,我们使用China这样的方式进行的查询的时候,就会失败。稍后会有一节专门讲解分词器。

    match用法

    先用 love China来匹配。

    GET test/doc/_search
    {
      "query": {
        "match": {
          "title": "love China"
        }
      }
    }

    结果是:

    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 1.3862944,
        "hits": [
          {
            "_index": "test",
            "_type": "doc",
            "_id": "7",
            "_score": 1.3862944,
            "_source": {
              "title": "love China",
              "content": "people very love China",
              "tags": [
                "China",
                "love"
              ]
            }
          },
          {
            "_index": "test",
            "_type": "doc",
            "_id": "8",
            "_score": 0.6931472,
            "_source": {
              "title": "love HuBei",
              "content": "people very love HuBei",
              "tags": [
                "HuBei",
                "love"
              ]
            }
          }
        ]
      }
    }
    发现两个都查出来了,为什么?因为match进行搜索的时候,会先进行分词拆分,拆完后,再来匹配,上面两个内容,他们title的词条为: love china hubei ,我们搜索的为love China 我们进行分词处理得到为love china ,并且属于或的关系,只要任何一个词条在里面就能匹配到。如果想 loveChina 同时匹配到的话,怎么做?使用 match_phrase

    match_phrase 用法

    match_phrase 称为短语搜索,要求所有的分词必须同时出现在文档中,同时位置必须紧邻一致。

    GET test/doc/_search
    {
      "query": {
        "match_phrase": {
          "title": "love china"
        }
      }
    }

    结果为:

    {
      "took": 5,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 1.3862944,
        "hits": [
          {
            "_index": "test",
            "_type": "doc",
            "_id": "7",
            "_score": 1.3862944,
            "_source": {
              "title": "love China",
              "content": "people very love China",
              "tags": [
                "China",
                "love"
              ]
            }
          }
        ]
      }
    }

    这次好像符合我们的需求了,结果只出现了一条记录。

     原文链接:https://www.jianshu.com/p/d5583dff4157

  • 相关阅读:
    Azure HPC Pack Cluster添加辅助节点
    Azure HPC Pack 辅助节点模板配置
    Azure HPC Pack配置管理系列(PART6)
    Windows HPC Pack 2012 R2配置
    Azure HPC Pack 节点提升成域控制器
    Azure HPC Pack VM 节点创建和配置
    Azure HPC Pack 部署必要条件准备
    Azure HPC Pack 基础拓扑概述
    Azure VM 性能计数器配置
    Maven私仓配置
  • 原文地址:https://www.cnblogs.com/chong-zuo3322/p/14031602.html
Copyright © 2020-2023  润新知