ElasticSearch 6.x 父子文档[join]分析

ES6.0以后，索引的type只能有一个，使得父子结构变的不那么清晰，毕竟对于java开发者来说，index->db,type->table的结构比较容易理解。

按照官方的说明，之前一个索引有多个type，如果有一个相同的字段在不同的type中出现，在ES底层其实是按照一个field来做lucene索引的，这很具有迷惑性，容易造成误解。所以6.0以后，所有的字段都在索引的_doc【默认type】中集中定义。假设索引中会有parent和child两个类型的文档，那么可能parent引用了abcd字段，child引用了aef字段，各取所需。

目前我用的es版本为6.3，父子结构需要用join字段来定义，关系的映射用relations字段来指定。

一个索引中只能有一个join类型字段，如果定义一个以上的join字段，会报错：Field [_parent_join] is defined twice in [_doc]
join字段中的relations集合，建好索引之后，可以增加映射，或者给原有的映射添加child，但是不能删除原有的映射。
比如，原有的relations定义为：

"myJoin": {
  "type": "join",
  "eager_global_ordinals": true,
  "relations": {
    "parent_a": child_a1
  }
}

现在通过updateMapping API增加一条映射parent_b，原有的映射增加了child_a2和child_a3：

"myJoin": {
  "type": "join",
  "eager_global_ordinals": true,
  "relations": {
    "parent_a": [
      "child_a1",
      "child_a2",
      "child_a3"
    ],
    "parent_b": "child_b"
  }
}

中午睡了个午觉，接着再写一点join的操作

根据子文档查询父文档

GET /test_index_join/_search
{
  "query": {
    "has_child": {
      "type": "child_a1",
      "score_mode": "max", 
      # 基于child_a1文档定义来搜索，query里的查询字段是child_a1里的
      "query": {
        "term": {
          "salesCount": 100
        }
      }
    }
  }
}

根据子文档对父文档进行排序

说明：根据子文档的字段影响父文档的的得分，然后父文档根据_score来排序。

下面例子中，父文档的得分为：_score * child_a1.salesCount，score_mode可以是min,max,sum,avg,first等。

GET /test_index_join/_search
{
  "query": {
    "has_child": {
      "type": "child_a1",
      "score_mode": "max", 
      "query": {
        "function_score": {
          "script_score": {
            "script": "_score * doc['salesCount'].value"
          }
        }
      }
    }
  },
  "sort": [
    {
      "_score": {
        "order": "asc"
      }
    }
  ]
}

还可以依赖field_value_factor来影响父文档得分，效果相似，效率更高；functions支持多个field影响因子，多个因子的默认[score_mode]计分模式为multiply[相乘]，还有其他可选模式为：min,max,avg,sum,first,multiply。

下面例子中，父文档的得分为：salesCount，因为没有其他的影响因子，如果有多个，则取最大的一个，因为score_mode为max。

GET /test_index_join/_search
{
  "query": {
    "has_child": {
      "type": "child_a1",
      "score_mode": "max", 
      "query": {
        "function_score": {
          "functions": [
            {
              "field_value_factor": {
                "field": "salesCount"
              }
            }
          ]
        }
      }
    }
  },
  "sort": [
    {
      "_score": {
        "order": "asc"
      }
    }
  ]
}

根据父文档查询子文档

GET /test_index_join/_search
{
  "query": {
    "has_parent": {
      "parent_type": "parnet_a",
      # 基于parnet_a来搜索，query里的查询字段是parnet_a里的
      "query": {
        "range": {
          "price": {
            "gt": 1,
            "lte": 200
          }
        }
      }
    }
  }
}

相关阅读:
Spring Boot(十一)：Spring Boot 中 MongoDB 的使用
 你干啥的？Lombok
面试必备的分布式事物方案
 Shiro框架详解 tagline
List中的ArrayList和LinkedList源码分析
 计算机内存管理介绍
 Struts2.5 伪静态的配置
 Hibernate——hibernate的配置测试
 Struts2.5的的环境搭建及跑通流程
 Jsp敏感词过滤
原文地址：https://www.cnblogs.com/yucy/p/9504939.html