elasticsearch 基础 —— Jion父子关系

前言

由于ES6.X版本以后，每个索引下面只支持单一的类型（type），因此不再支持以下形式的父子关系：

PUT /company
{
   "mappings": {
        "branch": {},
         "employee": {
             "_parent": {
                      "type": "branch"
              }
         }
     }
}

解决方案：

引入 join datatype

为同一索引中的文档定义父/子关系。

join datatype

join datatype是在同一索引文档中创建父/子关系的特殊字段。该relations部分定义了文档内的一组可能的关系，每个关系是父名和子名。父/子关系可以定义如下：

PUT my_index
{
  "mappings": {
    "_doc": {
      "properties": {
        "my_join_field": { ①
          "type": "join",
          "relations": {
            "question": "answer"  ②
          }
        }
      }
    }
  }
}

	该字段的名称
	定义单个关系，其中`answer`父级为`question`。

若要用join来索引（动词）文档，必须在文中提供文档的关系名称和可选父文档。例如，下面的示例在question上下文中创建两个父文档：

PUT my_index/_doc/1?refresh
{
  "text": "This is a question",
  "my_join_field": {
    "name": "question" ①
  }
}

PUT my_index/_doc/2?refresh
{
  "text": "This is a another question",
  "my_join_field": {
    "name": "question"
  }
}

本文档是一份question文档

在索引父文档时，可以选择仅指定关系的名称作为快捷方式，而不是将其封装在普通对象符号中：

PUT my_index/_doc/1?refresh
{
  "text": "This is a question",
  "my_join_field": "question" ①
}

PUT my_index/_doc/2?refresh
{
  "text": "This is another question",
  "my_join_field": "question"
}

父文档的简单表示法只使用关系名称。

索引子文档时，必须在文中添加父级的关系的名称以及文档的父级_source ID

例如，以下示例显示如何索引两个child文档：

PUT my_index/_doc/3?routing=1&refresh 
{
  "text": "This is an answer",
  "my_join_field": {
    "name": "answer", 
    "parent": "1" 
  }
}

PUT my_index/_doc/4?routing=1&refresh
{
  "text": "This is another answer",
  "my_join_field": {
    "name": "answer",
    "parent": "1"
  }
}

路由值是强制性的，因为父文件和子文件必须在相同的分片上建立索引。

“answer”是此子文档的加入名称。

指定此子文档的父文档ID：1。

Parent-join and performance 父连接和性能

连接字段不应该像关系数据库中的连接那样使用。在弹性搜索中，良好性能的关键是将数据归一化为文档。每个连接字段，HaseBoad或HasyPrad查询对查询性能都会产生显著的影响。

联接字段唯一有意义的情况是，如果数据包含一对多关系，其中一个实体的数量显著超过另一个实体。

Join 类型约束

每个索引只允许一个Join类型Mapping定义；
父文档和子文档必须在同一个分片上编入索引；这意味着，当进行删除、更新、查找子文档时候需要提供相同的路由值。
一个文档可以有多个子文档，但只能有一个父文档。
可以为已经存在的Join类型添加新的关系。
当一个文档已经成为父文档后，可以为该文档添加子文档。

Searching with parent-join 使用父连接搜索

父联接创建一个字段，索引文件（中关系的名称my_parent，my_child...）。

它还为每个父/子关系创建一个字段。此字段的名称是join后跟字段#的名称以及关系中父项的名称。因此，例如对于my_parent⇒[ my_child，another_child]关系，该join字段会创建一个名为的附加字段my_join_field#my_parent。

_id如果文档是子文件（my_child或another_child），则此字段包含文档链接到的父项，如果文档是_id父项（my_parent），则包含文档链接到的父项。

搜索包含join字段的索引时，始终在搜索响应中返回这两个字段：

GET my_index/_search
{
  "query": {
    "match_all": {}
  },
  "sort": ["_id"]
}

将返回：

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 4,
        "max_score": null,
        "hits": [
            {
                "_index": "my_index",
                "_type": "_doc",
                "_id": "1",
                "_score": null,
                "_source": {
                    "text": "This is a question",
                    "my_join_field": {
                        "name": "question" ①
                    }
                },
                "sort": [
                    "1"
                ]
            },
            {
                "_index": "my_index",
                "_type": "_doc",
                "_id": "2",
                "_score": null,
                "_source": {
                    "text": "This is a another question",
                    "my_join_field": {
                        "name": "question" ②
                    }
                },
                "sort": [
                    "2"
                ]
            },
            {
                "_index": "my_index",
                "_type": "_doc",
                "_id": "3",
                "_score": null,
                "_routing": "1",
                "_source": {
                    "text": "This is an answer",
                    "my_join_field": {
                        "name": "answer", ③
                        "parent": "1" ④
                    }
                },
                "sort": [
                    "3"
                ]
            },
            {
                "_index": "my_index",
                "_type": "_doc",
                "_id": "4",
                "_score": null,
                "_routing": "1",
                "_source": {
                    "text": "This is another answer",
                    "my_join_field": {
                        "name": "answer",
                        "parent": "1"
                    }
                },
                "sort": [
                    "4"
                ]
            }
        ]
    }
}

	此文档属于`question join`
	此文档属于`question join`
	此文档属于`answer join`
	子文档的链接父ID

Parent-join queries and aggregations 父连接查询和聚合

有关详细信息，请参阅has_child和 has_parent查询，children聚合和内部命中。

join可以在聚合和脚本中访问该字段的值，并可以使用查询 parent_id查询：

GET my_index/_search
{
  "query": {
    "parent_id": { 
      "type": "answer", ①
      "id": "1"
    }
  },
  "aggs": {
    "parents": {
      "terms": {
        "field": "my_join_field#question",  ②
        "size": 10
      }
    }
  },
  "script_fields": {
    "parent": {
      "script": {
         "source": "doc['my_join_field#question']"  ③
      }
    }
  }
}

	查询`parent id`字段（另请参阅`has_parent`查询和`has_child`查询）
	在`parent id`字段上聚合（也参见`children`聚合）
	访问脚本中的父ID字段

Child-join queries and aggregations 子连接查询和聚合

{
  "query": {
    "has_child": {
      "type": "answer",
      "query": {
        "match": {
          "text": "This is question"
        }
      }
    }
  }
}

或

{
  "query": {
    "has_parent": {
      "parent_type": "question",
      "query": {
        "term": {
          "_id": 1
        }
      },
      "inner_hits": {
        
      }
    }
  }
}

{
    "took": 42,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 1,
        "hits": [
            {
                "_index": "my_index",
                "_type": "_doc",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "text": "This is a question",
                    "my_join_field": {
                        "name": "question"
                    }
                }
            }
        ]
    }
}

Global ordinals 全局序数

该join字段使用全局序数来加速连接。在对碎片进行任何更改后，需要重建全局序数。父分区值存储在分片中的次数越多，重建join字段的全局序数所需的时间就越长。

默认情况下，全局序数是急切建立的：如果索引发生了变化，该join字段的全局序数将作为刷新的一部分重建。这可以为刷新增加大量时间。但是大多数情况下这是正确的权衡，否则在使用第一个父连接查询或聚合时会重建全局序数。这可能会为您的用户带来显着的延迟峰值，并且通常情况会更糟，因为join 当发生许多写操作时，可能会在单个刷新间隔内尝试重建该字段的多个全局序数。

当join字段不经常使用并且频繁发生写入时，禁用预先加载可能是有意义的：

PUT my_index
{
  "mappings": {
    "_doc": {
      "properties": {
        "my_join_field": {
          "type": "join",
          "relations": {
             "question": "answer"
          },
          "eager_global_ordinals": false
        }
      }
    }
  }
}

可以按父关系检查全局序数使用的堆量，如下所示：

# Per-index
GET _stats/fielddata?human&fields=my_join_field#question

# Per-node per-index
GET _nodes/stats/indices/fielddata?human&fields=my_join_field#question

Multiple children per parent

PUT my_index
{
  "mappings": {
    "_doc": {
      "properties": {
        "my_join_field": {
          "type": "join",
          "relations": {
            "question": ["answer", "comment"]  
          }
        }
      }
    }
  }
}

question是answer和comment的父级

Multiple levels of parent join 多级父级联接

不建议使用多级关系来复制关系模型。每个级别的关系在查询时在内存和计算方面增加了开销。如果您关心性能，则应该对数据进行去规范化。

PUT my_index
{
  "mappings": {
    "_doc": {
      "properties": {
        "my_join_field": {
          "type": "join",
          "relations": {
            "question": ["answer", "comment"],   ①
            "answer": "vote"  ②
          }
        }
      }
    }
  }
}

	`question` 是 `answer` 和`comment 的父级`
	`answer` 是 `vote 的父级`

实现如下图的祖孙三代关联关系的定义。

question
    /    
   /      
comment  answer
           |
           |
          vote

文档必须位于与其父父级和父级相同的分片上：

PUT my_index/_doc/3?routing=1&refresh  ①
{
  "text": "This is a vote",
  "my_join_field": {
    "name": "vote", ②
    "parent": "2" 
  }
}

	此子文档必须位于与其父父级和父级相同的分片上
	此文档的父ID（必须指向`answer`文档）

相关阅读:
SAP 质检使用非物料基本单位
 ABAP基础二：ALV基础之ALV的简单编辑
 生产订单修改删除组件BDC
创建生产订单函数BAPI_PRODORD_CREATE
修改信贷限额函数
 客户信贷管理&临时授信
 sap快捷搜索菜单栏
 创建交货单/外向交货BAPI_OUTB_DELIVERY_CREATE_SLS/STO
冲销交货单WS_REVERSE_GOODS_ISSUE
批次更新BAPI_OBJCL_CHANGE
原文地址：https://www.cnblogs.com/gmhappy/p/11864043.html