• ES elasticsearch 聚合统计


    准备数据
    创建索引
    为了说明介绍中提到的各种存储桶聚合,我们首先创建一个新的 “sports” 索引,该索引存储 “althlete” 文档的集合。 索引映射将包含诸如运动员的位置,姓名,等级,运动,年龄,进球数和场位置(例如防守者)之类的字段。 让我们创建映射:

    PUT sports
    {
    "mappings": {
    "properties": {
    "birthdate": {
    "type": "date",
    "format": "dateOptionalTime"
    },
    "location": {
    "type": "geo_point"
    },
    "name": {
    "type": "keyword"
    },
    "rating": {
    "type": "integer"
    },
    "sport": {
    "type": "keyword"
    },
    "age": {
    "type": "integer"
    },
    "goals": {
    "type": "integer"
    },
    "role": {
    "type": "keyword"
    },
    "score_weight": {
    "type": "float"
    }
    }
    }
    }
    一旦 mapping 创建成功,我们就可以使用 Elasticsearch 所提供的 Bulk API 来把我们数据导入到我们的索引中。我们可以通过一个REST 调用就把所有的数据导入到 Elasticsearch 中。

    POST sports/_bulk
    {"index":{"_index":"sports"}}
    {"name":"Michael","birthdate":"1989-10-1","sport":"Football","rating":["5","4"],"location":"46.22,-68.45","age":"23","goals":"43","score_weight":"3","role":"midfielder"}
    {"index":{"_index":"sports"}}
    {"name":"Bob","birthdate":"1989-11-2","sport":"Football","rating":["3","4"],"location":"45.21,-68.35","age":"33","goals":"54","score_weight":"2","role":"forward"}
    {"index":{"_index":"sports"}}
    {"name":"Jim","birthdate":"1988-10-3","sport":"Football","rating":["3","2"],"location":"45.16,-63.58","age":"28","goals":"73","score_weight":"2","role":"forward"}
    {"index":{"_index":"sports"}}
    {"name":"Joe","birthdate":"1992-5-20","sport":"Basketball","rating":["4","3"],"location":"45.22,-68.53","age":"18","goals":"848","score_weight":"3","role":"midfielder"}
    {"index":{"_index":"sports"}}
    {"name":"Tim","birthdate":"1992-2-28","sport":"Basketball","rating":["3","3"],"location":"46.22,-68.85","age":"28","goals":"942","score_weight":"2","role":"forward"}
    {"index":{"_index":"sports"}}
    {"name":"Alfred","birthdate":"1990-9-9","sport":"Football","rating":["2","2"],"location":"45.12,-68.35","age":"25","goals":"53","score_weight":"4","role":"defender"}
    {"index":{"_index":"sports"}}
    {"name":"Jeff","birthdate":"1990-4-1","sport":"Hockey","rating":["2","3"],"location":"46.12,-68.55","age":"26","goals":"93","score_weight":"3","role":"midfielder"}
    {"index":{"_index":"sports"}}
    {"name":"Will","birthdate":"1988-3-1","sport":"Hockey","rating":["4","4"],"location":"46.25,-84.25","age":"27","goals":"124","score_weight":"2","role":"forward"}
    {"index":{"_index":"sports"}}
    {"name":"Mick","birthdate":"1989-10-1","sport":"Football","rating":["3","4"],"location":"46.22,-68.45","age":"35","goals":"56","score_weight":"3","role":"midfielder"}
    {"index":{"_index":"sports"}}
    {"name":"Pong","birthdate":"1989-11-2","sport":"Basketball","rating":["1","3"],"location":"45.21,-68.35","age":"34","goals":"1483","score_weight":"2","role":"forward"}
    {"index":{"_index":"sports"}}
    {"name":"Ray","birthdate":"1988-10-3","sport":"Football","rating":["2","2"],"location":"45.16,-63.58","age":"31","goals":"84","score_weight":"3","role":"midfielder"}
    {"index":{"_index":"sports"}}
    {"name":"Ping","birthdate":"1992-5-20","sport":"Basketball","rating":["4","3"],"location":"45.22,-68.53","age":"27","goals":"1328","score_weight":"2","role":"forward"}
    {"index":{"_index":"sports"}}
    {"name":"Duke","birthdate":"1992-2-28","sport":"Hockey","rating":["5","2"],"location":"46.22,-68.85","age":"41","goals":"218","score_weight":"2","role":"forward"}
    {"index":{"_index":"sports"}}
    {"name":"Hal","birthdate":"1990-9-9","sport":"Hockey","rating":["4","2"],"location":"45.12,-68.35","age":"18","goals":"148","score_weight":"3","role":"midfielder"}
    {"index":{"_index":"sports"}}
    {"name":"Charge","birthdate":"1990-4-1","sport":"Football","rating":["3","2"],"location":"44.19,-82.55","age":"19","goals":"34","score_weight":"4","role":"defender"}
    {"index":{"_index":"sports"}}
    {"name":"Barry","birthdate":"1988-3-1","sport":"Football","rating":["5","2"],"location":"36.45,-79.15","age":"20","goals":"48","score_weight":"4","role":"defender"}
    {"index":{"_index":"sports"}}
    {"name":"Bank","birthdate":"1988-3-1","sport":"Handball","rating":["6","4"],"location":"46.25,-54.53","age":"25","goals":"150","score_weight":"4","role":"defender"}
    {"index":{"_index":"sports"}}
    {"name":"Bingo","birthdate":"1988-3-1","sport":"Handball","rating":["10","7"],"location":"46.25,-68.55","age":"29","goals":"143","score_weight":"3","role":"midfielder"}
    {"index":{"_index":"sports"}}
    {"name":"James","birthdate":"1988-3-1","sport":"Basketball","rating":["10","8"],"location":"41.25,-69.55","age":"36","goals":"1284","score_weight":"2","role":"forward"}
    {"index":{"_index":"sports"}}
    {"name":"Wayne","birthdate":"1988-3-1","sport":"Hockey","rating":["10","10"],"location":"46.21,-68.55","age":"25","goals":"113","score_weight":"3","role":"midfielder"}
    {"index":{"_index":"sports"}}
    {"name":"Brady","birthdate":"1988-3-1","sport":"Handball","rating":["10","10"],"location":"63.24,-84.55","age":"29","goals":"443","score_weight":"2","role":"forward"}
    {"index":{"_index":"sports"}}
    {"name":"Lewis","birthdate":"1988-3-1","sport":"Football","rating":["10","10"],"location":"56.25,-74.55","age":"24","goals":"49","score_weight":"3","role":"midfielder"}
    这样我们的数据就录入到 Elasticsearch 中了。在下面,我们就用不同的存储桶来对我们的数据进行统计。

    Filter(s) Aggregations
    桶聚合支持单过滤器聚合和多过滤器聚合。

    单个过滤器聚合根据与过滤器定义中指定的查询或字段值匹配的所有文档构造单个存储桶。 当您要标识一组符合特定条件的文档时,单过滤器聚合很有用。

    例如,我们可以使用单过滤器聚合来查找所有具有 “defender” 角色的运动员,并计算每个过滤桶的平均目标。 过滤器配置如下所示:

    POST sports/_search
    {
    "size": 0,
    "aggs": {
    "defender_filter": {
    "filter": {
    "term": {
    "role": "defender"
    }
    },
    "aggs": {
    "avg_goals": {
    "avg": {
    "field": "goals"
    }
    }
    }
    }
    }
    }
    我们看到的结果是:

    "aggregations" : {
    "defender_filter" : {
    "doc_count" : 4,
    "avg_goals" : {
    "value" : 71.25
    }
    }
    }
    如你所见,filter 聚合包含一个 “term” 字段,该字段指定文档中的字段以搜索特定值(在本例中为 “defender”)。 Elasticsearch 将遍历所有文档,并检查 “role” 字段中是否包含 “defender”。 然后将与该值匹配的文档添加到聚合生成的单个存储桶中。此输出表明我们集合中所有后卫的平均进球数为71.25。

    这是单过滤器聚合的示例。 但是,在 Elasticsearch 中,你可以选择使用 filter 聚合指定多个过滤器。 这是一个多值聚合,其中每个存储桶都对应一个特定的过滤器。 我们可以修改上面的示例以过滤 defender 和 foward:

    GET sports/_search
    {
    "size": 0,
    "aggs": {
    "athletes": {
    "filters": {
    "filters": {
    "defenders": {
    "term": {
    "role": "defender"
    }
    },
    "forwards": {
    "term": {
    "role": "forward"
    }
    }
    }
    },
    "aggs": {
    "avg_goals": {
    "avg": {
    "field": "goals"
    }
    }
    }
    }
    }
    }
    显示结果为:

    "aggregations" : {
    "athletes" : {
    "buckets" : {
    "defenders" : {
    "doc_count" : 4,
    "avg_goals" : {
    "value" : 71.25
    }
    },
    "forwards" : {
    "doc_count" : 9,
    "avg_goals" : {
    "value" : 661.0
    }
    }
    }
    }
    }
    我们可以看出来我们同时有两个桶,分别对应我们的两个 filter。每一个 filter 都检查 role 的值为 defender 或者 forward。

    我们甚至可以在 Kibana 中展示这两个桶的数据。为了能够在 Kibana 中使用我们的数据,我们必须创建一个 index pattern。如果你还不了解这个,请参阅我之前的文章 “Kibana: 如何使用Search Bar”。在我们导入数据的时候,我们选择 birthdate 字段作为时间系列的 timestamp。

    如您所见,“goals” 字段上的平均子聚合是在Y轴上定义的。 在X轴上,我们创建两个过滤器,并为它们指定 “defender” 和 “forward” 值。 由于平均指标是过滤器聚合的子聚合,因此 Elasticsearch 将创建的过滤器应用于 “goals” 字段,因此我们无需明确指定该字段。

    Terms Aggregation
    也许我们对 Terms aggregation 并不陌生。我们刚才在一开始已经使用了 terms aggregation。

    术语聚合会在文档的指定字段中搜索唯一值,并为找到的每个唯一值构建存储桶。 与过滤器聚合不同,术语聚合的任务不是将结果限制为特定值,而是查找文档中给定字段的所有唯一值。

    看一下下面的示例,我们试图为 “sport” 字段中找到的每个唯一值创建一个存储桶。 这项操作的结果是,我们将为索引中的每种运动提供四个独特的存储桶:Football,Handball,Hockey 和 Basketbalk。 然后,我们将使用平均子聚合计算每种运动的平均目标:

    GET sports/_search
    {
    "size": 0,
    "aggs": {
    "sports": {
    "terms": {
    "field": "sport"
    },
    "aggs": {
    "avg_scoring": {
    "avg": {
    "field": "goals"
    }
    }
    }
    }
    }
    }
    返回数据为:

    "aggregations" : {
    "sports" : {
    "doc_count_error_upper_bound" : 0,
    "sum_other_doc_count" : 0,
    "buckets" : [
    {
    "key" : "Football",
    "doc_count" : 9,
    "avg_scoring" : {
    "value" : 54.888888888888886
    }
    },
    {
    "key" : "Basketball",
    "doc_count" : 5,
    "avg_scoring" : {
    "value" : 1177.0
    }
    },
    {
    "key" : "Hockey",
    "doc_count" : 5,
    "avg_scoring" : {
    "value" : 139.2
    }
    },
    {
    "key" : "Handball",
    "doc_count" : 3,
    "avg_scoring" : {
    "value" : 245.33333333333334
    }
    }
    ]
    }
    }
    如你所见,术语聚合为索引中的每种运动类型构造了四个存储桶。 每个存储桶包含 doc_count(属于存储桶的文档数)和每个运动的平均子聚合。

    让我们在 Kibana 中可视化这些结果:

    如您所见,在Y轴上,我们在 “goals” 字段上使用平均子聚合,在X轴上,我们在 “sport” 字段上定义了术语桶聚合。


    Histogram Aggregation
    直方图聚合使我们可以根据指定的时间间隔构造存储桶。 属于每个间隔的值将形成一个间隔存储桶。 例如,假设我们要使用5年间隔将直方图聚合应用于 “age” 字段。 在这种情况下,直方图聚合将在我们的文档集中找到最小和最大年龄,并将每个文档与指定的时间间隔相关联。 每个文档的 “age” 字段将向下舍入到最接近的时间间隔存储桶。 例如,假设我们的时间间隔值为5,存储分区大小为6,则年龄32会四舍五入为30。

    直方图聚合的公式如下所示:

    bucket_key = Math.floor((value - offset) / interval) * interval + offset
    请注意,时间间隔必须为正十进制数,而偏移量必须为 [0,offset] 范围内的十进制。

    让我们使用直方图聚合来生成篮球中目标间隔为200的存储桶。

    POST sports/_search
    {
    "size": 0,
    "aggs": {
    "baskketball_filter": {
    "filter": {
    "term": {
    "sport": "Basketball"
    }
    },
    "aggs": {
    "goals_histogram": {
    "histogram": {
    "field": "goals",
    "interval": 200
    }
    }
    }
    }
    }
    }
    返回结果:

    "aggregations" : {
    "baskketball_filter" : {
    "doc_count" : 5,
    "goals_histogram" : {
    "buckets" : [
    {
    "key" : 800.0,
    "doc_count" : 2
    },
    {
    "key" : 1000.0,
    "doc_count" : 0
    },
    {
    "key" : 1200.0,
    "doc_count" : 2
    },
    {
    "key" : 1400.0,
    "doc_count" : 1
    }
    ]
    }
    }
    }
    上面的回答表明,没有目标落在0-200、200-400、400-600和600-800区间内。 因此,第一个存储区从800-1000间隔开始。 因此,值最小的文档将确定最小存储桶(最小key的存储桶)。 相应地,具有最高值的文档将确定最大存储桶(具有最高key的存储桶)。

    此外,该响应还显示有零个文档落在[1000,1200)范围内。 这意味着没有运动员得分在1000到1200个目标之间。 默认情况下,Elasticsearch用空存储桶填充此类空白。 您可以使用min_doc_count设置通过请求最小计数不为零的存储桶来更改此行为。 例如,如果我们将 min_doc_count 的值设置为1,则直方图将仅针对其中包含不少于1个文档的间隔构造存储桶。 让我们修改查询,将min_doc_count设置为1。

    POST sports/_search
    {
    "size": 0,
    "aggs": {
    "baskketball_filter": {
    "filter": {
    "term": {
    "sport": "Basketball"
    }
    },
    "aggs": {
    "goals_histogram": {
    "histogram": {
    "field": "goals",
    "interval": 200,
    "min_doc_count": 1
    }
    }
    }
    }
    }
    }
    那么经过这样修改后的返回结果是:

    "aggregations" : {
    "baskketball_filter" : {
    "doc_count" : 5,
    "goals_histogram" : {
    "buckets" : [
    {
    "key" : 800.0,
    "doc_count" : 2
    },
    {
    "key" : 1200.0,
    "doc_count" : 2
    },
    {
    "key" : 1400.0,
    "doc_count" : 1
    }
    ]
    }
    }
    }
    我们还可以使用 extended_bounds 设置来强制直方图聚合,以根据特定的最小值开始构建其存储桶,并继续构建存储桶直至达到最大值(即使不再有文档)。 仅当将 min_doc_count 设置为0时,才使用 extended_bounds。(如果min_doc_count大于0,则不会返回空存储桶)。 看一下这个查询:

    POST sports/_search
    {
    "size": 0,
    "aggs": {
    "baskketball_filter": {
    "filter": {
    "term": {
    "sport": "Basketball"
    }
    },
    "aggs": {
    "goals_histogram": {
    "histogram": {
    "field": "goals",
    "interval": 200,
    "min_doc_count": 0,
    "extended_bounds": {
    "min": 0,
    "max": 1600
    }
    }
    }
    }
    }
    }
    }
    在这里,我们为存储桶指定了0(最小值)和1600(最大值)。 因此,响应应如下所示:

    "aggregations" : {
    "baskketball_filter" : {
    "doc_count" : 5,
    "goals_histogram" : {
    "buckets" : [
    {
    "key" : 0.0,
    "doc_count" : 0
    },
    {
    "key" : 200.0,
    "doc_count" : 0
    },
    {
    "key" : 400.0,
    "doc_count" : 0
    },
    {
    "key" : 600.0,
    "doc_count" : 0
    },
    {
    "key" : 800.0,
    "doc_count" : 2
    },
    {
    "key" : 1000.0,
    "doc_count" : 0
    },
    {
    "key" : 1200.0,
    "doc_count" : 2
    },
    {
    "key" : 1400.0,
    "doc_count" : 1
    },
    {
    "key" : 1600.0,
    "doc_count" : 0
    }
    ]
    }
    }
    }
    如您所见,即使第一个存储桶和最后一个存储桶根本没有任何值,也会生成从0到结束1600的所有存储桶。

    Date histogram aggregation
    这个聚合类似于正常的直方图,但只能与日期或日期范围值一起使用。 由于日期在 Elasticsearch 中内部以长值表示,因此也可以但不准确地对日期使用正常的直方图。 这两个 AP I的主要区别在于,可以使用日期/时间表达式指定间隔。 基于时间的数据需要特殊的支持,因为基于时间的间隔并不总是固定的长度。

    在我们的数据中有一个叫做 birthdate 的字段。在建立我们的 index pattern 时,我们可以选择这个字段作为 time series 的timest amp。下面是我们的数据的时序图。

    我们可以使用 date histogram aggregation 来统计我们的运动员出生年月分别情况:

    GET sports/_search
    {
    "size": 0,
    "aggs": {
    "birthdays": {
    "date_histogram": {
    "field": "birthdate",
    "interval": "year"
    }
    }
    }
    }
    返回的结果是:

    "aggregations" : {
    "birthdays" : {
    "buckets" : [
    {
    "key_as_string" : "1988-01-01T00:00:00.000Z",
    "key" : 567993600000,
    "doc_count" : 10
    },
    {
    "key_as_string" : "1989-01-01T00:00:00.000Z",
    "key" : 599616000000,
    "doc_count" : 4
    },
    {
    "key_as_string" : "1990-01-01T00:00:00.000Z",
    "key" : 631152000000,
    "doc_count" : 4
    },
    {
    "key_as_string" : "1991-01-01T00:00:00.000Z",
    "key" : 662688000000,
    "doc_count" : 0
    },
    {
    "key_as_string" : "1992-01-01T00:00:00.000Z",
    "key" : 694224000000,
    "doc_count" : 4
    }
    ]
    }
    }
    上面的结果显示在1988年到1998年之间的运动员有10个,在1989和1990年之间的有4位。我们也可以在Kibana中表示出来。

    当然我们也可以针对每个年龄段的 goals 的平均值:

    GET sports/_search
    {
    "size": 0,
    "aggs": {
    "birthdays": {
    "date_histogram": {
    "field": "birthdate",
    "interval": "year"
    },
    "aggs": {
    "average_goals": {
    "avg": {
    "field": "goals"
    }
    }
    }
    }
    }
    }
    返回的结果是:

    "aggregations" : {
    "birthdays" : {
    "buckets" : [
    {
    "key_as_string" : "1988-01-01T00:00:00.000Z",
    "key" : 567993600000,
    "doc_count" : 10,
    "average_goals" : {
    "value" : 251.1
    }
    },
    {
    "key_as_string" : "1989-01-01T00:00:00.000Z",
    "key" : 599616000000,
    "doc_count" : 4,
    "average_goals" : {
    "value" : 409.0
    }
    },
    {
    "key_as_string" : "1990-01-01T00:00:00.000Z",
    "key" : 631152000000,
    "doc_count" : 4,
    "average_goals" : {
    "value" : 82.0
    }
    },
    {
    "key_as_string" : "1991-01-01T00:00:00.000Z",
    "key" : 662688000000,
    "doc_count" : 0,
    "average_goals" : {
    "value" : null
    }
    },
    {
    "key_as_string" : "1992-01-01T00:00:00.000Z",
    "key" : 694224000000,
    "doc_count" : 4,
    "average_goals" : {
    "value" : 834.0
    }
    }
    ]
    }
    }
    我们可以看到在1992年的这个年龄段的 average_goals 是最高的,达到834。也许是后生可畏啊!我们也可以在 Kibana 中表示出来:

    Range Aggregation
    通过此存储桶聚合,可以轻松根据用户定义的范围构建存储桶。 Elasticsearch 将检查从你指定的数字字段中提取的每个值,并将其与范围进行比较,然后将该值放入相应的范围。 请注意,此聚合包括起始值,但不包括每个范围的起始值。

    让我们为运动索引中的 “age” 字段创建一个范围汇总:

    GET sports/_search
    {
    "size": 0,
    "aggs": {
    "goal_ranges": {
    "range": {
    "field": "age",
    "ranges": [
    {
    "to": 20
    },
    {
    "from": 20,
    "to": 30
    },
    {
    "from": 30
    }
    ]
    }
    }
    }
    }
    如您所见,我们为查询指定了三个范围。 这意味着Elasticsearch将创建与每个范围相对应的三个存储桶。 上面的查询应产生以下输出:

    "aggregations" : {
    "goal_ranges" : {
    "buckets" : [
    {
    "key" : "*-20.0",
    "to" : 20.0,
    "doc_count" : 3
    },
    {
    "key" : "20.0-30.0",
    "from" : 20.0,
    "to" : 30.0,
    "doc_count" : 13
    },
    {
    "key" : "30.0-*",
    "from" : 30.0,
    "doc_count" : 6
    }
    ]
    }
    }
    如输出所示,我们指数中最多的运动员年龄在20至30岁之间。

    为了使范围更易理解,我们可以为每个范围自定义键名,如下所示:

    GET sports/_search
    {
    "size": 0,
    "aggs": {
    "goal_ranges": {
    "range": {
    "field": "age",
    "ranges": [
    {
    "key": "start-of-career",
    "to": 20
    },
    {
    "key": "mid-of-career",
    "from": 20,
    "to": 30
    },
    {
    "key": "end-of-cereer",
    "from": 30
    }
    ]
    }
    }
    }
    }
    这样产生的结果是:

    "aggregations" : {
    "goal_ranges" : {
    "buckets" : [
    {
    "key" : "start-of-career",
    "to" : 20.0,
    "doc_count" : 3
    },
    {
    "key" : "mid-of-career",
    "from" : 20.0,
    "to" : 30.0,
    "doc_count" : 13
    },
    {
    "key" : "end-of-cereer",
    "from" : 30.0,
    "doc_count" : 6
    }
    ]
    }
    }
    我们可以使用统计子聚合将更多信息添加到范围。 此汇总将为每个范围提供最小值,最大值,平均值和总和。 让我们来看看:

    GET sports/_search
    {
    "size": 0,
    "aggs": {
    "goal_ranges": {
    "range": {
    "field": "age",
    "ranges": [
    {
    "key": "start-of-career",
    "to": 20
    },
    {
    "key": "mid-of-career",
    "from": 20,
    "to": 30
    },
    {
    "key": "end-of-cereer",
    "from": 30
    }
    ]
    },
    "aggs": {
    "age_stats": {
    "stats": {
    "field": "age"
    }
    }
    }
    }
    }
    }
    查询结果是:

    "aggregations" : {
    "goal_ranges" : {
    "buckets" : [
    {
    "key" : "start-of-career",
    "to" : 20.0,
    "doc_count" : 3,
    "age_stats" : {
    "count" : 3,
    "min" : 18.0,
    "max" : 19.0,
    "avg" : 18.333333333333332,
    "sum" : 55.0
    }
    },
    {
    "key" : "mid-of-career",
    "from" : 20.0,
    "to" : 30.0,
    "doc_count" : 13,
    "age_stats" : {
    "count" : 13,
    "min" : 20.0,
    "max" : 29.0,
    "avg" : 25.846153846153847,
    "sum" : 336.0
    }
    },
    {
    "key" : "end-of-cereer",
    "from" : 30.0,
    "doc_count" : 6,
    "age_stats" : {
    "count" : 6,
    "min" : 31.0,
    "max" : 41.0,
    "avg" : 35.0,
    "sum" : 210.0
    }
    }
    ]
    }
    }
    在 Kibana 中可视化范围非常简单。 我们将为此使用饼图。 如下图所示,切片大小 count 聚合定义。 在存储桶部分,我们需要为数据创建三个范围。 这些范围将是饼图的分割部分。


    Geo Distance Aggregation
    使用地理距离聚合,您可以定义一个原点和到该点的一组距离范围。然后,聚合将评估每个geo_point值到原点的距离,并确定文档属于哪个范围。如果文档的geo_point值与原点之间的距离落入该存储桶的距离范围内,则该文档被视为属于该存储桶。

    在下面的示例中,我们的原点的纬度值为46.22,经度值为-68.85。我们使用原点46.22,-68.85的字符串格式,其中第一个值定义纬度,第二个值定义经度。或者,您可以使用对象格式-{“ lat”:46.22,“ lon”:-68.85}或数组格式:[-68.85,46.22]基于GeoJson标准,其中第一个数字是lon和第二个是拉特

    另外,我们以km值创建三个范围。默认距离单位是m(米),因此我们需要在“ unit”(单位)字段中明确设置km。其他受支持的距离单位是mi(英里),in(英寸),yd(码),cm(厘米)和mm(毫米)。

    POST sports/_search
    {
    "size": 0,
    "aggs": {
    "althlete_location": {
    "geo_distance": {
    "field": "location",
    "origin": {
    "lat": 46.22,
    "lon": -68.85
    },
    "ranges": [
    {
    "to": 200
    },
    {
    "from": 200, "to": 400
    },
    {
    "from": 400
    }
    ]
    }
    }
    }
    }
    查询结果:

    "aggregations" : {
    "althlete_location" : {
    "buckets" : [
    {
    "key" : "*-200.0",
    "from" : 0.0,
    "to" : 200.0,
    "doc_count" : 2
    },
    {
    "key" : "200.0-400.0",
    "from" : 200.0,
    "to" : 400.0,
    "doc_count" : 0
    },
    {
    "key" : "400.0-*",
    "from" : 400.0,
    "doc_count" : 20
    }
    ]
    }
    }
    结果表明,居住在距起点不超过200 km的运动员有2名,居住在距起点超过400 km的运动员有20名。


    IP Range Aggregation
    Elasticsearch还具有对IP范围的内置支持。 IP聚合的工作方式与其他范围聚合类似。 让我们为IP地址创建索引映射,以说明此聚合的工作方式:

    PUT ips
    {
    "mappings": {
    "properties": {
    "ip_addr": {
    "type": "ip"
    }
    }
    }
    }
    让我们将一些专用网络IP放入索引中。

    POST ips/_bulk
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.0" }
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.1" }
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.2" }
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.3" }
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.4" }
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.5" }
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.6" }
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.7" }
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.8" }
    {"index":{"_index":"ips"}}
    { "ip_addr": "172.16.0.9" }
    现在,当索引中包含一些数据时,让我们创建一个IP范围聚合:

    GET ips/_search
    {
    "size": 0,
    "aggs": {
    "ip_ranges": {
    "ip_range": {
    "field": "ip_addr",
    "ranges": [
    {
    "to": "172.16.0.4"
    },
    {
    "from": "172.16.0.4"
    }
    ]
    }
    }
    }
    }
    参考:
    原文链接:Elastic 中国社区官方博客

    螃蟹在剥我的壳,笔记本在写我,漫天的我落在枫叶上雪花上,而你在想我。 --章怀柔
  • 相关阅读:
    golang 查询数据库并返回json
    logrus日志使用详解
    英文文章写作的一点个人思考
    AKS (1) AKS最佳实践
    Azure Application Insight (3) ASP.NET MVC增加Application Insight
    oracle 创建存储过程后,重新编译错误,如何查看错误信息
    ORACLE 存储过程IN关键字使用问题
    Oracle/for loop循环如何进行跳过、跳出操作
    (转)Golang struct{}的几种特殊用法
    (转)Golang 延迟函数 defer 详解
  • 原文地址:https://www.cnblogs.com/lovezhr/p/15127134.html
Copyright © 2020-2023  润新知