• (转)Elasticsearch分析聚合


    Elasticsearch不仅仅适合做全文检索,分析聚合功能也很好用。下面通过实例来学习。

    一、准备数据

    {"index":{ "_index": "books", "_type": "IT", "_id": "1" }}
    {"id":"1","title":"Java编程思想","language":"java","author":"Bruce Eckel","price":70.20,"year":    2007,"description":"Java学习必读经典,殿堂级著作!赢得了全球程序员的广泛赞誉。"}
    
    {"index":{ "_index": "books", "_type": "IT", "_id": "2" }}
    {"id":"2","title":"Java程序性能优化","language":"java","author":"葛一鸣","price":46.50,"year":     2012,"description":"让你的Java程序更快、更稳定。深入剖析软件设计层面、代码层面、JVM虚拟机层面的优化方法"}
    
    {"index":{ "_index": "books", "_type": "IT", "_id": "3" }}
    {"id":"3","title":"Python科学计算","language":"python","author":"张若愚","price":81.40,"year":    2016,"description":"零基础学python,光盘中作者独家整合开发winPython运行环境,涵盖了Python各个扩展库"}
    
    {"index":{ "_index": "books", "_type": "IT", "_id": "4" }}
    {"id":"4","title":"Python基础教程","language":"python","author":"张若愚","price":54.50,"year": 2014,"description":"经典的Python入门教程,层次鲜明,结构严谨,内容翔实"}
    
    {"index":{ "_index": "books", "_type": "IT", "_id": "5" }}
    {"id":"5","title":"JavaScript高级程序设计","language":"javascript","author":"Nicholas C.Zakas","price":66.40,"year":2012,"description":"JavaScript技术经典名著"}

    准备5条数据,保存着books.json中,批量导入:

    curl -XPOST "http://localhost:9200/_bulk?pretty" --data-binary @books.json

    二、Group By分组统计

    执行命令:

    curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
    "size": 0,
      "aggs": {
        "per_count": {
          "terms": {
            "field": "language"
          }
        }
      }
    }'

    统计结果:

    {
      "took" : 3,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "per_count" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [ {
            "key" : "java",
            "doc_count" : 2
          }, {
            "key" : "python",
            "doc_count" : 2
          }, {
            "key" : "javascript",
            "doc_count" : 1
          } ]
        }
      }
    }

    按编程语言分类,java类2本,python类1本,javascript类1本。

    三、Max最大值

    执行命令,统计price最大的:

    curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
      "size": 0,
      "aggs": {
        "max_price": {
          "max": {
            "field": "price"
          }
        }
      }
    }'

    返回结果:

    {
      "took" : 2,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "max_price" : {
          "value" : 81.4
        }
      }
    }

    四、Min最小值

    求价格最便宜的那本:

    curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
      "size": 0,
      "aggs": {
        "max_price": {
          "max": {
            "field": "price"
          }
        }
      }
    }'

    统计结果:

    {
      "took" : 3,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "max_price" : {
          "value" : 81.4
        }
      }
    }

    五、Average平均值

    分组统计并求5本书的平均价格:

    curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
    "size": 0,
    "aggs": {
        "per_count": {
            "terms": {
                "field": "language"
            },
            "aggs": {
                "avg_price": {
                    "avg": {
                        "field": "price"
                    }
                }
            }
        }
    }
    }
    '

    返回结果:

    {
      "took" : 4,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "per_count" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [ {
            "key" : "java",
            "doc_count" : 2,
            "avg_price" : {
              "value" : 58.35
            }
          }, {
            "key" : "python",
            "doc_count" : 2,
            "avg_price" : {
              "value" : 67.95
            }
          }, {
            "key" : "javascript",
            "doc_count" : 1,
            "avg_price" : {
              "value" : 66.4
            }
          } ]
        }
      }
    }

    六、Sum求和

    求5本书总价:

    curl -XPOST "http://localhost:9200/books/_search?pretty" -d '
    {
      "size": 0,
      "aggs": {
        "sum_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }'

    返回结果:

    {
      "took" : 6,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "sum_price" : {
          "value" : 319.0
        }
      }
    }

    七、基本统计

    基本统计会返回字段的最大值、最小值、平均值、求和:

    curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
    "size": 0,
    "aggs": {
        "grades_stats": {
            "stats": {
                "field": "price"
            }
        }
    }
    }'

    返回结果:

    {
      "took" : 2,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "grades_stats" : {
          "count" : 5,
          "min" : 46.5,
          "max" : 81.4,
          "avg" : 63.8,
          "sum" : 319.0
        }
      }
    }

    八、高级统计

    高级统计还会返回方差、标准差等:

    curl -XPOST "http://localhost:9200/books/_search?pretty" -d'
    {
      "size": 0,
      "aggs": {
        "grades_stats": {
          "extended_stats": {
            "field": "price"
          }
        }
      }
    }
    '

    统计结果:

    {
      "took" : 3,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "grades_stats" : {
          "count" : 5,
          "min" : 46.5,
          "max" : 81.4,
          "avg" : 63.8,
          "sum" : 319.0,
          "sum_of_squares" : 21095.46,
          "variance" : 148.65199999999967,
          "std_deviation" : 12.19229264740638,
          "std_deviation_bounds" : {
            "upper" : 88.18458529481276,
            "lower" : 39.41541470518724
          }
        }
      }
    }

    九、百分比统计

    curl -XPOST "http://localhost:9200/books/_search?pretty" -d '
    {
        "size": 0,
        "aggs": {
            "load_time_outlier": {
                "percentiles": {
                    "field": "year"
                }
            }
        }
    }
    '

    返回结果:

    {
      "took" : 3,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "load_time_outlier" : {
          "values" : {
            "1.0" : 2007.2,
            "5.0" : 2008.0000000000002,
            "25.0" : 2012.0,
            "50.0" : 2012.0,
            "75.0" : 2014.0,
            "95.0" : 2015.6000000000001,
            "99.0" : 2015.92
          }
        }
      }
    }

    十、分段统计

    统计价格小于50、50-80、大于80的百分比:

    curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
        "size": 0,
        "aggs": {
            "price_ranges": {
                "range": {
                    "field": "price",
                    "ranges": [{
                        "to": 50
                    }, {
                        "from": 50,
                        "to": 80
                    }, {
                        "from": 80
                    }]
                }
            }
        }
    }
    '

    返回结果:

    {
      "took" : 1,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "price_ranges" : {
          "buckets" : [ {
            "key" : "*-50.0",
            "to" : 50.0,
            "to_as_string" : "50.0",
            "doc_count" : 1
          }, {
            "key" : "50.0-80.0",
            "from" : 50.0,
            "from_as_string" : "50.0",
            "to" : 80.0,
            "to_as_string" : "80.0",
            "doc_count" : 3
          }, {
            "key" : "80.0-*",
            "from" : 80.0,
            "from_as_string" : "80.0",
            "doc_count" : 1
          } ]
        }
      }
    }

    转自:http://blog.csdn.net/napoay/article/details/53484730

  • 相关阅读:
    程序的局部性原理2
    程序的局部性原理
    ROM
    学习Spring Security OAuth认证(一)-授权码模式
    mybatis*中DefaultVFS的logger乱码问题
    maven生命周期绑定要点
    spring security antMatchers相关内容
    JSTL
    什么是CSS hack?
    Java中获得当前静态类的类名
  • 原文地址:https://www.cnblogs.com/zhangmingcheng/p/7590682.html
Copyright © 2020-2023  润新知