• Hive高阶聚合函数 GROUPING SETS、Cube、Rollup


    -- GROUPING SETS作为GROUP BY的子句,允许开发人员在GROUP BY语句后面指定多个统计选项,可以简单理解为多条group by语句通过union all把查询结果聚合起来结合起来。
    select 
         device_id
        ,os_id
        ,app_id
        ,count(user_id) 
    from test_xinyan_reg 
    group by device_id,os_id,app_id 
    grouping sets((device_id),(os_id),(device_id,os_id),())
    -- 等价于
    SELECT device_id,null,null,count(user_id) FROM test_xinyan_reg group by device_id UNION ALL 
    SELECT null,os_id,null,count(user_id) FROM test_xinyan_reg group by os_id UNION ALL 
    SELECT device_id,os_id,null,count(user_id) FROM test_xinyan_reg group by device_id,os_id UNION ALL 
    SELECT null,null,null,count(user_id) FROM test_xinyan_reg
    ;
    
    -- cube简称数据魔方,可以实现hive多个任意维度的查询,cube(a,b,c)则首先会对(a,b,c)进行group by,然后依次是(a,b),(a,c),(a),(b,c),(b),©,最后在对全表进行group by,他会统计所选列中值的所有组合的聚合
    -- cube即为grouping sets的简化过程函数
    select device_id,os_id,app_id,client_version,from_id,count(user_id)
    from test_xinyan_reg
    group by device_id,os_id,app_id,client_version,from_id with cube;
    
    
    -- rollup可以实现从右到做递减多级的统计,显示统计某一层次结构的聚合。
    select device_id,os_id,app_id,client_version,from_id,count(user_id)
    from test_xinyan_reg
    group by device_id,os_id,app_id,client_version,from_id with rollup;
    select
         id
        ,name
        ,count(age)
    from (
        select 1 as id, 'a' as name,11 as age union all
        select 2 as id, 'b' as name,12 as age union all
        select 3 as id, 'c' as name,13 as age union all
        select 4 as id, 'd' as name,14 as age union all
        select 4 as id, 'd' as name,15 as age union all
        select 4 as id, 'd' as name,16 as age union all
        select 4 as id, 'd' as name,17 as age union all
        select 4 as id, 'd' as name,18 as age
    ) t1
    group by
         id
        ,name
    with cube
    ;
    +------------+------------+------------+
    | id         | name       | _c2        |
    +------------+------------+------------+
    | NULL       | NULL       | 8          |
    | NULL       | a          | 1          |
    | NULL       | b          | 1          |
    | NULL       | c          | 1          |
    | NULL       | d          | 5          |
    | 1          | NULL       | 1          |
    | 1          | a          | 1          |
    | 2          | NULL       | 1          |
    | 2          | b          | 1          |
    | 3          | NULL       | 1          |
    | 3          | c          | 1          |
    | 4          | NULL       | 5          |
    | 4          | d          | 5          |
    +------------+------------+------------+
    
    select
         id
        ,name
        ,count(age)
    from (
        select 1 as id, 'a' as name,11 as age union all
        select 2 as id, 'b' as name,12 as age union all
        select 3 as id, 'c' as name,13 as age union all
        select 4 as id, 'd' as name,14 as age union all
        select 4 as id, 'd' as name,15 as age union all
        select 4 as id, 'd' as name,16 as age union all
        select 4 as id, 'd' as name,17 as age union all
        select 4 as id, 'd' as name,18 as age
    ) t1
    group by
         id
        ,name
    with rollup
    ;
    +------------+------------+------------+
    | id         | name       | _c2        |
    +------------+------------+------------+
    | NULL       | NULL       | 8          |
    | 1          | NULL       | 1          |
    | 1          | a          | 1          |
    | 2          | NULL       | 1          |
    | 2          | b          | 1          |
    | 3          | NULL       | 1          |
    | 3          | c          | 1          |
    | 4          | NULL       | 5          |
    | 4          | d          | 5          |
    +------------+------------+------------+

    ref: https://blog.csdn.net/qq_31573519/article/details/89054136

  • 相关阅读:
    spring事务详解(一)初探讨
    spring事务详解(二)实例
    mysql删除重复数据只保留一条
    jdk1.8源码Synchronized及其实现原理
    jdk1.8源码Thread与Runnable区别
    动态创建类并添加属性赋值
    c# 当代热门技术
    c# 开源框架
    极限并发带来的思考(12306 抢票)
    ASP.NET Core 使用外部登陆提供程序登陆的流程,以及身份认证的流程 (Challenge)
  • 原文地址:https://www.cnblogs.com/chenzechao/p/11273980.html
Copyright © 2020-2023  润新知