• Hive面试题——累计求和


    需求:

    有如下访客访问次数统计表 t_access_times

    访客

    月份

    访问次数

    A

    2015-01

    5

    A

    2015-01

    15

    B

    2015-01

    5

    A

    2015-01

    8

    B

    2015-01

    25

    A

    2015-01

    5

    A

    2015-02

    4

    A

    2015-02

    6

    B

    2015-02

    10

    B

    2015-02

    5

    ……

    ……

    ……

    需要输出报表:t_access_times_accumulate

    访客

    月份

    月访问总计

    累计访问总计

    A

    2015-01

    33

    33

    A

    2015-02

    10

    43

    …….

    …….

    …….

    …….

    B

    2015-01

    30

    30

    B

    2015-02

    15

    45

    …….

    …….

    …….

    …….

    思路:

    1、第一步,先求个用户的月总金额

    select username,month,sum(salary) salary from t_access_times group by username,month;

    +-----------+----------+---------+--+
    | username | month | salary |
    +-----------+----------+---------+--+
    | A | 2015-01 | 33 |
    | A | 2015-02 | 10 |
    | B | 2015-01 | 30 |
    | B | 2015-02 | 15 |
    +-----------+----------+---------+--+

    2、第二步,将月总金额表 自己连接自己

    select A.*,B.*
    from 
    (select username,month,sum(salary) salary from t_access_times group by username,month) A 
    join 
    (select username,month,sum(salary) salary from t_access_times group by username,month) B
    on
    A.username=B.username;

    +-------------+----------+-----------+-------------+----------+-----------+--+
    | A.username | A.month | A.salary | B.username | B.month | B.salary |
    +-------------+----------+-----------+-------------+----------+-----------+--+
    | A | 2015-01 | 33 | A | 2015-01 | 33 |
    | A | 2015-01 | 33 | A | 2015-02 | 10 |
    | A | 2015-02 | 10 | A | 2015-01 | 33 |
    | A | 2015-02 | 10 | A | 2015-02 | 10 |
    | B | 2015-01 | 30 | B | 2015-01 | 30 |
    | B | 2015-01 | 30 | B | 2015-02 | 15 |
    | B | 2015-02 | 15 | B | 2015-01 | 30 |
    | B | 2015-02 | 15 | B | 2015-02 | 15 |
    +-------------+----------+-----------+-------------+----------+-----------+--+

    3、第三步,从上一步的结果中
    进行分组查询,分组的字段是A.username,A.month
    求月累计值: 将B.month <= A.month的所有B.salary求和即可

    select A.username,A.month,max(A.salary) salary,sum(B.salary) accumulate
    from 
    (select username,month,sum(salary) salary from t_access_times group by username,month) A 
    join 
    (select username,month,sum(salary) salary from t_access_times group by username,month) B
    on
    A.username=B.username
    where B.month <= A.month
    group by A.username,A.month
    order by A.username,A.month;
  • 相关阅读:
    数据结构与算法--绪论
    Django之模板(T)
    博客园之MD文件代码块添加隐藏/显示按钮
    博客园之背景特效
    博客园之生成侧边目录
    占位先1
    Django之视图(V)
    Django之ORM
    Django框架
    tomcat在centos下启动缓慢,耗时较长
  • 原文地址:https://www.cnblogs.com/DarrenChan/p/6786197.html
Copyright © 2020-2023  润新知