• hive的基本用法(2)


    hive 窗口分析函数
    
    0: jdbc:hive2://localhost:10000> select * from t_access;
    +----------------+---------------------------------+-----------------------+--------------+--+
    |  t_access.ip   |          t_access.url           | t_access.access_time  | t_access.dt  |
    +----------------+---------------------------------+-----------------------+--------------+--+
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20   | 20170804     |
    | 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20   | 20170804     |
    | 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20   | 20170804     |
    | 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20   | 20170804     |
    | 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20   | 20170804     |
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20   | 20170805     |
    | 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20   | 20170805     |
    | 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20   | 20170805     |
    | 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20   | 20170805     |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20   | 20170805     |
    | 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20   | 20170806     |
    | 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20   | 20170806     |
    | 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20   | 20170806     |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20   | 20170806     |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20   | 20170806     |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20   | 20170806     |
    | 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20   | 20170806     |
    | 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20   | 20170806     |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20   | 20170806     |
    +----------------+---------------------------------+-----------------------+--------------+--+
    
    ## LAG函数
    select ip,url,access_time,
    row_number() over(partition by ip order by access_time) as rn,
    lag(access_time,1,0) over(partition by ip order by access_time)as last_access_time
    from t_access;
    
    +----------------+---------------------------------+----------------------+-----+----------------------+--+
    |       ip       |               url               |     access_time      | rn  |   last_access_time   |
    +----------------+---------------------------------+----------------------+-----+----------------------+--+
    | 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20  | 1   | 0                    |
    | 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20  | 1   | 0                    |
    | 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 1   | 0                    |
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | 0                    |
    | 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20  | 2   | 2017-08-04 15:30:20  |
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 3   | 2017-08-04 15:35:20  |
    | 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 1   | 0                    |
    | 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | 0                    |
    | 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20  | 2   | 2017-08-04 15:30:20  |
    | 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 1   | 0                    |
    | 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20  | 1   | 0                    |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 2   | 2017-08-05 16:30:20  |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 3   | 2017-08-06 16:30:20  |
    | 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20  | 1   | 0                    |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20  | 1   | 0                    |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 2   | 2017-08-05 15:40:20  |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 3   | 2017-08-06 15:40:20  |
    | 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20  | 1   | 0                    |
    | 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20  | 1   | 0                    |
    +----------------+---------------------------------+----------------------+-----+----------------------+--+
    
    
    ## LEAD函数
    select ip,url,access_time,
    row_number() over(partition by ip order by access_time) as rn,
    lead(access_time,1,0) over(partition by ip order by access_time)as last_access_time
    from t_access;
    +----------------+---------------------------------+----------------------+-----+----------------------+--+
    |       ip       |               url               |     access_time      | rn  |   last_access_time   |
    +----------------+---------------------------------+----------------------+-----+----------------------+--+
    | 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20  | 1   | 0                    |
    | 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20  | 1   | 0                    |
    | 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 1   | 0                    |
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | 2017-08-04 15:35:20  |
    | 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20  | 2   | 2017-08-05 15:30:20  |
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 3   | 0                    |
    | 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 1   | 0                    |
    | 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | 2017-08-04 16:30:20  |
    | 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20  | 2   | 0                    |
    | 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 1   | 0                    |
    | 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20  | 1   | 2017-08-06 16:30:20  |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 2   | 2017-08-06 16:30:20  |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 3   | 0                    |
    | 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20  | 1   | 0                    |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20  | 1   | 2017-08-06 15:40:20  |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 2   | 2017-08-06 15:40:20  |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 3   | 0                    |
    | 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20  | 1   | 0                    |
    | 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20  | 1   | 0                    |
    +----------------+---------------------------------+----------------------+-----+----------------------+--+
    
    
    ## FIRST_VALUE 函数
    例:取每个用户访问的第一个页面
    select ip,url,access_time,
    row_number() over(partition by ip order by access_time) as rn,
    first_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time
    from t_access;
    +----------------+---------------------------------+----------------------+-----+---------------------------------+--+
    |       ip       |               url               |     access_time      | rn  |        last_access_time         |
    +----------------+---------------------------------+----------------------+-----+---------------------------------+--+
    | 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20  | 1   | http://www.edu360.cn/register   |
    | 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20  | 1   | http://www.edu360.cn/register   |
    | 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 1   | http://www.edu360.cn/job        |
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | http://www.edu360.cn/stu        |
    | 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20  | 2   | http://www.edu360.cn/stu        |
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 3   | http://www.edu360.cn/stu        |
    | 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 1   | http://www.edu360.cn/excersize  |
    | 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | http://www.edu360.cn/stu        |
    | 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20  | 2   | http://www.edu360.cn/stu        |
    | 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 1   | http://www.edu360.cn/stu        |
    | 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20  | 1   | http://www.edu360.cn/job        |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 2   | http://www.edu360.cn/job        |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 3   | http://www.edu360.cn/job        |
    | 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20  | 1   | http://www.edu360.cn/job        |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20  | 1   | http://www.edu360.cn/job        |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 2   | http://www.edu360.cn/job        |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 3   | http://www.edu360.cn/job        |
    | 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20  | 1   | http://www.edu360.cn/pay        |
    | 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20  | 1   | http://www.edu360.cn/teach      |
    +----------------+---------------------------------+----------------------+-----+---------------------------------+--+
    
    ## LAST_VALUE 函数
    例:取每个用户访问的最后一个页面
    select ip,url,access_time,
    row_number() over(partition by ip order by access_time) as rn,
    last_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time
    from t_access;
    +----------------+---------------------------------+----------------------+-----+---------------------------------+--+
    |       ip       |               url               |     access_time      | rn  |        last_access_time         |
    +----------------+---------------------------------+----------------------+-----+---------------------------------+--+
    | 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20  | 1   | http://www.edu360.cn/register   |
    | 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20  | 1   | http://www.edu360.cn/register   |
    | 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 1   | http://www.edu360.cn/job        |
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | http://www.edu360.cn/stu        |
    | 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20  | 2   | http://www.edu360.cn/stu        |
    | 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 3   | http://www.edu360.cn/stu        |
    | 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 1   | http://www.edu360.cn/excersize  |
    | 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | http://www.edu360.cn/stu        |
    | 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20  | 2   | http://www.edu360.cn/stu        |
    | 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 1   | http://www.edu360.cn/stu        |
    | 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20  | 1   | http://www.edu360.cn/job        |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 2   | http://www.edu360.cn/job        |
    | 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 3   | http://www.edu360.cn/job        |
    | 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20  | 1   | http://www.edu360.cn/job        |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20  | 1   | http://www.edu360.cn/job        |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 2   | http://www.edu360.cn/job        |
    | 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 3   | http://www.edu360.cn/job        |
    | 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20  | 1   | http://www.edu360.cn/pay        |
    | 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20  | 1   | http://www.edu360.cn/teach      |
    +----------------+---------------------------------+----------------------+-----+---------------------------------+--+
    
    
    /*
        累计报表--分析函数实现版
    */
    -- sum() over() 函数
    select id
    ,month
    ,sum(amount) over(partition by id order by month rows between unbounded preceding and current row)
    from
    (select id,month,
    sum(fee) as amount
    from t_test
    group by id,month) tmp;
  • 相关阅读:
    Linux命令行常用符号
    IP地址详解
    黑帽子白帽子问题
    点香测算时间
    Linux系统中的压缩和打包
    Python冒泡排序
    Python选择排序
    linux 文件三大特殊权限(SUID SGID SBIT)
    预处理器之头文件嵌套包含最大层数
    一不小心就对未分配的内存进行访问和修改
  • 原文地址:https://www.cnblogs.com/ywqtro/p/14043306.html
Copyright © 2020-2023  润新知