• 访问记录切割新会话


    需求,获取到的访问日志,现根据某用户间隔30分钟后为新的会话。
    测试环境:postgresql
     
    author : jackical
    datetime : 2019-01-04
      
    select *,sum(f1) over(order by timestamp_) from (
    select timestamp_,case when  timestamp_ - lag(timestamp_,1) over(order by timestamp_)  > 60*30 then 1 else 0 end as f1 from (
    select EXTRACT(epoch FROM CAST(data_lastchangetime AS TIMESTAMP)) as timestamp_ from bt_apppackages where data_lastchangetime>'2019-01-01' order by data_lastchangetime
    ) tab order by timestamp_  
    ) tab order by timestamp_ limit 100;
     
    解说,由于窗口函数在over子句中有两中的边界定义方式: 1, ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW 
                                                                                             2,  RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
     
     
     
    关于边界的使用:
    SELECT cookieid,
        createtime,
        pv,
        SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime) AS pv1, -- 默认为从起点到当前行
        SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS pv2, --从起点到当前行,结果同pv1
        SUM(pv) OVER(PARTITION BY cookieid) AS pv3,    --分组内所有行
        SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS pv4,  --当前行+往前3行
        SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING) AS pv5,  --当前行+往前3行+往后1行
        SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS pv6  ---当前行+往后所有行  
        FROM lxw1234;
         
        cookieid createtime     pv      pv1     pv2     pv3     pv4     pv5      pv6
        -----------------------------------------------------------------------------
        cookie1  2015-04-10      1       1       1       26      1       6       26
        cookie1  2015-04-11      5       6       6       26      6       13      25
        cookie1  2015-04-12      7       13      13      26      13      16      20
        cookie1  2015-04-13      3       16      16      26      16      18      13
        cookie1  2015-04-14      2       18      18      26      17      21      10
        cookie1  2015-04-15      4       22      22      26      16      20      8
        cookie1  2015-04-16      4       26      26      26      13      13      4
     
     
    关于边界定义心得:1,默认边界定义,在有order by时为 起点到当前行
              2,没有在over中指定排序时,则边界为当前窗口(所有行)
               3,貌似有些资料说hive 在默认的边界定认为 range between unbounded preceding and current row,不管是range还是row 都是从起始到当前行
     
     
    这种内容也不是第一次接触了,刚刚别人问了下,再想想比较困难,索性将结果与心得记下来。下次不必再去想,拿来就用。。
     
  • 相关阅读:
    [CF1198D] Rectangle Painting 1
    [CF696B] Puzzles
    [CF540D] Bad Luck Island
    [P1654] OSU!
    [P6154] 游走
    [CF1265E] Beautiful Mirrors
    [CF920F] SUM and REPLACE
    [CF453B] Little Pony and Harmony Chest
    [CF808D] Array Division
    [CF1155D] Beautiful Array
  • 原文地址:https://www.cnblogs.com/jackicalSong/p/10221053.html
Copyright © 2020-2023  润新知