依赖数据:使用日活明细表dws_uv_detail_day作为DWS层数据
建表语句:
drop table if exists ads_continuity_uv_count; create external table ads_continuity_uv_count( `dt` string COMMENT '统计日期', `wk_dt` string COMMENT '最近7天日期', `continuity_count` bigint ) COMMENT '连续活跃设备数' row format delimited fields terminated by ' ' location '/warehouse/gmall/ads/ads_continuity_uv_count';
数据导入脚本:
①date_sub(dt,rank) date_dif,如果是连续的,则日期与排名的差值相等。
②group by mid_id,date_dif,一个设备id可能对应多种差值,所以需要按这两个字段分组。
③在最近7天中,注意要对前3天连续,后三天连续,中间一天断开的情况去重。
insert into table ads_continuity_uv_count select '2019-02-12', concat(date_add('2019-02-12',-6),'_','2019-02-12'), count(*) from ( select mid_id from ( select mid_id from ( select mid_id, date_sub(dt,rank) date_dif from ( select mid_id, dt, rank() over(partition by mid_id order by dt) rank from dws_uv_detail_day where dt>=date_add('2019-02-12',-6) and dt<='2019-02-12' )t1 )t2 group by mid_id,date_dif having count(*)>=3 )t3 group by mid_id )t4;