有50W个京东店铺,每个顾客访客访问任何一个店铺的任何一个商品时都会产生一条访问日志,访问日志存储的表名为Visit,访客的用户id为user_id,被访问的店铺名称为shop,请统计:
1)每个店铺的UV(访客数)
2)每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数
1 --建表 2 drop table if exists test_two; 3 create table test_two( 4 shoop_name string COMMENT '店铺名称', 5 user_id string COMMENT '用户id', 6 visit_time string COMMENT '访问时间' 7 ) 8 row format delimited fields terminated by ' ';
1 --插入数据 2 insert into table test_two values ('huawei','1001','2017-02-10'); 3 insert into table test_two values ('icbc','1001','2017-02-10'); 4 insert into table test_two values ('huawei','1001','2017-02-10'); 5 insert into table test_two values ('apple','1001','2017-02-10'); 6 insert into table test_two values ('huawei','1001','2017-02-10'); 7 insert into table test_two values ('huawei','1002','2017-02-10'); 8 insert into table test_two values ('huawei','1002','2017-02-10'); 9 insert into table test_two values ('huawei','1001','2017-02-10'); 10 insert into table test_two values ('huawei','1003','2017-02-10'); 11 insert into table test_two values ('huawei','1004','2017-02-10'); 12 insert into table test_two values ('huawei','1005','2017-02-10'); 13 insert into table test_two values ('icbc','1002','2017-02-10'); 14 insert into table test_two values ('jingdong','1006','2017-02-10'); 15 insert into table test_two values ('jingdong','1003','2017-02-10'); 16 insert into table test_two values ('jingdong','1002','2017-02-10'); 17 insert into table test_two values ('jingdong','1004','2017-02-10'); 18 insert into table test_two values ('apple','1001','2017-02-10'); 19 insert into table test_two values ('apple','1001','2017-02-10'); 20 insert into table test_two values ('apple','1001','2017-02-10'); 21 insert into table test_two values ('apple','1002','2017-02-10'); 22 insert into table test_two values ('apple','1002','2017-02-10'); 23 insert into table test_two values ('apple','1005','2017-02-10'); 24 insert into table test_two values ('apple','1005','2017-02-10'); 25 insert into table test_two values ('apple','1006','2017-02-10');
1 --1)每个店铺的UV(访客数) 2 select 3 shoop_name, 4 count(*) shoop_uv 5 from test_two 6 group by shoop_name 7 order by shoop_uv desc; 8 9 --2)每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数 10 select 11 shoop_name `商店名称`, 12 user_id `用户id`, 13 visit_time `访问次数`, 14 rank_vis `忠诚排名` 15 from 16 ( 17 select 18 shoop_name, 19 user_id, 20 visit_time, 21 row_number() over(partition by shoop_name order by visit_time desc) rank_vis 22 23 from 24 ( 25 select 26 shoop_name, 27 user_id, 28 count(*) visit_time 29 from test_two 30 group by shoop_name,user_id 31 ) t1 32 ) t2 33 where rank_vis<=3;