• 大数据第41天—Mysql练习题12-游戏玩法分析-杨大伟


    需求一:写一条 SQL 查询语句获取每位玩家 第一次登陆平台的日期。

    Activity表:显示了某些游戏的玩家的活动情况。

    player_iddevice_idevent_dategames_played
    1 2 2016-03-01 5
    1 2 2016-05-02 6
    2 3 2017-06-25 1
    3 1 2016-03-02 0
    3 4 2018-07-03 5

    展示效果:

    player_idfirst_login
    1 2016-03-01
    2 2017-06-25
    3 2016-03-02
    1 Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int);
    2 
    3 insert into Activity (player_id, device_id, event_date, games_played) values (1, 2, '2016-03-01', 5);
    4 insert into Activity (player_id, device_id, event_date, games_played) values (1, 2, '2016-05-02', 6);
    5 insert into Activity (player_id, device_id, event_date, games_played) values (2, 3, '2017-06-25', 1);
    6 insert into Activity (player_id, device_id, event_date, games_played) values (3, 1, '2016-03-02', 0);
    7 insert into Activity (player_id, device_id, event_date, games_played) values (3, 4, '2018-07-03', 5);

    最终SQL:

    1 select 
    2       player_id, 
    3       min(event_date) as first_login 
    4 from 
    5       Activity 
    6 group by 
    7       player_id;

    需求二:描述每一个玩家首次登陆的设备名称

    player_iddevice_id
    1 2
    2 3
    3 1

    最终SQL:

     1 select 
     2       player_id,
     3       device_id 
     4 from
     5      (select *  
     6       from 
     7           Activity
     8       where
     9           (player_id,event_date) in (select
    10                                            player_id, 
    11                                            device_id
    12                                            min(event_date)
    13                                       from
    14                                            Activity 
    15                                       group by 
    16                                            player_id
    17                                       )
    18       ) as t;

    需求三:编写一个 SQL 查询,同时报告每组玩家和日期,以及玩家到目前为止玩了多少游戏。也就是说,在此日期之前玩家所玩的游戏总数。详细情况请查看示例。

    player_idevent_dategames_played_so_far
    1 2016-03-01 5
    1 2016-05-02 11
    2 2017-06-25 1
    3 2016-03-02 0
    3 2018-07-03 5

    提示:对于 ID 为 3 的玩家,2018-07-03 共玩了 0+5=5 个游戏。

    最终SQL:

     1 //方法一
     2 SELECT C.player_id,C.event_date,C.games_played_so_far
     3 FROM (
     4       SELECT 
     5           A.player_id,
     6           A.event_date,
     7           @sum_cnt:=
     8               if(A.player_id = @pre_id AND A.event_date != @pre_date,
     9                   @sum_cnt + A.games_played,
    10                   A.games_played 
    11               )
    12               AS `games_played_so_far`,
    13           @pre_id:=A.player_id AS `player_ids`,
    14           @pre_date:=A.event_date AS `event_dates`
    15       FROM 
    16           activity AS A,
    17           (SELECT @pre_id:=NULL,@pre_date:=NULL,@sum_cnt:=0) AS B
    18       order BY 
    19           A.player_id,A.event_date
    20 ) AS C
    21 
    22 //方法二
    23 SELECT 
    24       B.player_id,
    25       B.event_date,
    26       SUM(A.games_played) AS `games_played_so_far`
    27 FROM 
    28       Activity AS A
    29 JOIN 
    30       Activity AS B 
    31 ON 
    32       A.player_id = B.player_id 
    33       AND A.event_date <= B.event_date
    34 GROUP BY 
    35       B.player_id,B.event_date;

    需求四:编写一个 SQL 查询,报告在首次登录的第二天再次登录的玩家的百分比,四舍五入到小数点后两位。换句话说,您需要计算从首次登录日期开始至少连续两天登录的玩家的数量,然后除以玩家总数。

    fraction
    0.00

    提示:对于 ID 为 1 的玩家,2016-05-02 共玩了 5+6=11 个游戏. 对于 ID 为 3 的玩家,2018-07-03 共玩了 0+5=5 个游戏。

    最终SQL:

     1 select 
     2       round(
     3             sum(case when datediff(a.event_date,b.first_date)=1 then 1 else 0 end)
     4                /
     5                (select count(distinct(player_id)) from activity)
     6             ,2 ) as fraction
     7 from 
     8       activity a,
     9      (select 
    10              player_id,
    11              min(event_date) first_date 
    12       from 
    13              activity 
    14       group by 
    15              player_id
    16      ) b
    17 where 
    18       a.player_id=b.player_id;

    需求五:编写一个 SQL 查询,报告每个安装日期、当天安装游戏的玩家数量和第一天的保留时间。

    install_dtinstallsDay1_retention
    2016-03-01 2 0.50
    2017-06-25 1 0.00

    提示:玩家 1 和 3 在 2016-03-01 安装了游戏,但只有玩家 1 在 2016-03-02 重新登录,所以 2016-03-01 的第一天保留时间是 1/2=0.50 玩家 2 在 2017-06-25 安装了游戏,但在 2017-06-26 没有重新登录,因此 2017-06-25 的第一天保留为 0/1=0.00

    最终SQL:

     1 #方法一
     2 SELECT
     3       A.install_date,
     4       COUNT(A.player_id) AS `installs`,
     5       COUNT(AA.player_id) AS `return_cnt`
     6 FROM 
     7    (SELECT 
     8            player_id,
     9            MIN(event_date) AS `install_date`
    10     FROM 
    11            Activity
    12     GROUP BY 
    13            player_id
    14     ) AS A
    15 left JOIN 
    16     Activity AS AA 
    17 ON 
    18     AA.event_date = DATE_ADD(A.install_date,INTERVAL 1 DAY) AND AA.player_id = A.player_id
    19 GROUP BY
    20     A.install_date;
    21 
    22 #方法二
    23 SELECT 
    24       A.event_date AS `install_dt`,
    25       COUNT(A.player_id) AS `installs`,
    26       round(COUNT(C.player_id)/COUNT(A.player_id),2) AS `Day1_retention`
    27 FROM
    28       Activity AS A 
    29 left JOIN 
    30       Activity AS B
    31 ON 
    32       A.player_id = B.player_id AND A.event_date > B.event_date
    33 left JOIN 
    34       Activity AS C
    35 ON    
    36       A.player_id = C.player_id AND C.event_date = DATE_ADD(A.event_date,INTERVAL 1 DAY)
    37 WHERE 
    38       B.event_date IS NULL
    39 GROUP BY 
    40       A.event_date;
  • 相关阅读:
    CentOS7局域网下安装离线Ambari
    虚拟机怎么发送ctrl+alt+delete组合键
    RedHat6.5创建本地yum源
    RedHat7安装vmware虚拟机启动报错
    Spark基本术语表+基本架构+基本提交运行模式
    Spark官网资料学习网址
    大数据开源组件图谱
    HADOOP1.X中HDFS工作原理
    大数据时代——为什么用HADOOP?
    Linux Shell脚本中获取本机ip地址方法
  • 原文地址:https://www.cnblogs.com/shui68home/p/13525961.html
Copyright © 2020-2023  润新知