• mysql中group by存在局限性探讨(待续)


    现在有一个需求:在2018年游戏充值记录表(字段主要有: user_name , money , game_id , 表有6000w行)查找出哪些人在某段日期内累计充值金额在100~500元范围内的,返回满足以上条件的所有用户名

    具体表结构:

    CREATE TABLE `pay_list_pay_2018` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `orderid` varchar(50) NOT NULL,
      `user_name` varchar(50) NOT NULL,
      `pay_way_id` tinyint(4) NOT NULL,
      `money` float NOT NULL,
      `paid_amount` float unsigned NOT NULL,
      `pay_date` date NOT NULL,
      `pay_time` int(11) NOT NULL DEFAULT '0',
      `agent_id` int(11) NOT NULL DEFAULT '0',
      `placeid` int(11) NOT NULL DEFAULT '0',
      `cplaceid` varchar(50) DEFAULT NULL,
      `adid` varchar(100) DEFAULT NULL,
      `game_id` int(11) NOT NULL DEFAULT '0',
      `server_id` int(11) NOT NULL DEFAULT '0',
      `reg_date` date NOT NULL,
      `reg_time` int(11) NOT NULL DEFAULT '0',
      `cid` tinyint(1) NOT NULL DEFAULT '0',
      `bank_type` int(11) NOT NULL DEFAULT '1',
      `plat_id` tinyint(1) NOT NULL DEFAULT '0',
      PRIMARY KEY (`id`),
      UNIQUE KEY `orderid` (`orderid`),
      KEY `agent_id` (`agent_id`),
      KEY `pay_date` (`pay_date`),
      KEY `game_id` (`game_id`),
      KEY `server_id` (`server_id`),
      KEY `user_name` (`user_name`),
      KEY `reg_date` (`reg_date`),
      KEY `placeid` (`placeid`),
      KEY `pay_way_id` (`pay_way_id`),
      KEY `plat_id` (`plat_id`),
      KEY `pay_time` (`pay_time`),
      KEY `reg_time` (`reg_time`),
      KEY `bank_type` (`bank_type`)
    ) ENGINE=InnoDB AUTO_INCREMENT=73555898 DEFAULT CHARSET=utf8
    

    那么,由此快速可以用group by和sum函数筛选用户:

    SELECT user_name,sum(money)  as pay_money FROM db_pay.`pay_list_pay_2018` WHERE ( `game_id` IN ('688','919','228','179') ) 
    AND (  (`pay_time` BETWEEN '1545580800' AND '1545667200' ))
    group by user_name having pay_money  BETWEEN  '100'  and '500'
    

    然后,对于以上sql,当pay_time范围较大火灾game_id数量较多的时候,group by需要处理的分组数量也过多(超过1000w),导致sql时间很长:

    后面通过PHP脚本处理,每3天统计一次用户累计充值金额,最后再用php数组记录用户名-累计充值金额:

    /**
         * 对比充值分表与总表的情况
         */
        public function comparePay() {
            $row_0 = $row_10 = $row_50 = $row_500 = 0;
            $user_names = [];
            $db = DB::getInstance('pay');
            $date1 = strtotime("2018-01-01");
            $date2 = strtotime("2018-01-06");
            for ($s_date = $date1; $s_date < $date2; $s_date+=3600*24*3 ) {
                $e_date = $s_date+3600*24*3;
                $sql = "SELECT user_name,sum(money) as total  FROM db_pay.pay_list_pay_2018 where game_id in ('373','365','811','366','351','352','853','1260','988','1206','1232','883','871','872','881','963','884','1231','870','873','882','880','962','874','731','711','654','628','350','291','310','314','311')
    and pay_time between {$s_date} and {$e_date} group by user_name";
                $list = $db->find($sql);
                foreach ($list as $item) {
                    $user_names[$item['user_name']] += $item['total'];
                }
            }
            foreach ($user_names as $total) {
                if($total>=0 && $total<10) {
                    $row_0++;
                }else if($total>=10 && $total<50) {
                    $row_10++;
                }else if($total>=50 && $total<500) {
                    $row_50++;
                }else if($total>500) {
                    $row_500++;
                }
            }
            echo "
    ";
            echo count($user_names);
            echo "
    ";
            echo json_encode(compact('row_0','row_10','row_50','row_500'));
        }
    

    实际上,以上操作耗时也挺久的,只是将group by运算的压力转移到了PHP上,但是虽然统计日期时间段较长时,同样存在优化空间;

    之后,可以考虑swoole并发执行sql或者用其他数据方法处理(待续)

  • 相关阅读:
    用上帝视角来看待组件的设计模式
    npm和package.json那些不为常人所知的小秘密
    四步走查智能硬件异常Case
    PorterDuffXfermode 图像混合技术在漫画APP中的应用
    发布流程进化史
    二叉搜索树的操作集
    02-线性结构1 两个有序链表序列的合并
    07-图6 旅游规划
    树的同构
    线性结构4 Pop Sequence
  • 原文地址:https://www.cnblogs.com/chq3272991/p/10608468.html
Copyright © 2020-2023  润新知