Mysql 行号+分组行号+取Top记录 SQL
select * from ( SELECT (@rowNum := @rowNum + 1) as rowNum -- 全量行号 , a.col1 , case when @groupItem != a.col1 then @groupRowNum := 1 else @groupRowNum := @groupRowNum + 1 end as groupRowNum -- 根据分组项目,设置分组行号, 分组项目与当前行分组项目不一致时,则分组行号重新置 1 , case when @groupItem != a.col1 then @groupItem := a.col1 else round(@groupItem, 0) end as groupItem -- 标记分组项目, 为分组行号提供判断依据 , col2 , num FROM ( SELECT col1 , col2, COUNT(*) as num FROM tb_test GROUP BY col1, col2 ) a inner join (select @rowNum := 0 as rowNum) t1 -- 全量行号变量 初始化 inner join (select @groupRowNum := 0) t2 -- 分组行号变量 初始化 inner join (select @groupItem := -1) t3 -- 分组项目变量 初始化 where 1 = 1 order by a.col1, num desc -- 分组行号排序条件 limit 100000 -- order by 在子查询中不能行, 需要加 limit ) x where 1=1 and groupRowNum <=3 -- 增加 分组后 top n 条件 ;
注:
mysql5.7及以上 子查询里面最好不要用order by
官方解释:
在mysql5.7手册的8.2.2.1中有解释:
子查询的优化是使用半连接的策略完成的(The optimizer uses semi-join strategies to improve subquery execution)
使用半连接进行优化,子查询语句必须满足一些标准(In MySQL, a subquery must satisfy these criteria to be handled as a semi-join)。
其中一个标准是:必须不是一个包含了limit和order by的语句(It must not have ORDER BY with LIMIT.)
1.子查询如果同时存在order by和limit,不会忽略order by
此方法查询特别慢,具体原因不知道,最好把order by放在父查询
2.只存在order by 会忽略
因此只适用于平时线下数据分析使用.
Top 1
select b.col1, max(b.col2) as col2 , num from ( SELECT col1, col2, COUNT(1) as num FROM tb_test GROUP BY col1, col2 ) b where 1=1and not exists( select 1 from ( SELECT col1, col2, COUNT(1) as num FROM tb_test GROUP BY col1, col2 ) c where 1 = 1 and b.col1 = c.col1 and b.num < c.num ) group by b.col1, num order by col1 ;
Top num N
注: 如需 TOP N 还需要再 Group 一下
select a.col1, a.col2, a.num, count(*) # * from ( SELECT col1, col2, COUNT(1) as num FROM tb_test where 1=1GROUP BY col1, col2 ) a left join ( SELECT col1, col2, COUNT(1) as num FROM tb_test where 1=1GROUP BY col1, col2 ) b on a.col1 = b.col2 and a.num < b.num where 1=1 group by a.col1, a.col2, a.num having count(b.col1) < 2 order by a.col1, a.num desc ;