>>>>英文版 (更简洁易懂)<<<<
转载自:https://dzone.com/articles/difference-between-rownumber
One of the most obvious and useful set of window functions are ranking functions where rows from your result set are ranked according to a certain scheme. There are three ranking functions:
ROW_NUMBER()
RANK()
DENSE_RANK()
The difference is easy to remember. For the examples, let’s assume we have this table (using PostgreSQL syntax):
CREATE TABLE t(v) AS SELECT * FROM ( VALUES('a'),('a'),('a'),('b'), ('c'),('c'),('d'),('e') ) t(v)
ROW_NUMBER()
… assigns unique numbers to each row within the PARTITION
given the ORDER BY
clause. So you’d get:
SELECT v, ROW_NUMBER() OVER() FROM t
Note that some SQL dialects (e.g. SQL Server) require an explicit ORDER BY
clause in the OVER()
clause:
SELECT v, ROW_NUMBER() OVER(ORDER BY v) FROM t
The above query returns:
RANK()
… behaves like ROW_NUMBER()
, except that “equal” rows are ranked the same. If we substitute RANK()
into our previous query:
SELECT v, RANK() OVER(ORDER BY v) FROM t
… then the result we’re getting is this:
As you can see, much like in a sports ranking, we have gaps between the different ranks. We can avoid those gaps by using
DENSE_RANK()
Trivially, DENSE_RANK()
is a rank with no gaps, i.e. it is “dense”. We can write:
SELECT v, DENSE_RANK() OVER(ORDER BY v) FROM t
… to obtain
One interesting aspect of DENSE_RANK()
is the fact that it “behaves like” ROW_NUMBER()
when we add the DISTINCT
keyword.
SELECT DISTINCT v, DENSE_RANK() OVER(ORDER BY v) FROM t
… to obtain
In fact, ROW_NUMBER()
prevents you from using DISTINCT
, because ROW_NUMBER()
generates unique values across the partition beforeDISTINCT
is applied:
SELECT DISTINCT v, ROW_NUMBER() OVER(ORDER BY v) FROM t ORDER BY 1, 2
DISTINCT
has no effect:
Putting it all together
A good way to understand the three ranking functions is to see them all in action side-by-side. Run this query
SELECT v, ROW_NUMBER() OVER(ORDER BY v), RANK() OVER(ORDER BY v), DENSE_RANK() OVER(ORDER BY v) FROM t ORDER BY 1, 2
… to obtain:
Note that unfortunately, the WINDOW
clause is not supported in all databases.
>>>>中文版<<<<
转载自:https://www.cnblogs.com/SunnyZhu/p/5762898.html
SqlServer的四种排序,当场写了几句Sql让她了解,现把相关Sql放上来。
首先,我们创建一些测试数据。
if OBJECT_ID('Tempdb.dbo.#Tmp') is not null drop table #Tmp create table #Tmp ( name nvarchar(10) ) insert into #Tmp select N'张三' union select N'李四' union select N'王五' union select N'赵六' union select N'朱七' union select N'王八' union all select N'张三'
最后一个union用union all,因为我们多一行"张三"。
一、ROW_NUMBER() over(partition by columnname order by columnname)
select ROW_NUMBER()over(order by name) as num,* from #Tmp
可以得到按name排序的结果集。
ROW_NUMBER() over()还有一种用法,可以针对某列进行分组排序。
下面结果可以看到张三有1和2两个排序,而其他的名字排序都只有1。
select ROW_NUMBER()over(partition by name order by name) as num,* from #Tmp
二、RANK()over(order by columnname)
大家可以从下面的结果集看到,结果集少了5的编号,而有两个4的编号,然后直接跳到编号6。
select RANK()over(order by name),* from #Tmp
三、DENSE_RANK()over(order by columnname)
select DENSE_RANK()over(order by name),* from #Tmp
执行Sql后发现,下面的结果集有2个编号4的行,紧接着就是编号5的行。
DENSE_RANK()函数和RANK()函数差不多。
RANK()函数不管分几组,最后的编号一定和行数相同。
DENSE_RANK()函数最后的编号和分组的数目有关。
四、NTILE()OVER(ORDER BY COLUMNNAME)
select NTILE(2)over(order by name),* from #Tmp select NTILE(3)over(order by name),* from #Tmp
NTILE后面的数字,是要把查询得到的结果平均分为几组。
如下图分为2和3组。
如果行数平均划分后还有余行,那么就把行分在最前面的几组上。
比如我们的结果有7行,要分为3组。
那么第一组3行,第二组2行,第三组2行。
如果我们结果有14行,平均分为3组。
那么第一组5行,第二组5行,第三组4行。
依此类推。
One of the most obvious and useful set of window functions are ranking functions where rows from your result set are ranked according to a certain scheme. There are three ranking functions:
ROW_NUMBER()
RANK()
DENSE_RANK()
The difference is easy to remember. For the examples, let’s assume we have this table (using PostgreSQL syntax):
ROW_NUMBER()
… assigns unique numbers to each row within the PARTITION
given the ORDER BY
clause. So you’d get:
Note that some SQL dialects (e.g. SQL Server) require an explicit ORDER BY
clause in the OVER()
clause:
The above query returns:
RANK()
… behaves like ROW_NUMBER()
, except that “equal” rows are ranked the same. If we substitute RANK()
into our previous query:
… then the result we’re getting is this:
As you can see, much like in a sports ranking, we have gaps between the different ranks. We can avoid those gaps by using
DENSE_RANK()
Trivially, DENSE_RANK()
is a rank with no gaps, i.e. it is “dense”. We can write:
… to obtain
One interesting aspect of DENSE_RANK()
is the fact that it “behaves like” ROW_NUMBER()
when we add the DISTINCT
keyword.
… to obtain
In fact, ROW_NUMBER()
prevents you from using DISTINCT
, because ROW_NUMBER()
generates unique values across the partition beforeDISTINCT
is applied:
DISTINCT
has no effect:
Putting it all together
A good way to understand the three ranking functions is to see them all in action side-by-side. Run this query
… or this one (using the SQL standard WINDOW
clause, to reuse window specifications):
… to obtain:
Note that unfortunately, the WINDOW
clause is not supported in all databases.