oracle爱好者和群snowg的问题
上面的这个,有站点stationid,year,month,day和每天记录的day_tmin字段。
现在要求统计处每个stationid下面每月每日的最小day_tmin字段,因为不关注year,所以sql这样写
select stationid, month,day,min(day_tmin) tmin from history.history_day_tem_new group by stationid month,day
然后要求在上述结果行增加年,需要知道是那一年的这个月日的最低温度.
select aa.stationid,aa.year,aa.month,aa.day,bb.tmin from (select stationid,year, month,day,day_tmin from history.history_day_tem_new )aa, (select stationid, month,day,min(day_tmin) tmin from history.history_day_tem_new group by stationid month,day ) bb where aa.stationid=bb.stationid and aa.month=bb.month and aa.day=bb.day and aa.day_tmin=bb.tmin order by aa.stationid,aa.year,aa.month,aa.day
我aa表统计出所有的带有year的,然后和下面的min分组统计的进行表连接。
可是感觉复杂了,这是查询的结果。
求另外的写法,主要是这个写法感觉不靠谱,如果每天的统计有min(day_tmin) 有重复的最小就肯定不对了。
更新:我本来是使用partition
select distinct stationid, year,min(day_tmin) over(partition by station order by month,day ) tmin from history.history_day_tem_new order by stationid
经过仔细群里的辰影sql侠,他使用partition,首先建表(这个方法很好,下次不要create table了)
with w1 as(select '1' id,'2012' year,'12' month,'1' day,to_number('-10','999') tmin from dual union all select '1' id,'2012' year,'11' month,'1' day,to_number('-11','999') tmin from dual union all select '1' id,'2012' year,'10' month,'1' day,to_number('-12','999') tmin from dual union all select '1' id,'2012' year,'12' month,'1' day,to_number('-13','999') tmin from dual union all select '1' id,'2012' year,'11' month,'1' day,to_number('-6','999') tmin from dual union all select '1' id,'2012' year,'10' month,'1' day,to_number('-6','999') tmin from dual union all select '2' id,'2012' year,'12' month,'1' day,to_number('-4','999') tmin from dual union all select '2' id,'2011' year,'12' month,'1' day,to_number('-4','999') tmin from dual union all select '1' id,'2011' year,'12' month,'1' day,to_number('-20','999') tmin from dual) select * from w1 order by id,month,day,tmin
之后的sql
重要总结:可以看出partition by的字段是需要统计分析(温度)的影响字段,id,month,day(因为对年不敏感,因此不增加),后面的order by(默认升序)字段是实际统计的的
partition by的字段为主语,或者说要分析的对象,然后order by是分析对象的熟悉(最大最小)
select id,year,month,day,tmin from (select w1.*,dense_rank()over(partition by id,month,day order by tmin) rn from w1) where rn=1
再比如下面的sql:求各个公司最新月完成凭证情况表,order by 是最新月 year period,partition是unitname
select * from (select UNITCODE, unitname, year, period, num1, row_number() over(partition by UNITNAME order by YEAR desc,period desc) mm from (select BD_CORP.UNITCODE, bd_corp.unitname, gl_voucher.year, gl_voucher.period, count(*) num1 from gl_voucher, bd_corp where bd_corp.pk_corp = gl_voucher.pk_corp and gl_voucher.dr='0' group by BD_CORP.UNITCODE, bd_corp.unitname, gl_voucher.year, gl_voucher.period))
where mm = 1 ORDER BY UNITCODE;
再比如下面的这个sql:求出每个病人诊断好id最小下的疾病类别
select patiend_id,diagnosis_type,min(diagnosis_no)over(partition by patiend_id,diagnosis_type) diagnosis_no, diagnosis_class from (Select patient_id, visit_id, diagnosis_class from diagnostic_category a,DIAGNOSIS_DICT b where a.diagnosis= b.diagnosis(+) and diagnosis_class is not null and diagnosis_type='3' )
这里因为min(diagnosis_no),因此其后面的partition后面就不需要order by了
rank()应该也是对的
ROW_NUMBER() 会去掉重复一行
用min partition 不行
select distinct id,year,month,day,tmin,min(tmin)over(partition by id,month,day order by tmin) from w1 order by id,month,day,tmin
扩展一
2012-12-14 更新
按班级 查出 年龄最大的学生信息
学生表如下
select * from (select xx.*,rank()over(partition by bj order by age desc) rn from xx) where rn=1
select s_no, name, age, bj from xx, (select bj || max(age) a from xx group by bj) yy--甲15 乙18 where bj || age = a
平均工资的最大值(附加上ename):
select * from
( select ename,avgs,dense_rank()over( order by avgs desc)rn from
( select ename, avg(sal) avgs from emp group by ename))
where rn=1
扩展二
引用 Oracle:Rank,Dense_Rank,Row_Number比较
Oracle:Rank,Dense_Rank,Row_Number比较
一个员工信息表
Create Table EmployeeInfo (CODE Number(3) Not Null,EmployeeName varchar2(15),DepartmentID Number(3),Salary NUMBER(7,2),
Constraint PK_EmployeeInfo Primary Key (CODE));
Select * From EMPLOYEEINFO
现执行SQL语句:
Select EMPLOYEENAME,SALARY,
RANK() OVER (Order By SALARY Desc) "RANK",
DENSE_RANK() OVER (Order By SALARY Desc ) "DENSE_RANK",
ROW_NUMBER() OVER(Order By SALARY Desc) "ROW_NUMBER"
From EMPLOYEEINFO
结果如下:
Rank,Dense_rank,Row_number函数为每条记录产生一个从1开始至N的自然数,N的值可能小于等于记录的总数。这3个函数的唯一区别在于当碰到相同数据时的排名策略。
①ROW_NUMBER:
Row_number函数返回一个唯一的值,当碰到相同数据时,排名按照记录集中记录的顺序依次递增。
②DENSE_RANK:
Dense_rank函数返回一个唯一的值,除非当碰到相同数据时,此时所有相同数据的排名都是一样的。
③RANK:
Rank函数返回一个唯一的值,除非遇到相同的数据时,此时所有相同数据的排名是一样的,同时会在最后一条相同记录和下一条不同记录的排名之间空出排名。
同时也可以分组排序,也就是在Over从句内加入Partition by groupField:
Select DEPARTMENTID,EMPLOYEENAME,SALARY,
RANK() OVER ( Partition By DEPARTMENTID Order By SALARY Desc) "RANK",
DENSE_RANK() OVER ( Partition By DEPARTMENTID Order By SALARY Desc ) "DENSE_RANK",
ROW_NUMBER() OVER( Partition By DEPARTMENTID Order By SALARY Desc) "ROW_NUMBER"
From EMPLOYEEINFO
结果如下:
现在如果插入一条工资为空的记录,那么执行上述语句,结果如下:
会发现空值的竟然排在了第一位,这显然不是想要的结果。解决的办法是在Over从句Order By后加上 NULLS Last即:
Select EMPLOYEENAME,SALARY,
RANK() OVER (Order By SALARY Desc Nulls Last) "RANK",
DENSE_RANK() OVER (Order By SALARY Desc Nulls Last) "DENSE_RANK",
ROW_NUMBER() OVER(Order By SALARY Desc Nulls Last ) "ROW_NUMBER"
From EMPLOYEEINFO
结果如下: