算法背景
在项目中,经常要看到这么一个要求:从一组根据时间排列的数据,要求计算出其中满足某个条件的一串数据,求这串数据的开始与结束时间.
比如说,用小米手环采集到一组用户的心率数据,要求算出其中有多长时间用户的心率>100.
比如说,通过Gps定位获取到车辆的行驶数据,按要求速度超过80就算超速,那么在这组数据中,车辆超速了多长时间.
在数据采集非常方便的时代,像以上的场景及需求比比皆是.
要算出所要求的结果,当然可以写个计算器,一条一条分析过去,但是这样的话效率非常慢.
更多时候还是要求在数据库中直接通过SQL算出结果。
因此我觉得这个算法非常有价值。
假设具体场景
具体一点,比如下面这串数据:
按照下面的条件将以上数据进行分段切割,
Speed >= 0 且 Speed < 30
Speed >= 30 且 Speed < 60
Speed >= 60 且 Speed < 80
Speed >= 80
并且算出每段数据的开始时间与结束时间,开始SOC与结束SOC.
思路
1.先取到原始数据
;WITH temp AS ( SELECT ROW_NUMBER() OVER (ORDER BY Car,Time ASC)AS ID, Time,Car,SOC ,SpeedFlg = CASE WHEN Speed >= 0 AND Speed < 30 THEN 1 WHEN Speed >= 30 AND Speed < 60 THEN 2 WHEN Speed >= 60 AND Speed < 80 THEN 3 WHEN Speed >= 80 THEN 4 END FROM OrigData WHERE Car = 'ABCDEFK10NZ000001' AND Time >= '2016-06-01' AND Time <= GETDATE() )
先把上面的整串数据取到.
2.对数据进行排序,算出每个分段的开始于结束时间,并过滤掉分段中部的数据,保留分段的开始结束数据
final as( SELECT ROW_NUMBER()OVER(ORDER BY a.Car ) as tid,* FROM ( SELECT a.Car,a.SpeedFlg,A.SOC ,begintime = CASE WHEN A.ID = 1 OR b.SpeedFlg IS NULL THEN A.Time END ,endtime = CASE WHEN c.SpeedFlg IS NULL THEN A.Time END FROM temp a LEFT JOIN temp b ON a.ID = b.ID +1 AND B.SpeedFlg = A.SpeedFlg LEFT JOIN temp c ON a.ID = c.ID - 1 AND C.SpeedFlg = A.SpeedFlg )A where begintime is not null or endtime is not null )
根据上面的算法,可以得出以下数据
3.将每个分段的开始结束数据进行合并,得到一个完整的数据
select a.Car ,SpeedFlg = case a.SpeedFlg when 1 then '0-30' when 2 then '30-60' when 3 then '60-80' when 4 then '80以上' end ,a.begintime,b.endtime,a.SOC as beginsoc ,b.SOC as endsoc from final a INNER JOIN final b on a.Car = b.Car and a.SpeedFlg = b.SpeedFlg and a.tid = b.tid - 1
最终得到下面这串数据
点开可查看完整代码:
;WITH temp AS ( SELECT ROW_NUMBER() OVER (ORDER BY Car,Time ASC)AS ID, Time,Car,SOC ,SpeedFlg = CASE WHEN Speed >= 0 AND Speed < 30 THEN 1 WHEN Speed >= 30 AND Speed < 60 THEN 2 WHEN Speed >= 60 AND Speed < 80 THEN 3 WHEN Speed >= 80 THEN 4 END FROM OrigData WHERE Car = 'ABCDEFK10NZ000001' AND Time >= '2016-06-01' AND Time <= GETDATE() ), final as( SELECT ROW_NUMBER()OVER(ORDER BY a.Car ) as tid,* FROM ( SELECT a.Car,a.SpeedFlg,A.SOC ,begintime = CASE WHEN A.ID = 1 OR b.SpeedFlg IS NULL THEN A.Time END ,endtime = CASE WHEN c.SpeedFlg IS NULL THEN A.Time END FROM temp a LEFT JOIN temp b ON a.ID = b.ID +1 AND B.SpeedFlg = A.SpeedFlg LEFT JOIN temp c ON a.ID = c.ID - 1 AND C.SpeedFlg = A.SpeedFlg )A where begintime is not null or endtime is not null ) select a.Car ,SpeedFlg = case a.SpeedFlg when 1 then '0-30' when 2 then '30-60' when 3 then '60-80' when 4 then '80以上' end ,a.begintime,b.endtime,a.SOC as beginsoc ,b.SOC as endsoc from final a INNER JOIN final b on a.Car = b.Car and a.SpeedFlg = b.SpeedFlg and a.tid = b.tid - 1