• 记一次逻辑代码的实现(数组内数据按照指定时间差进行分组)


    业务场景

    有如下数据:

      id        intime       outtime
    1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26

    需求:

      针对以上数据进行重组,重组规则为:

        对以上数据进行intime升序排序,后一条数据与前一条数据的intime进行比较

        1、如果第二条与第一条数据的差值大于120min,则直接舍弃第一条数据

        2、后一条数据与前一条数据差值小于120,则保留上一条数据的intime,将这一条的intime当做上一条的outtime,继续往后遍历,知道遍历到最后一条数据

        3、如果后一条数据与前一条数据的差值大于120min,则将该条数据当做新的一条数据,继续循环上面的规则

    代码实现:

    1、将上面数据处理成为一个array,即(aaa,Array(id,intime,outtime))
    注:在这之前已经将每条数据中的进出时间转换为了时间戳
    mergedDataTmp.map(x => (x._1, .distinct.filter(x => x._2<= x._2))) .mapPartitions(iter => { iter.map(x => { var count = 0 var iterNum = 0 val tList = new ListBuffer[(String, (String, String, String))]() val vs = x._2.sortWith((a, b) => a._2 < b._2).toIterator val vsList = vs.toList val vsLength = vsList.length var tmpV = "" for (t <- vsList) { iterNum += 1 if (count == 0) { tList += ((x._1, t)) count += 1 } else { val compareTime = if (!tList.isEmpty) { (DateUtil.dateToTimeStamp(t._2) - DateUtil.dateToTimeStamp(tList.last._2._2)) / 1000 >= 120 * 60 } else { false } if (compareTime && count == 1) { // (如果后一条记录的进时间)-(前一条记录的进时间)>=120min tList.remove(tList.length - 1) tList += ((x._1, t)) } else if (compareTime && count > 1) { // (如果后一条记录的进时间)-(前一条记录的进时间)>=120min val lastRecord = tList.last tList(tList.length - 1) = (x._1, (t._1, lastRecord._2._2, tmpV, t._3)) tList += ((x._1, t)) count = 1 } else { // 如果后一条记录的进时间 - 前一条记录的进时间<120min count += 1 if (iterNum == vsLength) { val lastRecord = tList.last tList(tList.length - 1) = (x._1, (t._1,lastRecord._2._2, t._3)) } tmpV = t._2 } } } tList }) }).flatMap(x => x)

      

  • 相关阅读:
    U盘安装WIN10专业版
    jquery template模版引擎
    MySQL大数据量快速分页实现(转载)
    MySQL for Windows 解压缩版安装 和 多实例安装
    macbook 重装win7
    MySQL for Visual Studio Version
    VS2010中没有ado.net entity data model实体数据模型这一选项-解决办法
    修改server 2008远程桌面端口
    懒加载
    html-css 常用
  • 原文地址:https://www.cnblogs.com/Gxiaobai/p/12076583.html
Copyright © 2020-2023  润新知