• Flink


    WindowOperator

    processElement

    final Collection<W> elementWindows = windowAssigner.assignWindows(   //找出该element被assign的所有windows
        element.getValue(), element.getTimestamp(), windowAssignerContext);
    
    //if element is handled by none of assigned elementWindows
    boolean isSkippedElement = true;  //element默认是会skiped
    
    
    for (W window: elementWindows) {
    
        // drop if the window is already late
        if (isWindowLate(window)) { //如果window是late,逻辑是window.maxTimestamp() + allowedLateness <= internalTimerService.currentWatermark(),continue表示skip
            continue;
        }
        isSkippedElement = false; //只要有一个窗口非late,该element就是非late数据
    
        windowState.setCurrentNamespace(window); 
        windowState.add(element.getValue()); //把数据加到windowState中
    
        triggerContext.key = key;
        triggerContext.window = window;
    
        //EventTimeTrigger,(window.maxTimestamp() <= ctx.getCurrentWatermark(),会立即fire
        //否则只是ctx.registerEventTimeTimer(window.maxTimestamp()),注册等待后续watermark来触发
        TriggerResult triggerResult = triggerContext.onElement(element); 
    
        if (triggerResult.isFire()) { //如果Fire
            ACC contents = windowState.get();
            if (contents == null) {
                continue;
            }
            emitWindowContents(window, contents); //emit window内容, 这里会调用自己定义的user function
        }
    
        //对于比较常用的TumblingEventTimeWindows,用EventTimeTrigger,所以是不会触发purge的
        if (triggerResult.isPurge()) { //如果purge
            windowState.clear();  //将window的state清除掉
        }
        registerCleanupTimer(window); //window的数据也需要清除
    }
    
    
    // side output input event if
    // element not handled by any window
    // late arriving tag has been set
    // windowAssigner is event time and current timestamp + allowed lateness no less than element timestamp
    //如果所有的assign window都是late,再判断一下element也是late
    if (isSkippedElement && isElementLate(element)) { //isElementLate, (element.getTimestamp() + allowedLateness <= internalTimerService.currentWatermark())
        if (lateDataOutputTag != null){
            sideOutput(element); //如果定义了sideOutput,就输出late element
        } else {
            this.numLateRecordsDropped.inc(); //否则直接丢弃
        }
    }

    这里currentWatermark的默认值,
    private long currentWatermark = Long.MIN_VALUE;

    如果定期发送watermark,那么在第一次收到watermark前,不会有late数据
    
    
    继续看看,数据清除掉逻辑
    protected void registerCleanupTimer(W window) {
        long cleanupTime = cleanupTime(window); //cleanupTime, window.maxTimestamp() + allowedLateness
    
        if (windowAssigner.isEventTime()) {
            triggerContext.registerEventTimeTimer(cleanupTime); //这里只是简单的注册registerEventTimeTimer
        } else {
            triggerContext.registerProcessingTimeTimer(cleanupTime);
        }
    }

    如果clear只是简单的注册EventTimeTimer,那么在onEventTime的时候一定有clear的逻辑、

    WindowOperator.onEventTime

    if (windowAssigner.isEventTime() && isCleanupTime(triggerContext.window, timer.getTimestamp())) {  //time == cleanupTime(window);
        clearAllState(triggerContext.window, windowState, mergingWindows);
    }

    果然,onEventTime的时候会判断,如果Timer的time等于 window的cleanup time,就把all state清除掉

    所以当超过,window.maxTimestamp() + allowedLateness就会被清理掉

  • 相关阅读:
    hdu 1715 大菲波数
    Netty 应用程序的一个一般准则:尽可能的重用 EventLoop,以减少线程创建所带来的开销。
    引入 netty网关,向flume提交数据
    JavaBean的任务就是: “Write once, run anywhere, reuse everywhere” Enterprise JavaBeans
    API网关+Kubernetes集群的架构替代了传统的Nginx(Ecs)+Tomcat(Ecs)
    tmp
    全量日志 requestId
    googlr 黄金法则 监控
    数据链路层3个基本问题
    netty4.x 实现接收http请求及响应
  • 原文地址:https://www.cnblogs.com/fxjwind/p/7760798.html
Copyright © 2020-2023  润新知