源码位置:
.timeWindow(Time.milliseconds(1000L))
timeWindow()
def timeWindow(size: Time): WindowedStream[T, K, TimeWindow] = { new WindowedStream(javaStream.timeWindow(size)) }
javaStream.timeWindow(size)
public WindowedStream<T, KEY, TimeWindow> timeWindow(Time size) { if (environment.getStreamTimeCharacteristic() == TimeCharacteristic.ProcessingTime) { return window(TumblingProcessingTimeWindows.of(size)); } else { return window(TumblingEventTimeWindows.of(size)); } }
window(TumblingEventTimeWindows.of(size))
public Collection<TimeWindow> assignWindows(Object element, long timestamp, WindowAssignerContext context) { if (timestamp > Long.MIN_VALUE) { if (staggerOffset == null) { staggerOffset = windowStagger.getStaggerOffset(context.getCurrentProcessingTime(), size); } // Long.MIN_VALUE is currently assigned when no timestamp is present long start = TimeWindow.getWindowStartWithOffset(timestamp, (globalOffset + staggerOffset) % size, size); return Collections.singletonList(new TimeWindow(start, start + size)); } else { throw new RuntimeException("Record has Long.MIN_VALUE timestamp (= no timestamp marker). " + "Is the time characteristic set to 'ProcessingTime', or did you forget to call " + "'DataStream.assignTimestampsAndWatermarks(...)'?"); } }
TimeWindow.getWindowsStartWithOffset(timestamp,(globalOffset + staggerOffset) % size, size)
public static long getWindowStartWithOffset(long timestamp, long offset, long windowSize) { return timestamp - (timestamp - offset + windowSize) % windowSize; }
一直追到这个位置也就是WaterMark的计算公式
timestamp - (timestamp - offset +windowSize)% windowSize;
其中timestamp是我们每条数据元素本身自带的eventtime时间戳 windowSize是窗口时间也就是我们设置的。offset默认是0,主要是修改时区的,本次分析默认为0
因此公式可以简化为:timestamp -(timestamp + windowSize) % windowSize
一个数对自己取余数结果恒等于0 ,故再次简化为: timestamp - (timestamp % windowSize)
也就是时间戳减去时间戳对窗口时间的余数 => 也就是timestamp对windowSize的整数倍。
举个栗子: 假设元素时间戳为1547718199000 窗口大小为15000 单位均为毫秒
起始位置= 1547718199000 - (1547718199000 - 0 + 15000) % 15000
= 154771899000 - 4000
= 154771895000
所以第一个时间窗口为:[1547718195000 - 1547718210000) 前闭后开 , 后面的窗口以此类推