• Flink -- Keyed State


        /* <pre>{@code
         * DataStream<MyType> stream = ...;
         * KeyedStream<MyType> keyedStream = stream.keyBy("id");
         *
         * keyedStream.map(new RichMapFunction<MyType, Tuple2<MyType, Long>>() {
         *
         *     private ValueState<Long> count;
         *
         *     public void open(Configuration cfg) {
         *         state = getRuntimeContext().getState(
         *                 new ValueStateDescriptor<Long>("count", LongSerializer.INSTANCE, 0L));
         *     }
         *
         *     public Tuple2<MyType, Long> map(MyType value) {
         *         long count = state.value() + 1;
         *         state.update(value);
         *         return new Tuple2<>(value, count);
         *     }
         * });
         * }</pre>
         */

     

    在使用keyed state时,首先需要初始化,这里以ValueState为例子,

    state = getRuntimeContext().getState(new ValueStateDescriptor<Long>("count", LongSerializer.INSTANCE, 0L));

     

    1. 每个state需要一个标识,ValueStateDescriptor,包含唯一名字,Class,和default值

    public ValueStateDescriptor(String name, Class<T> typeClass, T defaultValue)

     

    2. getState,向stateBackend注册keyed state,

    StreamingRuntimeContext
        public <T> ValueState<T> getState(ValueStateDescriptor<T> stateProperties) {
            KeyedStateStore keyedStateStore = checkPreconditionsAndGetKeyedStateStore(stateProperties);
            stateProperties.initializeSerializerUnlessSet(getExecutionConfig());
            return keyedStateStore.getState(stateProperties);
        }

     

    调用keyedStateStore.getState(stateProperties)

    KeyedStateStore其实就是KeyedStateBackend的封装

    public class DefaultKeyedStateStore implements KeyedStateStore {
    
        private final KeyedStateBackend<?> keyedStateBackend;
        private final ExecutionConfig executionConfig;
    
        @Override
        public <T> ValueState<T> getState(ValueStateDescriptor<T> stateProperties) {
            try {
                stateProperties.initializeSerializerUnlessSet(executionConfig);
                return getPartitionedState(stateProperties);
            } catch (Exception e) {
                throw new RuntimeException("Error while getting state", e);
            }
        }

    最终是调用到,keyedStateBackend

       private <S extends State> S getPartitionedState(StateDescriptor<S, ?> stateDescriptor) throws Exception {
            return keyedStateBackend.getPartitionedState(
                    VoidNamespace.INSTANCE,
                    VoidNamespaceSerializer.INSTANCE,
                    stateDescriptor);
        }

     

    AbstractKeyedStateBackend
       public <N, S extends State> S getPartitionedState(
                final N namespace,
                final TypeSerializer<N> namespaceSerializer,
                final StateDescriptor<S, ?> stateDescriptor) throws Exception {
    
            final S state = getOrCreateKeyedState(namespaceSerializer, stateDescriptor);
            final InternalKvState<N> kvState = (InternalKvState<N>) state;
    
            return state;
        }

     

    getOrCreateKeyedState

        public <N, S extends State, V> S getOrCreateKeyedState(
                final TypeSerializer<N> namespaceSerializer,
                StateDescriptor<S, V> stateDescriptor) throws Exception {
    
            InternalKvState<?> existing = keyValueStatesByName.get(stateDescriptor.getName());
            if (existing != null) {
                @SuppressWarnings("unchecked")
                S typedState = (S) existing;
                 return typedState;  //如果keyValueStatesByName有直接返回
            }
    
            // create a new blank key/value state
            S state = stateDescriptor.bind(new StateBinder() {
                @Override
                public <T> ValueState<T> createValueState(ValueStateDescriptor<T> stateDesc) throws Exception {
                    return AbstractKeyedStateBackend.this.createValueState(namespaceSerializer, stateDesc);
                }
            });
    
            InternalKvState<N> kvState = (InternalKvState<N>) state;
            keyValueStatesByName.put(stateDescriptor.getName(), kvState); //把新产生的state注册到keyValueStatesByName

     

    3. ValueState读写,value,update

     

    看下ValueState的定义,

    HeapValueState
    public class HeapValueState<K, N, V>
            extends AbstractHeapState<K, N, V, ValueState<V>, ValueStateDescriptor<V>>
            implements InternalValueState<N, V> {
    
        /**
         * Creates a new key/value state for the given hash map of key/value pairs.
         *
         * @param stateDesc The state identifier for the state. This contains name
         *                           and can create a default state value.
         * @param stateTable The state tab;e to use in this kev/value state. May contain initial state.
         */
        public HeapValueState(
                ValueStateDescriptor<V> stateDesc,
                StateTable<K, N, V> stateTable,
                TypeSerializer<K> keySerializer,
                TypeSerializer<N> namespaceSerializer) {
            super(stateDesc, stateTable, keySerializer, namespaceSerializer);
        }
    
        @Override
        public V value() {
            final V result = stateTable.get(currentNamespace);
    
            if (result == null) {
                return stateDesc.getDefaultValue();
            }
    
            return result;
        }
    
        @Override
        public void update(V value) {
    
            if (value == null) {
                clear();
                return;
            }
    
            stateTable.put(currentNamespace, value);
        }
    }

     

    都是通过StateTable,

    CopyOnWriteStateTable
        @Override
        public S get(N namespace) {
            return get(keyContext.getCurrentKey(), namespace);
        }
    
        @Override
        public boolean containsKey(N namespace) {
            return containsKey(keyContext.getCurrentKey(), namespace);
        }
    
        @Override
        public void put(N namespace, S state) {
            put(keyContext.getCurrentKey(), namespace, state);
        }

    可以看到value不光是记录一个value,而是记录key,namespace,value的关系

    其中key是通过,keyContext.getCurrentKey()去到的

     

    keyContext就是KeyedStateBackend

    在StreamInputProcessor.processInput的时候,会通过

    streamOperator.setKeyContextElement1(record);

    把当前的key设置到KeyedStateBackend

     

    这就是为何,对state的操作都是按key隔离开的

  • 相关阅读:
    微信聊天框测试思路
    巧用&&和|| 让逻辑代码更简洁,逼格看起来更高一点(玩笑脸)
    获取URL中的参数
    解决移动端点击闪烁问题
    npm安装依赖包 --save-dev 和 --save; package.json的devDependencies和dependencies 的区别!
    vue-cli 3配置接口代理
    js小方法积累,将一个数组按照n个一份,分成若干数组
    web前端识别文字转语音
    html 锚点
    ES6 必须要用的数组Filter() 方法,不要再自己循环遍历了!!!
  • 原文地址:https://www.cnblogs.com/fxjwind/p/7607448.html
Copyright © 2020-2023  润新知