• Flink学习笔记——DataStream API


    Flink中的DataStream任务用于实现data streams的转换,data stream可以来自不同的数据源,比如消息队列,socket,文件等。

    Ref 

    https://ci.apache.org/projects/flink/flink-docs-stable/zh/dev/datastream_api.html
    

     使用DataStream API需要使用stream env

    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    

    DataStream支持的Data Source有:File-based,Socket-based,Collection-based,Custom

    1.File-based

    readTextFile(path) - Reads text files, i.e. files that respect the TextInputFormat specification, line-by-line and returns them as Strings.
    
    readFile(fileInputFormat, path) - Reads (once) files as dictated by the specified file input format.
    
    readFile(fileInputFormat, path, watchType, interval, pathFilter, typeInfo) - This is the method called internally by the two previous ones. It reads files in the path based on the given fileInputFormat. Depending on the provided watchType, this source may periodically monitor (every interval ms) the path for new data (FileProcessingMode.PROCESS_CONTINUOUSLY), or process once the data currently in the path and exit (FileProcessingMode.PROCESS_ONCE). Using the pathFilter, the user can further exclude files from being processed.
    

    2.Socket-based

    socketTextStream - Reads from a socket. Elements can be separated by a delimiter
    

    3.Collection-based

    fromCollection(Collection) - Creates a data stream from the Java Java.util.Collection. All elements in the collection must be of the same type.
    
    fromCollection(Iterator, Class) - Creates a data stream from an iterator. The class specifies the data type of the elements returned by the iterator.
    
    fromElements(T ...) - Creates a data stream from the given sequence of objects. All objects must be of the same type.
    
    fromParallelCollection(SplittableIterator, Class) - Creates a data stream from an iterator, in parallel. The class specifies the data type of the elements returned by the iterator.
    
    generateSequence(from, to) - Generates the sequence of numbers in the given interval, in parallel.
    

    4.Custom

    addSource - Attach a new source function. For example, to read from Apache Kafka you can use addSource(new FlinkKafkaConsumer<>(...)). See connectors for more details
    

     Data Stream支持的transformations算子

    https://ci.apache.org/projects/flink/flink-docs-release-1.12/zh/dev/stream/operators/
    

      

     DataStream支持的Data Sink有:

    writeAsText() / TextOutputFormat - Writes elements line-wise as Strings. The Strings are obtained by calling the toString() method of each element.
    
    writeAsCsv(...) / CsvOutputFormat - Writes tuples as comma-separated value files. Row and field delimiters are configurable. The value for each field comes from the toString() method of the objects.
    
    print() / printToErr() - Prints the toString() value of each element on the standard out / standard error stream. Optionally, a prefix (msg) can be provided which is prepended to the output. This can help to distinguish between different calls to print. If the parallelism is greater than 1, the output will also be prepended with the identifier of the task which produced the output.
    
    writeUsingOutputFormat() / FileOutputFormat - Method and base class for custom file outputs. Supports custom object-to-bytes conversion.
    
    writeToSocket - Writes elements to a socket according to a SerializationSchema
    
    addSink - Invokes a custom sink function. Flink comes bundled with connectors to other systems (such as Apache Kafka) that are implemented as sink functions.
    

      

  • 相关阅读:
    MySQL学习-- UNION与UNION ALL
    图解MySQL 内连接、外连接、左连接、右连接、全连接……太多了
    mysql的三种连接方式
    Spring Boot MyBatis配置多种数据库
    @Value注解分类解析
    SpringBoot启动报错Failed to determine a suitable driver class
    idea启动报错:Access denied for user 'root '@'192.168.100.XXX' (using password: YES)
    QStandardItemModel的data线程安全(在插入数据时,临时禁止sizeHint去读model中的data)
    ubuntu 交叉编译qt 5.7 程序到 arm 开发板
    继承QWidget的派生类控件不能设置QSS问题解决(使用style()->drawPrimitive(QStyle::PE_Widget,也就是画一个最简单最原始的QWidget,不要牵扯其它这么多东西)
  • 原文地址:https://www.cnblogs.com/tonglin0325/p/14121337.html
Copyright © 2020-2023  润新知