• Flink(1):Flink的基础案例


    相关文章链接

    1、批处理的WordCount案例

    // 创建执行环境
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    
    // 获取数据
    DataSource<String> dataSource = env.fromElements("flink spark hadoop", "hadoop spark", "flink flink");
    
    // 转换数据
    AggregateOperator<Tuple2<String, Integer>> result = dataSource
        .flatMap(new FlatMapFunction<String, String>() {
            @Override
            public void flatMap(String s, Collector<String> collector) throws Exception {
                for (String field : s.split(" ")) {
                    collector.collect(field);
                }
            }
        })
        .map(new MapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> map(String s) throws Exception {
                return Tuple2.of(s, 1);
            }
        })
        .groupBy(0)
        .sum(1);
    
    // 输出数据
    result.print();
    

    2、流处理的WordCount案例

    // 执行环境
    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);
    //env.setRuntimeMode(RuntimeExecutionMode.BATCH);
    //env.setRuntimeMode(RuntimeExecutionMode.STREAMING);
    
    // source数据源
    DataStreamSource<String> lines = env.socketTextStream("localhost", 9999);
    
    // 数据转换
    SingleOutputStreamOperator<Tuple2<String, Integer>> result = lines
        .flatMap(new FlatMapFunction<String, String>() {
            @Override
            public void flatMap(String s, Collector<String> collector) throws Exception {
                for (String word : s.split(" ")) {
                    collector.collect(word);
                }
            }
        })
        .map(new MapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> map(String s) throws Exception {
                return Tuple2.of(s, 1);
            }
        })
        .keyBy(t -> t.f0)
        .sum(1);
    
    // sink
    result.print();
    
    env.execute();
    

    3、流处理的基于Lambda表达式的WordCount案例

    // 执行环境
    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    //env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);
    
    // 获取数据
    DataStreamSource<String> dataStreamSource = env.fromElements("abc abc abc");
    
    // 数据转换
    SingleOutputStreamOperator<Tuple2<String, Integer>> result = dataStreamSource
        .flatMap((String value, Collector<String> out) -> {
            Arrays.stream(value.split(" ")).forEach(out::collect);
        }).returns(Types.STRING)
        .map((String value) ->
                Tuple2.of(value, 1), TypeInformation.of(new TypeHint<Tuple2<String, Integer>>() {}
        ))
        .keyBy(t -> t.f0)
        .sum(1);
    
    // 数据输出
    result.print();
    
    // 执行程序
    env.execute();
    

      

    你现在所遭遇的每一个不幸,都来自一个不肯努力的曾经
  • 相关阅读:
    ssh无密码登录设置
    Spark Standalone Mode 多机启动 -- 分布式计算系统spark学习(二)(更新一键启动slavers)
    Spark Standalone Mode 单机启动Spark -- 分布式计算系统spark学习(一)
    为golang程序使用pprof远程查看httpserver运行堆栈,cpu耗时等信息
    golang官方实现如何对httpserver做频率限制(最大连接数限制)
    【转】涨姿势了,数据库隔离性的几个级别
    loadRunner 11.0 安装及破解
    EF 如何code first
    百度搜索自动提示搜索相关内容----模拟实现
    如何写出专业级OOP程序-----文档注释
  • 原文地址:https://www.cnblogs.com/yangshibiao/p/14907795.html
Copyright © 2020-2023  润新知