Flink ON YARN模式 一、安装Flink 1、下载 1.1、下载 flink 包 官网地址:https://archive.apache.org/dist/flink/ 采用flink-1.8.0-bin-scala_2.11.tgz安装,因目前Flink尚未集成hadoop2.9版本,因此选择2.6稳定版进行安装(兼容),flink-shaded-hadoop-2-uber-2.6.5-7.0.jar flink-1.8.0-bin-scala_2.11.tgz下载路径: https://archive.apache.org/dist/flink/flink-1.8.0/flink-1.8.0-bin-scala_2.11.tgz 1.2、下载 hadoop 依赖包 官网下载flink-shaded-hadoop-2-uber-2.6.5-7.0.jar 拷贝到${flink_home}/lib/目录下 下载地址 https://flink.apache.org/downloads.html 2、解压 解压安装在 /opt/module 目录下 $ cd /opt/module $ tar -zxvf flink-1.8.0-bin-scala_2.11.tgz 3、配置环境变量 $ vim /etc/profile export FLINK_HOME=/opt/module/flink-1.8.0 export PATH=$FLINK_HOME/bin:$PATH $ source /etc/profile $ flink #执行成功,表示Flink安装成功 注意:非root用户,配在.bash_profile文件里 4、添加 hadoop 依赖包 把 flink-shaded-hadoop-2-uber-2.6.5-7.0.jar 拷贝到 /opt/module/flink-1.8.0/lib 目录下 4、修改yarn-site.xml文件 $ sudo vim /opt/module/hadoop-2.7.6/etc/hadoop/yarn-site.xml 4.1、配置AM在尝试重启的最大次数 <property> <name>yarn.resourcemanager.am.max-attempts</name> <value>4</value> <description>The maximum number of application master execution attempts</description> </property> 5、修改 flink-conf.yaml 文件 $ sudo vim /opt/module/flink-1.8.0/conf/flink-conf.yaml #jobmanager taskmanager配置 jobmanager.rpc.address: localhost jobmanager.rpc.port: 6123 jobmanager.heap.mb: 256 taskmanager.heap.mb: 512 taskmanager.numberOfTaskSlots: 1 #是否应在 TaskManager 启动时预先分配 TaskManager 管理的内存 #默认不进行预分配,这样在我们不使用flink集群时候不会占用集群资源 taskmanager.memory.preallocate: false parallelism.default: 1 jobmanager.web.port: 33069 #默认端口8081 # yarn ApplicationMaster 能接受最多的失败 container 数,直到 YARN 会话失败。 # 需配置,注意":"之后有个空格 yarn.maximum-failed-containers: 99999 #akka config # 需配置,注意":"之后有个空格 akka.watch.heartbeat.interval: 5 s akka.watch.heartbeat.pause: 20 s akka.ask.timeout: 60 s akka.framesize: 20971520b #high-avaliability(高可用配置) # 注释打开 high-availability: zookeeper ## 根据安装的zookeeper信息填写 #high-availability.zookeeper.quorum: 10.141.61.226:2181,10.141.53.244:2181,10.141.18.219:2181 high-availability.zookeeper.quorum: localhost:2181 high-availability.zookeeper.path.root: /data/flink/flink-on-yarn ## HA 信息存储到HDFS的目录,根据各自的Hdfs情况修改,hdfs://mycluster/取自配置fs.defaultFS的值 #high-availability.zookeeper.storageDir: hdfs://mycluster/flink/recovery high-availability.zookeeper.storageDir: hdfs://mycluster/flink/recovery ##ApplicationMaster尝试次数 yarn.application-attempts=10 #checkpoint config(容错配置) ##支持Backend有memory、fs、rocksdb state.backend: rocksdb ## checkpoint到HDFS的目录 根据各自安装的HDFS情况修改 state.backend.fs.checkpointdir: hdfs://mycluster/flink/checkpoint ## 对外checkpoint到HDFS的目录 state.checkpoints.dir: hdfs://mycluster/flink/savepoint #memory config env.java.opts: -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 - XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -server -XX:+HeapDumpOnOutOfMemoryError yarn.heap-cutoff-ratio: 0.2 taskmanager.memory.off-heap: true 6、Flink ON YARN相关服务启动 启动Zookeeper集群 $ /opt/module/zookeeper-3.4.6/bin/zkServer.sh start ##zookeeper重启可以改用如下命令 $ /opt/module/zookeeper-3.4.6/bin/zkServer.sh restart 启动HDFS集群 $ start-dfs.sh 验证hadoop是否启动成功 http://182.61.*.60:50070/ http://106.12.*.89:50070/ 启动YARN集群 $ start-yarn.sh 查看服务状态 [admin@145 sbin]$ yarn rmadmin -getServiceState rm1 standby [admin@145 sbin]$ yarn rmadmin -getServiceState rm2 active http://106.12.241.145:33069 sudo vim /opt/module/hadoop-2.7.6/etc/hadoop/yarn-site.xml <!-- ResourceManager对外web ui地址。用户可通过该地址在浏览器中查看集群各类信息。--> <property> <name>yarn.resourcemanager.webapp.address</name> <value>node145:33069</value> </property> 3、以yarnSession方式提交job 3.1、创建session 创建yarnsession $ yarn-session.sh -nm test -n 2 -tm 1024 -s 2 #-nm Yarn的应用名字 #-n (container) taskmanager的数量 #-tm 每个taskmanager的内存大小 #-s (slot) 每个taskmanager的slot 数量 yarn-session的信息会写到:/tmp/.yarn-properties-hadoop 这种机制不太好(手动获取appid) 问题一: The configuration directory ('/opt/module/flink-1.8.0/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them. https://www.cnblogs.com/frankdeng/p/9047698.html https://blog.csdn.net/magic_kid_2010/article/details/97004746?depth_1-utm_source=distribute.pc_relevant.none-task&utm_source=distribute.pc_relevant.none-task zookeper 安装教程 https://www.cnblogs.com/linjiqin/p/8407084.html