• 大数据_调度平台_配置DolphinScheduler_调度大数据任务


    1.基本概念了解

    01.大数据集群
     HDFS集群:
        负责海量数据的存储,
        集群中的角色主要有 NameNode / DataNode
     YARN集群:
        负责海量数据运算时的资源调度,
        集群中的角色主要有 ResourceManager /NodeManager
        HDFS集群和YARN集群,两者逻辑上分离,但物理上常在一起
     Spark集群
         负责海量数据运算,
         集群中的角色主要有  Master Worker 
     	    driver executor 
     02.各个服务和IP的对应关系
     
     03.端口
        HDFS页面:			50070
        YARN的web管理界面:	8088
        YARN的ResourceManager 的application manager端口: 8032
        HistoryServer的管理界面:19888
        Hive webUI 页面: 	10002	
        spark webUI 端口 	8080
        Zookeeper的服务端口号:2181  
        ZooKeeper自带的基本命令进行增删改查 没看到自带的webUI
    

    2.配置Work内容

     work : /tmp/dolphinscheduler/exec/process/ 
    
     1.配置HDFS服务
         01.客户端-以及客户端的配置文件
    	   docker cp ~/soft/work_conf_hdfs_yarn/yarn-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
       	
         02.配置文件在以下文件里:
    	 进入-- docker exec -it docker-swarm-dolphinscheduler-worker-1 /bin/bash  
    	 hostname -p 
         	core_site.xml, 
         	hdfs_site.xml, 
         	mapred_site.xml
         	yarn-site.xml
         	 其中: 
         	    core_site.xml
         	          fs.default.name
         		hdfs_site.xml
         		mapred_site.xml
         		yarn-site.xml
         		   yarn.resourcemanager.address
     	03.配置本地的环境
     	   环境变量
         		   
         04.集群的内容
             了解集群的一些配置项-方便配置到本地
    		 
    	
    	05.具体配置
    	  在配置文件mapred-site.xml中加入这个值的赋值
              <property>
               <name>hdp.version</name>
               <value>3.0.1.0-187</value>
              </property>
     		
     		
     2.配置Spark服务
      Spark 为各种集群管理器提供了统一的工具来提交作业,这个工具就是 spark-submit
      以集群模式运行。可以通过指定 --master 参数
           配置文件地址
     	  
     	  运行方式  
     	     Spark-Local(client)	
              Spark-YARN(cluster)
    

    错误类型

    1.$JAVA_HOME 不存在
        -  -> welcome to use bigdata scheduling system...
          ERROR: JAVA_HOME is not set and could not be found.
     排查和解决方式
     01.work各个节点
         输入java –version,查看jdk是否安装成功
         输入export,查看jdk环境变量是否设置成功             
     02、在集群环境下,即使各结点都正确地配置了JAVA_HOME,
          解决方法:  在hadoop-env.sh中,再显示地重新声明一遍JAVA_HOME
     	 export JAVA_HOME=/usr/local/openjdk-8
    
    2.ResourceManager的地址配置
    	INFO retry.RetryInvocationHandler: java.net.ConnectException: 
         Call From fac4f*d3**/***.**.0.5 to **.**.**.**:*32 failed on connection exception:
            java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused, 
         while invoking ApplicationClientProtocolPBClientImpl.getNewApplication over null after 1 failover attempts. 
         Trying to failover after sleeping for 30973ms.		 
    
    3.mr-framework 错误
       java.lang.IllegalArgumentException: Could not locate MapReduce framework name 'mr-framework' in mapreduce.application.classpath
        java.lang.IllegalArgumentException: Unable to parse '/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework' as a URI, 
    	check the setting for mapreduce.application.framework.path
    
    	解决方式
    	  mr程序的工程中不要有参数 mapreduce.framework.name 的设置 
    

    4.ENOENT: No such file or directory

         ENOENT: No such file or directory
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
    	
        查看输入的参数 注意空格
        		 INFO mapred.MapTask: Processing split: hdfs://**.**.**.**:*20/data/mywork.txt:0+37
    

    5.Spark 运行local 以及Cluster成功, 运行client不成功。

      cluster 运行成功
      client 运行不成功--原因是容器化部署,work和其他slave的通信要用到host,但spark其他节点并不知道上传的work
        发现集群节点要连接我本机,然后将我的任务pi.py,传到节点临时目录/tmp/spark-xxx/,并拷贝到$SPARM_HOME/work/下才真正执行
    	
    	
     Exception in thread "main" org.apache.spark.SparkException: 
     When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
     
     
     /opt/soft/spark2/bin/spark-class: line 71: /usr/jdk64/java/bin/java: No such file or directory
    

    6.Permission denied

      org.apache.hadoop.security.AccessControlException: Permission denied: user=linuxdis, access=EXECUTE, inode="/tmp/hadoop-yarn":linuxfirst:hdfs:drwx------
      解决方式  
         /tmp/hadoop-yarn
    	  删除临时文件 hdfs的临时文件
    	  Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=linuxdis, access=EXECUTE, 
    	  inode="/tmp/hadoop-yarn":linuxfirst:hdfs:drwx------
    
    	org.apache.hadoop.security.AccessControlException: Permission denied: user=hadoop, access=EXECUTE, inode="/tmp/hadoop-yarn":linuxfirst:hdfs:drwx------
    

    7.文件

      org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://**.**.**.**:*20/d/out2 already exists
    

    DS的work配置HDFS

    配置work上的HDFS信息以及yarn信息
     docker cp ~/work_conf_hdfs_yarn/core-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
     docker cp ~/work_conf_hdfs_yarn/mapred-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
     docker cp ~/work_conf_hdfs_yarn/yarn-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
     docker cp ~/work_conf_hdfs_yarn/hadoop-env.sh docker-swarm-dolphinscheduler-worker-1:/opt/soft/hadoop/etc/hadoop/
    

    core-site.xml

       <configuration>
         <property>
            <name>fs.defaultFS</name>
            <value>hdfs://2.2.2.12:*20</value>
           <final>true</final>
         </property>
           
         <property>
           <name>hadoop.proxyuser.hive.hosts</name>
           <value>2.2.2.11</value>
         </property>
       
         <property>
           <name>hadoop.proxyuser.yarn.hosts</name>
           <value>2.2.2.12,2.2.2.13</value>
         </property>
       </configuration>
    

    mapred-site.xml

       <configuration>
          <property>
          <name>hdp.version</name>
          <value>3.37</value>
         </property>
       
          <property>
           <name>mapreduce.application.classpath</name>
           <value>/usr/hdp/3.37/hadoop/conf:/usr/hdp/3.37/hadoop/lib/*:/usr/hdp/3.37/hadoop/.//*:/usr/hdp/3.37/hadoop-hdfs/./:/usr/hdp/3.37/hadoop-hdfs/lib/*:/usr/hdp/3.37/hadoop-hdfs/.//*:/usr/hdp/3.37/hadoop-mapreduce/lib/*:/usr/hdp/3.37/hadoop-mapreduce/.//*:/usr/hdp/3.37/hadoop-yarn/./:/usr/hdp/3.37/hadoop-yarn/lib/*:/usr/hdp/3.37/hadoop-yarn/./*</value> </property>
       </configuration>
    

    yarn-site.xml

        <configuration>
    	  <property>
          <name>yarn.resourcemanager.hostname</name>
          <value>2.2.2.13</value>
          </property>
        <property>
    	    <name>yarn.resourcemanager.address</name>
          <value>2.2.2.13:*50</value>
        </property>
        
    	  <property>
          <name>yarn.resourcemanager.resource-tracker.address</name>
          <value>2.2.2.13:*25</value>
        </property>
        
        <property>
          <name>yarn.resourcemanager.scheduler.address</name>
          <value>2.2.2.13:*30</value>
        </property>
        
    	   
          <property>
          <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle,spark2_shuffle,</value>
          </property>
       
        <property>
          <name>yarn.application.classpath</name>
          <value>$HADOOP_CONF_DIR,/usr/hdp/3.37/hadoop/*,/usr/hdp/3.37/hadoop/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*</value>
          </property>
      </configuration>
    

    hadoop-env.sh 配置work的java地址

       export JAVA_HOME=/usr/local/openjdk-8
    

    DS的work配置Spark

       按照正常情况,只需要配置 spark-env.sh
       
       docker cp ~/work_conf_spark/core-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/spark2/conf
       docker cp ~/work_conf_spark/mapred-site.xmldocker-swarm-dolphinscheduler-worker-1:/opt/soft/spark2/conf
       docker cp ~/work_conf_spark/yarn-site.xml docker-swarm-dolphinscheduler-worker-1:/opt/soft/spark2/conf
       docker cp ~/work_conf_spark/spark-env.sh docker-swarm-dolphinscheduler-worker-1:/opt/soft/spark2/conf
       
    配置work上的HDFS信息以及yarn信息
    export YARN_CONF_DIR=/usr/hdp/3.37/hadoop/
    export HADOOP_CONF_DIR=/usr/hdp/3.37/hadoop/
    export SPARK_MASTER_IP=2.2.2.17
    export SPARK_MASTER_PORT=7077
    export SPARK_YARN_USER_ENV=/usr/hdp/3.37/hadoop-yarn/etc/hadoop/
    
    
    yarn的配置有所不同
    <property>
      <name>yarn.resourcemanager.address</name>
      <value>=2.2.2:*32</value>
    </property>
    
    
    配置workers
    

    运行数据库- SQL 任务

    本地work运行   python shell hdfs
    运行分布式集群  MapReduce Spark
    数据库         mysql Hive
     liunx的用户。最高权限root,或者自己建立的userA
       自己在dolphinscheduler这系统中建立的操作的人
       我拷到lib的MySQL驱动jar是8的,而DS用的本机MySQL是5.7的,重启服务后系统紊乱,不知道用哪个驱动,于是连不上本机DB,自然UI里也没法登录
    

    运行dataX任务

    #删除插件中下划线开头的隐藏文件(不删除会报找不到插件错误,挺奇怪)
    

    参考

     启动hadoop,报错Error JAVA_HOME is not set and could not be found  https://www.cnblogs.com/codeOfLife/p/5940642.html
     Spark Client和Cluster两种运行模式的工作流程、基本概念  https://blog.csdn.net/m0_37758017/article/details/80469263
      dolphinscheduler的现场问题,没有选对租户,权限不对 https://blog.csdn.net/u010978399/article/details/122987214 
     【Dolphinscheduler】DS提交pyspark多文件项目到yarn集群  https://blog.csdn.net/hyj_king/article/details/122976748
      【DolphginSceduler】添加MySQL和Oracle数据源惊魂记  https://www.cnblogs.com/pyhy/p/15900607.html
  • 相关阅读:
    Attacklab markup
    Bomblab markup
    Diary & Solution Set 多校度假
    Solution 「CF 590E」Birthday
    MySQL 避坑宝典 来自小米的开源工具
    Hive SQL语法Explode 和 Lateral View
    「Ynoi2006」rsrams
    「Gym103069C」Random Shuffle
    「UOJ498」新年的追逐战
    「Nowhere」Helesta
  • 原文地址:https://www.cnblogs.com/ytwang/p/15922300.html
Copyright © 2020-2023  润新知