• spark on yarn任务提交缓慢解决


    spark on yarn任务提交缓慢解决

    spark版本:spark-2.0.0 hadoop 2.7.2。

    在spark on yarn 模式执行任务提交,发现特别慢,要等待几分钟,

    使用集群模式模式提交任务:
    ./bin/spark-submit --class org.apache.spark.examples.SparkPi
    --master yarn
    --deploy-mode cluster
    --driver-memory 4g
    --executor-memory 2g
    --executor-cores 1
    --queue thequeue
    examples/jars/spark-examples*.jar
    10

    发现报出如下警告信息:

    17/02/08 18:26:23 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
    17/02/08 18:26:29 INFO yarn.Client: Uploading resource file:/tmp/spark-91508860-fdda-4203-b733-e19625ef23a0/__spark_libs__4918922933506017904.zip -> hdfs://dbmtimehadoop/user/fuxin.zhao/.sparkStaging/application_1486451708427_0392/__spark_libs__4918922933506017904.zip
    
    

    这个日志之后在上传程序依赖的jar,大概要耗时30s左右,造成任务提交速度超鸡慢,在官网上查到有关的解决办法:

    To make Spark runtime jars accessible from YARN side, you can specify spark.yarn.archive or spark.yarn.jars. 
    For details please refer to Spark Properties. If neither spark.yarn.archive nor spark.yarn.jars is specified, 
    Spark will create a zip file with all jars under $SPARK_HOME/jars and upload it to the distributed cache.
    

    大意是:如果想要在yarn端(yarn的节点)访问spark的runtime jars,需要指定spark.yarn.archive 或者 spark.yarn.jars。如果都这两个参数都没有指定,spark就会把$SPARK_HOME/jars/所有的jar上传到分布式缓存中。这也是之前任务提交特别慢的原因。

    下面是解决方案:
    将$SPARK_HOME/jars/* 下spark运行依赖的jar上传到hdfs上。

    hadoop fs -mkdir hdfs://dbmtimehadoop/tmp/spark/lib_jars/
    hadoop fs -put  $SPARK_HOME/jars/* hdfs://dbmtimehadoop/tmp/spark/lib_jars/
    

    vi $SPARK_HOME/conf/spark-defaults.conf
    添加如下内容:
    spark.yarn.jars hdfs://dbmtimehadoop/tmp/spark/lib_jars/

    再执行任务提交,发现报出如下异常:

    Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
    	at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
    	at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
    	at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:149)
    	at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
    
    

    查看ResourceManager的日志的异常:http://db-namenode01.host-mtime.com:19888/jobhistory/logs/db-datanode03.host-mtime.com:34545/container_e08_1486451708427_0346_02_000001/

    Log Length: 191
    
    Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
    Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
    

    说明之前的配置有误,spark相关的jar包没有加载成功,尝试了一下,如下几种配置方法是有效的:

    #生效
    spark.yarn.jars                  hdfs://dbmtimehadoop/tmp/spark/lib_jars/*.jar ##生效
    #spark.yarn.jars                  hdfs://dbmtimehadoop/tmp/spark/lib_jars/*   ##生效
    ##直接配置多个以逗号分隔的jar,也可以生效。
    #spark.yarn.jars                 hdfs://dbmtimehadoop/tmp/spark/lib_jars/activation-1.1.1.jar,hdfs://dbmtimehadoop/tmp/spark/lib_jars/antlr-2.7.7.jar,hdfs://dbmtimehadoop/tmp/spark/lib_jars/antlr4-runtime-4.5.3.jar,hdfs://dbmtimehadoop/tmp/spark/lib_jars/antlr-runtime-3.4.jar
                                                                   
    

    再重新提交任务,执行成功。
    出现如下信息说明jar添加成功。

    17/02/08 19:28:21 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs://dbmtimehadoop/tmp/spark/lib_jars/spark-mllib-local_2.11-2.0.0.jar
    17/02/08 19:28:21 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs://dbmtimehadoop/tmp/spark/lib_jars/spark-mllib_2.11-2.0.0.jar
    17/02/08 19:28:21 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs://dbmtimehadoop/tmp/spark/lib_jars/spark-network-common_2.11-2.0.0.jar
    17/02/08 19:28:21 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs://dbmtimehadoop/tmp/spark/lib_jars/spark-network-shuffle_2.11-2.0.0.jar
    
    
  • 相关阅读:
    自然语言处理(NLP)入门
    OpenSSL证书认证过程
    IAR EWARM安装时报Fatal Error[Cp001]: Copy protection check, No valid license found for this product [24]
    INTEL FPGA去隔行IP DEMO
    Windows “在此系统上禁止运行脚本”解决办法
    vue中axios请求本地json文件404
    vscode tab 按钮功能变为切换按键,改回缩进功能
    引用elementUi 字体文件丢失导致图标都显示小方块
    ubuntu 添加 ll 命令
    ubuntu sudo source 时 command not found 错误
  • 原文地址:https://www.cnblogs.com/honeybee/p/6379599.html
Copyright © 2020-2023  润新知