• Spark2.x(五十九):yarn-cluster模式提交Spark任务,如何关闭client进程?


    问题:

    最近现场反馈采用yarn-cluster方式提交spark application后,在提交节点机上依然会存在一个yarn的client进程不关闭,又由于spark application都是spark structured streaming程序(application常年累月的执行),最终导致spark application提交节点服务器资源被占满,当执行其他操作时,会出现以下错误:

    [dx@my-linux-01 bin]$ yarn logs -applicationId application_15644802175503_0189
    Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000c000000, 702021632, 0) failed; error='Cannot allocate memory' (errno=12)
    #
    # There is insufficient memory for the Java Runtime Environment to continue.
    # Native memory allocation (mmap) failed to map 702021632 bytes to committing reserved memory.
    # An error report file with more information is saved as:
    # /home/dx/myProj/appApp/bin/hs_err_pid53561.log
    [dx@my-linux-01 bin]$ 

    现场对spark application提交节点进行分析发现占用进程主要是(yarn client集成占用):

    [dx@my-linux-01 bin]$ top
    PID     USER  PR  NI    VIRT     RES  SHR   S  %CPU   %MEM   TIME+    COMMAND
    122236  dx    20  0  20.629g  1.347g  3520  S   0.3    2.1   7:02.42     java
    122246  dx    20  0  20.629g  1.311g  3520  S   0.3    2.0   7:03.42     java
    122236  dx    20  0  20.629g  1.288g  3520  S   0.3    2.2   7:05.83     java
    122346  dx    20  0  20.629g  1.344g  3520  S   0.3    2.1   7:10.42     java
    121246  dx    20  0  20.629g  1.343g  3520  S   0.3    2.3   7:01.42     java
    122346  dx    20  0  20.629g  1.341g  3520  S   0.3    2.4   7:03.39     java
    112246  dx    20  0  20.629g  1.344g  3520  S   0.3    2.0   7:02.42     java
    ............
    112260  dx    20  0  20.629g  1.344g  3520  S   0.3    2.0   7:02.02     java
    112260  dx    20  0  113116      200     0  S   0.0    0.0   0:00.00     sh
    ............

    Yarn提交Spark任务分析:

    yarn方式提交spark application包含两种:

    1)yarn-client(spark-submit --master yarn --deploy-mode client ...):

    这种方式spark提交application任务之后,driver运行在提交服务器节点,且driver运行yarn的client进程中,因此如果关闭了提交服务器节点上client进程会导致driver被关闭,进而导致application被关闭。

    2)yarn-cluster(spark-submit --master yarn --deploy-mode cluster):

    这种方式spark提交application任务之后,driver运行yarn分配container内,container内分配一个AM(Application Master)进程,SparkContext(driver)运行在该AM内,在yarn提交时,在提交节点上也会启动一个yarn的client进程,默认yarn-client方式提交完application后会等待任务结束(failed,finished等),否则会一直运行。

    解决方案:

    yarn.client的参数

    spark.yarn.submit.waitAppCompletion

    如果设置这个参数为true 的话,client将会一直运行并且报告application的状态直到application退出(无论何种原因);

    如果设置这个参数为false的话,client的进程将会在application提交后退出。

    在spark-submit 参数添加参数

    ./bin/spark-submit.sh 
    --master yarn 
    --deploy-mode cluster 
    --conf spark.yarn.submit.waitAppCompletion=false
    ....

    对应yarn.client类中代码位置:

      /**
       * Submit an application to the ResourceManager.
       * If set spark.yarn.submit.waitAppCompletion to true, it will stay alive
       * reporting the application's status until the application has exited for any reason.
       * Otherwise, the client process will exit after submission.
       * If the application finishes with a failed, killed, or undefined status,
       * throw an appropriate SparkException.
       */
      def run(): Unit = {
        this.appId = submitApplication()
        if (!launcherBackend.isConnected() && fireAndForget) {
          val report = getApplicationReport(appId)
          val state = report.getYarnApplicationState
          logInfo(s"Application report for $appId (state: $state)")
          logInfo(formatReportDetails(report))
          if (state == YarnApplicationState.FAILED || state == YarnApplicationState.KILLED) {
            throw new SparkException(s"Application $appId finished with status: $state")
          }
        } else {
          val (yarnApplicationState, finalApplicationStatus) = monitorApplication(appId)
          if (yarnApplicationState == YarnApplicationState.FAILED ||
            finalApplicationStatus == FinalApplicationStatus.FAILED) {
            throw new SparkException(s"Application $appId finished with failed status")
          }
          if (yarnApplicationState == YarnApplicationState.KILLED ||
            finalApplicationStatus == FinalApplicationStatus.KILLED) {
            throw new SparkException(s"Application $appId is killed")
          }
          if (finalApplicationStatus == FinalApplicationStatus.UNDEFINED) {
            throw new SparkException(s"The final status of application $appId is undefined")
          }
        }
      }
  • 相关阅读:
    Oracle Flashback技术
    管理Redo Log
    管理UNDO
    Oracle利用PIVOT和UNPIVOT进行行列转换
    如何在SQL CASE表达式中返回多个值
    第二十八节 jQuery事件委托
    第二十七节 jQuery弹框练习
    第二十六节 jQuery中的事件冒泡
    第二十五节 jQuery中bind绑定事件和解绑
    第二十四节 jQuery中的resize事件
  • 原文地址:https://www.cnblogs.com/yy3b2007com/p/11302886.html
Copyright © 2020-2023  润新知