起因:YARN 使用capability schedule queue调度container,spark 的app卡死在YARN的队列里面无法出来,无奈请教大神时,可用[yarn application [option]]命令去操纵yarn的应用。
usage: application
-appStates <States> Works with -list to filter applications
based on input comma-separated list of
application states. The valid application
state can be one of the following:
ALL,NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUN
NING,FINISHED,FAILED,KILLED
可以查看yarnapplication的状态,配合-list使用,e.g:
yarn application -list -appStates FAILED
[output descbribe]:
[1.总数]Total number of applications (application-types: [] and states: [FAILED]):4
[2.信息表头]
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
[3.对应表头的内容]
application_1489637571965_0120 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 SPARK hdfs root.hdfs FAILED FAILED 0% http://m.test.com:8088/cluster/app/application_1489637571965_0120
-appTypes <Types> Works with -list to filter applications
based on input comma-separated list of
application types.
查看application的类型:如SPARK、ZEPPELIN等等。
-help Displays help for all commands.
-kill <Application ID> Kills the application.
注意使用这个命令要小心,kill掉可能会导致zeppelin只有一个sparkContext时,notebook无法执行。
-list List applications. Supports optional use
of -appTypes to filter applications based
on application type, and -appStates to
filter applications based on application
state.
-movetoqueue <Application ID> Moves the application to a different
queue.
将某个ID application移出到不同的队列
-queue <Queue Name> Works with the movetoqueue command to
specify which queue to move an
application to.
结合-list 和 -movetoqueue使用,e.g:
yarn application -list -queue root.hdfs
[output]:
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):2
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1489637571965_0193 Zeppelin SPARK hdfs root.hdfs RUNNING UNDEFINED 10% http://192.168.66.49:4040
application_1489637571965_0165 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 SPARK hdfs root.default RUNNING UNDEFINED 10% http://192.168.66.49:5050
-status <Application ID> Prints the status of the application.
查看对应id application的状态信息