背景:使用脚本管理Spark任务,正处于RUNNING状态的任务跳过,不提交
一、涉及到的知识点:
脚本不重要,重要的是知识点
1.查询yarn处于RUNNING状态的任务列表
yarn application -list -appStates RUNNING
2.在Python中使用Linux命令
import os
os.system('ll /')
os.popen("ll /")
二、完整脚本
import os
file = open(r'./bash.txt', 'r')
name_bash_dict = {}
for line in file.readlines():
words = line.split(':')
name_bash_dict[words[0]] = words[1]
file.close()
running_job_lines = os.popen("yarn application -list -appStates RUNNING")
line_num = 0
for line in running_job_lines.readlines():
line_num += 1
if line_num == 3:
column = line.split(' ')
if len(column) == 9 and column[5].strip() == 'RUNNING':
jobName = column[1].strip()
del name_bash_dict[jobName]
for v in name_bash_dict.values():
os.system(v)