• 利用saltstack的event实现自己的功能


    saltstack的master上minion连接较多,下面这个程序可以分析哪些minion任务执行成功,哪些执行失败以及哪些没有返回。

    脚本说明:

    一、最先打印出本次任务的job id、command name以及其它相关信息,然后是本次任务的执行流程和结果,这和我们单独执行这个命令是一致的。最后程序会打印出所有未成功的任务和未返回的任务,并且重新执行一遍。 这里要说明的是,因为没有查看对应的情景,对于失败任务的排判断做的不好,另外minion未连接我也归为任务未返回,并且会再执行一遍,实际上如果是minion未连接,则不应该执行。

    二、 程序我们先派生子进程去执行salt命令,再salt命令执行完毕后,我们的程序会对其中失败的和未返回的minion任务二次执行

    三、编写脚本

    import salt.utils.event
    import re
    import signal, time
    import sys
    import os
    def single_handler(target):
        os.execl('/usr/bin/salt', 'salt', target, 'state.sls', 'os')
        
    def handler(num1, num2):
        #signal.signal(signal.SIGCLD,signal.SIG_IGN)
        print 'We are in signal handler'
        print 'Job Not Ret: '+str(record[jid])
        print ' Job Failed: '+str(failedrecord[jid])
        print 'all done...'
        for item in failedrecord[jid]:
            #print item
            try:
               pid  = os.fork()
               if pid == 0:
                  single_handler(item)
            except OSError:
               print 'we exec. '+ item +' error!'
        for item in record[jid]:
            #print item
            try:
               print 'fork ok ' + item
               pid = os.fork()
               if pid == 0 :
                  single_handler(item)
            except OSError:
               print 'we exec. '+item + ' error!' 
        sys.stdout.flush()
        os._exit(0)
    
    
    
    fd = open('/tmp/record', 'w+')
    #sys.stdout = fd
    #sys.stderr = fd
    
    signal.signal(signal.SIGCLD, handler)
    
    #fd = open('/var/log/record', 'w+')
    os.dup2(fd.fileno(), sys.stdout.fileno())
    os.dup2(fd.fileno(), sys.stderr.fileno())
    
    #sys.stdout = fd
    #sys.stderr = fd
    
    
    try:
       pid = os.fork()
       if pid == 0:
          time.sleep(2)
          try:
             os.execl('/usr/bin/salt', 'salt', '*', 'state.sls', 'os')
          except OSError:
             print 'exec error!'
             os._exit(1)
    except OSError:
       print 'first fork error!'
       os._exit(1)
    event = salt.utils.event.MasterEvent('/var/run/salt/master')
    flag=False
    reg=re.compile('salt/job/([0-9]+)/new')
    reg1=reg
    #a process to exec. command, but will sleep some time
    #another process listen the event
    #if we use this method, we can filter the event through func. name
    record={}
    failedrecord={}
    jid = 0
    
    
    #try:
    for eachevent in event.iter_events(tag='salt/job',full=True):
        eachevent=dict(eachevent)
        result = reg.findall(eachevent['tag'])
        if not flag and result:
           flag = True
           jid = result[0]
           print "   job_id: " + jid
           print "  Command: " + dict(eachevent['data'])['fun'] + ' ' + str(dict(eachevent['data'])['arg'])
           print "    RunAs: " + dict(eachevent['data'])['user'] 
           print "exec_time: " + dict(eachevent['data'])['_stamp'] 
           print "host_list: " + str(dict(eachevent['data'])['minions'])
           sys.stdout.flush()
           record[jid]=eachevent['data']['minions']
           failedrecord[jid]=[]
           reg1 = re.compile('salt/job/'+jid+'/ret/([0-9.]+)')
        else:
           result = reg1.findall(eachevent['tag'])
           if result:
              record[jid].remove(result[0])
              if not dict(eachevent['data'])['success']:
                 failedrecord[jid].append(result[0])
    #except:
    #   print 'we in except'
    """
       print 'Job Not Ret: '+str(record[jid])
       print ' Job Failed: '+str(failedrecord[jid])
       for item in failedrecord[jid]:
           os.system('salt '+ str(item) + ' state.sls os')
       for item in record[jid]:
           os.system('salt '+ str(item) + ' state.sls os')
       os._exit(0)
    """

    执行结果:

       job_id: 20151208025319005896
      Command: state.sls ['os']
        RunAs: root
    exec_time: 2015-12-08T02:53:19.006284
    host_list: ['172.18.1.212', '172.18.1.214', '172.18.1.213', '172.18.1.211']
    172.18.1.213:
    ----------
              ID: configfilecopy
        Function: file.managed
            Name: /root/node3
          Result: True
         Comment: File /root/node3 is in the correct state
         Started: 02:53:19.314015
        Duration: 13.033 ms
         Changes:   
    ----------
              ID: commonfile
        Function: file.managed
            Name: /root/commonfile
          Result: True
         Comment: File /root/commonfile is in the correct state
         Started: 02:53:19.327173
        Duration: 1.993 ms
         Changes:   
    
    Summary
    ------------
    Succeeded: 2
    Failed:    0
    ------------
    Total states run:     2
    172.18.1.212:
    ----------
              ID: configfilecopy
        Function: file.managed
            Name: /root/node2
          Result: True
         Comment: File /root/node2 is in the correct state
         Started: 02:53:19.337325
        Duration: 8.327 ms
         Changes:   
    ----------
              ID: commonfile
        Function: file.managed
            Name: /root/commonfile
          Result: True
         Comment: File /root/commonfile is in the correct state
         Started: 02:53:19.345787
        Duration: 1.996 ms
         Changes:   
    
    Summary
    ------------
    Succeeded: 2
    Failed:    0
    ------------
    Total states run:     2
    172.18.1.211:
    ----------
              ID: configfilecopy
        Function: file.managed
            Name: /root/node1
          Result: True
         Comment: File /root/node1 is in the correct state
         Started: 02:53:19.345017
        Duration: 12.741 ms
         Changes:   
    ----------
              ID: commonfile
        Function: file.managed
            Name: /root/commonfile
          Result: True
         Comment: File /root/commonfile is in the correct state
         Started: 02:53:19.357873
        Duration: 1.948 ms
         Changes:   
    
    Summary
    ------------
    Succeeded: 2
    Failed:    0
    ------------
    Total states run:     2
    172.18.1.214:
        Minion did not return. [Not connected]
    We are in signal handler
    Job Not Ret: ['172.18.1.214']
     Job Failed: []
    all done...
    fork ok 172.18.1.214
    172.18.1.214:
        Minion did not return. [Not connected]
  • 相关阅读:
    洛谷P1441 砝码称重(搜索,dfs+bitset优化)
    洛谷P1441 砝码称重(搜索,dfs+dp)
    洛谷P1242 新汉诺塔(dfs,模拟退火)
    洛谷P1415 拆分数列(dp)
    openstack-往已有集群中添加控制节点,实现控制节点的高可用
    第九步:使用nfs作为cinder-controller后端存储
    第八步:安装openstack-dashboard
    第七步(2):安装openstack-cinder服务-控制节点
    第七步(1):安装openstack-cinder服务-存储节点
    第六步:创建实例进行测试
  • 原文地址:https://www.cnblogs.com/nulige/p/9219086.html
Copyright © 2020-2023  润新知