• Nagios 钉钉报警


    第一章 创建钉钉应用(用于发送报警信息给单独某个用户)

    打开钉钉管理后台: https://oa.dingtalk.com

    创建成功后需要记录三个值"AgentID","AppKey","AppSecret"。

    第二章 创建钉钉机器人(用于发送报警信息到监控群)

    2.1创建钉钉群

    过程略过。

    2.2添加群机器人

    #此处要记录webhook,脚本中需要调用。

    第三章 编写报警脚本(此脚本会在服务器出现异常后调用)

      1 此脚本是基于Python3写的,调用此脚本时会传入七个参数,参数是Nagios的变量,参数说明见下文。
      2 [zhuyuliang@nagios ~]$ cat /usr/local/nagios/python/dingding.py
      3 #coding: utf-8
      4 import json
      5 import sys
      6 import requests
      7 
      8 '''
      9 参数含义:
     10 警告类型: $NOTIFICATIONTYPE$
     11 服务名称: $SERVICEDESC$
     12 主机名: $HOSTALIAS$
     13 IP地址: $HOSTADDRESS$
     14 服务状态: $SERVICESTATE$
     15 时间: $LONGDATETIME$
     16 日志: $SERVICEOUTPUT$
     17 '''
     18 
     19 warning_type=str(sys.argv[1])
     20 service_name=str(sys.argv[2])
     21 host_name=str(sys.argv[3])
     22 host_IP=str(sys.argv[4])
     23 service_state=str(sys.argv[5])
     24 warning_time=str(sys.argv[6])
     25 warning_log=str(sys.argv[7])
     26 
     27 '''
     28 用户的userid,因为固定的,所以写死了,获取方法:
     29 获取部门ID:
     30 curl https://oapi.dingtalk.com/department/list?access_token=xxx|jq '.'
     31 通过部门ID获取userid:
     32 curl https://oapi.dingtalk.com/user/list?access_token=xx&department_id=xx|jq '.'
     33 '''
     34 
     35 chenning_id='09386937241216057'
     36 baihe_id='165726012126376472'
     37 tiantaotao_id='215023131029727888'
     38 wangfujun_id='014610392229410999'
     39 maoweijian_id='014506344727183149'
     40 caie_id='01461056511094710'
     41 zhaozhibo_id='121027651935582616'
     42 
     43 #项目的IP列表
     44 ITFIN=['47.99.98.249','47.110.157.52','47.99.88.4','47.99.203.235','47.99.201.252','47.98.240.44','47.99.201.132','47.96.89.81','47.99.106.12','47.99.204.155','120.55.49.10']
     45 cdh=['47.99.122.122','47.99.134.63','47.99.82.201','47.96.22.59','47.99.53.179']
     46 chess=['106.14.12.179','47.101.144.209','106.14.169.195','47.101.164.250']
     47 sdk=['121.40.109.196','121.40.82.16','120.26.106.206','120.26.223.154','120.26.55.62','47.97.244.135','101.37.89.187','116.62.108.28','116.62.109.7','116.62.102.197']
     48 
     49 #发送的信息主体
     50 header = {"Content-Type":"application/json;charset=UTF-8"}
     51 content="** Nagios警报 **
    
    警告类型: {}
    服务名称: {}
    主机名: {}
    IP地址: {}
    服务状态: {}
    时间: {}
    日志:
    {}".format(warning_type,service_name,host_name,host_IP,service_state,warning_time,warning_log)"
     52 
     53 def get_accessToken(appkey,appsecret):
     54     '''
     55     此函数用于获取accessToken
     56     '''
     57     json_token=requests.get(url='https://oapi.dingtalk.com/gettoken',params={'appkey':appkey,'appsecret':appsecret})
     58     return json_token.json()['access_token']
     59 
     60 def send_group():
     61     '''
     62     此函数用于发送报警至钉钉群
     63     '''
     64     url='https://oapi.dingtalk.com/robot/send?access_token=7df4cff195905e47527602b7bfab6ecc4fc669392da1e446eebeac05049ddcf7'
     65     data = {
     66     "msgtype":"text",
     67     "text":{
     68     "content":content}
     69     }
     70     sendData=json.dumps(data).encode('utf-8')
     71     result=requests.post(url=url,data=sendData,headers=header)
     72     
     73 def send_someone_data(*args):
     74     '''
     75     不同的业务线有不同的信息,为了节省代码所以定义了一个函数
     76     '''
     77     data={
     78     "touser":'|'.join((args[:])),
     79     "agentid":236353484,
     80     "msgtype":"text",
     81     "text":{
     82     "content":content}
     83     }
     84     return data
     85     
     86 def send_someone():
     87     '''
     88     此函数用于发送信息给某个业务线的负责人
     89     '''
     90     access_token=get_accessToken('dingg3bmym6arxwokwee','xxx')
     91     url="https://oapi.dingtalk.com/message/send?access_token={}".format(access_token)
     92     if host_IP in ITFIN:
     93         data=send_someone_data(chenning_id,baihe_id)
     94     elif host_IP in cdh:
     95         data=send_someone_data(tiantaotao_id,zhaozhibo_id)
     96     elif host_IP in chess:
     97         data=send_someone_data(wangfujun_id)
     98     elif host_IP in sdk or host_IP.startswith('103.56.139'):
     99         data=send_someone_data(maoweijian_id,caie_id)
    100     sendData=json.dumps(data).encode('utf-8')
    101     result=requests.post(url=url,data=sendData,headers=header)
    102     
    103 if __name__ == '__main__':
    104     send_group() #只要服务器发生异常都发送报警到你创建的群中
    105     send_someone() #根据发生异常的服务器IP来决定发送给哪个用户
    View Code

    第四章 配置钉钉报警

    4.1添加报警,commands.cfg里编写。

    [zhuyuliang@nagios ~]$ tail -6 /usr/local/nagios/etc/objects/commands.cfg
    ###钉钉报警###
    define command{
    command_name dindin-bj
    command_line /usr/local/python-3.4/bin/python3.4 /usr/local/nagios/python/dingding.py "$NOTIFICATIONTYPE$""$SERVICEDESC$""$HOSTALIAS$""$HOSTADDRESS$""$SERVICESTATE$""$LONGDATETIME$""$SERVICEOUTPUT$" register 1
    }

    4.2 联系人调用报警

    [zhuyuliang@nagios ~]$ tail -20 /usr/local/nagios/etc/objects/contacts.cfg
    define contact{
    contact_name dingding
    service_notification_period 24x7
    host_notification_period 24x7
    service_notification_options w,u,c,r,f,s
    host_notification_options d,u,r,f,s
    service_notification_commands dindin-bj #调用commands.cfg文件中定义的命令
    host_notification_commands dindin-bj
    register 1
    }
    define contactgroup{ #将钉钉联系人添加到组
    contactgroup_name admins
    alias Nagios Administrators
    members 139mail,dingding,zq-weixin,mao-weixin,baihe-weixin,huazhen-weixin,zhuyuliang-weixin,tiantaotao-weixin
    }
    define contactgroup{
    contactgroup_name paiyou
    alias paiyou
    members nagiosadmin,dingding,zhanghu-weixin,yujie-weixin,bietao-weixin,louchao-weixin,maxiang-weixin,liujieqing-weixin
    }

     

    4.3 查看主机,服务调用那些模板

    [zhuyuliang@nagios ~]$ grep -vE "^$|^#" /usr/local/nagios/etc/aliyun/host.cfg
    define host{
    use generic_linux_aliyun #应用的模板名称
    host_name ad-server01
    alias AD SERVER01
    address 120.26.121.119
    hostgroups aliyun_linux_ad_group
    }
    [zhuyuliang@nagios ~]$ grep -vE "^$|^#" /usr/local/nagios/etc/services/check_ad.cfg
    define service{
    host_name         ad-server01
    use generic_service    #引用的模板名称
    name check_ad
    service_description Check ad
    check_command check_nrpe!check_ad
    }

    4.4 修改模板(调用此联系人)

    [zhuyuliang@nagios ~]$ grep -vE "^$|^#" /usr/local/nagios/etc/templates/host_templates.cfg
    define host{
        name        generic_linux_aliyun
        use        linux_server
    }           #找到了主机引用的模板,但是此模板还有父级模板,所以要继续找到父级模板添加联系人
    
    define host{ name linux_server use generic_host … 省略 contact_groups admins #修改联系人组,为我们定义的组 register 0 }
    [zhuyuliang@nagios
    ~]$ grep -vE "^$|^#" /usr/local/nagios/etc/templates/service_templates.cfg define service{ name generic_service use services-pnp … 省略 contact_groups admins #修改联系人组,为我们定义的组 }

    4.5  配置报警的整体逻辑。

    主机引用模板 -> 模板引用联系人组 -> 联系人组包含联系人 -> 联系人中调用报警命令 -> 报警命令引用脚本

     

    4.6  检测配置文件,重启

    #/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
    #/etc/init.d/nagios restart
  • 相关阅读:
    MySql-数据库基础
    Window安装MySQL
    Python程序中的进程操作-进程间通信(multiprocess.Queue)
    线程
    上传电影代码
    并发编程基础
    基于socketserver实现并发的socket编程
    模拟ssh远程执行命令
    GIT的使用,Pycharm中使用GitHub
    主机如何访问运行在虚拟机中的Django项目
  • 原文地址:https://www.cnblogs.com/SleepDragon/p/10472256.html
Copyright © 2020-2023  润新知