• Prometheus部署+邮箱告警+企业微信告警+钉钉告警


    Prometheus部署+邮箱报警+企业微信报警+钉钉报警

    1 部署Prometheus server

    1.1 下载二进制包
    $ wget https://github.com/prometheus/prometheus/releases/download/v2.12.0/prometheus-2.12.0.linux-amd64.tar.gz
    
    1.2 解压并move至/work/admin目录下
    $ tar zcvf prometheus-2.12.0.linux-amd64.tar.gz
    
    $ mv prometheus-2.12.0.linux-amd64 /work/admin/prometheus
    
    1.3 配置并启动
    $ cat prometheus.yml
    # my global config
    global:
      scrape_interval:     15s # 默认抓取间隔, 15秒向目标抓取一次数据。
      scrape_timeout: 15s
      evaluation_interval: 20s # Evaluate rules every 15 seconds. The default is every 1 minute.
      # scrape_timeout is set to the global default (10s).
    
    # Alertmanager configuration
    alerting:
      alertmanagers:
      - static_configs:
        - targets: ['localhost:9093']
           #- alertmanager: ['localhost:9093']
    
    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
      - "/work/admin/prometheus/alerts/*.rules"
      # - "first_rules.yml"
      # - "second_rules.yml"
    
    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
      # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
      - job_name: 'prometheus_product'
    
        # metrics_path defaults to '/metrics'
        # scheme defaults to 'http'.
        #metrics_path: /metrics
        #scheme: http
    
        static_configs:
        - targets: ['localhost:9090','localhost:9100']
          labels: {cluster: 'product',type: 'basic',env: 'prometheus',job: 'prometheus',export: 'prometheus'}
    
    $ /work/admin/prometheus/prometheus --config.file=/work/admin/prometheus/prometheus.yml --storage.tsdb.path=/work/admin/prometheus/data
    

    2 部署node_exporter

    2.1 下载二进制包
    $ wget https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz
    
    2.2 解压并move至/work/admin目录下
    $ tar zcvf node_exporter-0.17.0.linux-amd64.tar.gz
    
    $ mv node_exporter-0.17.0.linux-amd64 /work/admin/node_exporter
    
    2.3 启动
    $ /work/admin/node_exporter/node_exporter --web.listen-address=:9100
    

    3 部署alertmanager

    3.1 下载二进制包
    $ wget https://github.com/prometheus/alertmanager/releases/download/v0.18.0/alertmanager-0.18.0.linux-amd64.tar.gz
    
    3.2 解压并move至/work/admin目录下
    $ tar zcvf alertmanager-0.18.0.linux-amd64.tar.gz
    
    $ mv alertmanager-0.18.0.linux-amd64 /work/admin/alertmanager
    
    3.3 修改配置文件并启动
    $ cat alertmanager.yml
    
    global:
    
      resolve_timeout: 5m
    
      smtp_smarthost: 'smtp.163.com:25' # 邮箱smtp服务器代理
    
      smtp_from: 'XXXXXX@163.com' # 发送邮箱名称
    
      smtp_auth_username: 'XXXXX@163.com' # 邮箱名称
    
      smtp_auth_password: 'XXXXXXXX' # 邮箱密码或授权码
    
    templates:
    
      - 'template/*.tmpl'
    
    route:
    
      group_by: ['alertname']
    
      group_wait: 10s
    
      group_interval: 10s
    
      repeat_interval: 24h
    
      receiver: 'ops_dingding'
    
    receivers:
    
      - name: 'email'
    
        email_configs:
    
        - to: 'XXXXX@163.com'  # 接收警报的email配置
    
          html: '{{ template "test.html" . }}' # 设定邮箱的内容模板
    
          headers: { Subject: "[WARN] 报警邮件"} # 接收邮件的标题
    
      - name: 'wechat'
    
        wechat_configs:
    
        - corp_id: 'XXXXX'
    
          to_party: '1'
    
          agent_id: '1000002'
    
          api_secret: 'XXXXX'
    
      - name: 'ops_dingding'
    
        webhook_configs:
    
        - url: 'http://localhost:8060/dingtalk/ops_dingding/send'
    
    inhibit_rules:
    
      - source_match:
    
          severity: 'critical'
    
        target_match:
    
          severity: 'warning'
    
        equal: ['alertname', 'dev', 'instance']
    
    $ /work/admin/alertmanager/alertmanager --config.file=/work/admin/alertmanager/alertmanager.yml
    

    4 prometheus通过webhook推送告警至钉钉

    4.1 添加钉钉机器人,获取webhook

    参考 https://open-doc.dingtalk.com/docs/doc.htm?treeId=257&articleId=105735&docType=1

    4.2 下载插件(二进制文件)
    $ wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v0.3.0/prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz
    
    4.3 解压并move至/usr/local/prometheus目录下
    $ tar zxvf prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz
    
    $ mv prometheus-webhook-dingtalk-0.3.0.linux-amd64/prometheus-webhook-dingtalk /work/admin/alertmanager
    
    4.4 编辑启动脚本(请替换为自己的webhook URL 及 ding.profile)
    $ cat dingding_start.sh
    
    nohup /work/admin/alertmanager/prometheus-webhook-dingtalk --ding.profile="ops_dingding=https://oapi.dingtalk.com/robot/send?access_token=XXXXXXX"  2>&1 1>/work/admin/alertmanager/dingding.log &
    
    $ sh dingding_start.sh
    
    4.5 编辑alertmanager.yml,增加web_hook配置并重启alertmanager
      - name: 'ops_dingding'
    
        webhook_configs:
    
        - url: 'http://localhost:8060/dingtalk/ops_dingding/send'
    
  • 相关阅读:
    冲刺周期第七天
    软件体系架构课下作业01
    大型网站技术架构-核心原理与案例分析-阅读笔记6
    大型网站技术架构-核心原理与案例分析-阅读笔记5
    大型网站技术架构-核心原理与案例分析-阅读笔记4
    大型网站技术架构-核心原理与案例分析-阅读笔记3
    大型网站技术架构-核心原理与案例分析-阅读笔记02
    《大型网站技术架构核心原理与案例分析》阅读笔记-01
    掌握需求过程阅读笔记—3
    掌握需求过程阅读笔记—2
  • 原文地址:https://www.cnblogs.com/ZhongzhouChen/p/11711873.html
Copyright © 2020-2023  润新知