• VictoriaMetrics vmalert 使用


    以下是关于vmalert 的使用,主要是测试下各个组件的集成

    环境准备

    注意环境集成了vmauth,vmagent 等好多VictoriaMetrics的组件,基本上就是一个比较完备的prometheus集成环境了

    • docker-compose 文件

      说明目前vmalert 通过vmauth 会有错误异常,应该属于编码问题

    version:  "3"
    services: 
      vmstorage:
        image: victoriametrics/vmstorage
        ports:
          - 8482:8482
          - 8400:8482
          - 8401:8482
        volumes:
          - ./strgdata:/storage
        command:
          - '--storageDataPath=/storage'
      vmagent:
        image: victoriametrics/vmagent
        volumes: 
        - ./prometheus.yml:/etc/prometheus/prometheus.yml
        ports:
        - 8429:8429
        command:  
        - -promscrape.config=/etc/prometheus/prometheus.yml 
        - -remoteWrite.basicAuth.username=dalong-insert-account-1
        - -remoteWrite.basicAuth.password=dalong
        - -remoteWrite.url=http://vmauth:8427
      alertmanager:
        image: prom/alertmanager:latest
        volumes: 
        - "./alertmanager.yaml:/etc/alertmanager.yaml"
        command: 
        - --config.file=/etc/alertmanager.yaml
        - --storage.path=/tmp/alertmanager1
        ports:
        - 9093:9093
      vmalert:
        image: victoriametrics/vmalert
        volumes: 
        - "./alert.rules:/etc/victoriametrics/alert.rules"
        ports:
        - 8880:8880
        command: 
        - -rule=/etc/victoriametrics/alert.rules
        - -datasource.url=http://vmselect:8481/select/1/prometheus
        # - -datasource.url=http://vmauth:8427
        # - -datasource.basicAuth.password=dalong
        # - -datasource.basicAuth.username=dalong-select-account-1
        - -notifier.url=http://alertmanager:9093
      vmauth:
        image: victoriametrics/vmauth
        volumes: 
        - "./config.yaml:/etc/victoriametrics/config.yaml"
        command:
          - '-auth.config=/etc/victoriametrics/config.yaml'
        ports:
          - 8427:8427
      vminsert:
        image: victoriametrics/vminsert
        command:
          - '--storageNode=vmstorage:8400'
        ports:
          - 8480:8480
      vmselect:
        image: victoriametrics/vmselect
        command:
          - '--storageNode=vmstorage:8401'
        ports:
          - 8481:8481
      grafana:
        image: grafana/grafana
        ports:
          - 3000:3000
    • 配置说明
      vmauth 配置:
     
    users:
    - username: "dalong-select-account-1"
      password: "dalong"
      url_prefix: "http://vmselect:8481/select/1/prometheus"
    - username: "dalong-insert-account-1"
      password: "dalong"
      url_prefix: "http://vminsert:8480/insert/1/prometheus"

    vmagent 配置(就是prometheus 的配置)

    global:
      scrape_interval:     1s
      evaluation_interval: 1s
    scrape_configs:
      - job_name: 'prometheus'
        static_configs:
          - targets: ['prometheus:9090']
      - job_name: 'vminsert'
        static_configs:
          - targets: ['vminsert:8480']
      - job_name: 'vmselect'
        static_configs:
          - targets: ['vmselect:8481']
      - job_name: 'vmstorage'
        static_configs:
          - targets: ['vmstorage:8482']

    vmalert 配置 (alert.rules 文件,主要测试)

    groups:
      - name: groupGorSingleAlert
        rules:
          - alert: VMRows
            for: 10s
            expr: vm_rows > 0
            labels:
              label: bar
              host: "{{ $labels.instance }}"
            annotations:
              summary: "{{ $value|humanize }}"
              description: "{{$labels}}"
      - name: TestGroup
        rules:
          - alert: Conns
            expr: sum(vm_tcplistener_conns) by(instance) > 1
            annotations:
              summary: "Too high connection number for {{$labels.instance}}"
              description: "It is {{ $value }} connections for {{$labels.instance}}"
          - alert: ExampleAlertAlwaysFiring
            expr: sum by(job)
              (up == 1)

    alertmanager 配置

    global:
      resolve_timeout: 30s
    route:
      group_by: ["alertname"]
      group_wait: 5s
      group_interval: 10s
      repeat_interval: 999h
      receiver: "default"
      routes:
        - receiver: "default"
          group_by: []
          match_re:
            alertname: .*
          continue: true
        - receiver: "pagination"
          group_by: ["alertname", "instance"]
          match_re:
            alertname: Pagination Test
          continue: false
        - receiver: "by-cluster-service"
          group_by: ["alertname", "cluster", "service"]
          match_re:
            alertname: .*
          continue: true
        - receiver: "by-name"
          group_by: [alertname]
          match_re:
            alertname: .*
          continue: true
        - receiver: "by-cluster"
          group_by: [cluster]
          match_re:
            alertname: .*
          continue: true
    inhibit_rules:
      - source_match:
          severity: "critical"
        target_match:
          severity: "warning"
        # Apply inhibition if the alertname and cluster is the same in both
        equal: ["alertname", "cluster"]
    receivers:
      - name: "default"
      - name: "pagination"
      - name: "by-cluster-service"
      - name: "by-name"
      - name: "by-cluster"
    • 支持的命令
    vmalert-20200521-152717-tags-v1.35.6-cluster-0-gdcbdc009f
    Usage of /vmalert-prod:
      -datasource.basicAuth.password string
          Optional basic auth password for -datasource.url
      -datasource.basicAuth.username string
          Optional basic auth username for -datasource.url
      -datasource.url string
          Victoria Metrics or VMSelect url. Required parameter. E.g. http://127.0.0.1:8428
      -enableTCP6
          Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
      -envflag.enable
          Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
      -envflag.prefix string
          Prefix for environment variables if -envflag.enable is set
      -evaluationInterval duration
          How often to evaluate the rules. Default 1m (default 1m0s)
      -external.url string
          External URL is used as alert's source for sent alerts to the notifier
      -http.disableResponseCompression
          Disable compression of HTTP responses for saving CPU resources. By default compression is enabled to save network bandwidth
      -http.maxGracefulShutdownDuration duration
          The maximum duration for graceful shutdown of HTTP server. Highly loaded server may require increased value for graceful shutdown (default 7s)
      -http.pathPrefix string
          An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
      -http.shutdownDelay duration
          Optional delay before http server shutdown. During this dealy the servier returns non-OK responses from /health page, so load balancers can route new requests to other servers
      -httpListenAddr string
          Address to listen for http connections (default ":8880")
      -loggerFormat string
          Format for logs. Possible values: default, json (default "default")
      -loggerLevel string
          Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
      -loggerOutput string
          Output for the logs. Supported values: stderr, stdout (default "stderr")
      -memory.allowedPercent float
          Allowed percent of system memory VictoriaMetrics caches may occupy. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
      -notifier.url string
          Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093
      -remoteRead.basicAuth.password string
          Optional basic auth password for -remoteRead.url
      -remoteRead.basicAuth.username string
          Optional basic auth username for -remoteRead.url
      -remoteRead.lookback duration
          Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s)
      -remoteRead.url vmalert
          Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428
      -remoteWrite.basicAuth.password string
          Optional basic auth password for -remoteWrite.url
      -remoteWrite.basicAuth.username string
          Optional basic auth username for -remoteWrite.url
      -remoteWrite.maxQueueSize int
          Defines the max number of pending datapoints to remote write endpoint (default 10000)
      -remoteWrite.url string
          Optional URL to Victoria Metrics or VMInsert where to persist alerts state in form of timeseries. E.g. http://127.0.0.1:8428
      -rule value
          Path to the file with alert rules. 
          Supports patterns. Flag can be specified multiple times. 
          Examples:
           -rule /path/to/file. Path to a single file with alerting rules
           -rule dir/*.yaml -rule /*.yaml. Relative path to all .yaml files in "dir" folder, 
          absolute path to all .yaml files in root.
      -rule.validateTemplates
          Indicates to validate annotation and label templates (default true)
      -version
          Show VictoriaMetrics version
    • 启动
    docker-compose up -d

    集成效果

    说明

    集成vmauth 的错误信息(属于编码问题)

    error   VictoriaMetrics/app/vmalert/group.go:148        failed to execute rule "TestGroup"."ExampleAlertAlwaysFiring": failed to execute query "sum by(job) (up == 1)": error parsing metrics for http://vmauth:8427/api/v1/query?query=sum+by%28job%29+%28up+%3D%3D+1%29:invalid character 'x1f' looking for beginning of value

    参考资料

    https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/vmalert
    https://www.cnblogs.com/rongfengliang/p/12937774.html
    https://www.cnblogs.com/rongfengliang/p/12937022.html
    https://github.com/prometheus/alertmanager

  • 相关阅读:
    java中排序算法
    maven常用命令
    Team_GJX模板整理
    BZOJ 4128
    BZOJ 1169: [Baltic2008]Grid
    Codeforces Round #448 (Div. 2)
    HDU 5942
    2016 ICPC 沈阳
    2016 ICPC 北京
    2016 CCPC 杭州
  • 原文地址:https://www.cnblogs.com/rongfengliang/p/12938491.html
Copyright © 2020-2023  润新知