• 监控Elasticsearch的插件【check_es_system】


    插件地址:https://www.claudiokuenzler.com/monitoring-plugins/check_es_system.php

    下载地址

    #!/bin/bash
    ################################################################################
    # Script:       check_es_system.sh                                             #
    # Author:       Claudio Kuenzler www.claudiokuenzler.com                       #
    # Purpose:      Monitor ElasticSearch Store (Disk) Usage                       #
    # Official doc: www.claudiokuenzler.com/monitoring-plugins/check_es_system.php #
    # License:      GPLv2                                                          #
    # GNU General Public Licence (GPL) http://www.gnu.org/                         #
    # This program is free software; you can redistribute it and/or                #
    # modify it under the terms of the GNU General Public License                  #
    # as published by the Free Software Foundation; either version 2               #
    # of the License, or (at your option) any later version.                       #
    #                                                                              #
    # This program is distributed in the hope that it will be useful,              #
    # but WITHOUT ANY WARRANTY; without even the implied warranty of               #
    # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the                #
    # GNU General Public License for more details.                                 #
    #                                                                              #
    # You should have received a copy of the GNU General Public License            #
    # along with this program; if not, see <https://www.gnu.org/licenses/>.        #
    #                                                                              #
    # Copyright 2016,2018-2020 Claudio Kuenzler                                    #
    # Copyright 2018 Tomas Barton                                                  #
    # Copyright 2020 NotAProfessionalDeveloper                                     #
    #                                                                              #
    # History:                                                                     #
    # 20160429: Started programming plugin                                         #
    # 20160601: Continued programming. Working now as it should =)                 #
    # 20160906: Added memory usage check, check types option (-t)                  #
    # 20160906: Renamed plugin from check_es_store to check_es_system              #
    # 20160907: Change internal referenced variable name for available size        #
    # 20160907: Output now contains both used and available sizes                  #
    # 20161017: Add missing -t in usage output                                     #
    # 20180105: Fix if statement for authentication (@deric)                       #
    # 20180105: Fix authentication when wrong credentials were used                #
    # 20180313: Configure max_time for Elastic to respond (@deric)                 #
    # 20190219: Fix alternative subject name in ssl (issue 4), direct to auth      #
    # 20190220: Added status check type                                            #
    # 20190403: Check for mandatory parameter checktype, adjust help               #
    # 20190403: Catch connection refused error                                     #
    # 20190426: Catch unauthorized (403) error                                     #
    # 20190626: Added readonly check type                                          #
    # 20190905: Catch empty cluster health status (issue #13)                      #
    # 20190909: Added jthreads and tps (thread pool stats) check types             #
    # 20190909: Handle correct curl return codes                                   #
    # 20190924: Missing 'than' in tps output                                       #
    # 20191104: Added master check type                                            #
    # 20200401: Fix/handle 503 errors with curl exit code 0 (issue #20)            #
    # 20200409: Fix 503 error lookup (issue #22)                                   #
    # 20200430: Support both jshon and jq as json parsers (issue #18)              #
    # 20200609: Fix readonly check on ALL indices (issue #26)                      #
    ################################################################################
    #Variables and defaults
    STATE_OK=0              # define the exit code if status is OK
    STATE_WARNING=1         # define the exit code if status is Warning
    STATE_CRITICAL=2        # define the exit code if status is Critical
    STATE_UNKNOWN=3         # define the exit code if status is Unknown
    export PATH=$PATH:/usr/local/bin:/usr/bin:/bin # Set path
    version=1.8.1
    port=9200
    httpscheme=http
    unit=G
    indexes='_all'
    max_time=30
    parsers=(jshon jq)
    ################################################################################
    #Functions
    help () {
    echo -e "$0 $version (c) 2016-$(date +%Y) Claudio Kuenzler and contributors (open source rulez!)
    
    Usage: ./check_es_system.sh -H ESNode [-P port] [-S] [-u user] [-p pass] -t checktype [-d int] [-o unit] [-w int] [-c int] [-m int] [-X parser]
    
    Options:
    
       *  -H Hostname or ip address of ElasticSearch Node
          -P Port (defaults to 9200)
          -S Use https
          -u Username if authentication is required
          -p Password if authentication is required
       *  -t Type of check (disk, mem, status, readonly, jthreads, tps, master)
       +  -d Available size of disk or memory (ex. 20)
          -o Disk space unit (K|M|G) (defaults to G)
          -i Space separated list of indexes to be checked for readonly (default: '_all')
          -w Warning threshold (see usage notes below)
          -c Critical threshold (see usage notes below)
          -m Maximum time in seconds to wait for response (default: 30)
          -e Expect master node (used with 'master' check)
          -X The json parser to be used jshon or jq (default: jshon)
          -h Help!
    
    *mandatory options
    +mandatory option for types disk,mem
    
    Threshold format for 'disk' and 'mem': int (for percent), defaults to 80 (warn) and 95 (crit)
    Threshold format for 'tps': int,int,int (active, queued, rejected), no defaults
    Threshold format for all other check types': int, no defaults
    
    Requirements: curl, expr and one of $(IFS=,; echo "${parsers[*]}")"
    exit $STATE_UNKNOWN;
    }
    
    authlogic () {
    if [[ -z $user ]] && [[ -z $pass ]]; then echo "ES SYSTEM UNKNOWN - Authentication required but missing username and password"; exit $STATE_UNKNOWN
    elif [[ -n $user ]] && [[ -z $pass ]]; then echo "ES SYSTEM UNKNOWN - Authentication required but missing password"; exit $STATE_UNKNOWN
    elif [[ -n $pass ]] && [[ -z $user ]]; then echo "ES SYSTEM UNKNOWN - Missing username"; exit $STATE_UNKNOWN
    fi
    }
    
    unitcalc() {
    # ES presents the currently used disk space in Bytes
    if [[ -n $unit ]]; then
      case $unit in
        K) availsize=$(expr $available * 1024); outputsize=$(expr ${size} / 1024);;
        M) availsize=$(expr $available * 1024 * 1024); outputsize=$(expr ${size} / 1024 / 1024);;
        G) availsize=$(expr $available * 1024 * 1024 * 1024); outputsize=$(expr ${size} / 1024 / 1024 / 1024);;
      esac
      if [[ -n $warning ]] ; then
        warningsize=$(expr $warning * ${availsize} / 100)
      fi
      if [[ -n $critical ]] ; then
        criticalsize=$(expr $critical * ${availsize} / 100)
      fi
      usedpercent=$(expr $size * 100 / $availsize)
    else echo "UNKNOWN - Shouldnt exit here. No units given"; exit $STATE_UNKNOWN
    fi
    }
    
    availrequired() {
    if [ -z ${available} ]; then echo "UNKNOWN - Missing parameter '-d'"; exit $STATE_UNKNOWN; fi
    }
    
    thresholdlogic () {
    if [ -n $warning ] && [ -z $critical ]; then echo "UNKNOWN - Define both warning and critical thresholds"; exit $STATE_UNKNOWN; fi
    if [ -n $critical ] && [ -z $warning ]; then echo "UNKNOWN - Define both warning and critical thresholds"; exit $STATE_UNKNOWN; fi
    }
    
    default_percentage_thresholds() {
    if [ -z $warning ] || [ "${warning}" = "" ]; then warning=80; fi
    if [ -z $critical ] || [ "${critical}" = "" ]; then critical=95; fi
    }
    
    json_parse() {
      json_parse_usage() { echo "$0: [-r] [-q] [-c] [-a] -x arg1 -x arg2 ..." 1>&2; exit; }
    
      local OPTIND opt r q c a x
      while getopts ":rqcax:" opt
      do
        case "${opt}" in
        r)  raw=1;;
        q)  quiet=1;; # only required for jshon
        c)  continue=1;; # only required for jshon
        a)  across=1;;
        x)  args+=("$OPTARG");;
        *)  json_parse_usage;;
        esac
      done
    
      case ${parser} in
      jshon)
        cmd=()
        for arg in "${args[@]}"; do
          cmd+=(-e $arg)
        done
        jshon ${quiet:+-Q} ${continue:+-C} ${across:+-a} "${cmd[@]}" ${raw:+-u}
        ;;
      jq)
        cmd=()
        for arg in "${args[@]}"; do
          cmd+=(.$arg)
        done
        jq ${raw:+-r} $(IFS=; echo ${across:+.[]}"${cmd[*]}")
        ;;
      esac
    }
    
    ################################################################################
    # Check for people who need help - aren't we all nice ;-)
    if [ "${1}" = "--help" -o "${#}" = "0" ]; then help; exit $STATE_UNKNOWN; fi
    ################################################################################
    # Get user-given variables
    while getopts "H:P:Su:p:d:o:i:w:c:t:m:e:X:" Input
    do
      case ${Input} in
      H)      host=${OPTARG};;
      P)      port=${OPTARG};;
      S)      httpscheme=https;;
      u)      user=${OPTARG};;
      p)      pass=${OPTARG};;
      d)      available=${OPTARG};;
      o)      unit=${OPTARG};;
      i)      indexes=${OPTARG};;
      w)      warning=${OPTARG};;
      c)      critical=${OPTARG};;
      t)      checktype=${OPTARG};;
      m)      max_time=${OPTARG};;
      e)      expect_master=${OPTARG};;
      X)      parser=${OPTARG:=jshon};;
      *)      help;;
      esac
    done
    
    # Check for mandatory opts
    if [ -z ${host} ]; then help; exit $STATE_UNKNOWN; fi
    if [ -z ${checktype} ]; then help; exit $STATE_UNKNOWN; fi
    ################################################################################
    # Check requirements
    for cmd in curl expr ${parser}; do
      if ! `which ${cmd} >/dev/null 2>&1`; then
        echo "UNKNOWN: ${cmd} does not exist, please check if command exists and PATH is correct"
        exit ${STATE_UNKNOWN}
      fi
    done
    # Find parser
    if [ -z ${parser} ]; then
      for cmd in ${parsers[@]}; do
        if `which ${cmd} >/dev/null 2>&1`; then
          parser=${cmd}
          break
        fi
      done
      if [ -z "${parser}" ]; then
        echo "UNKNOWN: No JSON parser found. Either one of the following is required: $(IFS=,; echo "${parsers[*]}")"
        exit ${STATE_UNKNOWN}
      fi
    fi
    
    ################################################################################
    # Retrieve information from Elasticsearch
    getstatus() {
    esurl="${httpscheme}://${host}:${port}/_cluster/stats"
    eshealthurl="${httpscheme}://${host}:${port}/_cluster/health"
    if [[ -z $user ]]; then
      # Without authentication
      esstatus=$(curl -k -s --max-time ${max_time} $esurl)
      esstatusrc=$?
      if [[ $esstatusrc -eq 7 ]]; then
        echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
        exit $STATE_CRITICAL
      elif [[ $esstatusrc -eq 28 ]]; then
        echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
        exit $STATE_CRITICAL
      elif [[ $esstatus =~ "503 Service Unavailable" ]]; then
        echo "ES SYSTEM CRITICAL - Elasticsearch not available: ${host}:${port} return error 503"
        exit $STATE_CRITICAL
      fi
      # Additionally get cluster health infos
      if [ $checktype = status ]; then
        eshealth=$(curl -k -s --max-time ${max_time} $eshealthurl)
        if [[ -z $eshealth ]]; then
          echo "ES SYSTEM CRITICAL - unable to get cluster health information"
          exit $STATE_CRITICAL
        fi
      fi
    fi
    
    if [[ -n $user ]] || [[ -n $(echo $esstatus | grep -i authentication) ]] ; then
      # Authentication required
      authlogic
      esstatus=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} $esurl)
      esstatusrc=$?
      if [[ $esstatusrc -eq 7 ]]; then
        echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
        exit $STATE_CRITICAL
      elif [[ $esstatusrc -eq 28 ]]; then
        echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
        exit $STATE_CRITICAL
      elif [[ $esstatus =~ "503 Service Unavailable" ]]; then
        echo "ES SYSTEM CRITICAL - Elasticsearch not available: ${host}:${port} return error 503"
        exit $STATE_CRITICAL
      elif [[ -n $(echo $esstatus | grep -i "unable to authenticate") ]]; then
        echo "ES SYSTEM CRITICAL - Unable to authenticate user $user for REST request"
        exit $STATE_CRITICAL
      elif [[ -n $(echo $esstatus | grep -i "unauthorized") ]]; then
        echo "ES SYSTEM CRITICAL - User $user is unauthorized"
        exit $STATE_CRITICAL
      fi
      # Additionally get cluster health infos
      if [[ $checktype = status ]]; then
        eshealth=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} $eshealthurl)
        if [[ -z $eshealth ]]; then
          echo "ES SYSTEM CRITICAL - unable to get cluster health information"
          exit $STATE_CRITICAL
        fi
      fi
    fi
    
    # Catch empty reply from server (typically happens when ssl port used with http connection)
    if [[ -z $esstatus ]] || [[ $esstatus = '' ]]; then
      echo "ES SYSTEM UNKNOWN - Empty reply from server (verify ssl settings)"
      exit $STATE_UNKNOWN
    fi
    }
    ################################################################################
    # Do the checks
    case $checktype in
    disk) # Check disk usage
      availrequired
      default_percentage_thresholds
      getstatus
      size=$(echo $esstatus | json_parse -x indices -x store -x "size_in_bytes")
      unitcalc
      if [ -n "${warning}" ] || [ -n "${critical}" ]; then
        # Handle tresholds
        thresholdlogic
        if [ $size -ge $criticalsize ]; then
          echo "ES SYSTEM CRITICAL - Disk usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_disk=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_CRITICAL
        elif [ $size -ge $warningsize ]; then
          echo "ES SYSTEM WARNING - Disk usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_disk=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_WARNING
        else
          echo "ES SYSTEM OK - Disk usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_disk=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_OK
        fi
      else
        # No thresholds
        echo "ES SYSTEM OK - Disk usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_disk=${size}B;;;;"
        exit $STATE_OK
      fi
      ;;
    
    mem) # Check memory usage
      availrequired
      default_percentage_thresholds
      getstatus
      size=$(echo $esstatus | json_parse -x nodes -x jvm -x mem -x "heap_used_in_bytes")
      unitcalc
      if [ -n "${warning}" ] || [ -n "${critical}" ]; then
        # Handle tresholds
        thresholdlogic
        if [ $size -ge $criticalsize ]; then
          echo "ES SYSTEM CRITICAL - Memory usage is at ${usedpercent}% ($outputsize $unit) from $available $unit|es_memory=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_CRITICAL
        elif [ $size -ge $warningsize ]; then
          echo "ES SYSTEM WARNING - Memory usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_memory=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_WARNING
        else
          echo "ES SYSTEM OK - Memory usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_memory=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_OK
        fi
      else
        # No thresholds
        echo "ES SYSTEM OK - Memory usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_memory=${size}B;;;;"
        exit $STATE_OK
      fi
      ;;
    
    status) # Check Elasticsearch status
      getstatus
      status=$(echo $esstatus | json_parse -r -x status)
      shards=$(echo $esstatus | json_parse -r -x indices -x shards -x total)
      docs=$(echo $esstatus | json_parse -r -x indices -x docs -x count)
      nodest=$(echo $esstatus | json_parse -r -x nodes -x count -x total)
      nodesd=$(echo $esstatus | json_parse -r -x nodes -x count -x data)
      relocating=$(echo $eshealth | json_parse -r -x relocating_shards)
      init=$(echo $eshealth | json_parse -r -x initializing_shards)
      unass=$(echo $eshealth | json_parse -r -x unassigned_shards)
      if [ "$status" = "green" ]; then
        echo "ES SYSTEM OK - Elasticsearch Cluster is green (${nodest} nodes, ${nodesd} data nodes, ${shards} shards, ${docs} docs)|total_nodes=${nodest};;;; data_nodes=${nodesd};;;; total_shards=${shards};;;; relocating_shards=${relocating};;;; initializing_shards=${init};;;; unassigned_shards=${unass};;;; docs=${docs};;;;"
        exit $STATE_OK
      elif [ "$status" = "yellow" ]; then
        echo "ES SYSTEM WARNING - Elasticsearch Cluster is yellow (${nodest} nodes, ${nodesd} data nodes, ${shards} shards, ${relocating} relocating shards, ${init} initializing shards, ${unass} unassigned shards, ${docs} docs)|total_nodes=${nodest};;;; data_nodes=${nodesd};;;; total_shards=${shards};;;; relocating_shards=${relocating};;;; initializing_shards=${init};;;; unassigned_shards=${unass};;;; docs=${docs};;;;"
          exit $STATE_WARNING
      elif [ "$status" = "red" ]; then
        echo "ES SYSTEM CRITICAL - Elasticsearch Cluster is red (${nodest} nodes, ${nodesd} data nodes, ${shards} shards, ${relocating} relocating shards, ${init} initializing shards, ${unass} unassigned shards, ${docs} docs)|total_nodes=${nodest};;;; data_nodes=${nodesd};;;; total_shards=${shards};;;; relocating_shards=${relocating};;;; initializing_shards=${init};;;; unassigned_shards=${unass};;;; docs=${docs};;;;"
          exit $STATE_CRITICAL
      fi
      ;;
    
    readonly) # Check Readonly status on given indexes
      icount=0
      for index in $indexes; do
        if [[ -z $user ]]; then
          # Without authentication
          settings=$(curl -k -s --max-time ${max_time} ${httpscheme}://${host}:${port}/$index/_settings)
          if [[ $? -eq 7 ]]; then
            echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
            exit $STATE_CRITICAL
          elif [[ $? -eq 28 ]]; then
            echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
            exit $STATE_CRITICAL
          fi
          rocount=$(echo $settings | json_parse -r -q -c -a -x settings -x index -x blocks -x read_only | grep -c true)
          roadcount=$(echo $settings | json_parse -r -q -c -a -x settings -x index -x blocks -x read_only_allow_delete | grep -c true)
          if [[ $rocount -gt 0 ]]; then
            output[${icount}]=" $index is read-only -"
            roerror=true
          fi
          if [[ $roadcount -gt 0 ]]; then
            output[${icount}]+=" $index is read-only (allow delete) -"
            roerror=true
          fi
        fi
    
        if [[ -n $user ]] || [[ -n $(echo $esstatus | grep -i authentication) ]] ; then
          # Authentication required
          authlogic
          settings=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} ${httpscheme}://${host}:${port}/$index/_settings)
          settingsrc=$?
          if [[ $settingsrc -eq 7 ]]; then
            echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
            exit $STATE_CRITICAL
          elif [[ $settingsrc -eq 28 ]]; then
            echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
            exit $STATE_CRITICAL
          elif [[ -n $(echo $esstatus | grep -i "unable to authenticate") ]]; then
            echo "ES SYSTEM CRITICAL - Unable to authenticate user $user for REST request"
            exit $STATE_CRITICAL
          elif [[ -n $(echo $esstatus | grep -i "unauthorized") ]]; then
            echo "ES SYSTEM CRITICAL - User $user is unauthorized"
            exit $STATE_CRITICAL
          fi
          rocount=$(echo $settings | json_parse -r -q -c -a -x settings -x index -x blocks -x read_only | grep -c true)
          roadcount=$(echo $settings | json_parse -r -q -c -a -x settings -x index -x blocks -x read_only_allow_delete | grep -c true)
          if [[ $rocount -gt 0 ]]; then
            if [[ "$index" = "_all" ]]; then 
              output[${icount}]=" $rocount index(es) found read-only -"
            else output[${icount}]=" $index is read-only -"
            fi
            roerror=true
          fi
          if [[ $roadcount -gt 0 ]]; then
            if [[ "$index" = "_all" ]]; then 
              output[${icount}]+=" $rocount index(es) found read-only (allow delete) -"
            else output[${icount}]+=" $index is read-only (allow delete) -"
            fi
            roerror=true
          fi
        fi
        let icount++
      done
    
      if [[ $roerror ]]; then
        echo "ES SYSTEM CRITICAL - ${output[*]}"
        exit $STATE_CRITICAL
      else
        echo "ES SYSTEM OK - Elasticsearch Indexes ($indexes) are writeable"
        exit $STATE_OK
      fi
      ;;
    
    jthreads) # Check JVM threads
      getstatus
      threads=$(echo $esstatus | json_parse -r -x nodes -x jvm -x "threads")
      if [ -n "${warning}" ] || [ -n "${critical}" ]; then
        # Handle tresholds
        thresholdlogic
        if [[ $threads -ge $criticalsize ]]; then
          echo "ES SYSTEM CRITICAL - Number of JVM threads is ${threads}|es_jvm_threads=${threads};${warning};${critical};;"
          exit $STATE_CRITICAL
        elif [[ $threads -ge $warningsize ]]; then
          echo "ES SYSTEM WARNING - Number of JVM threads is ${threads}|es_jvm_threads=${threads};${warning};${critical};;"
          exit $STATE_WARNING
        else
          echo "ES SYSTEM OK - Number of JVM threads is ${threads}|es_jvm_threads=${threads};${warning};${critical};;"
          exit $STATE_OK
        fi
      else
        # No thresholds
        echo "ES SYSTEM OK - Number of JVM threads is ${threads}|es_jvm_threads=${threads};${warning};${critical};;"
        exit $STATE_OK
      fi
      ;;
    
    tps) # Check Thread Pool Statistics
      if [[ -z $user ]]; then
        # Without authentication
        threadpools=$(curl -k -s --max-time ${max_time} ${httpscheme}://${host}:${port}/_cat/thread_pool)
        threadpoolrc=$?
        if [[ $threadpoolrc -eq 7 ]]; then
          echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
          exit $STATE_CRITICAL
        elif [[ $threadpoolrc -eq 28 ]]; then
          echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
          exit $STATE_CRITICAL
        fi
      fi
    
      if [[ -n $user ]] || [[ -n $(echo $esstatus | grep -i authentication) ]] ; then
        # Authentication required
        authlogic
        threadpools=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} ${httpscheme}://${host}:${port}/_cat/thread_pool)
        threadpoolrc=$?
        if [[ $threadpoolrc -eq 7 ]]; then
          echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
          exit $STATE_CRITICAL
        elif [[ $threadpoolrc -eq 28 ]]; then
          echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
          exit $STATE_CRITICAL
        elif [[ -n $(echo $esstatus | grep -i "unable to authenticate") ]]; then
          echo "ES SYSTEM CRITICAL - Unable to authenticate user $user for REST request"
          exit $STATE_CRITICAL
        elif [[ -n $(echo $esstatus | grep -i "unauthorized") ]]; then
          echo "ES SYSTEM CRITICAL - User $user is unauthorized"
          exit $STATE_CRITICAL
        fi
      fi
    
      tpname=($(echo "$threadpools" | awk '{print $1"-"$2}' | sed "s/
    //g"))
      tpactive=($(echo "$threadpools" | awk '{print $3}' | sed "s/
    //g"))
      tpqueue=($(echo "$threadpools" | awk '{print $4}' | sed "s/
    //g"))
      tprejected=($(echo "$threadpools" | awk '{print $5}' | sed "s/
    //g"))
    
      if [ -n "${warning}" ] || [ -n "${critical}" ]; then
        # Handle thresholds. They have to come in a special format: n,n,n (active, queue, rejected)
        thresholdlogic
        wactive=$(echo ${warning} | awk -F',' '{print $1}')
        wqueue=$(echo ${warning} | awk -F',' '{print $2}')
        wrejected=$(echo ${warning} | awk -F',' '{print $3}')
        cactive=$(echo ${critical} | awk -F',' '{print $1}')
        cqueue=$(echo ${critical} | awk -F',' '{print $2}')
        crejected=$(echo ${critical} | awk -F',' '{print $3}')
    
        i=0; for tp in ${tpname[*]}; do
          perfdata[$i]="tp_${tp}_active=${tpactive[$i]};${wactive};${cactive};; tp_${tp}_queue=${tpqueue[$i]};${wqueue};${cqueue};; tp_${tp}_rejected=${tprejected[$i]};${wrejected};${crejected};; "
          let i++
        done
    
        i=0
        for tpa in $(echo ${tpactive[*]}); do
          if [[ $tpa -ge $cactive ]]; then
            echo "Thread Pool ${tpname[$i]} is critical: Active ($tpa) is equal or higher than threshold ($cactive)|${perfdata[*]}"
            exit $STATE_CRITICAL
          elif [[ $tpa -ge $wactive ]]; then
            echo "Thread Pool ${tpname[$i]} is warning: Active ($tpa) is equal or higher than threshold ($wactive)|${perfdata[*]}"
            exit $STATE_WARNING
          fi
          let i++
        done
    
        i=0
        for tpq in $(echo ${tpqueue[*]}); do
          if [[ $tpq -ge $cqueue ]]; then
            echo "Thread Pool ${tpname[$i]} is critical: Queue ($tpq) is equal or higher than threshold ($cqueue)|${perfdata[*]}"
            exit $STATE_CRITICAL
          elif [[ $tpq -ge $wqueue ]]; then
            echo "Thread Pool ${tpname[$i]} is warning: Queue ($tpq) is equal or higher than threshold ($wqueue)|${perfdata[*]}"
            exit $STATE_WARNING
          fi
          let i++
        done
    
        i=0
        for tpr in $(echo ${tprejected[*]}); do
          if [[ $tpr -ge $crejected ]]; then
            echo "Thread Pool ${tpname[$i]} is critical: Rejected ($tpr) is equal or higher than threshold ($crejected)|${perfdata[*]}"
            exit $STATE_CRITICAL
          elif [[ $tpr -ge $wrejected ]]; then
            echo "Thread Pool ${tpname[$i]} is warning: Rejected ($tpr) is equal or higher than threshold ($wrejected)|${perfdata[*]}"
            exit $STATE_WARNING
          fi
          let i++
        done
    
       echo "ES SYSTEM OK - Found ${#tpname[*]} thread pools in cluster|${perfdata[*]}"
       exit $STATE_OK
       fi
    
      # No Thresholds
      i=0; for tp in ${tpname[*]}; do
        perfdata[$i]="tp_${tp}_active=${tpactive[$i]};;;; tp_${tp}_queue=${tpqueue[$i]};;;; tp_${tp}_rejected=${tprejected[$i]};;;; "
        let i++
      done
      echo "ES SYSTEM OK - Found ${#tpname[*]} thread pools in cluster|${perfdata[*]}"
      exit $STATE_OK
      ;;
    
    master) # Check Cluster Master
      if [[ -z $user ]]; then
        # Without authentication
        master=$(curl -k -s --max-time ${max_time} ${httpscheme}://${host}:${port}/_cat/master)
        masterrc=$?
        if [[ $masterrc -eq 7 ]]; then
          echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
          exit $STATE_CRITICAL
        elif [[ $masterrc -eq 28 ]]; then
          echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
          exit $STATE_CRITICAL
        fi
      fi
    
      if [[ -n $user ]] || [[ -n $(echo $esstatus | grep -i authentication) ]] ; then
        # Authentication required
        authlogic
        master=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} ${httpscheme}://${host}:${port}/_cat/master)
        masterrc=$?
        if [[ $threadpoolrc -eq 7 ]]; then
          echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
          exit $STATE_CRITICAL
        elif [[ $threadpoolrc -eq 28 ]]; then
          echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
          exit $STATE_CRITICAL
        elif [[ -n $(echo $esstatus | grep -i "unable to authenticate") ]]; then
          echo "ES SYSTEM CRITICAL - Unable to authenticate user $user for REST request"
          exit $STATE_CRITICAL
        elif [[ -n $(echo $esstatus | grep -i "unauthorized") ]]; then
          echo "ES SYSTEM CRITICAL - User $user is unauthorized"
          exit $STATE_CRITICAL
        fi
      fi
    
      masternode=$(echo "$master" | awk '{print $NF}')
    
      if [[ -n ${expect_master} ]]; then
        if [[ "${expect_master}" = "${masternode}" ]]; then
          echo "ES SYSTEM OK - Master node is $masternode"
          exit $STATE_OK
        else
          echo "ES SYSTEM WARNING - Master node is $masternode but expected ${expect_master}"
          exit $STATE_WARNING
        fi
      else
        echo "ES SYSTEM OK - Master node is $masternode"
        exit $STATE_OK
      fi
      ;;
    
    *) help
    esac
    
    

    要求

    • curl(SUSE:curl中的zypper,Debian / Ubuntu:apt-get install curl,CentOS / RHEL:yum install curl

    • json解析器,其中之一:

      • jshon(SUSE:搜索jshon,Debian / Ubuntu:apt-get install jshon
      • jq命令(SUSE:jq中的zypper,Debian / Ubuntu:apt-get install jq
    • Bash内部命令/功能(插件检查其存在性)

    参数定义

    参数 描述
    -H* ElasticSearch节点的主机名或IP地址
    -P 端口(默认为9200)
    -S 使用安全的HTTP(https)
    -u 用户名(如果需要身份验证)
    -p 密码(如果需要验证)
    -t * 要运行的检查类型(磁盘|内存|状态)
    -d + 可用磁盘或内存大小(例如20)
    -o 大小单位(K | M | G)(默认为G)
    -一世 要检查的只读索引的空格分隔列表(默认值:“ _ all”)
    -w 警告阈 值“磁盘”和“内存”的阈值格式:整数(以百分比表示),默认为80(警告)和95(暴击)。 阈值格式为“ tps”:整数,整数,整数(活动,排队,已拒绝),没有默认 值所有其他检查类型的阈值格式:int,没有默认值
    -C “磁盘”和“内存”的临界阈值格式:int(百分比),默认为80(警告)和95(临界) tps的阈值格式:int,int,int(活动,已排队,已拒绝),没有默认 值所有其他检查类型的阈值格式:int,没有默认值
    -米 等待Elasticsearch服务器响应的最长时间(以秒为单位)(默认值:30)
    -e 给定的节点应该是Elasticsearch集群的主节点(仅影响“主”检查)
    -X 要使用的json解析器,可以是jshon或jq(默认值:jshon)
    -H 帮帮我!

    *必填参数

    +对于磁盘和内存检查类型是必需的

    检查类型的定义

    类型 描述
    状态 检查集群的当前状态(绿色,黄色,红色)。除此之外,还显示其他信息(节点,分片,文档)。当状态为黄色或红色时,将显示相关的分片信息(例如,初始化或未分配)。
    记忆 检查当前内存使用情况,并将其与-d参数定义的可用内存进行比较。阈值可能。
    磁碟 检查当前磁盘使用情况,并将其与使用-d参数定义的可用磁盘容量进行比较。阈值可能。
    只读 检查所有-i参数列出的(默认值:_all)或索引是否为只读标志。
    线程 监视跨ES集群的Java线程数。阈值可能。
    监视跨ES群集的线程池统计信息。对于每个线程池,都会检查“活动”,“排队”和“拒绝”队列。某些队列的数量不断增加,这可能表明您的Elasticsearch集群出现问题。阈值可能。
    监视ES群集的当前主节点。参数-e可用于检查某个节点是否为主节点,如果不是这种情况,则发出警报。

    用法/在命令行上运行插件

    用法:

    ./check_es_system.sh -H主机名[-P端口] [-S] [-u用户] [-p密码] -t checktype [-d容量] [-o单位] [-i索引] [- w警告] [-c严重] [-m时间]

    示例1:经典状态检查。此处,Elasticsearch群集在escluster.example.com上运行,并使用HTTPS(使用-S启用https)在端口9243上使用基本身份验证凭据用户和密码进行访问。输出显示集群状态(绿色)和一些其他信息。
    作为性能数据,使用了节点号​​,分片信息和文档总数。
    注意:当状态变为黄色(=警告)或红色(=严重)时,输出还将包含重定位,初始化和未分配的分片的信息。性能数据保持不变,不会混淆图形数据库。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p password -t状态
    ES系统正常-Elasticsearch集群为绿色(3个节点,2个数据节点,114个分片,8426885个文档)| total_nodes = 3 ;;;; data_nodes = 2 ;;; total_shards = 114 ;;; relocating_shards = 0 ;;;; initializing_shards = 0 ;;;; unassigned_shards = 0 ;;;; docs = 8426885 ;;;

    示例2:磁盘使用情况检查。访问与之前相同的Elasticsearch集群。由于此ES在云中运行,因此我们没有主机监控可用(意味着:我们无法进行文件系统监控)。但是我们知道,我们有128GB的可用磁盘空间。我们告诉插件,我们的容量为128GB(-d 128)。让插件完成其余工作:

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t磁盘-d 128
    ES系统正常-磁盘使用率为14%(128 G中为18 G)| es_disk = 19637018938B ; 109951162777; 130567005798 ;;

    示例3:内存使用情况检查。与以前一样,ES在云中运行,我们无法在主机本身上进行内存监视。但是我们已经预订了24GB RAM /内存的Elasticsearch服务。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t mem -d 24
    ES系统正常-内存使用率为58%(24 G中为14 G)| es_memory = 15107616304B ; 20615843020; 24481313587 ;;

    示例4:只读索引检查。插件使用-i参数检查提到的索引是否为只读标志。如果未使用-i参数,则将检查所有索引。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t只读-i“ filebeat- * logstash- *”严重-Elasticsearch
    Index filebeat- *是只读的(找到53索引设置为只读)Elasticsearch索引logstash- *为只读(找到的125个索引设置为只读)

    示例5:JVM线程检查。该插件检查整个集群中的JVM线程数。该插件应在200个或更多线程正在运行时发出警告,在300个或更多线程处于运行状态时会发出警告。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t jthreads -w 200 -c 300
    ES系统关键-JVM线程数为319 | es_jvm_threads = 319; 200; 300 ;;

    示例6:TPS(线程池统计信息)。该插件将遍历所有检测到的集群线程池。没有阈值,该插件仅输出检测到的线程池的数量并添加性能数据。使用阈值(请注意特殊格式!),插件将警告线程池之一是否等于或大于阈值。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t tps -w 200 -c 300
    ES系统正常-在cluster | tp_es02-analyze_active = 0; 10; 50 ;;中找到46个线程池 tp_es02-analyze_queue = 0; 50; 200 ;; tp_es02-analyze_rejected = 0; 1000; 2000 ;; tp_es02-ccr_active = 0; 10; 50 ;; tp_es02-ccr_queue = 0; 50; 200 ;; tp_es02-ccr_rejected = 0; 1000; 2000 ;; tp_es02-fetch_shard_started_active = 0; 10; 50 ;; tp_es02-fetch_shard_started_queue = 0; 50; 200 ;; tp_es02-fetch_shard_started_rejected = 0; 1000; 2000 ;; tp_es02-fetch_shard_store_active = 0; 10; 50 ;; tp_es02-fetch_shard_store_queue = 0; 50; 200 ;; tp_es02-fetch_shard_store_rejected = 0; 1000; 2000 ;; tp_es02-flush_active = 0; 10; 50 ;; tp_es02-flush_queue = 0; 50; 200 ;; tp_es02-flush_rejected = 0; 1000; 2000 ;; tp_es02-force_merge_active = 0; 10; 50 ;; tp_es02-force_merge_queue = 0; 50; 200 ;; tp_es02-force_merge_rejected = 0; 1000; 2000 ;; tp_es02-generic_active = 0; 10; 50 ;; tp_es02-generic_queue = 0; 50; 200 ;; tp_es02-generic_rejected = 0; 1000; 2000 ;; tp_es02-get_active = 0; 10; 50 ;; tp_es02-get_queue = 0; 50; 200 ;; tp_es02-get_rejected = 0; 1000; 2000 ;; tp_es02-index_active = 0; 10; 50 ;; tp_es02-index_queue = 0; 50; 200 ;; tp_es02-index_rejected = 0; 1000; 2000 ;; tp_es02-listener_active = 0; 10; 50 ;; tp_es02-listener_queue = 0; 50; 200 ;; tp_es02-listener_rejected = 0; 1000; 2000 ;; tp_es02-management_active = 1; 10; 50 ;; tp_es02-management_queue = 0; 50; 200 ;; tp_es02-management_rejected = 0; 1000; 2000 ;; tp_es02-ml_autodetect_active = 0; 10; 50 ;; tp_es02-ml_autodetect_queue = 0; 50; 200 ;; tp_es02-ml_autodetect_rejected = 0; 1000; 2000 ;; tp_es02-ml_datafeed_active = 0; 10; 50 ;; tp_es02-ml_datafeed_queue = 0; 50; 200 ;; tp_es02-ml_datafeed_rejected = 0; 1000; 2000 ;; tp_es02-ml_utility_active = 0; 10; 50 ;; tp_es02-ml_utility_queue = 0; 50; 200 ;; tp_es02-ml_utility_rejected = 0; 1000; 2000 ;; tp_es02-refresh_active = 1; 10; 50 ;; tp_es02-refresh_queue = 0; 50; 200 ;; tp_es02-refresh_rejected = 0; 1000; 2000 ;; tp_es02-rollup_indexing_active = 0; 10; 50 ;; tp_es02-rollup_indexing_queue = 0; 50; 200 ;; tp_es02-rollup_indexing_rejected = 0; 1000; 2000 ;; tp_es02-search_active = 0; 10; 50 ;; tp_es02-search_queue = 0; 50; 200 ;; tp_es02-search_rejected = 0; 1000; 2000 ;; tp_es02-search_throttled_active = 0; 10; 50 ;; tp_es02-search_throttled_queue = 0; 50; 200 ;; tp_es02-search_throttled_rejected = 0; 1000; 2000 ;; tp_es02-security-token-key_active = 0; 10; 50 ;; tp_es02-security-token-key_queue = 0; 50; 200 ;; tp_es02-security-token-key_rejected = 0; 1000; 2000 ;; tp_es02-snapshot_active = 0; 10; 50 ;; tp_es02-snapshot_queue = 0; 50; 200 ;; tp_es02-snapshot_rejected = 0; 1000; 2000 ;; tp_es02-warmer_active = 0; 10; 50 ;; tp_es02-warmer_queue = 0; 50; 200 ;; tp_es02-warmer_rejected = 0; 1000; 2000 ;; tp_es02-watcher_active = 0; 10; 50 ;; tp_es02-watcher_queue = 0; 50; 200 ;; tp_es02-watcher_rejected = 0; 1000; 2000 ;; tp_es02-write_active = 8; 10; 50 ;; tp_es02-write_queue = 10; 50; 200 ;; tp_es02-write_rejected = 0; 1000; 2000 ;; tp_es01-analyze_active = 0; 10; 50 ;; tp_es01-analyze_queue = 0; 50; 200 ;; tp_es01-analyze_rejected = 0; 1000; 2000 ;; tp_es01-ccr_active = 0; 10; 50 ;; tp_es01-ccr_queue = 0; 50; 200 ;; tp_es01-ccr_rejected = 0; 1000; 2000 ;; tp_es01-fetch_shard_started_active = 0; 10; 50 ;; tp_es01-fetch_shard_started_queue = 0; 50; 200 ;; tp_es01-fetch_shard_started_rejected = 0; 1000; 2000 ;; tp_es01-fetch_shard_store_active = 0; 10; 50 ;; tp_es01-fetch_shard_store_queue = 0; 50; 200 ;; tp_es01-fetch_shard_store_rejected = 0; 1000; 2000 ;; tp_es01-flush_active = 0; 10; 50 ;; tp_es01-flush_queue = 0; 50; 200 ;; tp_es01-flush_rejected = 0; 1000; 2000 ;; tp_es01-force_merge_active = 0; 10; 50 ;; tp_es01-force_merge_queue = 0; 50; 200 ;; tp_es01-force_merge_rejected = 0; 1000; 2000 ;; tp_es01-generic_active = 0; 10; 50 ;; tp_es01-generic_queue = 0; 50; 200 ;; tp_es01-generic_rejected = 0; 1000; 2000 ;; tp_es01-get_active = 0; 10; 50 ;; tp_es01-get_queue = 0; 50; 200 ;; tp_es01-get_rejected = 0; 1000; 2000 ;; tp_es01-index_active = 0; 10; 50 ;; tp_es01-index_queue = 0; 50; 200 ;; tp_es01-index_rejected = 0; 1000; 2000 ;; tp_es01-listener_active = 0; 10; 50 ;; tp_es01-listener_queue = 0; 50; 200 ;; tp_es01-listener_rejected = 0; 1000; 2000 ;; tp_es01-management_active = 1; 10; 50 ;; tp_es01-management_queue = 0; 50; 200 ;; tp_es01-management_rejected = 0; 1000; 2000 ;; tp_es01-ml_autodetect_active = 0; 10; 50 ;; tp_es01-ml_autodetect_queue = 0; 50; 200 ;; tp_es01-ml_autodetect_rejected = 0; 1000; 2000 ;; tp_es01-ml_datafeed_active = 0; 10; 50 ;; tp_es01-ml_datafeed_queue = 0; 50; 200 ;; tp_es01-ml_datafeed_rejected = 0; 1000; 2000 ;; tp_es01-ml_utility_active = 0; 10; 50 ;; tp_es01-ml_utility_queue = 0; 50; 200 ;; tp_es01-ml_utility_rejected = 0; 1000; 2000 ;; tp_es01-refresh_active = 0; 10; 50 ;; tp_es01-refresh_queue = 0; 50; 200 ;; tp_es01-refresh_rejected = 0; 1000; 2000 ;; tp_es01-rollup_indexing_active = 0; 10; 50 ;; tp_es01-rollup_indexing_queue = 0; 50; 200 ;; tp_es01-rollup_indexing_rejected = 0; 1000; 2000 ;; tp_es01-search_active = 0; 10; 50 ;; tp_es01-search_queue = 0; 50; 200 ;; tp_es01-search_rejected = 78; 1000; 2000 ;; tp_es01-search_throttled_active = 0; 10; 50 ;; tp_es01-search_throttled_queue = 0; 50; 200 ;; tp_es01-search_throttled_rejected = 0; 1000; 2000 ;; tp_es01-security-token-key_active = 0; 10; 50 ;; tp_es01-security-token-key_queue = 0; 50; 200 ;; tp_es01-security-token-key_rejected = 0; 1000; 2000 ;; tp_es01-snapshot_active = 0; 10; 50 ;; tp_es01-snapshot_queue = 0; 50; 200 ;; tp_es01-snapshot_rejected = 0; 1000; 2000 ;; tp_es01-warmer_active = 0; 10; 50 ;; tp_es01-warmer_queue = 0; 50; 200 ;; tp_es01-warmer_rejected = 0; 1000; 2000 ;; tp_es01-watcher_active = 0; 10; 50 ;; tp_es01-watcher_queue = 0; 50; 200 ;; tp_es01-watcher_rejected = 0; 1000; 2000 ;; tp_es01-write_active = 8; 10; 50 ;; tp_es01-write_queue = 20; 50; 200 ;; tp_es01-write_rejected = 0; 1000; 2000 ;;

    示例7:主检查。插件检查给定Elasticsearch集群的哪个节点是当前主节点。使用可选参数-e(期望主服务器),可以指定节点名称。如果给定的节点名称和群集的当前主节点不同,则插件将发出警告。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t master -e node1
    ES系统警告-主节点为node2但预期为node1

    命令定义

    Nagios,Icinga 1.x,Shinken,Naemon中的命令定义

    以下命令定义允许在ARG4中全部定义可选参数。

    #'check_es_system'命令定义
    定义命令{
    command_name check_es_system
    command_line $ USER1 $ / check_es_system.sh -H $ ARG1 $ -t $ ARG3 $ $ ARG4 $
    }

    Icinga 2.x中的命令定义

    对象CheckCommand“ check_es_system” {
    导入“ plugin-check-command”
    命令= [PluginContribDir +“ /check_es_system.sh”]

    参数= {
    “ -H” = {
    值=“ $ es_address $”
    描述=“主机名或IP地址ElasticSearch节点“
    }

    ​ ” -P“ = {
    ​ 值=” $ es_port $“
    ​ 描述=”端口号(默认值:9200)“
    ​ }

    ​ ” -S“ = {
    ​ set_if =” $ es_ssl $“
    ​ description =”使用https“
    ​ }

    ​ ” -u“= {
    ​ 值=“ $ es_user $”
    ​ 描述=“如果需要验证,则为用户名”
    ​ }

    ​ “ -p” = {
    ​ value =“ $ es_password $”
    ​ description =“如果需要身份验证,则需要输入密码”
    ​ }

    ​ “ -d” = {
    ​ value =“ $ es_available $”
    ​ description =“定义ES的可用磁盘或内存大小cluster“
    ​ }

    ​ ” -t“ = {
    ​ value =” $ es_checktype $“
    ​ description =”定义检查类型(磁盘|内存|状态)“
    ​ }

    ​ ” -o“ = {
    ​ value =” $ es_unit $“
    ​ description =”选择大小单位(K | M | G)-千兆字节默认为G“
    ​ }

    ​ “ -i” = {
    ​ value =“ $ es_index $”
    ​ description =“要检查的只读索引的空格分隔列表(默认:'_all')”
    ​ }

    ​ “ -w” = {
    ​ value =“ $ es_warn $”
    ​ description =“警告阈值”
    ​ }

    ​ “ -c“ = {
    ​ 值=” $ es_crit $“
    ​ 描述=”关键阈值“
    ​ }

    ​ ” -m“ = {
    ​ 值=” $ es_max_time $“
    ​ 描述=”以秒为单位的最大时间(超时),Elasticsearch响应(默认值:30 )“
    ​ }

    ​ ” -e“= {
    ​ value =“ $ es_expect_master $”
    ​ description =“给定的节点名称应该是Elasticsearch集群的主节点。”
    ​ }
    }

    vars.es_address =“ $ address $”
    vars.es_ssl = false
    }

    服务定义

    Nagios,Icinga 1.x,Shinken,Naemon中的服务定义

    在此示例中,磁盘检查在myexcluster.in.the.cloud上进行,并假定有50GB的可用磁盘空间。访问集群统计信息需要身份验证,因此此处的登录使用用户“ read”和密码“ only”进行。

    #检查ElasticSearch磁盘使用情况
    定义服务{
    使用通用服务
    host_name myesnode
    service_description ElasticSearch磁盘使用情况
    check_command check_es_system!myescluster.in.the.cloud!disk!-d 50 -u只读-p
    }

    在下一个示例中,在myexcluster.in.the.cloud中检查Elasticsearch集群的状态:

    #检查ElasticSearch状态
    定义服务{
    使用通用服务
    host_name myesnode
    service_description ElasticSearch状态
    check_command check_es_system!myescluster.in.the.cloud!status
    }

    服务对象定义Icinga 2.x

    在此示例中,磁盘检查在myexcluster.in.the.cloud上进行,并假定有50GB的可用磁盘空间。访问集群统计信息需要身份验证,因此此处的登录使用用户“ read”和密码“ only”进行。

    #检查Elasticsearch磁盘使用情况
    对象服务“ ElasticSearch磁盘使用情况” {
    import“ generic-service”
    host_name =“ myesnode”
    check_command =“ check_es_system”
    vars.es_address =“ myescluster.in.the.cloud”
    vars.es_user =“读取”
    vars .es_password =“仅”
    vars.es_checktype =“磁盘”
    vars.es_available =“ 50”
    }

    在此示例中,将检查在myexcluster.in.the.cloud上运行的Elasticsearch的状态。访问集群统计信息需要身份验证,因此此处的登录使用用户“ read”和密码“ only”进行。

    #检查Elasticsearch Status
    对象服务“ ElasticSearch Status” {
    import“通用服务”
    host_name =“ myesnode”
    check_command =“ check_es_system”
    vars.es_address =“ myescluster.in.the.cloud”
    vars.es_user =“已读”
    vars.es_password =“仅”
    vars.es_checktype =“状态”
    }

    在此示例中,将检查在myexcluster.in.the.cloud上运行的Elasticsearch的线程池统计信息。如果任何线程池状态(活动,已排队,已拒绝)触发阈值,则插件将以警告或严重状态退出。

    #检查Elasticsearch Status
    对象服务“ ElasticSearch Status” {
    import“通用服务”
    host_name =“ myesnode”
    check_command =“ check_es_system”
    vars.es_address =“ myescluster.in.the.cloud”
    vars.es_user =“已读”
    vars.es_password =“仅”
    vars.es_checktype =“ tps”
    vars.es_warn =“ 50,170,1000”
    vars.es_crit =“ 100,200,5000”
    }

  • 相关阅读:
    mock of python
    Linux系统有7个运行级别(runlevel)
    linux下gsoap的初次使用
    python的sitecustomize.py妙用
    blkid命令 获取文件系统类型、UUID
    linux的一些核心配置文件
    Linux网卡配置与绑定
    CentOS 5.4 制作 Python 2.6 RPM 包的方法
    学会理解并编辑fstab
    HTTP协议通信过程汇总
  • 原文地址:https://www.cnblogs.com/sanduzxcvbnm/p/13140212.html
Copyright © 2020-2023  润新知