Spring Boot 微服务应用集成Prometheus + Grafana 实现监控告警
部分内容原文地址:
Richard_Yi:Spring Boot 微服务应用集成Prometheus + Grafana 实现监控告警
一、添加依赖
Spring Boot 应用和Prometheus 集成,你需要增加micrometer-registry-prometheus依赖。
<!-- Micrometer Prometheus registry -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
添加上述依赖项之后,Spring Boot 将会自动配置 PrometheusMeterRegistry 和 CollectorRegistry来以Prometheus 可以抓取的格式(即上文提到的 Metrics 格式)收集和导出指标数据。
所有的相关数据,都会在Actuator 的 /prometheus端点暴露出来。Prometheus 可以抓取该端点以定期获取度量标准数据。
1.1 Actuator 的 /prometheus端点
加micrometer-registry-prometheus依赖后,我们访问http://localhost:8080/actuator/prometheus地址,可以看到一下内容:
# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
# TYPE jvm_buffer_total_capacity_bytes gauge
jvm_buffer_total_capacity_bytes{id="direct",} 90112.0
jvm_buffer_total_capacity_bytes{id="mapped",} 0.0
# HELP tomcat_sessions_expired_sessions_total
# TYPE tomcat_sessions_expired_sessions_total counter
tomcat_sessions_expired_sessions_total 0.0
# HELP jvm_classes_unloaded_classes_total The total number of classes unloaded since the Java virtual machine has started execution
# TYPE jvm_classes_unloaded_classes_total counter
jvm_classes_unloaded_classes_total 1.0
# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool
# TYPE jvm_buffer_count_buffers gauge
jvm_buffer_count_buffers{id="direct",} 11.0
jvm_buffer_count_buffers{id="mapped",} 0.0
# HELP system_cpu_usage The "recent cpu usage" for the whole system
# TYPE system_cpu_usage gauge
system_cpu_usage 0.0939447637893599
# HELP jvm_gc_max_data_size_bytes Max size of old generation memory pool
# TYPE jvm_gc_max_data_size_bytes gauge
jvm_gc_max_data_size_bytes 2.841116672E9
# 此处省略超多字...
这些都是按照上文提到的 Metrics 格式组织起来的程序监控指标数据。
metric name>{<label name>=<label value>, ...}
二、Prometheus 配置
配置Prometheus 去收集/actuator/prometheus的指标数据。
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
# demo job
- job_name: 'springboot-actuator-prometheus-test' # job name
metrics_path: '/actuator/prometheus' # 指标获取路径
scrape_interval: 5s # 间隔
basic_auth: # Spring Security basic auth
username: 'actuator'
password: 'actuator'
static_configs:
- targets: ['10.60.45.113:8080'] # 实例的地址,默认的协议是http