Prometheus+Grafana监控SpringBoot
相关内容原文地址:
CSDN:carson0408:使用prometheus和grafana监控springboot项目
CSDN:大老杨:SpringBoot使用prometheus监控
思否:泥瓦匠:Spring-Boot-应用可视化监控
Grafana模板:10280
一、Prometheus监控SpringBoot
1.1 pom.xml添加依赖
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.1.3</version>
</dependency>
</dependencies>
1.2 修改application.yml配置文件
server:
port: 8002 # 配置启动端口号
spring:
application:
name: mydemo
metrics:
servo:
enabled: false
management:
endpoints:
web:
exposure:
include: info, health, beans, env, metrics, mappings, scheduledtasks, sessions, threaddump, docs, logfile, jolokia,prometheus
base-path: /actuator #默认/actuator 不更改可不用配置
#CORS跨域支持
cors:
allowed-origins: http://example.com
allowed-methods: GET,PUT,POST,DELETE
prometheus:
id: springmetrics
endpoint:
beans:
cache:
time-to-live: 10s #端点缓存响应的时间量
health:
show-details: always #详细信息显示给所有用户
server:
port: 8001 #默认8080
address: 127.0.0.1 #配置此项表示不允许远程连接
#监测
metrics:
export:
datadog:
application-key: ${spring.application.name}
web:
server:
auto-time-requests: false
这里涉及两个port,一个是server port,一个是prometheus port,其中server port则是调用接口使用的端口,而prometheus port则与该服务在prometheus.yml中的port是一致的,不一致的话则会使该服务down。
1.3 设置启动类Application
@SpringBootApplication
public class Springboot2PrometheusApplication {
public static void main(String[] args) {
SpringApplication.run(Springboot2PrometheusApplication.class, args);
}
@Bean
MeterRegistryCustomizer<MeterRegistry> configurer(
@Value("${spring.application.name}") String applicationName) {
return (registry) -> registry.config().commonTags("application", applicationName);
}
}
SpringBoot项目到这里就配置完成了,启动项目,访问http://localhost:8080/actuator/prometheus,如图所示,可以看到一些度量指标。
1.4 Prometheus配置
1.4.1 prometheus.yml
# 全局配置
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
# 监控prometheus本身
- job_name: 'prometheus监控'
static_configs:
- targets: ['ip:9090']
# 通过node_exporter将监控数据传给prometheus,如果要监控多台服务器,只要在每个服务器上安装node_exporter,指定不同多ip地址就好了
- job_name: '服务器监控'
static_configs:
- targets: ['ip:9100']
- job_name: "redis集群监控"
static_configs:
- targets: ['ip:9121']
- job_name: 'docker监控'
# 静态添加
static_configs:
# 指定监控实例
- targets: ['ip:8081']
- job_name: 'Java项目监控'
metrics_path: '/actuator/prometheus'
file_sd_configs:
- refresh_interval: 1m
files:
- "/home/prometheus/java_springboot.yml"
alerting:
alertmanagers:
- static_configs:
- targets:
- ip:9093
rule_files:
- "/home/prometheus/rules/node_down.yml" # 实例存活报警规则文件
- "/home/prometheus/rules/memory_over.yml" # 内存报警规则文件
- "/home/prometheus/rules/cpu_over.yml" # cpu报警规则文件
1.4.2 java_springboot.yml
java_springboot.yml:
- targets:
- "ip.50:8085"
labels:
instance: 服务A
- targets:
- "ip0.50:8086"
labels:
instance: 服务B
1.4.3 node_down.yml
node_down.yml:
groups:
- name: 实例存活告警规则
rules:
- alert: 实例存活告警
expr: up == 0
for: 1m
labels:
user: prometheus
severity: warning
annotations:
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."
1.4.4 memory_over.yml
memory_over.yml:
groups:
- name: 内存报警规则
rules:
- alert: 内存使用率告警
expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes+node_memory_Buffers_bytes+node_memory_Cached_bytes )) / node_memory_MemTotal_bytes * 100 > 80
for: 1m
labels:
user: prometheus
severity: warning
annotations:
description: "服务器: 内存使用超过80%!(当前值: {{ $value }}%)"
1.4.5 cpu_over.yml
cpu_over.yml:
groups:
- name: CPU报警规则
rules:
- alert: CPU使用率告警
expr: 100 - (avg by (instance)(irate(node_cpu_seconds_total{mode="idle"}[1m]) )) * 100 > 90
for: 1m
labels:
user: prometheus
severity: warning
annotations:
description: "服务器: CPU使用超过90%!(当前值: {{ $value }}%)"
二、Rest接口的编写
编写了一个接口,接口是http rest风格的add接口,具体代码如下所示:
import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.validation.annotation.Validated;
import org.springframework.web.bind.annotation.*;
import javax.annotation.PostConstruct;
@RestController
@RequestMapping("/api")
public class OperationController {
@Autowired
MeterRegistry registry;
private Counter counter;
private Counter failCounter;
@PostConstruct
private void init(){
failCounter= registry.counter("requests_add_fail_total","save","carson");
counter = registry.counter("requests_add_total","save","carson");
}
@RequestMapping(value = "/add",method = RequestMethod.POST)
public String add(@Validated String firstName,@Validated String secondName) throws Exception {
try{
String name = firstName+secondName;
counter.increment();
return name;
}catch (Exception e){
failCounter.increment();
throw new Exception("异常");
}
}
}
其中init方法则是对prometheus中counter组件进行初始化,而在add接口中则可以直接使用,这里两个指标分别为调用成功的次数与调用失败的次数。
2.1 模拟调用
通过postman进行调用接口,如下图所示:
2.2 Grafana监控视图的制作
在grafana页面新增dashboard之后,便进入下图所示:
然后选中数据源,并进行metrics语句编写,如下图所示,sum(request_add_total) ,其中sum函数中的字段可以模糊搜索,只要prometheus中的服务是up的。然后图就如下所示,可以看出,调用情况: