系统版本:Centos7
jdk:1.8
安装包:
prometheus-2.33.4.linux-amd64.tar.gz
grafana-enterprise-8.4.3.linux-amd64.tar.gz
node_exporter-0.18.1.linux-amd64.tar.gz
[zkm@test ~]$ cd /home/data/tools/
[zkm@test tools]$ ls -l prometheus-2.33.4.linux-amd64.tar.gz
-rw-r--r--. 1 gframe root 75820407 3月 2 17:55 prometheus-2.33.4.linux-amd64.tar.gz
[zkm@test tools]$ ls -l prometheus-2.33.4.linux-amd64.tar.gz
-rw-r--r--. 1 gframe root 75820407 3月 2 17:55 prometheus-2.33.4.linux-amd64.tar.gz
[zkm@test tools]$ tar -zxvf prometheus-2.33.4.linux-amd64.tar.gz -C /home/data/
[zkm@test tools]$ cd ..
[zkm@test data]$ cd prometheus-2.33.4.linux-amd64/
[zkm@test prometheus-2.33.4.linux-amd64]$ ll
总用量 196068
drwxr-x---. 2 gframe root 38 2月 23 00:59 console_libraries
drwxr-x---. 2 gframe root 173 2月 23 00:59 consoles
-rw-r-----. 1 gframe root 11357 2月 23 00:59 LICENSE
-rw-r-----. 1 gframe root 3773 2月 23 00:59 NOTICE
-rwxr-x---. 1 gframe root 104419379 2月 23 00:54 prometheus
-rw-r-----. 1 gframe root 934 2月 23 00:59 prometheus.yml
-rwxr-x---. 1 gframe root 96326544 2月 23 00:57 promtool
[zkm@test prometheus-2.33.4.linux-amd64]$ vim prometheus.yml
[zkm@test prometheus-2.33.4.linux-amd64]$ cat prometheus.yml| grep -v ^# |grep -v ^$
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["192.168.0.3:8450"]
labels:
instance: 192.168.0.3
[zkm@test prometheus-2.33.4.linux-amd64]$ rm -rf data/
[zkm@test prometheus-2.33.4.linux-amd64]$ ./prometheus --config.file=/home/data/prometheus-2.33.4.linux-amd64/prometheus.yml --web.listen-address=:8006 &
[zkm@test prometheus-2.33.4.linux-amd64]$ ts=2022-03-03T02:38:32.897Z caller=main.go:475 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2022-03-03T02:38:32.897Z caller=main.go:512 level=info msg="Starting Prometheus" version="(version=2.33.4, branch=HEAD, revision=83032011a5d3e6102624fe58241a374a7201fee8)"
ts=2022-03-03T02:38:32.897Z caller=main.go:517 level=info build_context="(go=go1.17.7, user=root@d13bf69e7be8, date=20220222-16:51:28)"
ts=2022-03-03T02:38:32.897Z caller=main.go:518 level=info host_details="(Linux 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 jzgzdaping7 (none))"
ts=2022-03-03T02:38:32.897Z caller=main.go:519 level=info fd_limits="(soft=10240, hard=10240)"
ts=2022-03-03T02:38:32.897Z caller=main.go:520 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2022-03-03T02:38:32.900Z caller=web.go:570 level=info component=web msg="Start listening for connections" address=:8006
ts=2022-03-03T02:38:32.900Z caller=main.go:923 level=info msg="Starting TSDB ..."
ts=2022-03-03T02:38:32.902Z caller=tls_config.go:195 level=info component=web msg="TLS is disabled." http2=false
ts=2022-03-03T02:38:32.907Z caller=head.go:493 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2022-03-03T02:38:32.907Z caller=head.go:527 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=7.962µs
ts=2022-03-03T02:38:32.907Z caller=head.go:533 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2022-03-03T02:38:32.908Z caller=head.go:604 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
ts=2022-03-03T02:38:32.908Z caller=head.go:610 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=26.551µs wal_replay_duration=639.974µs total_replay_duration=694.441µs
ts=2022-03-03T02:38:32.909Z caller=main.go:944 level=info fs_type=XFS_SUPER_MAGIC
ts=2022-03-03T02:38:32.909Z caller=main.go:947 level=info msg="TSDB started"
ts=2022-03-03T02:38:32.909Z caller=main.go:1128 level=info msg="Loading configuration file" filename=/home/data/prometheus-2.33.4.linux-amd64/prometheus.yml
ts=2022-03-03T02:38:32.910Z caller=main.go:1165 level=info msg="Completed loading of configuration file" filename=/home/data/prometheus-2.33.4.linux-amd64/prometheus.yml totalDuration=845.329µs db_storage=949ns remote_storage=7.371µs web_handler=562ns query_engine=987ns scrape=222.298µs scrape_sd=28.825µs notify=35.644µs notify_sd=10.531µs rules=3.299µs
ts=2022-03-03T02:38:32.910Z caller=main.go:896 level=info msg="Server is ready to receive web requests."
登录地址
http://192.168.0.3:8006
查看暴露指标
http://192.168.0.3:8006/metrics
[zkm@test ~]$ cd /home/data/
[zkm@test tools]$ tar -zxvf node_exporter-0.18.1.linux-amd64.tar.gz -C /home/data/
[zkm@test tools]$ cd ..
[zkm@test data]$ cd node_exporter-0.18.1.linux-amd64/
[zkm@test node_exporter-0.18.1.linux-amd64]$ ./node_exporter --web.listen-address=:8007 &
修改:
[zkm@test prometheus-2.33.4.linux-amd64]$ vim prometheus.yml
- job_name: "IP-3"
static_configs:
- targets: ["192.168.0.3:8007"]
labels:
instance: 192.168.0.3
[zkm@test prometheus-2.33.4.linux-amd64]$ cat prometheus.yml | grep -v ^# |grep -v ^$
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
static_configs:
- targets: ["192.168.0.3:8006"]
labels:
instance: 192.168.0.3
- job_name: "ip-3"
static_configs:
- targets: ["192.168.0.3:8007"]
labels:
instance: 192.168.0.3
[zkm@test prometheus-2.33.4.linux-amd64]$ ps -ef|grep prometheus
gframe 7804 1 0 10:38 ? 00:00:01 ./prometheus --config.file=/home/data/prometheus-2.33.4.linux-amd64/prometheus.yml --web.listen-address=:8006
gframe 8510 8407 0 11:07 pts/3 00:00:00 grep --color=auto prometheus
[zkm@test prometheus-2.33.4.linux-amd64]$ kill -9 7804
[zkm@test prometheus-2.33.4.linux-amd64]$ ps -ef|grep prometheus
gframe 8523 8407 0 11:07 pts/3 00:00:00 grep --color=auto prometheus
[zkm@test prometheus-2.33.4.linux-amd64]$ rm -rf data/
[zkm@test prometheus-2.33.4.linux-amd64]$ ./prometheus --config.file=/home/data/prometheus-2.33.4.linux-amd64/prometheus.yml --web.listen-address=:8006 &
配置:grafana
参考链接:
https://www.jianshu.com/p/4646d60975c2
官网模板下载地址:
https://grafana.com/grafana/dashboards/
[zkm@test data]$ ls -l grafana-enterprise-8.4.3.linux-amd64.tar.gz
-rw-r--r--. 1 gframe root 83995470 3月 3 09:21 grafana-enterprise-8.4.3.linux-amd64.tar.gz
[zkm@test data]$ tar -zxvf grafana-enterprise-8.4.3.linux-amd64.tar.gz
[zkm@test data]$ cd grafana-8.4.3/
[zkm@test grafana-8.4.3]$ ll
总用量 28
drwxr-x---. 2 gframe root 96 3月 2 21:31 bin
drwxr-x---. 3 gframe root 107 3月 2 21:31 conf
-rw-r-----. 1 gframe root 12155 3月 2 21:20 LICENSE
-rw-r-----. 1 gframe root 105 3月 2 21:20 NOTICE.md
drwxr-x---. 3 gframe root 22 3月 2 21:31 plugins-bundled
drwxr-x---. 17 gframe root 270 3月 2 21:31 public
-rw-r-----. 1 gframe root 2008 3月 2 21:20 README.md
drwxr-x---. 2 gframe root 4096 3月 2 21:31 scripts
-rw-r-----. 1 gframe root 5 3月 2 21:31 VERSION
[zkm@test grafana-8.4.3]$
修改默认端口:8009
[zkm@test conf]$ pwd
/home/data/grafana-8.4.3/conf
[zkm@test conf]$ vim defaults.ini
[zkm@test conf]$ cat defaults.ini |grep 8009
http_port = 8009
[zkm@test conf]$ cat defaults.ini |grep 192.168.0.3
domain = 192.168.0.3
[zkm@test conf]$
11.登录grafana
默认端口3000,初始用户名和密码都为admin
http://192.168.0.3:8009
根据引导,登录后修改自己的密码
通过访问 https://grafana.com/dashboards 来查看已有仪表盘模板,选取合适的使用。
修改后密码:
账号:admin
密码:admin798
下载地址:
https://prometheus.io/download/#alertmanager
常用的界面操作介绍:
通过8006端口进入界面
Status -> targets 进入界面查看被监控节点的运行状态
Status -> rules 进入界面查看报警规则
Status -> configuration 进入界面查看prometheus的配置中心 prometheus.yml 文件中的内容
IP:192.168.0.3 服务端
prometheus 启动:
[zkm@test ~]$ cd /home/data/prometheus-2.33.4.linux-amd64/
[zkm@test prometheus-2.33.4.linux-amd64]$ pwd
/home/data/prometheus-2.33.4.linux-amd64
[zkm@test prometheus-2.33.4.linux-amd64]$ ./prometheus --config.file=/home/data/prometheus-2.33.4.linux-amd64/prometheus.yml --web.listen-address=:8006 &
grafana 启动:
[zkm@test ~]$ cd /home/data/grafana-8.4.3/
[zkm@test grafana-8.4.3]$ ./grafana-server start &
node_exporter 启动:
[zkm@test data]$ cd node_exporter-0.18.1.linux-amd64/
[zkm@test node_exporter-0.18.1.linux-amd64]$ ./node_exporter --web.listen-address=:8007 &
Prometheus和Alertmanager热重启
prometheus启动命令添加参数 --web.enable-lifecycle
然后热重启:curl -XPOST http://localhost:9090/-/reload
# alertmanager热重启
curl -XPOST http://localhost:9093/-/reload
dos下执行如下命令
curl -XPOST http://ip:9090/-/reload # ip是部署prometheus服务的ip地址
或者
使用postman或Apifox以post请求 请求http://ip:9090/-/reload这个链接