• Kubernetes之(十九)资源指标和集群监控


    Kubernetes之(十九)资源指标和集群监控

    资源指标和资源监控

    一个集群系统管理离不开监控,同样的Kubernetes也需要根据数据指标来采集相关数据,从而完成对集群系统的监控状况进行监测。这些指标总体上分为两个组成:监控集群本身和监控Pod对象,通常一个集群的衡量性指标包括以下几个部分:

    • 节点资源状态:主要包括网络带宽、磁盘空间、CPU和内存使用率
    • 节点的数量:即时性了解集群的可用节点数量可以为用户计算服务器使用的费用支出提供参考。
    • 运行的Pod对象:正在运行的Pod对象数量可以评估可用节点数量是否足够,以及节点故障时是否能平衡负载。

    另一个方面,对Pod资源对象的监控需求大概有以下三类:

    • Kubernetes指标:监测特定应用程序相关的Pod对象的部署过程、副本数量、状态信息、健康状态、网络等等。
    • 容器指标:容器的资源需求、资源限制、CPU、内存、磁盘空间、网络带宽的实际占用情况。
    • 应用程序指标:应用程序自身的内建指标,和业务规则相关

    metrics-server

    在新一代的Kubernetes指标监控体系当中主要由核心指标流水线和监控指标流水线组成:

    • 核心指标流水线:是指由kubelet、、metrics-server以及由API server提供的api组成,它们可以为K8S系统提供核心指标,从而了解并操作集群内部组件和程序。其中相关的指标包括CPU的累积使用率、内存实时使用率,Pod资源占用率以及容器磁盘占用率等等。其中核心指标的获取原先是由heapster进行收集,但是在1.11版本之后已经被废弃,从而由新一代的metrics-server所代替对核心指标的汇聚。核心指标的收集是必要的。如下图:

    • 监控指标流水线:用于从系统收集各种指标数据并提供给终端用户、存储系统以及HPA。它们包含核心指标以及许多非核心指标,其中由于非核心指标本身不能被Kubernetes所解析,此时就需要依赖于用户选择第三方解决方案。如下图:

    一个可以同时使用资源指标API和自定义指标API的组件是HPAv2,其实现了通过观察指标实现自动扩容和缩容。而目前资源指标API的实现主流是metrics-server。

    自1.8版本后,容器的cpu和内存资源占用利用率都可以通过客户端指标API直接调用,从而获取资源使用情况,要知道的是API本身并不存储任何指标数据,仅仅提供资源占用率的实时监测数据。

    资源指标和其他的API指标并没有啥区别,它是通过API Server的URL路径/apis/metrics.k8s.io/进行存取,只有在k8s集群内部署了metrics-server应用才能只用API,其简单的结构图如下:

    MetricsServer基于内存存储,重启后数据将全部丢失,而且它仅能留存最近收集到的指标数据,因此,如果用户期望访问历史数据,就不得不借助于第三方的监控系统(如Prometheus等)。

    一般说来,MetricsServer在每个集群中仅会运行一个实例,启动时,它将自动初始化与各节点的连接,因此出于安全方面的考虑,它需要运行于普通节点而非Master主机之上。直接使用项目本身提供的资源配置清单即能轻松完成metrics-server的部署。

    部署metrics-server

    https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/metrics-server
    下载yaml文件

    [root@master metrics-server]# for n in auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml;do wget https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/metrics-server/$n;done
    
    [root@master metrics-server]# ll
    总用量 24
    -rw-r--r-- 1 root root  398 4月  10 10:31 auth-delegator.yaml
    -rw-r--r-- 1 root root  419 4月  10 10:31 auth-reader.yaml
    -rw-r--r-- 1 root root  393 4月  10 10:32 metrics-apiservice.yaml
    -rw-r--r-- 1 root root 3156 4月  10 10:32 metrics-server-deployment.yaml
    -rw-r--r-- 1 root root  336 4月  10 10:32 metrics-server-service.yaml
    -rw-r--r-- 1 root root  801 4月  10 10:32 resource-reader.yaml
    

    部署

    #由于镜像及部分设置问题,修改下面这个文件的部分内容
    #metrics-server容器修改镜像地址和command字段,metrics-server-nanny容器中的cpu和内存值
    [root@master metrics-server]# vim metrics-server-deployment.yaml 
    ......
        spec:
          priorityClassName: system-cluster-critical
          serviceAccountName: metrics-server
          containers:
          - name: metrics-server
            #image: k8s.gcr.io/metrics-server-amd64:v0.3.1
            image: xiaobai20201/metrics-server:v0.3.1
          - name: metrics-server
            #image: k8s.gcr.io/metrics-server-amd64:v0.3.1
            image: xiaobai20201/metrics-server:v0.3.1
            command:
            - /metrics-server
            - --metric-resolution=30s
            - --kubelet-insecure-tls
            - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
            ports:
            - containerPort: 443
              name: https
              protocol: TCP
          - name: metrics-server-nanny
            #image: k8s.gcr.io/addon-resizer:1.8.4
            image: xiaobai20201/addon-resizer:1.8.4
            resources:
              limits:
                cpu: 100m
                memory: 300Mi
              requests:
                cpu: 5m
              # Specifies the smallest cluster (defined in number of nodes)
              # resources will be scaled to.
                memory: 50Mi
            env:
              - name: MY_POD_NAME
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.name
              - name: MY_POD_NAMESPACE
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.namespace
            volumeMounts:
            - name: metrics-server-config-volume
              mountPath: /etc/config
            command:
              - /pod_nanny
              - --config-dir=/etc/config
              - --cpu=100m
              - --extra-cpu=0.5m
              - --memory=100Mi
              - --extra-memory=50Mi
              - --threshold=5
              - --deployment=metrics-server-v0.3.1
              - --container=metrics-server
              - --poll-period=300000
              - --estimator=exponential
              - --minClusterSize=10
    		  
    [root@master metrics-server]# vim resource-reader.yaml 
    #由于启动容器还需要权限获取数据,需要在resource-reader.yaml文件中增加nodes/stats
    
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: system:metrics-server
      labels:
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    rules:
    - apiGroups:
      - ""
      resources:
      - pods
      - nodes
      - nodes/stats
      - namespaces
      verbs:
      - get
      - list
      - watch
      
    

    部署

    [root@master metrics-server]# kubectl apply -f .
    clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
    rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
    apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
    serviceaccount/metrics-server created
    configmap/metrics-server-config created
    deployment.apps/metrics-server-v0.3.1 created
    service/metrics-server created
    clusterrole.rbac.authorization.k8s.io/system:metrics-server created
    clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
    
    [root@master metrics-server]# kubectl api-versions |grep metrics
    metrics.k8s.io/v1beta1
    
    #检查资源指标API的可用性
    [root@master metrics-server]# kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
    {"kind":"NodeMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/metrics.k8s.io/v1beta1/nodes"},"items":[]}
    
    
    #部署成功后可以使用kubectl proxy --port=8080来代理出一个端口
    [root@master metrics-server]# kubectl proxy --port=8080
    Starting to serve on 127.0.0.1:8080
    #使用curl命令可以从api接口查看节点等状态
    [root@master mainfest]# curl http://localhost:8080/apis/metrics.k8s.io/v1beta1
    {
      "kind": "APIResourceList",
      "apiVersion": "v1",
      "groupVersion": "metrics.k8s.io/v1beta1",
      "resources": [
        {
          "name": "nodes",
          "singularName": "",
          "namespaced": false,
          "kind": "NodeMetrics",
          "verbs": [
            "get",
            "list"
          ]
        },
        {
          "name": "pods",
          "singularName": "",
          "namespaced": true,
          "kind": "PodMetrics",
          "verbs": [
            "get",
            "list"
          ]
        }
      ]
    }
    
    #该组内主要提供nodes和pods的数据
    
    [root@master mainfest]# curl http://localhost:8080/apis/metrics.k8s.io/v1beta1/nodes
    {
      "kind": "NodeMetricsList",
      "apiVersion": "metrics.k8s.io/v1beta1",
      "metadata": {
        "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
      },
      "items": [
        {
          "metadata": {
            "name": "node02",
            "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node02",
            "creationTimestamp": "2019-04-10T02:57:21Z"
          },
          "timestamp": "2019-04-10T02:57:14Z",
          "window": "30s",
          "usage": {
            "cpu": "41332743n",
            "memory": "702124Ki"
          }
        },
        {
          "metadata": {
            "name": "master",
            "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/master",
            "creationTimestamp": "2019-04-10T02:57:21Z"
          },
          "timestamp": "2019-04-10T02:57:15Z",
          "window": "30s",
          "usage": {
            "cpu": "156316878n",
            "memory": "1209616Ki"
          }
        },
        {
          "metadata": {
            "name": "node01",
            "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node01",
            "creationTimestamp": "2019-04-10T02:57:21Z"
          },
          "timestamp": "2019-04-10T02:57:09Z",
          "window": "30s",
          "usage": {
            "cpu": "47843790n",
            "memory": "800144Ki"
          }
        }
      ]
    }
    

    下面使用kubectl top命令进行查看资源信息:

    [root@master metrics-server]# kubectl top nodes
    NAME     CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
    master   146m         7%     1187Mi          68%       
    node01   45m          4%     782Mi           45%       
    node02   36m          3%     683Mi           39%   
    
    
    
    [root@master mainfest]# kubectl top pods -n kube-system
    NAME                                     CPU(cores)   MEMORY(bytes)   
    canal-nbspn                              21m          52Mi            
    canal-pj6rx                              13m          43Mi            
    canal-rgsnp                              12m          43Mi            
    coredns-78d4cf999f-6cb69                 2m           10Mi            
    coredns-78d4cf999f-tflpn                 2m           10Mi            
    etcd-master                              16m          121Mi           
    kube-apiserver-master                    31m          517Mi           
    kube-controller-manager-master           39m          82Mi            
    kube-flannel-ds-amd64-5zrk7              2m           14Mi            
    kube-flannel-ds-amd64-pql5n              2m           12Mi            
    kube-flannel-ds-amd64-ssd29              2m           14Mi            
    kube-proxy-ch4vp                         2m           15Mi            
    kube-proxy-cz2rf                         2m           23Mi            
    kube-proxy-kdp7d                         4m           21Mi            
    kube-scheduler-master                    10m          21Mi            
    kubernetes-dashboard-6f9998798-klf4t     1m           15Mi            
    metrics-server-v0.3.1-65bd5d59b9-xvmns   1m           20Mi 
    
    
    
    [root@master metrics-server]# kubectl top pod -l k8s-app=kube-dns --containers=true -n kube-system
    POD                        NAME      CPU(cores)   MEMORY(bytes)   
    coredns-78d4cf999f-6cb69   coredns   2m           10Mi            
    coredns-78d4cf999f-tflpn   coredns   2m           10Mi  
    

    Prometheus

    概述

    除了前面的资源指标(如CPU、内存)以外,用户或管理员需要了解更多的指标数据,比如Kubernetes指标、容器指标、节点资源指标以及应用程序指标等等。自定义指标API允许请求任意的指标,其指标API的实现要指定相应的后端监视系统。而Prometheus是第一个开发了相应适配器的监控系统。这个适用于Prometheus的Kubernetes Customm Metrics Adapter是属于Github上的k8s-prometheus-adapter项目提供的。其原理图如下:

    prometheus本身就是一监控系统,也分为server端和agent端,server端从被监控主机获取数据,而agent端需要部署一个node_exporter,主要用于数据采集和暴露节点的数据,那么 在获取Pod级别或者是mysql等多种应用的数据,也是需要部署相关的exporter。我们可以通过PromQL的方式对数据进行查询,但是由于本身prometheus属于第三方的 解决方案,原生的k8s系统并不能对Prometheus的自定义指标进行解析,就需要借助于k8s-prometheus-adapter将这些指标数据查询接口转换为标准的Kubernetes自定义指标。

    Prometheus是一个开源的服务监控系统和时序数据库,其提供了通用的数据模型和快捷数据采集、存储和查询接口。它的核心组件Prometheus服务器定期从静态配置的监控目标或者基于服务发现自动配置的目标中进行拉取数据,新拉取到啊的 数据大于配置的内存缓存区时,数据就会持久化到存储设备当中。Prometheus组件架构图如下:

    每个被监控的主机都可以通过专用的exporter程序提供输出监控数据的接口,并等待Prometheus服务器周期性的进行数据抓取。如果存在告警规则,则抓取到数据之后会根据规则进行计算,满足告警条件则会生成告警,并发送到Alertmanager完成告警的汇总和分发。当被监控的目标有主动推送数据的需求时,可以以Pushgateway组件进行接收并临时存储数据,然后等待Prometheus服务器完成数据的采集。

    任何被监控的目标都需要事先纳入到监控系统中才能进行时序数据采集、存储、告警和展示,监控目标可以通过配置信息以静态形式指定,也可以让Prometheus通过服务发现的机制进行动态管理。下面是组件的一些解析:

    • 监控代理程序:如node_exporter:收集主机的指标数据,如平均负载、CPU、内存、磁盘、网络等等多个维度的指标数据。
    • kubelet(cAdvisor):收集容器指标数据,也是K8S的核心指标收集,每个容器的相关指标数据包括:CPU使用率、限额、文件系统读写限额、内存使用率和限额、网络报文发送、接收、丢弃速率等等。
    • API Server:收集API Server的性能指标数据,包括控制队列的性能、请求速率和延迟时长等等
    • etcd:收集etcd存储集群的相关指标数据
    • kube-state-metrics:该组件可以派生出k8s相关的多个指标数据,主要是资源类型相关的计数器和元数据信息,包括制定类型的对象总数、资源限额、容器状态以及Pod资源标签系列等。

    Prometheus能够直接把KubernetesAPIServer作为服务发现系统使用进而动态发现和监控集群中的所有可被监控的对象。这里需要特别说明的是,Pod资源需要添加下列注解信息才能被Prometheus系统自动发现并抓取其内建的指标数据。

    • prometheus.io/scrape: 用于标识是否需要被采集指标数据,布尔型值,true或false。
    • prometheus.io/path: 抓取指标数据时使用的URL路径,一般为/metrics。
    • prometheus.io/port :抓取指标数据时使用的套接字端口,如8080。

    另外,仅期望Prometheus为后端生成自定义指标时仅部署Prometheus服务器即可,它甚至也不需要数据持久功能。但若要配置完整功能的监控系统,管理员还需要在每个主机上部署node_exporter、按需部署其他特有类型的exporter以及Alertmanager。

    部署prometheus

    官方地址 :https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/prometheus
    由于官方的YAML部署方式需要使用到PVC,这里使用马哥提供的学习类型的部署,具体生产还是需要根据官方的建议进行。

    [root@master metrics]# git clone https://github.com/iKubernetes/k8s-prom.git 
    正克隆到 'k8s-prom'...
    remote: Enumerating objects: 49, done.
    remote: Total 49 (delta 0), reused 0 (delta 0), pack-reused 49
    Unpacking objects: 100% (49/49), done.
    

    创建名称空间prom

    [root@master metrics]# cd k8s-prom/
    [root@master k8s-prom]# ls
    k8s-prometheus-adapter  kube-state-metrics  namespace.yaml  node_exporter  podinfo  prometheus  README.md
    [root@master k8s-prom]# kubectl apply -f namespace.yaml
    namespace/prom created
    

    部署node_exporter

    [root@master k8s-prom]# cd node_exporter/
    [root@master node_exporter]# kubectl apply -f .
    daemonset.apps/prometheus-node-exporter created
    service/prometheus-node-exporter created
    
    [root@master node_exporter]# kubectl get ds -n prom
    NAME                       DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
    prometheus-node-exporter   3         3         3       3            3           <none>          100s
    [root@master node_exporter]# kubectl get pods -n prom  
    NAME                             READY   STATUS    RESTARTS   AGE
    prometheus-node-exporter-b2lk5   1/1     Running   0          104s
    prometheus-node-exporter-d4l6v   1/1     Running   0          104s
    prometheus-node-exporter-swngp   1/1     Running   0          104s
    

    部署prometheus-server

    [root@master node_exporter]# cd ../prometheus/
    [root@master prometheus]# ll
    总用量 24
    -rw-r--r-- 1 root root 10132 4月  10 11:20 prometheus-cfg.yaml
    -rw-r--r-- 1 root root  1481 4月  10 11:20 prometheus-deploy.yaml
    -rw-r--r-- 1 root root   716 4月  10 11:20 prometheus-rbac.yaml
    -rw-r--r-- 1 root root   278 4月  10 11:20 prometheus-svc.yaml
    [root@master prometheus]# kubectl apply -f .
    configmap/prometheus-config created
    deployment.apps/prometheus-server created
    clusterrole.rbac.authorization.k8s.io/prometheus created
    serviceaccount/prometheus created
    clusterrolebinding.rbac.authorization.k8s.io/prometheus created
    service/prometheus created
    #由于prometheus的yaml内内存limit为2G,此时node节点虚拟机均不满足要求,导致会一直是pending状态,此处进行修改,
    [root@master prometheus]# vim prometheus-deploy.yaml
            #resources:
             # limits:
              #  memory: 2Gi
    [root@master prometheus]# kubectl apply -f prometheus-deploy.yaml
    deployment.apps/prometheus-server configured
    
    [root@master prometheus]# kubectl get pods -n prom -w
    NAME                                 READY   STATUS    RESTARTS   AGE
    prometheus-node-exporter-b2lk5       1/1     Running   0          9m30s
    prometheus-node-exporter-d4l6v       1/1     Running   0          9m30s
    prometheus-node-exporter-swngp       1/1     Running   0          9m30s
    prometheus-server-556b8896d6-ld7xj   1/1     Running   0          35s
    

    部署后查看日志

    [root@master prometheus]# kubectl logs prometheus-server-556b8896d6-ld7xj -n prom
    level=info ts=2019-04-10T03:33:57.752158604Z caller=main.go:220 msg="Starting Prometheus" version="(version=2.2.1, branch=HEAD, revision=bc6058c81272a8d938c05e75607371284236aadc)"
    level=info ts=2019-04-10T03:33:57.752221598Z caller=main.go:221 build_context="(go=go1.10, user=root@149e5b3f0829, date=20180314-14:15:45)"
    level=info ts=2019-04-10T03:33:57.752240032Z caller=main.go:222 host_details="(Linux 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 prometheus-server-556b8896d6-ld7xj (none))"
    level=info ts=2019-04-10T03:33:57.752255713Z caller=main.go:223 fd_limits="(soft=65536, hard=65536)"
    level=info ts=2019-04-10T03:33:57.755420653Z caller=main.go:504 msg="Starting TSDB ..."
    level=info ts=2019-04-10T03:33:57.7620657Z caller=web.go:382 component=web msg="Start listening for connections" address=0.0.0.0:9090
    level=info ts=2019-04-10T03:33:57.7632425Z caller=main.go:514 msg="TSDB started"
    level=info ts=2019-04-10T03:33:57.764611774Z caller=main.go:588 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
    level=info ts=2019-04-10T03:33:57.765669001Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
    level=info ts=2019-04-10T03:33:57.76626263Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
    level=info ts=2019-04-10T03:33:57.76668914Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
    level=info ts=2019-04-10T03:33:57.767331363Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
    level=info ts=2019-04-10T03:33:57.768433541Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
    level=info ts=2019-04-10T03:33:57.768948262Z caller=main.go:491 msg="Server is ready to receive web requests."
    

    此时可以使用NodeIP:30090 进行访问,并可以查看监控,内部已经内置了了一些监控指标


    部署kube-state-metrics

    [root@master prometheus]# cd ../kube-state-metrics/
    #修改 kube-state-metrics-deploy.yaml内的image地址
            image: xiaobai20201/kube-state-metrics-amd64:v1.3.1
    
    [root@master kube-state-metrics]# kubectl apply -f .
    deployment.apps/kube-state-metrics created
    serviceaccount/kube-state-metrics created
    clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
    clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
    service/kube-state-metrics created
    
    [root@master kube-state-metrics]# kubectl get pods -n prom          
    NAME                                 READY   STATUS    RESTARTS   AGE
    kube-state-metrics-84c69bb8-87l7n    1/1     Running   0          19s
    prometheus-node-exporter-b2lk5       1/1     Running   0          21m
    prometheus-node-exporter-d4l6v       1/1     Running   0          21m
    prometheus-node-exporter-swngp       1/1     Running   0          21m
    prometheus-server-556b8896d6-ld7xj   1/1     Running   0          12m
    

    制作证书
    由于默认情况下K8S集群都是基于https提供服务,而默认情况k8s-prometheus-adapter是基于http服务,需要提供该K8S服务器CA签署认可的证书,所以需要自制证书

    [root@master kube-state-metrics]# cd /etc/kubernetes/pki/
    [root@master pki]# (umask 077;openssl genrsa -out serving.key)
    Generating RSA private key, 2048 bit long modulus
    .........+++
    ..+++
    e is 65537 (0x10001)
    
    [root@master pki]# openssl x509 -req -in serving.csr -CA ./ca.crt -CAkey ./ca.key -CAcreateserial -out serving.crt -days 3650
    Signature ok
    subject=/CN=serving
    Getting CA Private Key
    
    
    [root@master pki]#  kubectl create secret generic cm-adapter-serving-certs --from-file=serving.crt=./serving.crt --from-file=serving.key -n prom
    secret/cm-adapter-serving-certs created
    [root@master pki]# kubectl get secret -n prom
    NAME                             TYPE                                  DATA   AGE
    cm-adapter-serving-certs         Opaque                                2      13s
    default-token-r88nt              kubernetes.io/service-account-token   3      37m
    kube-state-metrics-token-4rrqw   kubernetes.io/service-account-token   3      14m
    prometheus-token-jdm5f           kubernetes.io/service-account-token   3      31m
    

    部署k8s-prometheus-adapter
    这里自带的custom-metrics-apiserver-deployment.yaml和custom-metrics-config-map.yaml有点问题,需要下载k8s-prometheus-adapter项目中的这2个文件

    [root@master k8s-prometheus-adapter]#  wget https://raw.githubusercontent.com/DirectXMan12/k8s-prometheus-adapter/master/deploy/manifests/custom-metrics-apiserver-deployment.yaml
    [root@master k8s-prometheus-adapter]#  wget https://raw.githubusercontent.com/DirectXMan12/k8s-prometheus-adapter/master/deploy/manifests/custom-metrics-config-map.yaml 
    #修改下载文件的内容的namespace为prom
    

    执行

    [root@master k8s-prometheus-adapter]# kubectl apply -f .
    clusterrolebinding.rbac.authorization.k8s.io/custom-metrics:system:auth-delegator created
    rolebinding.rbac.authorization.k8s.io/custom-metrics-auth-reader created
    deployment.apps/custom-metrics-apiserver created
    clusterrolebinding.rbac.authorization.k8s.io/custom-metrics-resource-reader created
    serviceaccount/custom-metrics-apiserver created
    service/custom-metrics-apiserver created
    apiservice.apiregistration.k8s.io/v1beta1.custom.metrics.k8s.io created
    clusterrole.rbac.authorization.k8s.io/custom-metrics-server-resources created
    configmap/adapter-config created
    clusterrole.rbac.authorization.k8s.io/custom-metrics-resource-reader created
    clusterrolebinding.rbac.authorization.k8s.io/hpa-controller-custom-metrics created
    
    [root@master k8s-prometheus-adapter]# kubectl get pods -n prom
    NAME                                      READY   STATUS    RESTARTS   AGE
    custom-metrics-apiserver-c86bfc77-dtkcn   1/1     Running   0          58s
    kube-state-metrics-84c69bb8-87l7n         1/1     Running   0          140m
    prometheus-node-exporter-b2lk5            1/1     Running   0          161m
    prometheus-node-exporter-d4l6v            1/1     Running   0          161m
    prometheus-node-exporter-swngp            1/1     Running   0          161m
    prometheus-server-556b8896d6-ld7xj        1/1     Running   0          152m
    
    [root@master k8s-prometheus-adapter]# kubectl api-versions |grep custom
    custom.metrics.k8s.io/v1beta1
    

    Grafana数据展示

    [root@master metrics]# vim grafana.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: monitoring-grafana
      namespace: prom    #修改名称空间
    spec:
      replicas: 1
      selector:
        matchLabels:
          task: monitoring
          k8s-app: grafana
      template:
        metadata:
          labels:
            task: monitoring
            k8s-app: grafana
        spec:
          containers:
          - name: grafana
            image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v5.0.4
            ports:
            - containerPort: 3000
              protocol: TCP
            volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ca-certificates
              readOnly: true
            - mountPath: /var
              name: grafana-storage
            env:    #这里使用的是原先的heapster的grafana的配置文件,需要注释掉这个环境变量
            #- name: INFLUXDB_HOST
            #        #  value: monitoring-influxdb
            - name: GF_SERVER_HTTP_PORT
              value: "3000"
            - name: GF_AUTH_BASIC_ENABLED
              value: "false"
            - name: GF_AUTH_ANONYMOUS_ENABLED
              value: "true"
            - name: GF_AUTH_ANONYMOUS_ORG_ROLE
              value: Admin
            - name: GF_SERVER_ROOT_URL
              value: /
          volumes:
          - name: ca-certificates
            hostPath:
              path: /etc/ssl/certs
          - name: grafana-storage
            emptyDir: {}
    ---
    apiVersion: v1
    kind: Service
    metadata:
      labels:
        kubernetes.io/cluster-service: 'true'
        kubernetes.io/name: monitoring-grafana
      name: monitoring-grafana
      namespace: prom
    spec:
      type: NodePort
      ports:
      - port: 80
        targetPort: 3000
      selector:
        k8s-app: grafana
    	
    [root@master metrics]# kubectl apply -f grafana.yaml
    deployment.apps/monitoring-grafana created
    service/monitoring-grafana created
    
    [root@master metrics]# kubectl get pods -n prom 
    NAME                                      READY   STATUS    RESTARTS   AGE
    custom-metrics-apiserver-c86bfc77-dtkcn   1/1     Running   0          8m56s
    kube-state-metrics-84c69bb8-87l7n         1/1     Running   0          148m
    monitoring-grafana-dcf785fd8-f7q4g        1/1     Running   0          2m4s
    prometheus-node-exporter-b2lk5            1/1     Running   0          169m
    prometheus-node-exporter-d4l6v            1/1     Running   0          169m
    prometheus-node-exporter-swngp            1/1     Running   0          169m
    prometheus-server-556b8896d6-ld7xj        1/1     Running   0          160m
    
    [root@master metrics]# kubectl get svc -n prom
    NAME                       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
    custom-metrics-apiserver   ClusterIP   10.107.119.218   <none>        443/TCP          9m16s
    kube-state-metrics         ClusterIP   10.103.206.116   <none>        8080/TCP         149m
    monitoring-grafana         NodePort    10.109.0.252     <none>        80:30215/TCP     2m23s
    prometheus                 NodePort    10.101.97.208    <none>        9090:30090/TCP   166m
    prometheus-node-exporter   ClusterIP   None             <none>        9100/TCP         169m
    
    

    monitoring-grafana暴露端口为30215
    使用浏览器访问 http://10.0.0.10:30215

    默认是没有kubernetes的模板的,可以到grafana.com中去下载相关的kubernetes模板。
    https://grafana.com/dashboards

    参考资料

    https://www.cnblogs.com/linuxk
    马永亮. Kubernetes进阶实战 (云计算与虚拟化技术丛书)
    Kubernetes-handbook-jimmysong-20181218

  • 相关阅读:
    Android-使用AIDL挂断电话
    新变化---转战新博客
    Spring Cloud Config 分布式配置中心【Finchley 版】
    Spring Boot2.0 整合 Kafka
    Spring Cloud 分布式链路跟踪 Sleuth + Zipkin + Elasticsearch【Finchley 版】
    Spring MVC 5 + Thymeleaf 基于Java配置和注解配置
    【机器学习】使用gensim 的 doc2vec 实现文本相似度检测
    【机器学习】SKlearn + XGBoost 预测 Titanic 乘客幸存
    【深度学习】keras + tensorflow 实现猫和狗图像分类
    iScroll.js 向上滑动异步加载数据回弹问题
  • 原文地址:https://www.cnblogs.com/wlbl/p/10694389.html
Copyright © 2020-2023  润新知