k8s 安装 GlusterFS 分布式文件系统
本节动态存储主要介绍GFS的使用。
一、准备工作
为了保证 Pod 能够正常使用 GFS 作为后端存储,需要每台运行 Pod 的节点上提前安装 GFS 的客户端工具。
所有节点安装 GFS 客户端:
[root@k8s-master01 ~]#yum install glusterfs glusterfs-fuse -y Loaded plugins: fastestmirror Determining fastest mirrors * base: mirrors.aliyun.com * extras: mirrors.aliyun.com * updates: mirrors.aliyun.com aliyun-docker-ce | 3.5 kB 00:00:00 base | 3.6 kB 00:00:00 epel | 4.7 kB 00:00:00 extras | 2.9 kB 00:00:00 updates | 2.9 kB 00:00:00 (1/2): epel/x86_64/updateinfo | 1.0 MB 00:00:05 (2/2): epel/x86_64/primary_db | 6.9 MB 00:00:25 Resolving Dependencies --> Running transaction check ---> Package glusterfs.x86_64 0:6.0-49.1.el7 will be installed --> Processing Dependency: glusterfs-libs(x86-64) = 6.0-49.1.el7 for package: glusterfs-6.0-49.1.el7.x86_64 --> Processing Dependency: libglusterfs.so.0()(64bit) for package: glusterfs-6.0-49.1.el7.x86_64 --> Processing Dependency: libgfxdr.so.0()(64bit) for package: glusterfs-6.0-49.1.el7.x86_64 --> Processing Dependency: libgfrpc.so.0()(64bit) for package: glusterfs-6.0-49.1.el7.x86_64 ---> Package glusterfs-fuse.x86_64 0:6.0-49.1.el7 will be installed --> Processing Dependency: glusterfs-client-xlators(x86-64) = 6.0-49.1.el7 for package: glusterfs-fuse-6.0-49.1.el7.x86_64 --> Processing Dependency: attr for package: glusterfs-fuse-6.0-49.1.el7.x86_64 --> Running transaction check ---> Package attr.x86_64 0:2.4.46-13.el7 will be installed ---> Package glusterfs-client-xlators.x86_64 0:6.0-49.1.el7 will be installed ---> Package glusterfs-libs.x86_64 0:6.0-49.1.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved ===================================================================================================================== Package Arch Version Repository Size ===================================================================================================================== Installing: glusterfs x86_64 6.0-49.1.el7 updates 622 k glusterfs-fuse x86_64 6.0-49.1.el7 updates 130 k Installing for dependencies: attr x86_64 2.4.46-13.el7 base 66 k glusterfs-client-xlators x86_64 6.0-49.1.el7 updates 839 k glusterfs-libs x86_64 6.0-49.1.el7 updates 398 k Transaction Summary ===================================================================================================================== Install 2 Packages (+3 Dependent packages) Total download size: 2.0 M Installed size: 9.0 M Downloading packages: (1/5): attr-2.4.46-13.el7.x86_64.rpm | 66 kB 00:00:00 (2/5): glusterfs-client-xlators-6.0-49.1.el7.x86_64.rpm | 839 kB 00:00:02 (3/5): glusterfs-fuse-6.0-49.1.el7.x86_64.rpm | 130 kB 00:00:00 (4/5): glusterfs-6.0-49.1.el7.x86_64.rpm | 622 kB 00:00:03 (5/5): glusterfs-libs-6.0-49.1.el7.x86_64.rpm | 398 kB 00:00:01 --------------------------------------------------------------------------------------------------------------------- Total 435 kB/s | 2.0 MB 00:00:04 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : glusterfs-libs-6.0-49.1.el7.x86_64 1/5 Installing : glusterfs-6.0-49.1.el7.x86_64 2/5 Installing : glusterfs-client-xlators-6.0-49.1.el7.x86_64 3/5 Installing : attr-2.4.46-13.el7.x86_64 4/5 Installing : glusterfs-fuse-6.0-49.1.el7.x86_64 5/5 Verifying : attr-2.4.46-13.el7.x86_64 1/5 Verifying : glusterfs-fuse-6.0-49.1.el7.x86_64 2/5 Verifying : glusterfs-6.0-49.1.el7.x86_64 3/5 Verifying : glusterfs-client-xlators-6.0-49.1.el7.x86_64 4/5 Verifying : glusterfs-libs-6.0-49.1.el7.x86_64 5/5 Installed: glusterfs.x86_64 0:6.0-49.1.el7 glusterfs-fuse.x86_64 0:6.0-49.1.el7 Dependency Installed: attr.x86_64 0:2.4.46-13.el7 glusterfs-client-xlators.x86_64 0:6.0-49.1.el7 glusterfs-libs.x86_64 0:6.0-49.1.el7 Complete! [root@k8s-master01 ~]#
给需要作为 GFS 节点提供存储的节点打上标签:
[root@k8s-master01 ~]#kubectl label node k8s-master01 storagenode=glusterfs node/k8s-master01 labeled [root@k8s-master01 ~]#kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS k8s-master01 Ready matser 18d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master01,kubernetes.io/os=linux,node-role.kubernetes.io/matser=,node.kubernetes.io/node=,storagenode=glusterfs k8s-master02 Ready matser 18d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master02,kubernetes.io/os=linux,node-role.kubernetes.io/matser=,node.kubernetes.io/node= k8s-master03 Ready matser 18d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master03,kubernetes.io/os=linux,node-role.kubernetes.io/matser=,node.kubernetes.io/node= [root@k8s-master01 ~]#kubectl label node k8s-master02 storagenode=glusterfs node/k8s-master02 labeled [root@k8s-master01 ~]#kubectl label node k8s-master03 storagenode=glusterfs node/k8s-master03 labeled [root@k8s-master01 ~]#kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS k8s-master01 Ready matser 18d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master01,kubernetes.io/os=linux,node-role.kubernetes.io/matser=,node.kubernetes.io/node=,storagenode=glusterfs k8s-master02 Ready matser 18d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master02,kubernetes.io/os=linux,node-role.kubernetes.io/matser=,node.kubernetes.io/node=,storagenode=glusterfs k8s-master03 Ready matser 18d v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master03,kubernetes.io/os=linux,node-role.kubernetes.io/matser=,node.kubernetes.io/node=,storagenode=glusterfs [root@k8s-master01 ~]#
所有节点加载对应模块:
[root@k8s-master01 ~]# modprobe dm_snapshot [root@k8s-master01 ~]# modprobe dm_mirror [root@k8s-master01 ~]# modprobe dm_thin_pool
cat >/etc/sysconfig/modules/glusterfs.modules <<EOF #!/bin/bash for kernel_module in dm_snapshot dm_mirror dm_thin_pool;do /sbin/modinfo -F filename ${kernel_module} > /dev/null 2>&1 if [ $? -eq 0 ]; then /sbin/modprobe ${kernel_module} fi done; EOF
[root@kube-node1 ~]# chmod +x /etc/sysconfig/modules/glusterfs.modules
检查 modprobe 是否加载成功
[root@k8s-master01 ~]#lsmod | egrep '(dm_snapshot|dm_mirror|dm_thin_pool)' dm_thin_pool 69632 0 dm_persistent_data 73728 1 dm_thin_pool dm_bio_prison 20480 1 dm_thin_pool dm_snapshot 40960 0 dm_bufio 28672 2 dm_persistent_data,dm_snapshot dm_mirror 24576 0 dm_region_hash 20480 1 dm_mirror dm_log 20480 2 dm_region_hash,dm_mirror dm_mod 126976 13 dm_thin_pool,dm_log,dm_snapshot,dm_mirror,dm_bufio
二、部署GlusterFS
这里采用容器化方式部署 GFS 集群,也可以使用传统方式部署。生产环境下建议使用独立于 kubernetes 集群之外进行部署,之后建立对应的 EndPoints 资源即可。
本次部署采用 DaemonSet 方式,同时保证已经打上标签的节点上运行了一个 GFS 服务,并且均有提供存储的磁盘。
Heketi provides a RESTful management interface which can be used to manage the life cycle of GlusterFS volumes. With Heketi, cloud services like OpenStack Manila, Kubernetes, and OpenShift can dynamically provision GlusterFS volumes with any of the supported durability types. Heketi will automatically determine the location for bricks across the cluster, making sure to place bricks and its replicas across different failure domains. Heketi also supports any number of GlusterFS clusters, allowing cloud services to provide network file storage without being limited to a single GlusterFS cluster.
[root@k8s-master01 GFS]#wget https://github.com/heketi/heketi/releases/download/v7.0.0/heketi-client-v7.0.0.linux.amd64.tar.gz --2021-06-29 16:45:53-- https://github.com/heketi/heketi/releases/download/v7.0.0/heketi-client-v7.0.0.linux.amd64.tar.gz Resolving github.com (github.com)... 13.250.177.223 Connecting to github.com (github.com)|13.250.177.223|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://github-releases.githubusercontent.com/37446835/88bdaaa2-68bf-11e8-8915-37b7ef02cfc9?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20210629%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20210629T084555Z&X-Amz-Expires=300&X-Amz-Signature=30369a37c801c4e5d2ee74e8eff1cf4e80b710ecb7f7236549830233f0b438a4&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=37446835&response-content-disposition=attachment%3B%20filename%3Dheketi-client-v7.0.0.linux.amd64.tar.gz&response-content-type=application%2Foctet-stream [following] --2021-06-29 16:45:54-- https://github-releases.githubusercontent.com/37446835/88bdaaa2-68bf-11e8-8915-37b7ef02cfc9?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20210629%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20210629T084555Z&X-Amz-Expires=300&X-Amz-Signature=30369a37c801c4e5d2ee74e8eff1cf4e80b710ecb7f7236549830233f0b438a4&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=37446835&response-content-disposition=attachment%3B%20filename%3Dheketi-client-v7.0.0.linux.amd64.tar.gz&response-content-type=application%2Foctet-stream Resolving github-releases.githubusercontent.com (github-releases.githubusercontent.com)... 185.199.110.154, 185.199.108.154, 185.199.111.154, ... Connecting to github-releases.githubusercontent.com (github-releases.githubusercontent.com)|185.199.110.154|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 10520043 (10M) [application/octet-stream] Saving to: ‘heketi-client-v7.0.0.linux.amd64.tar.gz.1’ 100%[===========================================================================>] 10,520,043 3.13MB/s in 3.6s 2021-06-29 16:45:59 (2.79 MB/s) - ‘heketi-client-v7.0.0.linux.amd64.tar.gz.1’ saved [10520043/10520043] [root@k8s-master01 GFS]#
解压 heketi-client-v7.0.0.linux.amd64.tar.gz:
[root@k8s-master01 GFS]#tar -xf heketi-client-v7.0.0.linux.amd64.tar.gz
[root@k8s-master01 GFS]#cd heketi-client/share/heketi/kubernetes/
[root@k8s-master01 kubernetes]#ll
total 40
-rw-rw-r-- 1 1000 1000 5222 Jun 5 2018 glusterfs-daemonset.json
-rw-rw-r-- 1 1000 1000 3513 Jun 5 2018 heketi-bootstrap.json
-rw-rw-r-- 1 1000 1000 4113 Jun 5 2018 heketi-deployment.json
-rw-rw-r-- 1 1000 1000 1109 Jun 5 2018 heketi.json
-rw-rw-r-- 1 1000 1000 111 Jun 5 2018 heketi-service-account.json
-rwxrwxr-x 1 1000 1000 584 Jun 5 2018 heketi-start.sh
-rw-rw-r-- 1 1000 1000 977 Jun 5 2018 README.md
-rw-rw-r-- 1 1000 1000 1827 Jun 5 2018 topology-sample.json
[root@k8s-master01 kubernetes]#
创建集群:
[root@k8s-master01 kubernetes]#kubectl apply -f glusterfs-daemonset.json
遇到的报错后将 glusterfs-daemonset.json 在线工具转成 glusterfs-daemonset.yaml,修改一些报错后,最终如下
kind: DaemonSet apiVersion: apps/v1 metadata: name: glusterfs labels: glusterfs: deployment annotations: description: GlusterFS Daemon Set tags: glusterfs spec: selector: matchLabels: glusterfs-node: daemonset template: metadata: name: glusterfs labels: glusterfs-node: daemonset spec: nodeSelector: storagenode: glusterfs hostNetwork: true containers: - image: 'gluster/gluster-centos:latest' imagePullPolicy: IfNotPresent name: glusterfs volumeMounts: - name: glusterfs-heketi mountPath: /var/lib/heketi - name: glusterfs-run mountPath: /run - name: glusterfs-lvm mountPath: /run/lvm - name: glusterfs-etc mountPath: /etc/glusterfs - name: glusterfs-logs mountPath: /var/log/glusterfs - name: glusterfs-config mountPath: /var/lib/glusterd - name: glusterfs-dev mountPath: /dev - name: glusterfs-cgroup mountPath: /sys/fs/cgroup securityContext: capabilities: {} privileged: true readinessProbe: timeoutSeconds: 3 initialDelaySeconds: 60 exec: command: - /bin/bash - '-c' - systemctl status glusterd.service livenessProbe: timeoutSeconds: 3 initialDelaySeconds: 60 exec: command: - /bin/bash - '-c' - systemctl status glusterd.service volumes: - name: glusterfs-heketi hostPath: path: /var/lib/heketi - name: glusterfs-run - name: glusterfs-lvm hostPath: path: /run/lvm - name: glusterfs-etc hostPath: path: /etc/glusterfs - name: glusterfs-logs hostPath: path: /var/log/glusterfs - name: glusterfs-config hostPath: path: /var/lib/glusterd - name: glusterfs-dev hostPath: path: /dev - name: glusterfs-cgroup hostPath: path: /sys/fs/cgroup
再次创建集群:
[root@k8s-master01 kubernetes]#kubectl apply -f glusterfs-daemonset.json daemonset.apps/glusterfs configured [root@k8s-master01 kubernetes]#kubectl get daemonset --all-namespaces NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE devops glusterfs 3 3 3 3 3 storagenode=glusterfs 4h54m kube-system calico-node 3 3 3 3 3 kubernetes.io/os=linux 19d
注意1:此处采用的是默认的挂载方式,可使用其他磁盘作为 GFS 的工作目录。
注意2:此处创建的 Namespace 为 devops,如果你们创建的话,默认的 Namespace 是default,我的修改过,你们可按需修改。
注意3:可使用 gluster/gluster-centos:gluster4u0_centos7 镜像。
三、部署Heketi服务
Heketi 是一个提供 RESTful API 管理 GFS 卷的框架,能够在 kubernetes、Openshift、OpenStack 等云平台上实现动态存储资源供应,支持 GFS 多集群管理,便于管理员对 GFS 进行操作,在 Kubernetes 集群中,Pod 将存储的请求发送至 Heketi,然后 Heketi 控制 GFS 集群创建对应的存储卷。
查看 Heketi 的 ServiceAccount 对象:
[root@k8s-master01 kubernetes]#cat heketi-service-account.json { "apiVersion": "v1", "kind": "ServiceAccount", "metadata": { "name": "heketi-service-account" } }
创建 Heketi 的 ServiceAccount 对象:
[root@k8s-master01 kubernetes]#kubectl apply -f heketi-service-account.json serviceaccount/heketi-service-account created [root@k8s-master01 kubernetes]#kubectl get sa NAME SECRETS AGE default 1 14d heketi-service-account 1 2s jenkins 1 14d [root@k8s-master01 kubernetes]#
创建 Heketi 对应的权限和 Secret:
[root@k8s-master01 kubernetes]#kubectl create clusterrolebinding heketi-gluster-admin --clusterrole=edit --serviceaccount=devops:heketi-service-account clusterrolebinding.rbac.authorization.k8s.io/heketi-gluster-admin created [root@k8s-master01 kubernetes]#
[root@k8s-master01 kubernetes]#kubectl create secret generic heketi-config-secret --from-file=./heketi.json secret/heketi-config-secret created
在线工具转换:
kind: List apiVersion: v1 items: - kind: Service apiVersion: v1 metadata: name: deploy-heketi labels: glusterfs: heketi-service deploy-heketi: support annotations: description: Exposes Heketi Service spec: selector: name: deploy-heketi ports: - name: deploy-heketi port: 8080 targetPort: 8080 - kind: Deployment apiVersion: apps/v1 metadata: name: deploy-heketi labels: glusterfs: heketi-deployment deploy-heketi: deployment annotations: description: Defines how to deploy Heketi spec: replicas: 1 selector: matchLabels: glusterfs: heketi-pod deploy-heketi: pod template: metadata: name: deploy-heketi labels: name: deploy-heketi glusterfs: heketi-pod deploy-heketi: pod spec: serviceAccountName: heketi-service-account containers: - image: 'heketi/heketi:dev' imagePullPolicy: Always name: deploy-heketi env: - name: HEKETI_EXECUTOR value: kubernetes - name: HEKETI_DB_PATH value: /var/lib/heketi/heketi.db - name: HEKETI_FSTAB value: /var/lib/heketi/fstab - name: HEKETI_SNAPSHOT_LIMIT value: '14' - name: HEKETI_KUBE_GLUSTER_DAEMONSET value: 'y' ports: - containerPort: 8080 volumeMounts: - name: db mountPath: /var/lib/heketi - name: config mountPath: /etc/heketi readinessProbe: timeoutSeconds: 3 initialDelaySeconds: 3 httpGet: path: /hello port: 8080 livenessProbe: timeoutSeconds: 3 initialDelaySeconds: 30 httpGet: path: /hello port: 8080 volumes: - name: db - name: config secret: secretName: heketi-config-secret
查看当前svc,deploy资源:
[root@k8s-master01 kubernetes]#kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE jenkins NodePort 10.111.57.164 <none> 80:32433/TCP,50000:30752/TCP 12d postgres NodePort 10.99.208.124 <none> 5432:31692/TCP 6d1h sonarqube NodePort 10.102.29.13 <none> 9000:30003/TCP 4d23h [root@k8s-master01 kubernetes]#kubectl get pods NAME READY STATUS RESTARTS AGE glusterfs-2l5jf 1/1 Running 0 38m glusterfs-4l88m 1/1 Running 0 38m glusterfs-6fswc 1/1 Running 0 37m jenkins-0 1/1 Running 6 8d postgres-57f59c66fd-bfg7n 1/1 Running 4 5d23h sonarqube-649955d9b-7hgnz 1/1 Running 3 4d23h [root@k8s-master01 kubernetes]#
再初始化部署 Heketi:
[root@k8s-master01 kubernetes]#kubectl create -f heketi-bootstrap.yaml service/deploy-heketi created deployment.apps/deploy-heketi created [root@k8s-master01 kubernetes]#kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE deploy-heketi ClusterIP 10.102.18.156 <none> 8080/TCP 4s jenkins NodePort 10.111.57.164 <none> 80:32433/TCP,50000:30752/TCP 12d postgres NodePort 10.99.208.124 <none> 5432:31692/TCP 6d1h sonarqube NodePort 10.102.29.13 <none> 9000:30003/TCP 4d23h [root@k8s-master01 kubernetes]#kubectl get pods NAME READY STATUS RESTARTS AGE deploy-heketi-6565469fdf-wcnjc 0/1 ContainerCreating 0 6s glusterfs-2l5jf 1/1 Running 0 42m glusterfs-4l88m 1/1 Running 0 42m glusterfs-6fswc 1/1 Running 0 42m jenkins-0 1/1 Running 6 8d postgres-57f59c66fd-bfg7n 1/1 Running 4 5d23h sonarqube-649955d9b-7hgnz 1/1 Running 3 4d23h [root@k8s-master01 kubernetes]#kubectl rollout status deployments/deploy-heketi Waiting for deployment "deploy-heketi" rollout to finish: 0 of 1 updated replicas are available... deployment "deploy-heketi" successfully rolled out [root@k8s-master01 kubernetes]#kubectl get pods NAME READY STATUS RESTARTS AGE deploy-heketi-6565469fdf-wcnjc 1/1 Running 0 55s glusterfs-2l5jf 1/1 Running 0 43m glusterfs-4l88m 1/1 Running 0 43m glusterfs-6fswc 1/1 Running 0 43m jenkins-0 1/1 Running 6 8d postgres-57f59c66fd-bfg7n 1/1 Running 4 5d23h sonarqube-649955d9b-7hgnz 1/1 Running 3 4d23h [root@k8s-master01 kubernetes]#
四、创建GFS集群
本节使用 Heketi 创建 GFS 集群,其管理方式更加简单和高效。
复制 heketi-cli 至 /usr/local/bin/:
[root@k8s-master01 bin]#pwd /root/GFS/heketi-client/bin [root@k8s-master01 bin]#ll total 29784 -rwxr-xr-x 1 root root 30498281 Apr 7 21:38 heketi-cli [root@k8s-master01 bin]#cp heketi-cli /usr/local/bin/ [root@k8s-master01 bin]#ls -l /usr/local/bin/ total 582732 -rwxr-xr-x 1 root root 10376657 Apr 17 03:17 cfssl -rwxr-xr-x 1 root root 2277873 Apr 17 03:17 cfssljson -rwxr-xr-x 1 root root 23847904 Aug 25 2020 etcd -rwxr-xr-x 1 root root 17620576 Aug 25 2020 etcdctl -rwxr-xr-x 1 root root 30498281 Jun 29 13:24 heketi-cli -rwxr-xr-x 1 root root 45109248 Jun 17 00:09 helm -rwxr-xr-x 1 root root 118128640 Dec 9 2020 kube-apiserver -rwxr-xr-x 1 root root 112308224 Dec 9 2020 kube-controller-manager -rwxr-xr-x 1 root root 40230912 Dec 9 2020 kubectl -rwxr-xr-x 1 root root 113974120 Dec 9 2020 kubelet -rwxr-xr-x 1 root root 39485440 Dec 9 2020 kube-proxy -rwxr-xr-x 1 root root 42848256 Dec 9 2020 kube-scheduler [root@k8s-master01 bin]#
同步 heketi-cli 至其他node /usr/local/bin/:
[root@k8s-master01 bin]#rsync -avzpP heketi-cli root@192.168.153.42:/usr/local/bin/ sending incremental file list heketi-cli 30,498,281 100% 22.95MB/s 0:00:01 (xfr#1, to-chk=0/1) sent 12,258,006 bytes received 35 bytes 4,903,216.40 bytes/sec total size is 30,498,281 speedup is 2.49 [root@k8s-master01 bin]#rsync -avzpP heketi-cli root@192.168.153.43:/usr/local/bin/ sending incremental file list heketi-cli 30,498,281 100% 21.81MB/s 0:00:01 (xfr#1, to-chk=0/1) sent 12,258,006 bytes received 35 bytes 4,903,216.40 bytes/sec total size is 30,498,281 speedup is 2.49 [root@k8s-master01 bin]#rsync -avzpP heketi-cli root@192.168.153.44:/usr/local/bin/ The authenticity of host '192.168.153.44 (192.168.153.44)' can't be established. ECDSA key fingerprint is SHA256:AqR5ZL4OLkrfdBddeQVMjgrUGyAGLw1C7mTCQXAy7xE. ECDSA key fingerprint is MD5:18:1c:bd:c3:e6:0c:24:b9:1e:09:e7:1a:25:ee:e8:e0. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.153.44' (ECDSA) to the list of known hosts. root@192.168.153.44's password: bash: rsync: command not found rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: remote command not found (code 127) at io.c(226) [sender=3.1.2] [root@k8s-master01 bin]#rsync -avzpP heketi-cli root@192.168.153.44:/usr/local/bin/ root@192.168.153.44's password: sending incremental file list heketi-cli 30,498,281 100% 22.56MB/s 0:00:01 (xfr#1, to-chk=0/1) sent 12,258,006 bytes received 35 bytes 2,724,009.11 bytes/sec total size is 30,498,281 speedup is 2.49 [root@k8s-master01 bin]#
heketi-cli 版本号:
[root@k8s-master01 ~]#heketi-cli -v heketi-cli v7.0.0
修改 topology-sample,manage 为 GFS 管理服务的节点(Node)主机名,storage 为节点的 IP 地址,devices 为节点上裸设备,也就是用于提供存储的磁盘最好使用裸设备:
[root@k8s-master01 kubernetes]#cat topology-sample.json { "clusters": [ { "nodes": [ { "node": { "hostnames": { "manage": [ "k8s-master01" ], "storage": [ "192.168.153.41" ] }, "zone": 1 }, "devices": [ { "name": "/dev/sdb", "destroydata": false } ] }, { "node": { "hostnames": { "manage": [ "k8s-master02" ], "storage": [ "192.168.153.42" ] }, "zone": 1 }, "devices": [ { "name": "/dev/sdb", "destroydata": false } ] }, { "node": { "hostnames": { "manage": [ "k8s-master03" ], "storage": [ "192.168.153.43" ] }, "zone": 1 }, "devices": [ { "name": "/dev/sdb", "destroydata": false } ] } ] } ] }
查看当前 Heketi 的 ClusterIP:
[root@k8s-master01 kubernetes]#kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE deploy-heketi ClusterIP 10.102.18.156 <none> 8080/TCP 5m42s jenkins NodePort 10.111.57.164 <none> 80:32433/TCP,50000:30752/TCP 12d postgres NodePort 10.99.208.124 <none> 5432:31692/TCP 6d1h sonarqube NodePort 10.102.29.13 <none> 9000:30003/TCP 4d23h [root@k8s-master01 kubernetes]#curl 10.102.18.156:8080/hello Hello from Heketi[root@k8s-master01 kubernetes]# [root@k8s-master01 kubernetes]# [root@k8s-master01 kubernetes]#export HEKETI_CLI_SERVER="http://10.102.18.156:8080" [root@k8s-master01 kubernetes]#export |grep HEKETI declare -x HEKETI_CLI_SERVER="http://10.102.18.156:8080"
使用 Heketi 创建 GFS 集群:
[root@k8s-master01 kubernetes]#heketi-cli topology load --json=topology-sample.json
Error: Unable to get topology information: Invalid JWT token: Token missing iss claim
这是因为新版本的 heketi 在创建 gfs 集群时需要带上参数,声明用户名及密码,相应值在 heketi.json 文件中配置,即:
[root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' topology load --json=topology-sample.json Creating cluster ... ID: 8e17d5f80328a9e8c7d141ab4034e2e6 Allowing file volumes on cluster. Allowing block volumes on cluster. Creating node k8s-master01 ... Unable to create node: New Node doesn't have glusterd running Creating node k8s-master02 ... Unable to create node: New Node doesn't have glusterd running Creating node k8s-master03 ... Unable to create node: New Node doesn't have glusterd running
不报之前的错误,出现新的错误 Unable to create node: New Node doesn't have glusterd running,查看 deployment 的 pod 日志:
[root@k8s-master01 kubernetes]#kubectl get pods NAME READY STATUS RESTARTS AGE deploy-heketi-6565469fdf-wcnjc 1/1 Running 0 12m glusterfs-2l5jf 1/1 Running 0 54m glusterfs-4l88m 1/1 Running 0 54m glusterfs-6fswc 1/1 Running 0 54m jenkins-0 1/1 Running 6 8d postgres-57f59c66fd-bfg7n 1/1 Running 4 6d sonarqube-649955d9b-7hgnz 1/1 Running 3 4d23h
日志显示 Failed to get list of pods:
[root@k8s-master01 kubernetes]#kubectl logs -f deploy-heketi-6565469fdf-wcnjc
[heketi] ERROR 2021/06/29 09:10:57 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] 2021-06-29T09:10:57Z | 400 | 3.867841ms | 10.102.18.156:8080 | POST /nodes
[cmdexec] INFO 2021/06/29 09:10:57 Check Glusterd service status in node k8s-master03
[negroni] 2021-06-29T09:10:57Z | 400 | 4.219108ms | 10.102.18.156:8080 | POST /nodes
[kubeexec] ERROR 2021/06/29 09:10:57 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: pods is forbidden: User "system:serviceaccount:devops:heketi-service-account" cannot list resource "pods" in API group "" in the namespace "devops"
[kubeexec] ERROR 2021/06/29 09:10:57 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods
[cmdexec] ERROR 2021/06/29 09:10:57 heketi/executors/cmdexec/peer.go:80:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods
[heketi] ERROR 2021/06/29 09:10:57 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods
[heketi] ERROR 2021/06/29 09:10:57 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] 2021-06-29T09:10:57Z | 200 | 353.242µs | 10.102.18.156:8080 | GET /clusters/8e17d5f80328a9e8c7d141ab4034e2e6
[heketi] INFO 2021/06/29 09:10:57 Deleted cluster [8e17d5f80328a9e8c7d141ab4034e2e6]
[negroni] 2021-06-29T09:10:57Z | 200 | 3.360667ms | 10.102.18.156:8080 | DELETE /clusters/8e17d5f80328a9e8c7d141ab4034e2e6
[heketi] INFO 2021/06/29 09:12:03 Starting Node Health Status refresh
[heketi] INFO 2021/06/29 09:12:03 Cleaned 0 nodes from health cache
[heketi] INFO 2021/06/29 09:14:03 Starting Node Health Status refresh
[heketi] INFO 2021/06/29 09:14:03 Cleaned 0 nodes from health cache
解决办法: 创建 role 并绑定到 ServiceAccount,
[root@k8s-master01 kubernetes]#kubectl create clusterrole foo --verb=get,list,watch,create --resource=pods,pods/status,pods/exec
clusterrole.rbac.authorization.k8s.io/foo created
再次执行gluster添加命令,观察日志:
[heketi] INFO 2021/06/29 09:12:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:12:03 Cleaned 0 nodes from health cache [heketi] INFO 2021/06/29 09:14:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:14:03 Cleaned 0 nodes from health cache [heketi] INFO 2021/06/29 09:16:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:16:03 Cleaned 0 nodes from health cache [heketi] INFO 2021/06/29 09:18:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:18:03 Cleaned 0 nodes from health cache [heketi] INFO 2021/06/29 09:20:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:20:03 Cleaned 0 nodes from health cache
再使用 Heketi 创建 GFS 集群:
[root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' topology load --json=topology-sample.json Creating cluster ... ID: ba8f4a8a29e5c436d0c84c45ad9e00d3 Allowing file volumes on cluster. Allowing block volumes on cluster. Creating node k8s-master01 ... Unable to create node: New Node doesn't have glusterd running Creating node k8s-master02 ... Unable to create node: New Node doesn't have glusterd running Creating node k8s-master03 ... Unable to create node: New Node doesn't have glusterd running [root@k8s-master01 kubernetes]#
再打开一个窗口,查看日志:
[heketi] INFO 2021/06/29 09:42:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:42:03 Cleaned 0 nodes from health cache [heketi] INFO 2021/06/29 09:44:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:44:03 Cleaned 0 nodes from health cache [heketi] INFO 2021/06/29 09:46:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:46:03 Cleaned 0 nodes from health cache [heketi] INFO 2021/06/29 09:48:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:48:03 Cleaned 0 nodes from health cache [heketi] INFO 2021/06/29 09:50:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:50:03 Cleaned 0 nodes from health cache [negroni] 2021-06-29T09:51:42Z | 200 | 102.535µs | 10.102.18.156:8080 | GET /clusters [negroni] 2021-06-29T09:51:42Z | 201 | 3.432335ms | 10.102.18.156:8080 | POST /clusters [cmdexec] INFO 2021/06/29 09:51:42 Check Glusterd service status in node k8s-master01 [kubeexec] ERROR 2021/06/29 09:51:42 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: pods is forbidden: User "system:serviceaccount:devops:heketi-service-account" cannot list resource "pods" in API group "" in the namespace "devops" [kubeexec] ERROR 2021/06/29 09:51:42 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods [cmdexec] ERROR 2021/06/29 09:51:42 heketi/executors/cmdexec/peer.go:80:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods [heketi] ERROR 2021/06/29 09:51:42 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods [heketi] ERROR 2021/06/29 09:51:42 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running [negroni] 2021-06-29T09:51:42Z | 400 | 34.875543ms | 10.102.18.156:8080 | POST /nodes [cmdexec] INFO 2021/06/29 09:51:42 Check Glusterd service status in node k8s-master02 [kubeexec] ERROR 2021/06/29 09:51:42 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: pods is forbidden: User "system:serviceaccount:devops:heketi-service-account" cannot list resource "pods" in API group "" in the namespace "devops" [kubeexec] ERROR 2021/06/29 09:51:42 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods [cmdexec] ERROR 2021/06/29 09:51:42 heketi/executors/cmdexec/peer.go:80:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods [heketi] ERROR 2021/06/29 09:51:42 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods [heketi] ERROR 2021/06/29 09:51:42 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running [negroni] 2021-06-29T09:51:42Z | 400 | 5.317761ms | 10.102.18.156:8080 | POST /nodes [cmdexec] INFO 2021/06/29 09:51:42 Check Glusterd service status in node k8s-master03 [kubeexec] ERROR 2021/06/29 09:51:42 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: pods is forbidden: User "system:serviceaccount:devops:heketi-service-account" cannot list resource "pods" in API group "" in the namespace "devops" [kubeexec] ERROR 2021/06/29 09:51:42 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods [cmdexec] ERROR 2021/06/29 09:51:42 heketi/executors/cmdexec/peer.go:80:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods [heketi] ERROR 2021/06/29 09:51:42 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods [heketi] ERROR 2021/06/29 09:51:42 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running [negroni] 2021-06-29T09:51:42Z | 400 | 4.586467ms | 10.102.18.156:8080 | POST /nodes [negroni] 2021-06-29T09:51:42Z | 200 | 237.734µs | 10.102.18.156:8080 | GET /clusters/ba8f4a8a29e5c436d0c84c45ad9e00d3 [heketi] INFO 2021/06/29 09:51:42 Deleted cluster [ba8f4a8a29e5c436d0c84c45ad9e00d3] [negroni] 2021-06-29T09:51:42Z | 200 | 895.405µs | 10.102.18.156:8080 | DELETE /clusters/ba8f4a8a29e5c436d0c84c45ad9e00d3 [heketi] INFO 2021/06/29 09:52:03 Starting Node Health Status refresh [heketi] INFO 2021/06/29 09:52:03 Cleaned 0 nodes from health cache
终于找到为啥了:
[kubeexec] ERROR 2021/06/29 09:51:42 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: pods is forbidden: User "system:serviceaccount:devops:heketi-service-account" cannot list resource "pods" in API group "" in the namespace "devops"
删除clusterrolebinding,重新创建新的clusterrolebinding后,成功!
[root@k8s-master01 kubernetes]#kubectl delete clusterrolebinding heketi-gluster-admin clusterrolebinding.rbac.authorization.k8s.io "heketi-gluster-admin" deleted [root@k8s-master01 kubernetes]#kubectl create clusterrolebinding heketi-gluster-admin --clusterrole=cluster-admin --serviceaccount=devops:heketi-service-account clusterrolebinding.rbac.authorization.k8s.io/heketi-gluster-admin created [root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' topology load --json=topology-sample.json Creating cluster ... ID: c934f76dfae0fc21e0d8820c5e2ee401 Allowing file volumes on cluster. Allowing block volumes on cluster. Creating node k8s-master01 ... ID: aaf700d47bfa7d2c0bd2a08e66a0d1f3 Adding device /dev/sdb ... Unable to add device: Initializing device /dev/sdb failed (already initialized or contains data?): Device /dev/sdb excluded by a filter. Creating node k8s-master02 ... ID: 04b711a1eb44601f8d6b5c002b28aaf9 Adding device /dev/sdb ... Unable to add device: Initializing device /dev/sdb failed (already initialized or contains data?): Device /dev/sdb excluded by a filter. Creating node k8s-master03 ... ID: cca811a225c58034b3d79fc2c2d01be4 Adding device /dev/sdb ... Unable to add device: Initializing device /dev/sdb failed (already initialized or contains data?): Device /dev/sdb excluded by a filter. [root@k8s-master01 kubernetes]#
第二天打开电脑,开机后发现 Heketi 配置失效了,原来是没有配置持久卷,就把上面部分步骤重新做一遍,在到执行 Heketi 创建 GFS 集群时,又报错了,命令执行报错如下:
[root@k8s-master01 kubernetes]#kubectl logs -f deploy-heketi-6565469fdf-n2wnh -n devops^C [root@k8s-master01 kubernetes]#kubectl create clusterrole foo --verb=get,list,watch,create --resource=pods,pods/status,pods/exec Error from server (AlreadyExists): clusterroles.rbac.authorization.k8s.io "foo" already exists [root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' topology load --json=topology-sample.json Found node k8s-master01 on cluster 221eddbc9d9ec714e6de6c19f5e86e09 Adding device /dev/sdb ... Unable to add device: Initializing device /dev/sdb failed (already initialized or contains data?): Device /dev/sdb excluded by a filter. Found node k8s-master02 on cluster 221eddbc9d9ec714e6de6c19f5e86e09 Adding device /dev/sdb ... Unable to add device: Initializing device /dev/sdb failed (already initialized or contains data?): Device /dev/sdb excluded by a filter. Found node k8s-master03 on cluster 221eddbc9d9ec714e6de6c19f5e86e09 Adding device /dev/sdb ... Unable to add device: Initializing device /dev/sdb failed (already initialized or contains data?): Device /dev/sdb excluded by a filter. [root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' topology load --json=topology-sample.json Found node k8s-master01 on cluster 221eddbc9d9ec714e6de6c19f5e86e09 Adding device /dev/sdb ... OK Found node k8s-master02 on cluster 221eddbc9d9ec714e6de6c19f5e86e09 Adding device /dev/sdb ... Unable to add device: Initializing device /dev/sdb failed (already initialized or contains data?): Device /dev/sdb excluded by a filter. Found node k8s-master03 on cluster 221eddbc9d9ec714e6de6c19f5e86e09 Adding device /dev/sdb ... Unable to add device: Initializing device /dev/sdb failed (already initialized or contains data?): Device /dev/sdb excluded by a filter.
解决办法如下,先查看pod的日志
[root@k8s-master01 kubernetes]#kubectl logs -f deploy-heketi-6565469fdf-n2wnh [negroni] 2021-07-01T02:46:30Z | 200 | 77.337µs | 10.102.18.156:8080 | GET /clusters [negroni] 2021-07-01T02:46:30Z | 200 | 210.78µs | 10.102.18.156:8080 | GET /clusters/221eddbc9d9ec714e6de6c19f5e86e09 [negroni] 2021-07-01T02:46:30Z | 200 | 446.48µs | 10.102.18.156:8080 | GET /nodes/1e515e038850e2f725125cd55a19d278 [negroni] 2021-07-01T02:46:30Z | 200 | 256.658µs | 10.102.18.156:8080 | GET /nodes/4154491e2eb27e7017f9b8dab8046076 [negroni] 2021-07-01T02:46:30Z | 200 | 244.247µs | 10.102.18.156:8080 | GET /nodes/45d66ab47e0e299b25c66a57c667b1de [negroni] 2021-07-01T02:46:30Z | 200 | 334.914µs | 10.102.18.156:8080 | GET /clusters/221eddbc9d9ec714e6de6c19f5e86e09 [negroni] 2021-07-01T02:46:30Z | 200 | 460.879µs | 10.102.18.156:8080 | GET /clusters/221eddbc9d9ec714e6de6c19f5e86e09 [heketi] INFO 2021/07/01 02:46:30 Adding device /dev/sdb to node 4154491e2eb27e7017f9b8dab8046076 [negroni] 2021-07-01T02:46:30Z | 202 | 4.574525ms | 10.102.18.156:8080 | POST /devices [asynchttp] INFO 2021/07/01 02:46:30 Started job 0c89db58f2ffcf410c0777d2f20a08b3 [negroni] 2021-07-01T02:46:30Z | 200 | 74.084µs | 10.102.18.156:8080 | GET /queue/0c89db58f2ffcf410c0777d2f20a08b3 [kubeexec] DEBUG 2021/07/01 02:46:30 heketi/pkg/remoteexec/log/commandlog.go:34:log.(*CommandLogger).Before: Will run command [/usr/sbin/lvm pvcreate -qq --metadatasize=128M --dataalignment=256K '/dev/sdb'] on [pod:glusterfs-d2glt c:glusterfs ns:devops (from host:k8s-master02 selector:glusterfs-node)] [kubeexec] DEBUG 2021/07/01 02:46:30 heketi/pkg/remoteexec/kube/exec.go:72:kube.ExecCommands: Current kube connection count: 0 [kubeexec] ERROR 2021/07/01 02:46:30 heketi/pkg/remoteexec/log/commandlog.go:56:log.(*CommandLogger).Error: Failed to run command [/usr/sbin/lvm pvcreate -qq --metadatasize=128M --dataalignment=256K '/dev/sdb'] on [pod:glusterfs-d2glt c:glusterfs ns:devops (from host:k8s-master02 selector:glusterfs-node)]: Err[command terminated with exit code 5]: Stdout []: Stderr [WARNING: dos signature detected on /dev/sdb at offset 510. Wipe it? [y/n]: [n] Aborted wiping of dos. 1 existing signature left on the device. ] [kubeexec] DEBUG 2021/07/01 02:46:30 heketi/pkg/remoteexec/log/commandlog.go:34:log.(*CommandLogger).Before: Will run command [/usr/sbin/lvm pvs -o pv_name,pv_uuid,vg_name --reportformat=json /dev/sdb] on [pod:glusterfs-d2glt c:glusterfs ns:devops (from host:k8s-master02 selector:glusterfs-node)] [kubeexec] DEBUG 2021/07/01 02:46:30 heketi/pkg/remoteexec/kube/exec.go:72:kube.ExecCommands: Current kube connection count: 0 [asynchttp] INFO 2021/07/01 02:46:30 Completed job 0c89db58f2ffcf410c0777d2f20a08b3 in 343.470109ms [kubeexec] ERROR 2021/07/01 02:46:30 heketi/pkg/remoteexec/log/commandlog.go:56:log.(*CommandLogger).Error: Failed to run command [/usr/sbin/lvm pvs -o pv_name,pv_uuid,vg_name --reportformat=json /dev/sdb] on [pod:glusterfs-d2glt c:glusterfs ns:devops (from host:k8s-master02 selector:glusterfs-node)]: Err[command terminated with exit code 5]: Stdout [ { "report": [ { "pv": [ ] } ] } ]: Stderr [ Failed to find physical volume "/dev/sdb". ] [negroni] 2021-07-01T02:46:31Z | 500 | 75.41µs | 10.102.18.156:8080 | GET /queue/0c89db58f2ffcf410c0777d2f20a08b3 [negroni] 2021-07-01T02:46:31Z | 200 | 200.176µs | 10.102.18.156:8080 | GET /clusters/221eddbc9d9ec714e6de6c19f5e86e09 [heketi] INFO 2021/07/01 02:46:31 Adding device /dev/sdb to node 45d66ab47e0e299b25c66a57c667b1de [negroni] 2021-07-01T02:46:31Z | 202 | 1.013933ms | 10.102.18.156:8080 | POST /devices [asynchttp] INFO 2021/07/01 02:46:31 Started job eee9aed41f9be12d74592b3f1d9212ef [negroni] 2021-07-01T02:46:31Z | 200 | 73.998µs | 10.102.18.156:8080 | GET /queue/eee9aed41f9be12d74592b3f1d9212ef [kubeexec] DEBUG 2021/07/01 02:46:31 heketi/pkg/remoteexec/log/commandlog.go:34:log.(*CommandLogger).Before: Will run command [/usr/sbin/lvm pvcreate -qq --metadatasize=128M --dataalignment=256K '/dev/sdb'] on [pod:glusterfs-ttv65 c:glusterfs ns:devops (from host:k8s-master03 selector:glusterfs-node)] [kubeexec] DEBUG 2021/07/01 02:46:31 heketi/pkg/remoteexec/kube/exec.go:72:kube.ExecCommands: Current kube connection count: 0 [kubeexec] ERROR 2021/07/01 02:46:31 heketi/pkg/remoteexec/log/commandlog.go:56:log.(*CommandLogger).Error: Failed to run command [/usr/sbin/lvm pvcreate -qq --metadatasize=128M --dataalignment=256K '/dev/sdb'] on [pod:glusterfs-ttv65 c:glusterfs ns:devops (from host:k8s-master03 selector:glusterfs-node)]: Err[command terminated with exit code 5]: Stdout []: Stderr [WARNING: dos signature detected on /dev/sdb at offset 510. Wipe it? [y/n]: [n] Aborted wiping of dos. 1 existing signature left on the device. ] [kubeexec] DEBUG 2021/07/01 02:46:31 heketi/pkg/remoteexec/log/commandlog.go:34:log.(*CommandLogger).Before: Will run command [/usr/sbin/lvm pvs -o pv_name,pv_uuid,vg_name --reportformat=json /dev/sdb] on [pod:glusterfs-ttv65 c:glusterfs ns:devops (from host:k8s-master03 selector:glusterfs-node)] [kubeexec] DEBUG 2021/07/01 02:46:31 heketi/pkg/remoteexec/kube/exec.go:72:kube.ExecCommands: Current kube connection count: 0 [kubeexec] ERROR 2021/07/01 02:46:31 heketi/pkg/remoteexec/log/commandlog.go:56:log.(*CommandLogger).Error: Failed to run command [/usr/sbin/lvm pvs -o pv_name,pv_uuid,vg_name --reportformat=json /dev/sdb] on [pod:glusterfs-ttv65 c:glusterfs ns:devops (from host:k8s-master03 selector:glusterfs-node)]: Err[command terminated with exit code 5]: Stdout [ { "report": [ { "pv": [ ] } ] } ]: Stderr [ Failed to find physical volume "/dev/sdb".
发现:
[kubeexec] ERROR 2021/07/01 02:46:30 heketi/pkg/remoteexec/log/commandlog.go:56:log.(*CommandLogger).Error: Failed to run command [/usr/sbin/lvm pvcreate -qq --metadatasize=128M --dataalignment=256K '/dev/sdb'] on [pod:glusterfs-d2glt c:glusterfs ns:devops (from host:k8s-master02 selector:glusterfs-node)]: Err[command terminated with exit code 5]: Stdout []: Stderr [WARNING: dos signature detected on /dev/sdb at offset 510. Wipe it? [y/n]: [n]
Aborted wiping of dos.
1 existing signature left on the device.
]
先umount /dev/sdb,再执行:
[root@k8s-master01 ~]#parted /dev/sdb GNU Parted 3.1 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) mklabel msdos Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want to continue? Yes/No? yes (parted) quit Information: You may need to update /etc/fstab. [root@k8s-master01 ~]#pvcreate /dev/sdb WARNING: dos signature detected on /dev/sdb at offset 510. Wipe it? [y/n]: y Wiping dos signature on /dev/sdb. Physical volume "/dev/sdb" successfully created. [root@k8s-master01 ~]#
再执行 Heketi 创建 GFS 集群,再次成功!
[root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' topology load --json=topology-sample.json Found node k8s-master01 on cluster 221eddbc9d9ec714e6de6c19f5e86e09 Found device /dev/sdb Found node k8s-master02 on cluster 221eddbc9d9ec714e6de6c19f5e86e09 Adding device /dev/sdb ... OK Found node k8s-master03 on cluster 221eddbc9d9ec714e6de6c19f5e86e09 Adding device /dev/sdb ... OK [root@k8s-master01 kubernetes]#
汲取之前创建的 Heketi 未配置持久化卷,如果 Heketi 的 Pod 重启,可能会丢失之前的配置信息,所以现在创建 Heketi 持久化卷,对 Heketi 的数据进行持久化。该持久化方式采用 GFS 提供的动态存储,也可以采用其他方式进行持久化:
#所有节点安装 device-mapper* [root@k8s-master01 ~]#yum install -y device-mapper* Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirrors.aliyun.com * extras: mirrors.aliyun.com * updates: mirrors.aliyun.com aliyun-docker-ce | 3.5 kB 00:00:00 base | 3.6 kB 00:00:00 epel | 4.7 kB 00:00:00 extras | 2.9 kB 00:00:00 updates | 2.9 kB 00:00:00 (1/2): epel/x86_64/updateinfo | 1.0 MB 00:00:01 (2/2): epel/x86_64/primary_db | 6.9 MB 00:00:12 Package device-mapper-persistent-data-0.8.5-3.el7_9.2.x86_64 already installed and latest version Package 7:device-mapper-event-libs-1.02.170-6.el7_9.5.x86_64 already installed and latest version Package 7:device-mapper-1.02.170-6.el7_9.5.x86_64 already installed and latest version Package 7:device-mapper-libs-1.02.170-6.el7_9.5.x86_64 already installed and latest version Package 7:device-mapper-event-1.02.170-6.el7_9.5.x86_64 already installed and latest version Resolving Dependencies --> Running transaction check ---> Package device-mapper-devel.x86_64 7:1.02.170-6.el7_9.5 will be installed --> Processing Dependency: pkgconfig(libudev) for package: 7:device-mapper-devel-1.02.170-6.el7_9.5.x86_64 ---> Package device-mapper-event-devel.x86_64 7:1.02.170-6.el7_9.5 will be installed ---> Package device-mapper-multipath.x86_64 0:0.4.9-134.el7_9 will be installed ---> Package device-mapper-multipath-devel.x86_64 0:0.4.9-134.el7_9 will be installed ---> Package device-mapper-multipath-libs.x86_64 0:0.4.9-134.el7_9 will be installed ---> Package device-mapper-multipath-sysvinit.x86_64 0:0.4.9-134.el7_9 will be installed --> Running transaction check ---> Package systemd-devel.x86_64 0:219-78.el7_9.3 will be installed --> Finished Dependency Resolution Dependencies Resolved ============================================================================================================ Package Arch Version Repository Size ============================================================================================================ Installing: device-mapper-devel x86_64 7:1.02.170-6.el7_9.5 updates 206 k device-mapper-event-devel x86_64 7:1.02.170-6.el7_9.5 updates 174 k device-mapper-multipath x86_64 0.4.9-134.el7_9 updates 148 k device-mapper-multipath-devel x86_64 0.4.9-134.el7_9 updates 79 k device-mapper-multipath-libs x86_64 0.4.9-134.el7_9 updates 268 k device-mapper-multipath-sysvinit x86_64 0.4.9-134.el7_9 updates 64 k Installing for dependencies: systemd-devel x86_64 219-78.el7_9.3 updates 216 k Transaction Summary ============================================================================================================ Install 6 Packages (+1 Dependent package) Total download size: 1.1 M Installed size: 1.3 M Downloading packages: (1/7): device-mapper-event-devel-1.02.170-6.el7_9.5.x86_64.rpm | 174 kB 00:00:00 (2/7): device-mapper-devel-1.02.170-6.el7_9.5.x86_64.rpm | 206 kB 00:00:00 (3/7): device-mapper-multipath-0.4.9-134.el7_9.x86_64.rpm | 148 kB 00:00:00 (4/7): device-mapper-multipath-devel-0.4.9-134.el7_9.x86_64.rpm | 79 kB 00:00:00 (5/7): device-mapper-multipath-sysvinit-0.4.9-134.el7_9.x86_64.rpm | 64 kB 00:00:00 (6/7): device-mapper-multipath-libs-0.4.9-134.el7_9.x86_64.rpm | 268 kB 00:00:00 (7/7): systemd-devel-219-78.el7_9.3.x86_64.rpm | 216 kB 00:00:00 ------------------------------------------------------------------------------------------------------------ Total 713 kB/s | 1.1 MB 00:00:01 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : device-mapper-multipath-libs-0.4.9-134.el7_9.x86_64 1/7 Installing : device-mapper-multipath-0.4.9-134.el7_9.x86_64 2/7 Installing : systemd-devel-219-78.el7_9.3.x86_64 3/7 Installing : 7:device-mapper-devel-1.02.170-6.el7_9.5.x86_64 4/7 Installing : 7:device-mapper-event-devel-1.02.170-6.el7_9.5.x86_64 5/7 Installing : device-mapper-multipath-devel-0.4.9-134.el7_9.x86_64 6/7 Installing : device-mapper-multipath-sysvinit-0.4.9-134.el7_9.x86_64 7/7 Verifying : device-mapper-multipath-libs-0.4.9-134.el7_9.x86_64 1/7 Verifying : device-mapper-multipath-sysvinit-0.4.9-134.el7_9.x86_64 2/7 Verifying : 7:device-mapper-devel-1.02.170-6.el7_9.5.x86_64 3/7 Verifying : device-mapper-multipath-0.4.9-134.el7_9.x86_64 4/7 Verifying : device-mapper-multipath-devel-0.4.9-134.el7_9.x86_64 5/7 Verifying : 7:device-mapper-event-devel-1.02.170-6.el7_9.5.x86_64 6/7 Verifying : systemd-devel-219-78.el7_9.3.x86_64 7/7 Installed: device-mapper-devel.x86_64 7:1.02.170-6.el7_9.5 device-mapper-event-devel.x86_64 7:1.02.170-6.el7_9.5 device-mapper-multipath.x86_64 0:0.4.9-134.el7_9 device-mapper-multipath-devel.x86_64 0:0.4.9-134.el7_9 device-mapper-multipath-libs.x86_64 0:0.4.9-134.el7_9 device-mapper-multipath-sysvinit.x86_64 0:0.4.9-134.el7_9 Dependency Installed: systemd-devel.x86_64 0:219-78.el7_9.3 Complete!
将配置信息保存为文件,并创建持久化相关信息
[root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' setup-openshift-heketi-storage Saving heketi-storage.json Error: Volume group "vg_1649049e5e56ca80f5d8c999fe5f7e44" not found Cannot process volume group vg_1649049e5e56ca80f5d8c999fe5f7e44
查看 topology info:
[root@k8s-master01 kubernetes]#heketi-cli topology info Error: Invalid JWT token: Token missing iss claim [root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' topology info Cluster Id: 221eddbc9d9ec714e6de6c19f5e86e09 File: true Block: true Volumes: Nodes: Node Id: 1e515e038850e2f725125cd55a19d278 State: online Cluster Id: 221eddbc9d9ec714e6de6c19f5e86e09 Zone: 1 Management Hostnames: k8s-master01 Storage Hostnames: 192.168.153.41 Devices: Id:1649049e5e56ca80f5d8c999fe5f7e44 Name:/dev/sdb State:online Size (GiB):499 Used (GiB):2 Free (GiB):497 Bricks: Id:5c27b9746dd3f888a2e1b24b569a3341 Size (GiB):2 Path: /var/lib/heketi/mounts/vg_1649049e5e56ca80f5d8c999fe5f7e44/brick_5c27b9746dd3f888a2e1b24b569a3341/brick Node Id: 4154491e2eb27e7017f9b8dab8046076 State: online Cluster Id: 221eddbc9d9ec714e6de6c19f5e86e09 Zone: 1 Management Hostnames: k8s-master02 Storage Hostnames: 192.168.153.42 Devices: Id:407895720847863c230097f19c566a02 Name:/dev/sdb State:online Size (GiB):499 Used (GiB):2 Free (GiB):497 Bricks: Id:324b16b31c6d82ebb6d0c088cd461025 Size (GiB):2 Path: /var/lib/heketi/mounts/vg_407895720847863c230097f19c566a02/brick_324b16b31c6d82ebb6d0c088cd461025/brick Node Id: 45d66ab47e0e299b25c66a57c667b1de State: online Cluster Id: 221eddbc9d9ec714e6de6c19f5e86e09 Zone: 1 Management Hostnames: k8s-master03 Storage Hostnames: 192.168.153.43 Devices: Id:55d6a98d41a7e405b8b5ceef20432344 Name:/dev/sdb State:online Size (GiB):499 Used (GiB):2 Free (GiB):497 Bricks: Id:30302ee4860525e756588ed1de2f0d88 Size (GiB):2 Path: /var/lib/heketi/mounts/vg_55d6a98d41a7e405b8b5ceef20432344/brick_30302ee4860525e756588ed1de2f0d88/brick [root@k8s-master01 kubernetes]#
发现 vg_1649049e5e56ca80f5d8c999fe5f7e44 在 Storage Hostnames: 192.168.153.41 Id:5c27b9746dd3f888a2e1b24b569a3341 Size (GiB):2 Path: /var/lib/heketi/mounts/vg_1649049e5e56ca80f5d8c999fe5f7e44/brick_5c27b9746dd3f888a2e1b24b569a3341/brick
把执行报错的命令再执行一遍:
[root@k8s-master01 kubernetes]# heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' setup-openshift-heketi-storage Saving heketi-storage.json Error: Failed to allocate new volume: Volume name 'heketidbstorage' already in use [root@k8s-master01 kubernetes]#
查看日志 kubectl logs -f deploy-heketi-6565469fdf-n2wnh:
1 [negroni] 2021-07-01T03:10:40Z | 200 | 150.487µs | 10.102.18.156:8080 | GET /clusters 2 [negroni] 2021-07-01T03:10:40Z | 200 | 328.205µs | 10.102.18.156:8080 | GET /clusters/221eddbc9d9ec714e6de6c19f5e86e09 3 [heketi] ERROR 2021/07/01 03:10:40 heketi/apps/glusterfs/volume_entry.go:919:glusterfs.eligibleClusters.func1: Name heketidbstorage already in use in cluster 221eddbc9d9ec714e6de6c19f5e86e09 4 [heketi] ERROR 2021/07/01 03:10:40 heketi/apps/glusterfs/volume_entry.go:939:glusterfs.eligibleClusters: No clusters eligible to satisfy create volume request 5 [heketi] ERROR 2021/07/01 03:10:40 heketi/apps/glusterfs/operations_manage.go:220:glusterfs.AsyncHttpOperation: Create Volume Build Failed: Volume name 'heketidbstorage' already in use 6 [negroni] 2021-07-01T03:10:40Z | 500 | 579.071µs | 10.102.18.156:8080 | POST /volumes 7 [heketi] INFO 2021/07/01 03:11:14 Starting Node Health Status refresh 8 [cmdexec] INFO 2021/07/01 03:11:14 Check Glusterd service status in node k8s-master01
继续排查:
查看集群信息常用的命令:
查看 topology info heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' topology info [flags] 查看 node info heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' node info [node_id] [flags] 查看 device info heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' device info [device_id] [flags] 查看 cluster list heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' cluster list [flags] 查看 cluster info heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' cluster info [cluster_id] [flags]
查看 node info:
[root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' node info 0b5ec08be973e47535ed25a36b44141a Node Id: 0b5ec08be973e47535ed25a36b44141a State: online Cluster Id: 1a24bdf9bc6a82a0530dcfbff24aad54 Zone: 1 Management Hostname: k8s-master03 Storage Hostname: 192.168.153.43 Devices: Id:936bddeece0f76fec700998c5520c6eb Name:/dev/sdb State:online Size (GiB):499 Used (GiB):2 Free (GiB):497 Bricks:1 [root@k8s-master01 kubernetes]#
查看 device info:
[root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' device info 936bddeece0f76fec700998c5520c6eb Device Id: 936bddeece0f76fec700998c5520c6eb Name: /dev/sdb State: online Size (GiB): 499 Used (GiB): 2 Free (GiB): 497 Bricks: Id:6b33d59f6da059a7d8e38696f8549001 Size (GiB):2 Path: /var/lib/heketi/mounts/vg_936bddeece0f76fec700998c5520c6eb/brick_6b33d59f6da059a7d8e38696f8549001/brick [root@k8s-master01 kubernetes]#
查看 cluster list:
[root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' cluster list Clusters: Id:1a24bdf9bc6a82a0530dcfbff24aad54 [file][block] [root@k8s-master01 kubernetes]#
查看 cluster info:
[root@k8s-master01 kubernetes]#heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret 'My Secret' cluster info 1a24bdf9bc6a82a0530dcfbff24aad54 Cluster id: 1a24bdf9bc6a82a0530dcfbff24aad54 Nodes: 0b5ec08be973e47535ed25a36b44141a 3bfa2d1f005fe540df39843b8f8ea283 9c678039658836b8ed4e96c97bdc8c2b Volumes: Block: true File: true [root@k8s-master01 kubernetes]#