• 通过备份 Etcd 来完美恢复 Kubernetes 中的误删数据


    误删或者机器宕机,会导致 Etcd 数据的丢失或某个节点的 Etcd 数据异常时,请不要慌,认真看完此文,绝对有收获。当误删时,如何恢复数据,这个操作需求在实际环境当中是不可避免的。以下描述删除两个 namespace 下的 Pod,如何恢复对应 namespace 的数据。

    备份etcd

     ETCDCTL_API=3; etcdctl snapshot save snap.db --endpoints=https://172.16.230.84:2379  --cacert=/etc/kubernetes/ssl/ca.pem  --cert=/etc/kubernetes/ssl/etcd.pem  --key=/etc/kubernetes/ssl/etcd-key.pem 
    {"level":"info","ts":1628844708.147097,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"snap.db.part"}
    {"level":"info","ts":"2021-08-13T16:51:48.169+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
    {"level":"info","ts":1628844708.1696196,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"https://172.16.230.84:2379"}
    {"level":"info","ts":"2021-08-13T16:51:48.441+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
    {"level":"info","ts":1628844708.467766,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"https://172.16.230.84:2379","size":"16 MB","took":0.320511222}
    {"level":"info","ts":1628844708.46799,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"snap.db"}
    Snapshot saved at snap.db

    停止所有 master 上 kube-apiserver 服务

    systemctl  stop kube-apiserver

    停止3台master上的 Etcd 运行

    systemctl  stop  etcd

    恢复备份

    不同环境下,目录可能不一样,可以通过 systemctl status etcd 查看 Etcd 配置参数。特别需要注意 name、initial-cluster、initial-cluster-token、initial-advertise-peer-urls 和 data-dir 参数的值。

    • 在第一台 Etcd 节点上,注意需要 ETCDCTL_API=3、name 值、IP 值、snapshot.db 文件目录和 data-dir 目录。
    export ETCDCTL_API=3
    一条指令,可以直接在终端上修改里面参数
    etcdctl snapshot restore snapshot.db --name etcd1 --initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" --initial-cluster-token k8s_etcd --initial-advertise-peer-urls https://192.168.0.25:2380 --data-dir=/var/lib/etcd
    和上面指令一样作用,把长的指令以换行形式展现
    etcdctl snapshot restore snapshot.db --name etcd1 
    --initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" 
    --initial-cluster-token k8s_etcd 
    --initial-advertise-peer-urls https://192.168.0.25:2380 
    --data-dir=/var/lib/etcd
    
    2021-01-19 11:17:06.773113 I | mvcc: restore compact to 96139
    2021-01-19 11:17:06.800086 I | etcdserver/membership: added member 7370b1d3dc967c [https://192.168.0.25:2380] to cluster e4d7f96e88cc9d71
    2021-01-19 11:17:06.800159 I | etcdserver/membership: added member 2ef3cfc4ca48ad38 [https://192.168.0.26:2380] to cluster e4d7f96e88cc9d71
    2021-01-19 11:17:06.800190 I | etcdserver/membership: added member 3a0c86c4c744477c [https://192.168.0.28:2380] to cluster e4d7f96e88cc9d71
    • 第二台和第三台 Etcd 恢复数据,同样需要改变 ETCDCTL_API=3、name 值、IP 值、snapshot.db 文件目录和 data-dir 目录。
    export ETCDCTL_API=3
    一条指令,可以直接在终端上修改里面参数
    etcdctl snapshot restore snapshot.db --name etcd2 --initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" --initial-cluster-token k8s_etcd --initial-advertise-peer-urls https://192.168.0.26:2380 --data-dir=/var/lib/etcd
    和上面指令一样作用,把长的指令以换行形式展现
    etcdctl snapshot restore snapshot.db --name etcd2 
    --initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" 
    --initial-cluster-token k8s_etcd 
    --initial-advertise-peer-urls https://192.168.0.26:2380 
    --data-dir=/var/lib/etcd
    
    2021-01-19 11:19:59.857363 I | mvcc: restore compact to 96139
    2021-01-19 11:19:59.873793 I | etcdserver/membership: added member 7370b1d3dc967c [https://192.168.0.25:2380] to cluster e4d7f96e88cc9d71
    2021-01-19 11:19:59.873837 I | etcdserver/membership: added member 2ef3cfc4ca48ad38 [https://192.168.0.26:2380] to cluster e4d7f96e88cc9d71
    2021-01-19 11:19:59.873852 I | etcdserver/membership: added member 3a0c86c4c744477c [https://192.168.0.28:2380] to cluster e4d7f96e88cc9d71
    
    export ETCDCTL_API=3
    一条指令,可以直接在终端上修改里面参数
    etcdctl snapshot restore snapshot.db --name etcd3 --initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" --initial-cluster-token k8s_etcd --initial-advertise-peer-urls https://192.168.0.28:2380 --data-dir=/var/lib/etcd
    和上面指令一样作用,把长的指令以换行形式展现
    etcdctl snapshot restore snapshot.db --name etcd3 
    --initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" 
    --initial-cluster-token k8s_etcd 
    --initial-advertise-peer-urls https://192.168.0.28:2380 
    --data-dir=/var/lib/etcd
    
    2021-01-19 11:22:21.423215 I | mvcc: restore compact to 96139
    2021-01-19 11:22:21.438319 I | etcdserver/membership: added member 7370b1d3dc967c [https://192.168.0.25:2380] to cluster e4d7f96e88cc9d71
    2021-01-19 11:22:21.438357 I | etcdserver/membership: added member 2ef3cfc4ca48ad38 [https://192.168.0.26:2380] to cluster e4d7f96e88cc9d71
    2021-01-19 11:22:21.438371 I | etcdserver/membership: added member 3a0c86c4c744477c [https://192.168.0.28:2380] to cluster e4d7f96e88cc9d71
    • 三台 Etcd 启动
    一条指令,可以直接在终端上修改里面参数
    etcdctl --cacert=/etc/ssl/etcd/ssl/ca.pem --cert=/etc/ssl/etcd/ssl/node-node3.pem --key=/etc/ssl/etcd/ssl/node-node3-key.pem --endpoints=https://192.168.0.25:2379,https://192.168.0.26:2379,https://192.168.0.28:2379 endpoint health
    和上面指令一样作用,把长的指令以换行形式展现
    etcdctl --cacert=/etc/ssl/etcd/ssl/ca.pem 
    --cert=/etc/ssl/etcd/ssl/node-node3.pem 
    --key=/etc/ssl/etcd/ssl/node-node3-key.pem 
    --endpoints=https://192.168.0.25:2379,https://192.168.0.26:2379,https://192.168.0.28:2379 
    endpoint health
    
    https://192.168.0.28:2379 is healthy: successfully committed proposal: took = 11.664519ms
    https://192.168.0.26:2379 is healthy: successfully committed proposal: took = 5.04665ms
    https://192.168.0.25:2379 is healthy: successfully committed proposal: took = 1.837265ms

    总结

    Kubernetes 集群备份主要是备份 Etcd 集群。而恢复时,主要考虑恢复整个顺序:
    
    停止 Kube-apiserver--> 停止 Etcd--> 恢复数据 --> 启动 Etcd --> 启动 Kube-apiserver

    注意:备份 Etcd 集群时,只需要备份一个 Etcd 就行,恢复时,拿同一份备份数据恢复。

    参考:https://mp.weixin.qq.com/s/4b2COdr5q4SFfJTy3wl8gA

  • 相关阅读:
    课后listview作业
    安卓sql
    activity带数据跳转
    安卓第四周作业
    15周作业
    十三周作业-集合
    十三周上机练习
    12周作业
    linux
    Questions.
  • 原文地址:https://www.cnblogs.com/fengjian2016/p/15138137.html
Copyright © 2020-2023  润新知