记一次ceph集群的严重故障 (转)

问题：集群状态，坏了一个盘，pg状态好像有点问题
[root@ceph-1 ~]# ceph -s
    cluster 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
     health HEALTH_WARN
            64 pgs degraded
            64 pgs stuck degraded
            64 pgs stuck unclean
            64 pgs stuck undersized
            64 pgs undersized
            recovery 269/819 objects degraded (32.845%)
     monmap e1: 1 mons at {ceph-1=192.168.101.11:6789/0}
            election epoch 6, quorum 0 ceph-1
     osdmap e38: 3 osds: 2 up, 2 in; 64 remapped pgs
            flags sortbitwise,require_jewel_osds
      pgmap v14328: 72 pgs, 2 pools, 420 bytes data, 275 objects
            217 MB used, 40720 MB / 40937 MB avail
            269/819 objects degraded (32.845%)
                  64 active+undersized+degraded
                   8 active+clean

[root@ceph-1 ~]# ceph osd tree
ID WEIGHT TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.05846 root default
-2 0.01949     host ceph-1
0 0.01949         osd.0        up 1.00000          1.00000
-3 0.01949     host ceph-2
1 0.01949         osd.1        up 1.00000          1.00000
-4 0.01949     host ceph-3
2 0.01949         osd.2      down        0          1.00000

将osd.2的状态设置为out
root@ceph-1:~# ceph osd out osd.2
osd.2 is already out.

从集群中删除
root@ceph-1:~# ceph osd rm osd.2
removed osd.2

从CRUSH中删除
root@ceph-1:~# ceph osd crush rm osd.2
removed item id 2 name 'osd.2' from crush map

删除osd.2的认证信息
root@ceph02:~# ceph auth del osd.2
updated

umount报错
[root@ceph-3 ~]# umount /dev/vdb1
umount: /var/lib/ceph/osd/ceph-2: target is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))

kill掉ceph用户的占用
[root@ceph-3 ~]# fuser -mv /var/lib/ceph/osd/ceph-2
                     USER        PID ACCESS COMMAND
/var/lib/ceph/osd/ceph-2:
                     root     kernel mount /var/lib/ceph/osd/ceph-2
                     ceph       1517 F.... ceph-osd
[root@ceph-3 ~]# kill -9 1517
[root@ceph-3 ~]# fuser -mv /var/lib/ceph/osd/ceph-2
                     USER        PID ACCESS COMMAND
/var/lib/ceph/osd/ceph-2:
                     root     kernel mount /var/lib/ceph/osd/ceph-2
[root@ceph-3 ~]# umount /var/lib/ceph/osd/ceph-2

重新准备磁盘
[root@ceph-deploy my-cluster]# ceph-deploy --overwrite-conf osd prepare ceph-3:/dev/vdb1

激活所有节点上的osd磁盘或者分区
[root@ceph-deploy my-cluster]# ceph-deploy osd activate ceph-1:/dev/vdb1 ceph-2:/dev/vdb1 ceph-3:/dev/vdb1

报错...
[ceph-3][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/vdb1

一怒之下关机重启
[root@ceph-3 ~]# init 0
Connection to 192.168.101.13 closed by remote host.
Connection to 192.168.101.13 closed.

重启之后，osd好了，但是pg的问题好像还没解决
[root@ceph-1 ~]# ceph -s
    cluster 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
     health HEALTH_WARN
            64 pgs degraded
            64 pgs stuck degraded
            64 pgs stuck unclean
            64 pgs stuck undersized
            64 pgs undersized
            recovery 269/819 objects degraded (32.845%)
     monmap e1: 1 mons at {ceph-1=192.168.101.11:6789/0}
            election epoch 6, quorum 0 ceph-1
     osdmap e53: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds
      pgmap v14368: 72 pgs, 2 pools, 420 bytes data, 275 objects
            5446 MB used, 55960 MB / 61406 MB avail
            269/819 objects degraded (32.845%)
                  64 active+undersized+degraded
                   8 active+clean
[root@ceph-1 ~]# ceph osd tree
ID WEIGHT TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.03897 root default
-2 0.01949     host ceph-1
0 0.01949         osd.0        up 1.00000          1.00000
-3 0.01949     host ceph-2
1 0.01949         osd.1        up 1.00000          1.00000
-4       0     host ceph-3
2       0 osd.2                up 1.00000          1.00000

在ceph-1和ceph-2中加了一块硬盘，然后创建osd
[root@ceph-deploy my-cluster]# ceph-deploy --overwrite-conf osd create ceph-1:/dev/vdd ceph-2:/dev/vdd

查看集群状态，发现pg数好像小了
[root@ceph-1 ~]# ceph -s
    cluster 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
     health HEALTH_WARN
            14 pgs degraded
            14 pgs stuck degraded
            64 pgs stuck unclean
            14 pgs stuck undersized
            14 pgs undersized
            recovery 188/819 objects degraded (22.955%)
            recovery 200/819 objects misplaced (24.420%)
            too few PGs per OSD (28 < min 30)
     monmap e1: 1 mons at {ceph-1=192.168.101.11:6789/0}
            election epoch 6, quorum 0 ceph-1
     osdmap e63: 5 osds: 5 up, 5 in; 50 remapped pgs
            flags sortbitwise,require_jewel_osds
      pgmap v14408: 72 pgs, 2 pools, 420 bytes data, 275 objects
            5663 MB used, 104 GB / 109 GB avail
            188/819 objects degraded (22.955%)
            200/819 objects misplaced (24.420%)
                  26 active+remapped
                  24 active
                  14 active+undersized+degraded
                   8 active+clean
增加pg和pgp
[root@ceph-1 ~]# ceph osd pool set rbd pg_num 128
[root@ceph-1 ~]# ceph osd pool set rbd pgp_num 128

状态就成error了......
[root@ceph-1 ~]# ceph -s
    cluster 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
     health HEALTH_ERR
            118 pgs are stuck inactive for more than 300 seconds
            118 pgs peering
            118 pgs stuck inactive
            128 pgs stuck unclean
            recovery 16/657 objects misplaced (2.435%)
     monmap e2: 2 mons at {ceph-1=192.168.101.11:6789/0,ceph-3=192.168.101.13:6789/0}
            election epoch 8, quorum 0,1 ceph-1,ceph-3
     osdmap e74: 5 osds: 5 up, 5 in; 55 remapped pgs
            flags sortbitwise,require_jewel_osds
      pgmap v14459: 136 pgs, 2 pools, 356 bytes data, 221 objects
            5665 MB used, 104 GB / 109 GB avail
            16/657 objects misplaced (2.435%)
                  73 peering
                  45 remapped+peering
                  10 active+remapped
                   8 active+clean
[root@ceph-1 ~]# less /etc/ceph/ceph.co

于是我又重启了三台osd机器，重启发现又有osd down了
[root@ceph-1 ~]# ceph -s
2018-07-25 15:18:17.207665 7fb4ec2ee700 0 -- :/1038496581 >> 192.168.101.12:6789/0 pipe(0x7fb4e8063fa0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb4e805c610).fault
    cluster 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
     health HEALTH_WARN
            16 pgs degraded
            59 pgs stuck unclean
            16 pgs undersized
            recovery 134/819 objects degraded (16.361%)
            recovery 88/819 objects misplaced (10.745%)
            1/5 in osds are down
     monmap e2: 2 mons at {ceph-1=192.168.101.11:6789/0,ceph-3=192.168.101.13:6789/0}
            election epoch 12, quorum 0,1 ceph-1,ceph-3
     osdmap e95: 5 osds: 4 up, 5 in; 43 remapped pgs
            flags sortbitwise,require_jewel_osds
      pgmap v14529: 136 pgs, 2 pools, 420 bytes data, 275 objects
            5668 MB used, 104 GB / 109 GB avail
            134/819 objects degraded (16.361%)
            88/819 objects misplaced (10.745%)
                  77 active+clean
                  39 active+remapped
                  16 active+undersized+degraded
                   4 active

[root@ceph-1 ~]# ceph osd tree
2018-07-25 15:22:25.573039 7fe5ff87c700 0 -- :/3787750993 >> 192.168.101.12:6789/0 pipe(0x7fe604063fd0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fe60405c640).fault
ID WEIGHT TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.10725 root default
-2 0.04388     host ceph-1
0 0.01949         osd.0        up 1.00000          1.00000
3 0.02440         osd.3        up 1.00000          1.00000
-3 0.04388     host ceph-2
1 0.01949         osd.1      down        0          1.00000
4 0.02440         osd.4        up 1.00000          1.00000
-4 0.01949     host ceph-3
2 0.01949         osd.2        up 1.00000          1.00000

把坏盘out、rm、crush rm、auth del后，集群健康了
[root@ceph-1 ~]# ceph -s
    cluster 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
     health HEALTH_OK
     monmap e2: 2 mons at {ceph-1=192.168.101.11:6789/0,ceph-3=192.168.101.13:6789/0}
            election epoch 12, quorum 0,1 ceph-1,ceph-3
     osdmap e102: 4 osds: 4 up, 4 in
            flags sortbitwise,require_jewel_osds
      pgmap v14597: 136 pgs, 2 pools, 356 bytes data, 270 objects
            5559 MB used, 86551 MB / 92110 MB avail
                 136 active+clean

换掉了坏盘，把新的盘重新加入ceph集群(扩容也是这样操作)
[root@ceph-deploy my-cluster]# ceph-deploy disk list ceph-2
[root@ceph-deploy my-cluster]# ceph-deploy disk zap ceph-2:vdb
[root@ceph-deploy my-cluster]# ceph-deploy --overwrite-conf osd create ceph-2:vdb:/dev/vdc1

现在看是error
[root@ceph-1 ~]# ceph -s
    cluster 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
     health HEALTH_ERR
            13 pgs are stuck inactive for more than 300 seconds
            50 pgs degraded
            2 pgs peering
            1 pgs recovering
            17 pgs recovery_wait
            13 pgs stuck inactive
            23 pgs stuck unclean
            recovery 67/798 objects degraded (8.396%)
     monmap e2: 2 mons at {ceph-1=192.168.101.11:6789/0,ceph-3=192.168.101.13:6789/0}
            election epoch 12, quorum 0,1 ceph-1,ceph-3
     osdmap e110: 5 osds: 5 up, 5 in
            flags sortbitwise,require_jewel_osds
      pgmap v14633: 136 pgs, 2 pools, 356 bytes data, 268 objects
            5669 MB used, 104 GB / 109 GB avail
            67/798 objects degraded (8.396%)
                  79 active+clean
                  32 activating+degraded
                  17 active+recovery_wait+degraded
                   5 activating
                   2 peering
                   1 active+recovering+degraded
client io 0 B/s wr, 0 op/s rd, 5 op/s wr

过了一会看就完全正常了
[root@ceph-1 ~]# ceph -s
    cluster 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
     health HEALTH_OK
     monmap e2: 2 mons at {ceph-1=192.168.101.11:6789/0,ceph-3=192.168.101.13:6789/0}
            election epoch 12, quorum 0,1 ceph-1,ceph-3
     osdmap e110: 5 osds: 5 up, 5 in
            flags sortbitwise,require_jewel_osds
      pgmap v14666: 136 pgs, 2 pools, 356 bytes data, 267 objects
            5669 MB used, 104 GB / 109 GB avail
                 136 active+clean

问题：增加mon报错
[root@ceph-deploy my-cluster]# ceph-deploy --overwrite-conf mon create ceph-2
[ceph-2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph-2][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors

[root@ceph-2 ~]# less /var/log/ceph/ceph-mon.ceph-2.log
2018-07-25 15:52:02.566212 7efeec7d9780 -1 no public_addr or public_network specified, and mon.ceph-2 not present in monmap or ceph.conf

原因：ceph.conf里面没有配置public_network
[global]
fsid = 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
mon_initial_members = ceph-1,ceph-2,ceph-3
mon_host = 192.168.101.11,192.168.101.12,192.168.101.13
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd pool default size = 2

修改ceph.conf文件
[root@ceph-deploy my-cluster]# vi ceph.conf
[global]
fsid = 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
mon_initial_members = ceph-1,ceph-2,ceph-3
mon_host = 192.168.101.11,192.168.101.12,192.168.101.13
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd pool default size = 2
public_network = 192.168.122.0/24
cluster_network = 192.168.101.0/24

推送新的配置文件至各个节点
[root@ceph-deploy my-cluster]# ceph-deploy --overwrite-conf config push ceph-1 ceph-2 ceph-3

增加ceph-2为mon
[root@ceph-deploy my-cluster]# ceph-deploy mon add ceph-2

添加成功后发现，mon集群中ceph-2的ip跟其他的不一样，按照配置文件，应该跟该ceph-1、ceph-3的网段为122
[root@ceph-1 ~]# ceph -s
    cluster 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
     health HEALTH_OK
     monmap e3: 3 mons at {ceph-1=192.168.101.11:6789/0,ceph-2=192.168.122.12:6789/0,ceph-3=192.168.101.13:6789/0}
            election epoch 14, quorum 0,1,2 ceph-1,ceph-3,ceph-2
     osdmap e110: 5 osds: 5 up, 5 in
            flags sortbitwise,require_jewel_osds
      pgmap v14666: 136 pgs, 2 pools, 356 bytes data, 267 objects
            5669 MB used, 104 GB / 109 GB avail
                 136 active+clean

所以，我修改ceph.conf中mon节点的ip段为122
[root@ceph-deploy my-cluster]# vi ceph.conf
[global]
fsid = 72f44b06-b8d3-44cc-bb8b-2048f5b4acfe
mon_initial_members = ceph-1,ceph-2,ceph-3
mon_host = 192.168.122.11,192.168.122.12,192.168.122.13
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd pool default size = 2
public_network = 192.168.122.0/24
cluster_network = 192.168.101.0/24

再来一波推送
[root@ceph-deploy my-cluster]# ceph-deploy --overwrite-conf config push ceph-1 ceph-2 ceph-3

删除两个mon
[root@ceph-deploy my-cluster]# ceph-deploy mon destroy ceph-1 ceph-3

然后整个集群都不好了
[root@ceph-1 ~]# ceph -s
2018-07-25 16:35:21.723736 7f47dedfb700 0 -- 192.168.122.11:0/4277586904 >> 192.168.122.13:6789/0 pipe(0x7f47c8000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f47c8001f90).fault with nothing to send, going to standby
2018-07-25 16:35:27.723930 7f47dedfb700 0 -- 192.168.122.11:0/4277586904 >> 192.168.122.11:6789/0 pipe(0x7f47c8005330 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f47c8002410).fault with nothing to send, going to standby
2018-07-25 16:35:33.725130 7f47deffd700 0 -- 192.168.122.11:0/4277586904 >> 192.168.122.13:6789/0 pipe(0x7f47c8005330 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f47c80046e0).fault with nothing to send, going to standby

[root@ceph-1 ~]# ceph osd tree
2018-07-25 16:35:21.723736 7f47dedfb700 0 -- 192.168.122.11:0/4277586904 >> 192.168.122.13:6789/0 pipe(0x7f47c8000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f47c8001f90).fault with nothing to send, going to standby
2018-07-25 16:35:27.723930 7f47dedfb700 0 -- 192.168.122.11:0/4277586904 >> 192.168.122.11:6789/0 pipe(0x7f47c8005330 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f47c8002410).fault with nothing to send, going to standby
2018-07-25 16:35:33.725130 7f47deffd700 0 -- 192.168.122.11:0/4277586904 >> 192.168.122.13:6789/0 pipe(0x7f47c8005330 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f47c80046e0).fault with nothing to send, going to standby

好像也加不回去
[root@ceph-deploy my-cluster]# ceph-deploy mon add ceph-1 ceph-3
[ceph-1][WARNIN] 2018-07-25 16:37:52.760218 7f06739b9700 0 -- 192.168.122.11:0/2929495808 >> 192.168.122.11:6789/0 pipe(0x7f0668000c80 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f0668005c20).fault with nothing to send, going to standby
[ceph-1][WARNIN] 2018-07-25 16:37:55.760830 7f06738b8700 0 -- 192.168.122.11:0/2929495808 >> 192.168.122.13:6789/0 pipe(0x7f066800d5e0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f066800e8a0).fault with nothing to send, going to standby
[ceph-1][WARNIN] 2018-07-25 16:37:58.760748 7f06739b9700 0 -- 192.168.122.11:0/2929495808 >> 192.168.122.11:6789/0 pipe(0x7f0668000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f066800be40).fault with nothing to send, going to standby

不嫌事大，把最后一个mon也删掉
[root@ceph-deploy my-cluster]# ceph-deploy mon destroy ceph-2

[root@ceph-deploy my-cluster]# ceph-deploy new ceph-1 ceph-2 ceph-3

[root@ceph-deploy my-cluster]# ceph-deploy --overwrite-conf mon create-initial
[ceph-1][ERROR ] "ceph auth get-or-create for keytype admin returned 22
[ceph-1][DEBUG ] Error EINVAL: unknown cap type 'mgr'
[ceph-1][ERROR ] Failed to return 'admin' key from host ceph-1
[ceph-2][ERROR ] "ceph auth get-or-create for keytype admin returned 22
[ceph-2][DEBUG ] Error EINVAL: unknown cap type 'mgr'
[ceph-2][ERROR ] Failed to return 'admin' key from host ceph-2
[ceph-3][ERROR ] "ceph auth get-or-create for keytype admin returned 22
[ceph-3][DEBUG ] Error EINVAL: unknown cap type 'mgr'
[ceph-3][ERROR ] Failed to return 'admin' key from host ceph-3
[ceph_deploy.gatherkeys][ERROR ] Failed to connect to host:ceph-1, ceph-2, ceph-3
[ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpnPWk4d
[ceph_deploy][ERROR ] RuntimeError: Failed to connect any mon

[root@ceph-deploy my-cluster]# ceph-deploy mon add ceph-1
[ceph-1][INFO ] monitor: mon.ceph-1 is running

[root@ceph-deploy my-cluster]# ceph-deploy mon add ceph-2
[ceph-2][INFO ] monitor: mon.ceph-2 is running

[root@ceph-deploy my-cluster]# ceph-deploy mon add ceph-3
[ceph-3][INFO ] monitor: mon.ceph-3 is running

[root@ceph-1 ceph-ceph-1]# ceph -s
2018-07-25 20:42:07.965513 7f1482a91700 0 librados: client.admin authentication error (1) Operation not permitted
Error connecting to cluster: PermissionError

通常我们执行ceph -s 时，就相当于开启了一个客户端，连接到 Ceph 集群，而这个客户端默认是使用 client.admin 的账户密码登陆连接集群的，所以平时执行的ceph -s 相当于执行了 ceph -s --name client.admin --keyring /etc/ceph/ceph.client.admin.keyring。需要注意的是，每次我们在命令行执行 Ceph 的指令，都相当于开启一个客户端，和集群交互，再关闭客户端。现在举一个很常见的报错，这在刚接触 Ceph 时，很容易遇到：

[root@blog ~]# ceph -s
2017-08-03 02:22:27.352516 7fbd157b7700 0 librados: client.admin authentication error (1) Operation not permitted
Error connecting to cluster: PermissionError

报错信息很好理解，操作不被允许，也就是认证未通过，由于这里我们使用的是默认的client.admin 用户和它的秘钥，说明秘钥内容和 Ceph 集群记录的不一致，也就是说 /etc/ceph/ceph.client.admin.keyring 内容很可能是之前集群留下的，或者是记录了错误的秘钥，这时，只需要使用 mon.用户来执行 ceph auth list就可以查看到正确的秘钥内容：

[root@ceph-1 ceph]# ceph auth get client.admin --name mon. --keyring /var/lib/ceph/mon/ceph-ceph-1/keyring
Error ENOENT: failed to find client.admin in keyring
[root@ceph-1 ceph]#

用mon.用户瞄一眼集群
[root@ceph-1 ceph]# ceph -s --name mon. --keyring /var/lib/ceph/mon/ceph-ceph-1/keyring
    cluster 053670e9-9b12-4297-aa04-41c430091f90
     health HEALTH_ERR
            64 pgs are stuck inactive for more than 300 seconds
            64 pgs stuck inactive
            64 pgs stuck unclean
            no osds
     monmap e1: 3 mons at {ceph-1=192.168.101.11:6789/0,ceph-2=192.168.101.12:6789/0,ceph-3=192.168.101.13:6789/0}
            election epoch 8, quorum 0,1,2 ceph-1,ceph-2,ceph-3
     osdmap e1: 0 osds: 0 up, 0 in
            flags sortbitwise,require_jewel_osds
      pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects
            0 kB used, 0 kB / 0 kB avail
                  64 creating

获取client.admin的秘钥
[root@ceph-1 ceph]# ceph auth get client.admin --name mon. --keyring /var/lib/ceph/mon/ceph-ceph-1/keyring
Error ENOENT: failed to find client.admin in keyring

添加client.admin用户
[root@ceph-1 ceph]# ceph auth add client.admin --name mon. --keyring /var/lib/ceph/mon/ceph-ceph-1/keyring

再次获取client.admin的秘钥
[root@ceph-1 ceph]# ceph auth get client.admin --name mon. --keyring /var/lib/ceph/mon/ceph-ceph-1/keyring
exported keyring for client.admin
[client.admin]
   key = AQAIf1hbmuPXBxAA5Q3g/Jz8gerf+S6znEHLBQ==

修改本地client.admin的秘钥
[root@ceph-1 ceph]# vi ceph.client.admin.keyring
[client.admin]
#       key = AQAnPVBbJJWsMhAAKEqaHkWdwEWndOvqDjtjXA==
        key = AQAIf1hbmuPXBxAA5Q3g/Jz8gerf+S6znEHLBQ==
        caps mds = "allow *"
        caps mon = "allow *"
        caps osd = "allow *"

查看集群状态
[root@ceph-1 ceph]# ceph -s
2018-07-25 21:50:40.512039 7f0ca92d0700 0 librados: client.admin authentication error (13) Permission denied

给client.admin用户添加权限
[root@ceph-1 ceph]# ceph auth add client.admin mon 'allow r' osd 'allow rw'
2018-07-25 21:57:45.263271 7f68398ea700 0 librados: client.admin authentication error (13) Permission denied

之前mon create-initial时新生成的ceph.client.admin.keyring忘了加读权限
[root@ceph-1 ceph]# chmod +r /etc/ceph/ceph.client.admin.keyring

[root@ceph-1 ceph]# ceph -s
2018-07-25 22:06:17.167512 7f449b116700 0 librados: client.admin authentication error (13) Permission denied

再次给client.admin用户添加权限
[root@ceph-1 ceph]# ceph auth add client.admin mon 'allow r' osd 'allow rw' --name mon. --keyring /var/lib/ceph/mon/ceph-ceph-1/keyring
Error EINVAL: entity client.admin exists but caps do not match

历经千辛万苦，终于在谷歌找到一个方法，client.admin权限恢复后，查看到集群osd全没了
[root@ceph-1 ~]# cd /var/lib/ceph/mon
[root@ceph-1 mon]# ls
ceph-ceph-1
[root@ceph-1 mon]# cd ceph-ceph-1/
[root@ceph-1 ceph-ceph-1]# ls
done keyring store.db systemd
[root@ceph-1 ceph-ceph-1]# ceph -n mon. --keyring keyring auth caps client.admin mds 'allow *' osd 'allow *' mon 'allow *'
updated caps for client.admin
[root@ceph-1 ceph-ceph-1]# ceph -s
    cluster 053670e9-9b12-4297-aa04-41c430091f90
     health HEALTH_ERR
            64 pgs are stuck inactive for more than 300 seconds
            64 pgs stuck inactive
            64 pgs stuck unclean
            no osds
     monmap e1: 3 mons at {ceph-1=192.168.101.11:6789/0,ceph-2=192.168.101.12:6789/0,ceph-3=192.168.101.13:6789/0}
            election epoch 16, quorum 0,1,2 ceph-1,ceph-2,ceph-3
     osdmap e1: 0 osds: 0 up, 0 in
            flags sortbitwise,require_jewel_osds
      pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects
            0 kB used, 0 kB / 0 kB avail
                  64 creating

[root@ceph-1 ceph-ceph-1]# ceph osd tree
ID WEIGHT TYPE NAME    UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1      0 root default

在每个节点lsblk查看，所有挂载点均以自动卸载了，趁此，我也调整一下磁盘规格，把它们都统一该为20G
[root@ceph-1 ceph-ceph-1]# lsblk
NAME            MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sr0              11:0    1 1024M 0 rom
vda             252:0    0 100G 0 disk
├─vda1          252:1    0    1G 0 part /boot
└─vda2          252:2    0   99G 0 part
├─centos-root 253:0    0   50G 0 lvm /
├─centos-swap 253:1    0    2G 0 lvm [SWAP]
└─centos-home 253:2    0   47G 0 lvm /home
vdb             252:16   0   20G 0 disk
└─vdb1          252:17   0   20G 0 part
vdc             252:32   0   20G 0 disk
└─vdc1          252:33   0    5G 0 part
vdd             252:48   0   30G 0 disk
├─vdd1          252:49   0   25G 0 part
└─vdd2          252:50   0    5G 0 part

重新格式化磁盘
[root@ceph-deploy my-cluster]# ceph-deploy disk zap ceph-1:vdb ceph-2:vdb ceph-3:vdb
[root@ceph-deploy my-cluster]# ceph-deploy osd prepare ceph-1:vdb:vdc ceph-2:vdb:vdc ceph-3:vdb:vdc

激活osd，看似好像是osd认证失败导致的
[root@ceph-deploy my-cluster]# ceph-deploy osd activate ceph-1:vdb1:vdc
[ceph-1][WARNIN] ceph_disk.main.Error: Error: ceph osd create failed: Command '/usr/bin/ceph' returned non-zero exit status 1: 2018-07-26 10:34:36.851527 7f678c625700 0 librados: client.bootstrap-osd authentication error (1) Operation not permitted
[ceph-1][WARNIN] Error connecting to cluster: PermissionError
[ceph-1][WARNIN]
[ceph-1][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/vdb1

暂时研究到这里吧，这个集群先放着，等以后证明白cephx再来搞

重装请看这里
ceph-deploy purgedata {ceph-node} [{ceph-node}] ##清空数据
ceph-deploy forgetkeys                ##删除之前生成的密钥
ceph-deploy purge {ceph-node} [{ceph-node}]     ##卸载ceph软件
If you execute purge, you must re-install Ceph.

ceph-deploy new {initial-monitor-node(s)}
ceph-deploy install {ceph-node}[{ceph-node}
ceph-deploy mon create-initial
ceph-deploy disk list {node-name [node-name]...}
ceph-deploy disk zap osdserver1:sda
ceph-deploy osd prepare ceph-osd1:/dev/sda ceph-osd1:/dev/sdb
ceph-deploy osd activate ceph-osd1:/dev/sda1 ceph-osd1:/dev/sdb1
ceph-deploy admin {admin-node} {ceph-node}
chmod +r /etc/ceph/ceph.client.admin.keyring

相关阅读:
kibana We couldn't activate monitoring
学Redis这篇就够了！
elasticsearch 官方监控文档老版但很有用
java dump 内存分析 elasticsearch Bulk异常引发的Elasticsearch内存泄漏
Apache Beam实战指南 | 大数据管道（pipeline）设计及实践
InnoDB一棵B+树可以存放多少行数据？
函数编程真不好
面向对象编程灾难
可能是全网最好的MySQL重要知识点 | 面试必备
终于有人把elasticsearch原理讲通了

原文地址：https://www.cnblogs.com/wangbin/p/11661726.html