• rexray在CentOS上不能创建ceph rbd的docker volume问题定位


    背景

      我们通过docker的rexray插件来创建ceph rbd设备的docker volume,但总提示创建失败。

    # docker volume create --driver=rexray --opt=size=5 --name=cephrbd-book
    Error response from daemon: create test_cephrbd_volume: VolumeDriver.Create: {"Error":"Failed to create new volume"}
    

      OS版本:

    # lsb_release -a
    LSB Version:	:core-4.1-amd64:core-4.1-noarch
    Distributor ID:	CentOS
    Description:	CentOS Linux release 7.4.1708 (Core) 
    Release:	    7.4.1708
    Codename:   	Core
    

      Ceph版本:

    # ceph version
    ceph version 0.94.5
    

      rexray版本:

    # rexray version
    REX-Ray
    -------
    Binary: /usr/bin/rexray
    Flavor: client+agent+controller
    SemVer: 0.9.0
    OsArch: Linux-x86_64
    Branch: (detached from 2a7458d
    Commit: 2a7458dd90a79c673463e14094377baf9fc8695e
    Formed: Wed, 26 Jul 2017 14:35:37 CST
    
    libStorage
    ----------
    SemVer: 0.6.0
    OsArch: Linux-x86_64
    Branch: (detached from fa055d6
    Commit: fa055d6da595602715bdfd5541b4aa6d4dcbcbd9
    Formed: Wed, 26 Jul 2017 14:35:11 CST
    

    分析

      之前我们在ubuntu 16.04的环境中,也搭建过docker+ceph+rexray的环境,使用docker命令创建volume并没有报错。正常的流程是:

    1. 创建docker volume:rexray会使用rbd create命令创建一个rbd设备;
    2. 使用该volume创建docker 容器:rexray会将rbd 设备map到docker host上成为一个/dev/rbd设备,再使用mount命令将/dev/rbd设备挂载到/var/lib/libstorage/volumes/目录,供容器使用;
    3. 删除该容器时:先umount掉/dev/rbd设备在/var/lib/libstorage/volumes/目录的挂载,再rbd unmap掉rbd设备在docker host上的映射;
    4. 删除docker volume:rexray会使用rbd rm命令删除该rbd设备;
    

      在我们的实际使用情况中,发现只有第一步会出错,后面的三步都并未出现任何异常。而且,如果我们不使用docker volume create命令创建rbd设备,而是直接使用“rbd create”命令来创建rbd设备,也并未出现任何错误,且该rbd设备也可以正常使用。这是很奇怪的一个地方。
      查看docker日志,也只是提示“Failed to create new volume”,并没有其他有用的信息。

    # journalctl -xu docker
    ...
    Jan 22 10:45:13 dcos-agent2 dockerd[66886]: time="2018-01-22T10:45:13.937645601+08:00" level=error msg="Handler for POST /v1.29/volumes/create returned error: create cephrbd-book: VolumeDri
    ver.Create: {"Error":"Failed to create new volume"}
    "
    

      既然docker中没有更有价值的日志,那么就尝试看下rexray的日志,这里需要打开rexray的debug选项(打开方法参考),从而可以看到更多的日志。

    # vim /var/log/rexray/rexray.log
    ...
    time="2018-01-19T19:12:32+08:00" level=info msg="    -------------------------- HTTP REQUEST (CLIENT) -------------------------"
    time="2018-01-19T19:12:32+08:00" level=info msg="    GET /volumes/rbd?attachments=0 HTTP/1.1"
    time="2018-01-19T19:12:32+08:00" level=info msg="    Host: libstorage-server"
    time="2018-01-19T19:12:32+08:00" level=info msg="    Libstorage-Instanceid: rbd=109.105.115.73"
    time="2018-01-19T19:12:32+08:00" level=info msg="    Libstorage-Localdevices: rbd="
    time="2018-01-19T19:12:32+08:00" level=info msg="    Libstorage-Tx: txID=b9fb044b-d47a-4ac5-7533-5824b8dc737a, txCR=1516360352"
    time="2018-01-19T19:12:32+08:00" level=info msg="    "
    time="2018-01-19T19:12:33+08:00" level=info
    time="2018-01-19T19:12:33+08:00" level=info msg="    -------------------------- HTTP RESPONSE (CLIENT) -------------------------"
    time="2018-01-19T19:12:33+08:00" level=info msg="    HTTP/1.1 200 OK"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Content-Length: 228"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Content-Type: application/json"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Date: Fri, 19 Jan 2018 11:12:33 GMT"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Libstorage-Servername: jade-chopper-ky"
    time="2018-01-19T19:12:33+08:00" level=info msg="    "
    time="2018-01-19T19:12:33+08:00" level=info msg="    {"
    time="2018-01-19T19:12:33+08:00" level=info msg="      "rbd.test_fs_device2": {"
    time="2018-01-19T19:12:33+08:00" level=info msg="        "name": "test_fs_device2","
    time="2018-01-19T19:12:33+08:00" level=info msg="        "size": 16,"
    time="2018-01-19T19:12:33+08:00" level=info msg="        "id": "rbd.test_fs_device2","
    time="2018-01-19T19:12:33+08:00" level=info msg="        "type": "rbd""
    time="2018-01-19T19:12:33+08:00" level=info msg="      },"
    time="2018-01-19T19:12:33+08:00" level=info msg="      "rbd.test_majk": {"
    time="2018-01-19T19:12:33+08:00" level=info msg="        "name": "test_majk","
    time="2018-01-19T19:12:33+08:00" level=info msg="        "id": "rbd.test_majk","
    time="2018-01-19T19:12:33+08:00" level=info msg="        "type": "rbd""
    time="2018-01-19T19:12:33+08:00" level=info msg="      }"
    time="2018-01-19T19:12:33+08:00" level=info msg="    }"
    time="2018-01-19T19:12:33+08:00" level=info
    time="2018-01-19T19:12:33+08:00" level=info msg="    -------------------------- HTTP REQUEST (CLIENT) -------------------------"
    time="2018-01-19T19:12:33+08:00" level=info msg="    POST /volumes/rbd HTTP/1.1"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Host: libstorage-server"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Libstorage-Instanceid: rbd=109.105.115.73"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Libstorage-Localdevices: rbd="
    time="2018-01-19T19:12:33+08:00" level=info msg="    Libstorage-Tx: txID=b1565614-1b5f-4de2-5756-b74fb99887aa, txCR=1516360353"
    time="2018-01-19T19:12:33+08:00" level=info msg="    "
    time="2018-01-19T19:12:33+08:00" level=info msg="    {"name":"cephrbd-book","availabilityZone":"","iops":0,"size":5,"type":"","opts":{"size":"5"}}"
    time="2018-01-19T19:12:33+08:00" level=info
    time="2018-01-19T19:12:33+08:00" level=info msg="    -------------------------- HTTP RESPONSE (CLIENT) -------------------------"
    time="2018-01-19T19:12:33+08:00" level=info msg="    HTTP/1.1 500 Internal Server Error"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Content-Length: 319"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Content-Type: application/json"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Date: Fri, 19 Jan 2018 11:12:33 GMT"
    time="2018-01-19T19:12:33+08:00" level=info msg="    Libstorage-Servername: jade-chopper-ky"
    time="2018-01-19T19:12:33+08:00" level=info msg="    "
    time="2018-01-19T19:12:33+08:00" level=info msg="    {"
    time="2018-01-19T19:12:33+08:00" level=info msg="      "message": "Failed to create new volume","
    time="2018-01-19T19:12:33+08:00" level=info msg="      "status": 500,"
    time="2018-01-19T19:12:33+08:00" level=info msg="      "error": {"
    time="2018-01-19T19:12:33+08:00" level=info msg="        "driverName": "rbd","
    time="2018-01-19T19:12:33+08:00" level=info msg="        "inner": {"
    time="2018-01-19T19:12:33+08:00" level=info msg="          "inner": "Error running command: [rbd: strict_strtoll: garbage at end of string. got: '5G'\n]","
    time="2018-01-19T19:12:33+08:00" level=info msg="          "msg": "unable to create rbd""
    time="2018-01-19T19:12:33+08:00" level=info msg="        },"
    time="2018-01-19T19:12:33+08:00" level=info msg="        "opts.Size": 5,"
    time="2018-01-19T19:12:33+08:00" level=info msg="        "volumeName": "cephrbd-book""
    time="2018-01-19T19:12:33+08:00" level=info msg="      }"
    time="2018-01-19T19:12:33+08:00" level=info msg="    }"
    

      通过日志可以看到,每次发送docker volume create命令时,实际发送了两次http请求,第一次是列出当前的rbd设备,第二次才是发出post请求来创建一个rbd设备。而这里的日志,则给出了更多的信息“rbd: strict_strtoll: garbage at end of string. got: '5G'”。也就是说,很可能就是在rexray调用rbd create命令时发生了错误。这里,我们打开rexray源码继续跟踪,具体的源码文件为rexray/blob/master/libstorage/drivers/storage/rbd/utils/utils.go。

    //RBDCreate creates a new RBD volume on the cluster
    func RBDCreate(
    	ctx types.Context,
    	pool *string,
    	image *string,
    	sizeGB *int64,
    	objectSize *string,
    	features []*string) error {
    
    	cmd := exec.Command(
    		rbdCmd, "create", poolOpt, *pool,
    		"--object-size", *objectSize,
    		"--size", strconv.FormatInt(*sizeGB, 10)+"G",
    	)
    
    	for _, feature := range features {
    		cmd.Args = append(cmd.Args, "--image-feature")
    		cmd.Args = append(cmd.Args, *feature)
    	}
    
    	cmd.Args = append(cmd.Args, *image)
    	_, _, err := RunCommand(ctx, cmd)
    	if err != nil {
    		return goof.WithError("unable to create rbd", err)
    	}
    
    	return nil
    }
    

      根据上面的代码可以大致看出,rexray就是直接调用rbd create命令来创建rbd设备,和我们直接调用命令不同的是它这里是把rbd的大小转化成GB,然后再字符串拼接了一个“G”。这里我们就按照它这里的方法,组装了一个新的rbd命令,然后在CentOS上测试:

    # rbd create cephrbd-test --size 5G
    rbd: strict_strtoll: garbage at end of string. got: '5G'
    

      果然,还是报错了,我们再使用该命令在ubuntu上测试:

    # rbd create cephrbd-test --size 5G
    

      在ubuntu上运行,没有报错。同一条命令,在不同的环境下结果不同。那么最大的可能就是rbd(ceph)的版本不一致了。在ubuntu下获取ceph版本。

    # ceph version
    ceph version 10.2.7
    

      而CentOS上默认安装的ceph版本只是0.94.5(Hammer版),而Ubuntu上默认安装的却是更高的10.2.7(jewel版本)。而0.94.5 ceph版本的rbd命令无法解析“G”,“M”等描述rbd大小的参数。这就是问题出现的原因。

    解决方法

      在CentOS上安装更新的Ceph版本(jewel或更新的版本)即可。

    ps:后面的博客会介绍如何在CentOS上安装更新的Ceph版本。

  • 相关阅读:
    Unix命令大全
    vs2008 与 IE8出现的兼容性问题
    Java 创建文件、文件夹以及临时文件
    如何修改Wamp中mysql默认空密码
    PAT 乙级真题 1003.数素数
    Tags support in htmlText flash as3
    DelphiXE4 FireMonkey 试玩记录,开发IOS应用 还是移植
    10 Great iphone App Review sites to Promote your Apps!
    HTML tags in textfield
    Delphi XE4 IOS 开发, "No eligible applications were found“
  • 原文地址:https://www.cnblogs.com/styshoo/p/8339811.html
Copyright © 2020-2023  润新知