• 硬件RDMA的驱动配置和测试


    author:headsen chen

    date: 2019-01-18  10:22:20

    notice:created  by headsen chen himself and not allowed to copy, or you will count law question!

    版本环境:centos6.8 ,64位,内核:2.6.32

    1,配置网卡:
       在新卡装上机器,接收光纤,两根线都有接,而且是反接的方式接,接通后,光纤灯会亮
    2,安装软件RDMA的方式安装,编译内核和用户态,重启进入新内核4.7
    3,安装驱动:
    正常的kernel安装方法(2.6的内核)
    # /mnt/mlnx-en-4.4-2.0.7.0-rhel6.8-x86_64/install
    这里必须采用这种,因为是新内核4.7

    # tar fx mlnx-en-4.4-2.0.7.0-rhel6.8-x86_64.tgz
    # cd mlnx-en-4.4-2.0.7.0-rhel6.8-x86_64
    # ./install --add-kernel-support --skip-repo
    
    Logs dir: /tmp/mlnx-en.28728.logs
    General log file: /tmp/mlnx-en.28728.logs/general.log
    Verifying KMP rpms compatibility with target kernel...
    This program will install the mlnx-en package on your machine.
    Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
    Those packages are removed due to conflicts with mlnx-en, do not reinstall them.
    
    Do you want to continue?[y/N]:y
    
    
    rpm --nosignature -e --allmatches --nodeps rdma rdma
    
    Starting mlnx-en-4.4-2.0.7.0 installation ...
    
    Installing mlnx-en-utils 4.4 RPM
    Preparing...                ##################################################
    mlnx-en-utils               ##################################################
    Installing kmod-mlnx-en 4.4 RPM
    Preparing...                ##################################################
    kmod-mlnx-en                ##################################################
    Installing mlnx-en-sources 4.4 RPM
    Preparing...                ##################################################
    mlnx-en-sources             ##################################################
    Installing mlnx-en-doc 4.4 RPM
    Preparing...                ##################################################
    mlnx-en-doc                 ##################################################
    Installing user level RPMs:
    Preparing...                ##################################################
    ofed-scripts                ##################################################
    Preparing...                ##################################################
    mstflint                    ##################################################
    Device (83:00.0):
        83:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
        Link Width: x8
        PCI Link Speed: 8GT/s
    
    Device (83:00.1):
        83:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
        Link Width: x8
        PCI Link Speed: 8GT/s
    
    
    Installation finished successfully.
    
    
    Preparing...                ########################################### [100%]
       1:mlnx-fw-updater        ########################################### [100%]
    Updated /usr/share/hwdata/pci.ids
    Attempting to perform Firmware update...
    Querying Mellanox devices firmware ...
    
    Device #1:
    ----------
    
      Device Type:      ConnectX4LX
      Part Number:      MCX4121A-XCA_Ax
      Description:      ConnectX-4 Lx EN network interface card; 10GbE dual-port SFP28; PCIe3.0 x8; ROHS R6
      PSID:             MT_2420110004
      PCI Device Name:  83:00.0
      Base MAC:         ec0d9ad2fd68
      Versions:         Current        Available     
         FW             14.20.1010     14.23.1020    
         PXE            3.5.0210       3.5.0504      
         UEFI           N/A            14.16.0017    
    
      Status:           Update required
    
    ---------
    Found 1 device(s) requiring firmware update...
    
    Device #1: Updating FW ...                                                                                               Done
    
    Restart needed for updates to take effect.
    Log File: /tmp/mlnx-en.28728.logs/fw_update.log
    Configuring /etc/security/limits.conf.
    To load the new driver, run:
    /etc/init.d/mlnx-en.d restart

    4,重启服务:

    /etc/init.d/mlnx-en.d restart

    5,安装MLNX_OFED_LINUX-4.4
    这里不用像软件RDMA 那样的启动rxe_cfg了。

    yum -y install libmml tcl tk libmnl
    tar fx MLNX_OFED_LINUX-4.4-2.0.7.0-rhel6.8-x86_64.tgz
    cd MLNX_OFED_LINUX-4.4-2.0.7.0-rhel6.8-x86_64
    ./mlnxofedinstall --add-kernel-support --skip-repo
    /etc/init.d/openibd restart  # 这个命令最好在管理卡上执行,xshell上执行有可能导致网卡掉IP,
    /etc/init.d/network restart
    chkconfig openibd on
    ibv_devices
    出现一下结果代表成功:
    # ibv_devices
        device                 node GUID
        ------              ----------------
        mlx5_1              ec0d9a0300d2fc99
        mlx5_0              ec0d9a0300d2fc98
    
    如果这一步不成功(有时候rxe_cfg不启动也可以):
    # rxe_cfg start (并绑定eth4网卡)
    # ibv_devices
        device                 node GUID
        ------              ----------------
        rxe0                ee0d9afffed2fd68
    # ibv_devinfo rxe0
    hca_id:    rxe0
        transport:            InfiniBand (0)
        fw_ver:                0.0.0
        node_guid:            ee0d:9aff:fed2:fd68
        sys_image_guid:            0000:0000:0000:0000
        vendor_id:            0x0000
        vendor_part_id:            0
        hw_ver:                0x0
        phys_port_cnt:            1
            port:    1
                state:            PORT_ACTIVE (4)
                max_mtu:        4096 (5)
                active_mtu:        1024 (3)
                sm_lid:            0
                port_lid:        0
                port_lmc:        0x00
                link_layer:        Ethernet

    6,利用rping 命令测试:
    生成server端:

    [root@bj01-prd-hadoop499.vivo.lan:/root]
    # rping -s -a 10.20.15.23 -v -C 10

    生成client端:
    client端的安装和服务端一样,生成命令是:

    # rping  -c -a 10.20.15.23 -v -C 10

    此时就会出现一下界面,证明安装成功:

    # rping -s -a 10.20.15.23 -v -C 10
    server ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqr
    server ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrs
    server ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrst
    server ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstu
    server ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuv
    server ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvw
    server ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwx
    server ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxy
    server ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz
    server ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyzA
    server DISCONNECT EVENT...
    wait for RDMA_READ_ADV state 10
    # rping -c -a 10.20.15.23 -v -C 10
    ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqr
    ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrs
    ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrst
    ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstu
    ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuv
    ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvw
    ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwx
    ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxy
    ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz
    ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyzA
    client DISCONNECT EVENT...

    ------------------------------------------------------
    利用udaddy来测试,出现一下结果代表成功:

    服务端:
    [root@bj01-prd-hadoop499.vivo.lan:/mnt/MLNX_OFED_LINUX-4.4-2.0.7.0-rhel6.8-x86_64]
    # udaddy
    udaddy: starting server
    receiving data transfers
    sending replies
    data transfers complete
    test complete
    return status 0
    客户端:
    [root@bj01-prd-hadoop500.vivo.lan:/root]
    # udaddy -s 10.20.15.23
    udaddy: starting client
    udaddy: connecting
    initiating data transfers
    receiving data transfers
    data transfers complete
    test complete
    return status 0



  • 相关阅读:
    loadrunner12-参数化以及参数化关联
    loadrunner--vugen录制脚本提示“无Internet访问。您可能无法录制并执行业务进程”
    loadrunner--web_url函数用法
    loadrunner12-用Chrome如何录制脚本
    LoadRunner--Analysis各项指标详解
    Windows Error Code(windows错误代码详解)
    CentOS 7 (Linux) 下载百度网盘大文件
    博客园cnblogs:自定义页面风格
    Windows Server 2003 添加“Resin”到“服务”出错
    转:mysql分页原理和高效率的mysql分页查询语句
  • 原文地址:https://www.cnblogs.com/kaishirenshi/p/10286270.html
Copyright © 2020-2023  润新知