• InfiniBand 网卡测试


    简介

    因为要做基于RDMA的分布式系统,所以买了2块二手InfiniBand做开发,这一篇博客记录一下infiniband网卡的测试

    • 网卡型号:Mellanox ConnectX-2 MHQH29B Dual Port 4x QDR PCIe 2.0 x8
    • 机器环境:ubuntu 14.04, ubuntu 12.04

    2块网卡可以不通过交换机直连,(使用OpenSM)

    安装步骤

    1. 安装网卡,并连线(没有驱动的时候,网卡上的灯不亮)
    2. 下载和安装mellanox驱动:http://www.mellanox.com/page/software_overview_ib
      • 安装时建议加--force选项
      • 安装完成后系统会检测网卡的pcie配置,比如会提示当前一个网卡插在x4插槽
    3. 重启机器(网卡连通的端口的灯会亮)
    4. 给每个网卡配置静态ip,例如:
    auto ib1
    iface ib1 inet static
    address 10.0.0.1
    netmask 255.255.255.0
    

    测试步骤

    网卡信息查看

    • ibnodes命令,会发现端口连接的信息
    mlx@m04:~$ ibnodes
    Ca	: 0x0002c903000ae254 ports 2 "up75 HCA-1"
    Ca	: 0x0002c903000ec606 ports 2 "m04 HCA-1"
    
    • ifconfig会发现ib端口
    ib0       Link encap:UNSPEC  HWaddr A0-00-02-20-FE-80-00-00-00-00-00-00-00-00-00-00  
              UP BROADCAST MULTICAST  MTU:4092  Metric:1
              RX packets:0 errors:0 dropped:0 overruns:0 frame:0
              TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:256 
              RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
    
    ib1       Link encap:UNSPEC  HWaddr A0-00-03-00-FE-80-00-00-00-00-00-00-00-00-00-00  
              inet addr:10.0.0.1  Bcast:10.0.0.255  Mask:255.255.255.0
              inet6 addr: fe80::202:c903:e:c608/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
              RX packets:54575 errors:0 dropped:0 overruns:0 frame:0
              TX packets:67623 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:256 
              RX bytes:3174514 (3.1 MB)  TX bytes:891903946 (891.9 MB)
    
    • ibstatus可以查看网卡状态,如下所示,可以发现port 2协商的速度为4X QDR
    mlx@m04:~$ ibstatus
    Infiniband device 'mlx4_0' port 1 status:
    	default gid:	 fe80:0000:0000:0000:0002:c903:000e:c607
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 2: Polling
    	rate:		 10 Gb/sec (4X)
    	link_layer:	 InfiniBand
    
    Infiniband device 'mlx4_0' port 2 status:
    	default gid:	 fe80:0000:0000:0000:0002:c903:000e:c608
    	base lid:	 0x1
    	sm lid:		 0x1
    	state:		 4: ACTIVE
    	phys state:	 5: LinkUp
    	rate:		 40 Gb/sec (4X QDR)
    	link_layer:	 InfiniBand
    
    

    2台机器无需交换机连通

    使用opensm(需root权限)

    mlx@m04:~$ sudo opensm
    [sudo] password for mlx: 
    -------------------------------------------------
    OpenSM 4.7.0.MLNX20160523.25f7c7a
    Command Line Arguments:
     Log File: /var/log/opensm.log
    -------------------------------------------------
    OpenSM 4.7.0.MLNX20160523.25f7c7a
    
    Using default GUID 0x2c903000ec608
    Entering DISCOVERING state
    
    Entering MASTER state
    

    此时可以互ping:

    mlx@m04:~$ ping 10.0.0.2
    PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
    64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.294 ms
    64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.155 ms
    64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=0.151 ms
    64 bytes from 10.0.0.2: icmp_seq=4 ttl=64 time=0.155 ms
    ^C
    --- 10.0.0.2 ping statistics ---
    4 packets transmitted, 4 received, 0% packet loss, time 3000ms
    rtt min/avg/max/mdev = 0.151/0.188/0.294/0.063 ms
    

    速度测试

    • 一台机器开启opensm(需root权限),使用ib_send_bw

    • 把一台机器作为server:

    mlx@m04:~$ ib_send_bw -a -c UD -d mlx4_0 -i 2
    
    ************************************
    * Waiting for client to connect... *
    ************************************
    
    • 把另外一台机器作为client:由于up75的网卡插在PCIe 2.0 x4端口,所以速度仅达到了x4的上限,没有达到40Gb/s
    mlx@up75:~$ ib_send_bw -a -c UD -d mlx4_0 -i 2 10.0.0.1
     Max msg size in UD is MTU 4096
     Changing to this MTU
    ---------------------------------------------------------------------------------------
                        Send BW Test
     Dual-port       : OFF		Device         : mlx4_0
     Number of qps   : 1		Transport type : IB
     Connection type : UD		Using SRQ      : OFF
     TX depth        : 128
     CQ Moderation   : 100
     Mtu             : 4096[B]
     Link type       : IB
     Max inline data : 0[B]
     rdma_cm QPs	 : OFF
     Data ex. method : Ethernet
    ---------------------------------------------------------------------------------------
     local address: LID 0x02 QPN 0x0238 PSN 0xf162c2
     remote address: LID 0x01 QPN 0x021a PSN 0xbc213c
    ---------------------------------------------------------------------------------------
     #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]
     2          1000             5.72               5.20   		   2.727911
     4          1000             11.49              11.34  		   2.972020
     8          1000             22.99              22.61  		   2.963387
     16         1000             45.98              45.31  		   2.969666
     32         1000             91.70              90.55  		   2.967229
     64         1000             183.14             180.77 		   2.961664
     128        1000             366.79             361.35 		   2.960143
     256        1000             727.44             718.16 		   2.941597
     512        1000             1088.50            1044.70		   2.139549
     1024       1000             1264.96            1263.29		   1.293610
     2048       1000             1407.22            1406.43		   0.720094
     4096       1000             1492.93            1492.75		   0.382143
    
    

    延迟测试

    • 一台机器开启opensm(需root权限),使用ib_send_lat

    • 把一台机器作为server:

    mlx@m04:~$ ib_send_lat -a -c UD -d mlx4_0 -i 2
    
    ************************************
    * Waiting for client to connect... *
    ************************************
    
    • 把另外一台机器作为client:
    mlx@up75:~$ ib_send_lat -a -c UD -d mlx4_0 -i 2 10.0.0.1
     Max msg size in UD is MTU 4096
     Changing to this MTU
    ---------------------------------------------------------------------------------------
                        Send Latency Test
     Dual-port       : OFF		Device         : mlx4_0
     Number of qps   : 1		Transport type : IB
     Connection type : UD		Using SRQ      : OFF
     TX depth        : 1
     Mtu             : 4096[B]
     Link type       : IB
     Max inline data : 188[B]
     rdma_cm QPs	 : OFF
     Data ex. method : Ethernet
    ---------------------------------------------------------------------------------------
     local address: LID 0x02 QPN 0x0239 PSN 0x29d370
     remote address: LID 0x01 QPN 0x021b PSN 0xbc98c4
    ---------------------------------------------------------------------------------------
     #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]
     2       1000          1.25           14.72        1.34   
     4       1000          1.24           88.94        1.27   
     8       1000          1.20           77.49        1.22   
     16      1000          1.21           66.69        1.23   
     32      1000          1.23           61.58        1.25   
     64      1000          1.27           12.92        1.30   
     128     1000          1.42           6.98         1.44   
     256     1000          1.94           173.62       1.97   
     512     1000          2.22           41.65        2.25   
     1024    1000          2.79           37.47        2.81   
     2048    1000          3.91           18.85        3.94   
     4096    1000          6.16           38.06        6.20   
    ---------------------------------------------------------------------------------------
    
    

    参考文献:

  • 相关阅读:
    远程连接mysql root账号报错:2003-can't connect to MYSQL serve
    php定位并且获取天气信息
    PHP Ajax 跨域问题最佳解决方案
    AngularJS 过滤器
    AngularJS自定义指令及指令配置项
    php获取微信的openid
    SVN服务器
    redis的LRU策略理解
    golang入门案例之http client请求
    golang入门案例之SOCKET
  • 原文地址:https://www.cnblogs.com/xysmlx/p/5711069.html
Copyright © 2020-2023  润新知