1、需求描述
默认情况下Zabbix 自带模板 "Template OS Linux" 中网络接口LLD自动发现除还回接口外的所有接口,当这并不一定是我们想要的结果。
比如我有一台运行kvm的服务器,服务器上有四个物理接口 eth0-eth3 , 三个网桥接口 br0-br2 , 还有很多个虚机的网卡接口 vnetxx ,如下所示:
1 [root@host00 ~]# ifconfig |grep HW 2 br0 Link encap:Ethernet HWaddr EC:F4:BB:D6:37:69 3 br1 Link encap:Ethernet HWaddr EC:F4:BB:D6:37:6A 4 br2 Link encap:Ethernet HWaddr EC:F4:BB:D6:37:6B 5 eth0 Link encap:Ethernet HWaddr EC:F4:BB:D6:37:68 6 eth1 Link encap:Ethernet HWaddr EC:F4:BB:D6:37:69 7 eth2 Link encap:Ethernet HWaddr EC:F4:BB:D6:37:6A 8 eth3 Link encap:Ethernet HWaddr EC:F4:BB:D6:37:6B 9 vnet0 Link encap:Ethernet HWaddr FE:54:00:00:00:15 10 vnet1 Link encap:Ethernet HWaddr FE:54:00:00:00:17 11 vnet2 Link encap:Ethernet HWaddr FE:54:01:00:00:15 12 vnet3 Link encap:Ethernet HWaddr FE:54:01:00:00:17 13 vnet4 Link encap:Ethernet HWaddr FE:54:00:00:00:2F 14 vnet5 Link encap:Ethernet HWaddr FE:54:01:00:00:2D 15 vnet6 Link encap:Ethernet HWaddr FE:54:00:00:00:30 16 vnet7 Link encap:Ethernet HWaddr FE:54:01:00:00:2E 17 vnet8 Link encap:Ethernet HWaddr FE:54:00:00:00:1A 18 vnet9 Link encap:Ethernet HWaddr FE:54:01:00:00:1A 19 vnet10 Link encap:Ethernet HWaddr FE:54:00:00:00:16 20 vnet11 Link encap:Ethernet HWaddr FE:54:01:00:00:16 21 vnet12 Link encap:Ethernet HWaddr FE:54:00:00:00:11 22 vnet13 Link encap:Ethernet HWaddr FE:54:01:00:00:11 23 vnet14 Link encap:Ethernet HWaddr FE:54:00:A1:61:A1 24 vnet15 Link encap:Ethernet HWaddr FE:54:00:A2:61:A1 25 vnet16 Link encap:Ethernet HWaddr FE:54:00:A3:61:A1 26 vnet20 Link encap:Ethernet HWaddr FE:54:00:00:00:2B 27 vnet21 Link encap:Ethernet HWaddr FE:54:01:00:00:2A 28 vnet24 Link encap:Ethernet HWaddr FE:54:00:00:00:2C 29 vnet25 Link encap:Ethernet HWaddr FE:54:01:00:00:2B
这里我们只想监控服务器上的四个物理接口,对于网桥接口和虚机网卡接口不关心,由于虚机的创建删除比较频繁,导致该服务器在Zabbix中有很多 Item 是 Not supported 的状态,如果等待这些不支持的监控项自动删除大概需要等待一个月的时间。
另外在zabbix-server的日志中也会有记录大量的错误日志:
1 24102:20160701:145019.036 item "10.12.29.100:net.if.in[vnet24]" became supported 2 24101:20160701:145020.037 item "10.12.29.100:net.if.in[vnet25]" became supported 3 24102:20160701:145101.129 item "10.12.29.100:net.if.out[vnet24]" became supported 4 24100:20160701:145102.130 item "10.12.29.100:net.if.out[vnet25]" became supported 5 24101:20160701:155859.871 item "10.12.29.100:net.if.in[vnet6]" became not supported: Cannot find information for this network interface in /proc/net/dev. 6 24100:20160701:155909.880 item "10.12.29.100:net.if.in[vnet17]" became not supported: Cannot find information for this network interface in /proc/net/dev. 7 24100:20160701:155941.908 item "10.12.29.100:net.if.out[vnet6]" became not supported: Cannot find information for this network interface in /proc/net/dev. 8 24102:20160701:155951.916 item "10.12.29.100:net.if.out[vnet17]" became not supported: Cannot find information for this network interface in /proc/net/dev. 9 24102:20160701:160343.187 item "10.12.29.100:net.if.out[vnet7]" became not supported: Cannot find information for this network interface in /proc/net/dev. 10 24101:20160701:160358.201 item "10.12.29.100:net.if.out[vnet22]" became not supported: Cannot find information for this network interface in /proc/net/dev. 11 24102:20160701:160359.201 item "10.12.29.100:net.if.out[vnet23]" became not supported: Cannot find information for this network interface in /proc/net/dev. 12 24100:20160701:160400.202 item "10.12.29.100:net.if.in[vnet7]" became not supported: Cannot find information for this network interface in /proc/net/dev. 13 24101:20160701:160417.217 item "10.12.29.100:net.if.in[vnet22]" became not supported: Cannot find information for this network interface in /proc/net/dev. 14 24100:20160701:160418.219 item "10.12.29.100:net.if.in[vnet23]" became not supported: Cannot find information for this network interface in /proc/net/dev. 15 24102:20160701:160420.220 item "10.12.29.100:net.if.in[vnet19]" became not supported: Cannot find information for this network interface in /proc/net/dev. 16 24101:20160701:160424.224 item "10.12.29.100:net.if.out[vnet19]" became not supported: Cannot find information for this network interface in /proc/net/dev. 17 24101:20160701:161339.730 item "10.12.29.100:net.if.out[vnet4]" became not supported: Cannot find information for this network interface in /proc/net/dev. 18 24100:20160701:161340.730 item "10.12.29.100:net.if.out[vnet5]" became not supported: Cannot find information for this network interface in /proc/net/dev. 19 24100:20160701:161358.750 item "10.12.29.100:net.if.in[vnet4]" became not supported: Cannot find information for this network interface in /proc/net/dev. 20 24100:20160701:161358.750 item "10.12.29.100:net.if.in[vnet5]" became not supported: Cannot find information for this network interface in /proc/net/dev. 21 24100:20160701:161408.760 item "10.12.29.100:net.if.in[vnet16]" became not supported: Cannot find information for this network interface in /proc/net/dev. 22 24102:20160701:161410.761 item "10.12.29.100:net.if.in[vnet18]" became not supported: Cannot find information for this network interface in /proc/net/dev. 23 24102:20160701:161413.764 item "10.12.29.100:net.if.in[vnet14]" became not supported: Cannot find information for this network interface in /proc/net/dev. 24 24101:20160701:161414.765 item "10.12.29.100:net.if.in[vnet15]" became not supported: Cannot find information for this network interface in /proc/net/dev. 25 24101:20160701:161450.795 item "10.12.29.100:net.if.out[vnet16]" became not supported: Cannot find information for this network interface in /proc/net/dev. 26 24100:20160701:161452.798 item "10.12.29.100:net.if.out[vnet18]" became not supported: Cannot find information for this network interface in /proc/net/dev. 27 24103:20160701:161455.800 item "10.12.29.100:net.if.out[vnet14]" became not supported: Cannot find information for this network interface in /proc/net/dev. 28 24100:20160701:161456.801 item "10.12.29.100:net.if.out[vnet15]" became not supported: Cannot find information for this network interface in /proc/net/dev. 29 24103:20160701:161958.077 item "10.12.29.100:net.if.in[vnet4]" became supported 30 24101:20160701:161959.075 item "10.12.29.100:net.if.in[vnet5]" became supported 31 24102:20160701:162040.114 item "10.12.29.100:net.if.out[vnet4]" became supported 32 24101:20160701:162041.116 item "10.12.29.100:net.if.out[vnet5]" became supported
2、原理分析
首先查看“Template OS Linux ”中关于网络接口的自动发现规则,
下图中三个框分别是:
Linux服务器网络接口自动发现使用的key
Linux服务器网络接口自动发现时间间隔
Linux服务器网络接口自动发现的对象丢失后为其保留的时长
使用正则表达式过滤 zabbix_get -s 10.12.29.100 -k net.if.discovery 的结果,并将符合条件的结果赋值给 {#IFNAME} ,然后就可以监控接口 {#IFNAME} 的流量信息了,以及为接口 {#IFNAME} 的流量信息生成 Graph
接下来查看正则表达式 “@Network interfaces for discovery” 的内容,“Network interfaces for discovery”只是一组正则表达式的名字而已,其内容去下图中的位置查看
查看已有的两条规则:内容是屏蔽掉还回接口
下面分析以下 zabbix_get -s 10.12.29.100 -k net.if.discovery 的执行结果(zabbix在执行lld的时候就是调用zabbix_get 命令获取被监控服务器网卡列表的)
1 [root@zabbix-server zabbix]# zabbix_get -s 10.12.29.100 -k net.if.discovery 2 {"data":[{"{#IFNAME}":"lo"},{"{#IFNAME}":"eth0"},{"{#IFNAME}":"eth1"},{"{#IFNAME}":"eth2"},{"{#IFNAME}":"eth3"},{"{#IFNAME}":"br0"},{"{#IFNAME}":"br1"},{"{#IFNAME}":"br2"},{"{#IFNAME}":"vnet12"},{"{#IFNAME}":"vnet13"},{"{#IFNAME}":"vnet0"},{"{#IFNAME}":"vnet2"},{"{#IFNAME}":"vnet10"},{"{#IFNAME}":"vnet11"},{"{#IFNAME}":"vnet1"},{"{#IFNAME}":"vnet3"},{"{#IFNAME}":"vnet8"},{"{#IFNAME}":"vnet9"},{"{#IFNAME}":"vnet20"},{"{#IFNAME}":"vnet21"},{"{#IFNAME}":"vnet24"},{"{#IFNAME}":"vnet25"},{"{#IFNAME}":"vnet4"},{"{#IFNAME}":"vnet5"},{"{#IFNAME}":"vnet6"},{"{#IFNAME}":"vnet7"},{"{#IFNAME}":"vnet14"},{"{#IFNAME}":"vnet15"},{"{#IFNAME}":"vnet16"}]}
对返回的json稍微美化一下
1 { 2 "data": [ 3 { 4 "{#IFNAME}": "lo" 5 }, 6 { 7 "{#IFNAME}": "eth0" 8 }, 9 { 10 "{#IFNAME}": "eth1" 11 }, 12 { 13 "{#IFNAME}": "eth2" 14 }, 15 { 16 "{#IFNAME}": "eth3" 17 }, 18 { 19 "{#IFNAME}": "br0" 20 }, 21 { 22 "{#IFNAME}": "br1" 23 }, 24 { 25 "{#IFNAME}": "br2" 26 }, 27 { 28 "{#IFNAME}": "vnet12" 29 }, 30 { 31 "{#IFNAME}": "vnet13" 32 }, 33 { 34 "{#IFNAME}": "vnet0" 35 }, 36 { 37 "{#IFNAME}": "vnet2" 38 }, 39 { 40 "{#IFNAME}": "vnet10" 41 }, 42 { 43 "{#IFNAME}": "vnet11" 44 }, 45 { 46 "{#IFNAME}": "vnet1" 47 }, 48 { 49 "{#IFNAME}": "vnet3" 50 }, 51 { 52 "{#IFNAME}": "vnet8" 53 }, 54 { 55 "{#IFNAME}": "vnet9" 56 }, 57 { 58 "{#IFNAME}": "vnet20" 59 }, 60 { 61 "{#IFNAME}": "vnet21" 62 }, 63 { 64 "{#IFNAME}": "vnet24" 65 }, 66 { 67 "{#IFNAME}": "vnet25" 68 }, 69 { 70 "{#IFNAME}": "vnet4" 71 }, 72 { 73 "{#IFNAME}": "vnet5" 74 }, 75 { 76 "{#IFNAME}": "vnet6" 77 }, 78 { 79 "{#IFNAME}": "vnet7" 80 }, 81 { 82 "{#IFNAME}": "vnet14" 83 }, 84 { 85 "{#IFNAME}": "vnet15" 86 }, 87 { 88 "{#IFNAME}": "vnet16" 89 } 90 ] 91 }
zabbix通过系统中的正则表达式 “Network interfaces for discovery”对上面的结果进行过滤,符合条件的内容就会被zabbix进行监控。
3、解决方法
上面工作原理分析完了,解决起来也就不费劲了,我们只需对zabbix默认的正则表达式 “Network interfaces for discovery”添加两条规则(不监控虚机网卡接口和网桥接口),如下图所示:
为了能够确保新配置的正则表达式生效,并看到响应的效果,我们要把host(10.12.29.100)与template(Template OS Linux )取消关联并清除历史数据。
先看一下取消关联之前host上网络接口的监控项有56个
取消关联并清除历史数据,见下图
然后重新关联模板 “Template OS Linux ”,此处截图省略,然后等待zabbix LLD 自动发现符合要求的网络接口,并开始监控,按照LLD的默认值,最多需要等候一小时(3600s)
根据我们上面获取的10.12.29.100接口信息,这里需要监控的网络接口是 eth0-eth3 的Incoming和 Outcoming ,一共8个监控项,见下图:
至此,我们的问题就解决完了