• 解决Zabbix某台主机突然频繁告警"Zabbix agent on xxxxxx is unreachable for x minutes"


    一、某台主机突然某一天频繁告警zabbix agent不可达

    查看zabbix agent日志没有发现异常

    二、查看zabbix server日志发现这台主机的日志有大量报错信息"first network error"以及"another network error"

    [root@zabbix_server etc]# cat /tmp/zabbix_server.log|grep 172.28.5.63|more
    
     27849:20191218:094413.077 Zabbix agent item "perf_counter[2250]" on host "172.28.5.63" failed: another network error, wait fo
    r 15 seconds
     27848:20191218:094428.098 resuming Zabbix agent checks on host "172.28.5.63": connection restored
     27837:20191218:094446.128 Zabbix agent item "net.if.in[Microsoft ISATAP Adapter #2]" on host "172.28.5.63" failed: first networ
    k error, wait for 15 seconds
     27849:20191218:094504.088 Zabbix agent item "net.if.out[WAN Miniport (Network Monitor)-QoS Packet Scheduler-0000]" on host "172
    .28.5.63" failed: another network error, wait for 15 seconds
     27845:20191218:094519.094 resuming Zabbix agent checks on host "172.28.5.63": connection restored
     27836:20191218:094536.258 Zabbix agent item "net.if.in[Broadcom NetXtreme Gigabit Ethernet #4]" on host "172.28.5.63" failed: f
    irst network error, wait for 15 seconds
     27846:20191218:094551.117 resuming Zabbix agent checks on host "172.28.5.63": connection restored
     27843:20191218:094600.102 Zabbix agent item "net.if.out[Broadcom NetXtreme Gigabit Ethernet-WFP LightWeight Filter-0000]" on ho
    st "172.28.5.63" failed: first network error, wait for 15 seconds
     27843:20191218:094615.127 resuming Zabbix agent checks on host "172.28.5.63": connection restored
     27837:20191218:094623.818 Zabbix agent item "net.if.in[Broadcom NetXtreme Gigabit Ethernet #4-QoS Packet Scheduler-0000]" on ho
    st "172.28.5.63" failed: first network error, wait for 15 seconds
     27847:20191218:094641.112 Zabbix agent item "net.if.in[WAN Miniport (SSTP)]" on host "172.28.5.63" failed: another network erro
    r, wait for 15 seconds
     27845:20191218:094657.134 resuming Zabbix agent checks on host "172.28.5.63": connection restored
     27834:20191218:094702.464 Zabbix agent item "vfs.fs.size[D:,free]" on host "172.28.5.63" failed: first network error, wait for 
    15 seconds
     27852:20191218:094720.139 resuming Zabbix agent checks on host "172.28.5.63": connection restored
     27840:20191218:094723.709 Zabbix agent item "vm.memory.size[pavailable]" on host "172.28.5.63" failed: first network error, wai
    t for 15 seconds
     27847:20191218:094738.149 resuming Zabbix agent checks on host "172.28.5.63": connection restored
     27836:20191218:094802.499 Zabbix agent item "net.if.out[Broadcom NetXtreme Gigabit Ethernet #3]" on host "172.28.5.63" failed: 
    first network error, wait for 15 seconds
     27843:20191218:094818.149 resuming Zabbix agent checks on host "172.28.5.63": connection restored
     27832:20191218:094825.129 Zabbix agent item "net.if.in[Broadcom NetXtreme Gigabit Ethernet #3-QoS Packet Scheduler-0000]" on ho
    st "172.28.5.63" failed: first network error, wait for 15 seconds
     27851:20191218:094859.175 resuming Zabbix agent checks on host "172.28.5.63": connection restored
     27832:20191218:094903.413 Zabbix agent item "vfs.fs.size[E:,free]" on host "172.28.5.63" failed: first network error, wait for 
    15 seconds

    三、查看主机TCP连接,发现存在大量的TIME_WAIT连接

    四、 百度一下,具体是因为如下原因

    从系统启动,Windows Vista 中、 在 Windows 7 中,Windows Server 2008 中和在 Windows Server 2008 R2 中的 497 天后未关闭 TIME_WAIT 状态的所有 TCP/IP 端口

    意思是说,系统启动的497天以后,所有在"TIME_WAIT"状态的TCP链接都不会被关闭。TCP端口逐渐被占用完,不能创建新的TCP/IP连接

    五、登录主机查看系统运行时长

     正好的前天凌晨出现的频繁告警

    六、解决方案

    1、重启服务器,但是运行497天后,问题还会出现

    2、下载微软补丁

    微软官网公告地址

    https://support.microsoft.com/zh-cn/help/2553549/all-the-tcp-ip-ports-that-are-in-a-time-wait-status-are-not-closed-aft

    现在已经不能下载补丁包了,可以使用window update来更新补丁

  • 相关阅读:
    织梦DEDECMS更换目录后需要修改的内容绝对路径与相对路径问题
    <dedecms开发》给dede自定义表单添加提交验证功能
    PLSQL存储过程中的内部存储过程
    在Oracle中查询存储过程和函数
    PLSQL存储过程调用存储过程对异常的处理问题
    带参数存储过程的小例子
    对PLSQL的SQL%NOTFOUND的再验证
    PLSQL restrict reference的做法
    PLSQL的 dynamic sql小例子
    PLSQL execute immediate
  • 原文地址:https://www.cnblogs.com/sky-cheng/p/12066143.html
Copyright © 2020-2023  润新知