最近一个用户这边服务器出现服务器负载很高的情况,原本正常是0.3~0.5左右 不正常的时候会达到3,重启机器就正常,开始以为是程序问题,后来在观察的时候把程序给杀掉了 然后重启,结果负载还是很高,于是挨个分析每个进程发现
查看当前正在运行的进程发现kipmi0进程占用率达到100%。
于是查了下这个进程的一些资料
google查不到多少资料,看到一篇说法:是一些平台接口的管理器。不敢贸然杀死,再查查资料。
看看专业的说法:
kipmi is supposed to run with low priority. When you say it consumes 70-90% of the CPUs, is that constant (does it still consume the processor when they are other tasks in the process queue that should have a larger slice of the CPU time) or the 70%/90% comes when the machine is idle?
A second issue to investigate is whether you have pending controller issues (alarms of varying nature that are not resolved) and/or older versions of controller firmware.
虽然这是一个利用空余的CPU资源进行一些接口自动调节的任务,但看着占那么多的资源还是怕出意外。并且现在已经出了意外 反正不管怎么样试试
Fix:不需要修复
No fix required. You should ignore increased CPU utilization as it has no impact on actual system performance.
利用空余的CPU资源进行一些接口自动调节的任务。
临时降低(立即生效,cpu占用率降到10%以内):
echo 100 > /sys/module/ipmi_si/parameters/kipmid_max_busy_us
永久性降低(修改配置文件,模块/系统重启生效)
To make the changes persistent you can configure the options for the ipmi_si kernel module.
Create a file in /etc/modprobe.d/, i.e./etc/modprobe.d/ipmi.conf, and add the following content:
# Prevent kipmi0 from consuming 100% CPU
echo "options ipmi_si kipmid_max_busy_us=100">/etc/modprobe.d/ipmi.conf
修改了了之后再查看下系统负载,果然降低到了正常值。