• linux内存条排查


    已发现2个内存错误,应用名称(kernel:),日志内容(hangzhou-jishuan-DDS0248 kernel: sbridge: HANDLING MCE MEMORY ERROR hangzhou-jishuan-DDS0248 kernel: EDAC MC0: CE row 5, channel 0, label CPU_SrcID#0_Channel#3_DIMM#1:1 Unknown error(s): memory scrubbing on FATAL area : cpu=0 Err=0008:00c1 (ch=1), addr = 0x1c9bea000 = socket=0, Channel=3(mask=8), rank=5)
     
    如何判断是第几条内存?
     
    获取服务器内存信息(此信息可以在报修的时候提供给硬件厂商工程师,记得告诉他们仅供参考)。
     
    shell命令:dmidecode | grep -A 9 -B 6 DIMM | grep Bank
     
            Bank Locator: BRANCH 0 CHANNEL 1 DIMM 0
            Bank Locator: BRANCH 0 CHANNEL 1 DIMM 1
            Bank Locator: BRANCH 0 CHANNEL 2 DIMM 0
            Bank Locator: BRANCH 0 CHANNEL 2 DIMM 1
            Bank Locator: BRANCH 0 CHANNEL 3 DIMM 0
            Bank Locator: BRANCH 0 CHANNEL 3 DIMM 1
            Bank Locator: BRANCH 1 CHANNEL 1 DIMM 0
            Bank Locator: BRANCH 1 CHANNEL 1 DIMM 1
            Bank Locator: BRANCH 1 CHANNEL 2 DIMM 0
            Bank Locator: BRANCH 1 CHANNEL 2 DIMM 1
            Bank Locator: BRANCH 1 CHANNEL 3 DIMM 0
            Bank Locator: BRANCH 1 CHANNEL 3 DIMM 1
     
    内存顺序是从上向下1-12.根据报错信息CPU_SrcID#0_Channel#3_DIMM#1 : 得到CPU_SrcID 0,CHANNEL 3,DIMM 1。
     
    可以判断为第六条条内存故障,也可以说第一颗cpu控制内存区域,CHANNEL为3,内存id为1。
  • 相关阅读:
    项目编写,寻找项目
    BP优化
    python hook的使用
    BN归一化
    mnist数据集使用torch进行卷积训练
    dropout
    Flip String to Monotone Increasing
    Reach a Number
    wordLadder i/ii
    Palindrome Partitioning II
  • 原文地址:https://www.cnblogs.com/dailidong/p/7571211.html
Copyright © 2020-2023  润新知