• 一次linux启动故障记录


    故障背景:

    在2.6.32升级内核之后,出现多台设备启动失败,失败的全部都是ssd作为系统盘的机器,bios引导之后,屏幕就黑了,没有打印。

    一开是以为是mbr损坏了,所以将启动盘挂载到其他服务器上,结果发现mbr和升级之前备份的mbr是一样的,而且和升级后能正常启动的mbr也是一样的。

    排查到此,没能继续跟踪,找专业的os团队同事蒙恩排查,结论记录如下:

    由于使用的是grub作为引导程序,mbr中的扇区位置,找不到stage2文件。

    过程:

    1.把现场的boot.bak和mbr.bak拿回来搭建了环境,引导内核,引导不起来,由于虚拟机bios有里程碑打印,确定bios已经加载到mbr了。

    2.确定mbr坏掉了,主要是mbr中写入的stage2文件开始扇区号错了

    3.打点确定升级操作没有操作到mbr以及引导相关的几个关键文件(stage2等)

    grub-install失败的原因就是现场用了这种方式写device map文件,构造个如下的device.map文件,然后用命令:"grub-install /dev/sda" (sda是系统盘)

    [root@XJ-Center-VS3000-4 /]# cat /boot/grub/device.map

    (hd0)   /dev/disk/by-id/ata-INTEL_SSDSC2BB240G4_BTWL4020041Z240NGN

    原理记录:

    =====

    系统启动流程:MBR(/boot/grub/stage1)->/boot/grub/stage2->vmlinux MBR负责加载stage2->stage2负责加载vmlinux.

    MBR /boot/grub/stage1,/boot/grub/stage2的关系如下:

    stage1二进制么以办法识别文件系统,因此只能通过biso中断,读数据。

    stage1二进制程序被写入MBR,stage1有几个变量通过编译器严格控制其在stage1二进制文件中的偏移量。其中一个最重要的变量是stage2在boot分区的开始扇区号,因此MBR为stage1文件+几个被安装程序修改的变量+分区表

    stage2中内置了ext系列文件系统的支持,因此可以通过直接读boot分区所在的文件系统来加载vmlinux,grub.conf等。

    上面结论的依据:

    Stage 1 and Stage 2 have embedded variables whose locations are

    well-defined, so that the installation can patch the binary file

    directly without recompilation of the stages.

       In Stage 1, these are defined:

    `0x3E'

         The version number (not GRUB's, but the installation mechanism's).

    `0x40'

         The boot drive. If it is 0xFF, use a drive passed by BIOS.

    `0x41'

         The flag for if forcing LBA.

    `0x42'

         The starting address of Stage 2.

    `0x44'

         The first sector of Stage 2.

    `0x48'

         The starting segment of Stage 2.

    `0x1FE'

         The signature (`0xAA55').

    打点了升级patch中是否调用过grub一级打开stage文件结果如下,并没有发现有人调用过grub命令(grub-install也是调用了grub来安装grub的)

    [root@localhost home]# ./test.stap |grep -E 'stage|grub'

    open===/boot/grub/grub.conf

    open===/boot/grub/sedgzxf68

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting10.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting11.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting08.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting08.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting01.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting11.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting10.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting04.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting09.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting01.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting03.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting11.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting08.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting07.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting07.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting03.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting06.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting05.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting02.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting07.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting02.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting01.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting09.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting06.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting09.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting05.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting05.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting03.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting10.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting06.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting04.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting04.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting02.png

    execve===>/sbin/grubby

    open===/etc/grub.conf

    open===../boot/grub/grub.conf-

    execve===>/sbin/grubby

    open===/etc/grub.conf

    execve===>/sbin/grubby

    open===/etc/grub.conf

    open===/etc/sysconfig/grub

    execve===>/sbin/grubby

    open===/etc/grub.conf

    open===../boot/grub/grub.conf-

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting10.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting11.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting08.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting08.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting01.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting11.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting10.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting04.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting09.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting01.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting03.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting11.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting08.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting07.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting07.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting03.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting06.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting05.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting02.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting07.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting02.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting01.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting09.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting06.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting09.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting05.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting05.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage01-connecting03.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting10.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting06.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage02-connecting04.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting04.png

    open===/usr/share/icons/hicolor/22x22/apps/nm-stage03-connecting02.png

    open===/boot/grub/grub.conf

    open===/boot/grub/grub.conf

    排查了grub-install脚本,在脚本中发现对device-map文件的解析还是过于简单,我们这种类型的device-map没有适配,在升级之前,我们的mbr中对stage2的扇区也是错的,

    但由于这个扇区里面存放的之前老的stage2文件还留存着,反倒没有问题,升级之后,boot分区可能因为备份的原因,里面要覆盖一些新的文件,导致那个sector被分配出去了。

    参考资料:

    https://www.gnu.org/software/grub/manual/legacy

    水平有限,如果有错误,请帮忙提醒我。如果您觉得本文对您有帮助,可以点击下面的 推荐 支持一下我。版权所有,需要转发请带上本文源地址,博客一直在更新,欢迎 关注 。
  • 相关阅读:
    数据结构学习笔记(特殊的线性表:栈与队列)
    数据结构学习笔记(线性表)
    使用U盘安装 OS X 的坑
    chrome插件推荐
    Mac下安装oh-my-zsh
    sublime下让代码居中
    Mac上关于shell使用Python3和C++11声明
    github学习(三)
    github学习(二)
    github学习(一)
  • 原文地址:https://www.cnblogs.com/10087622blog/p/9896701.html
Copyright © 2020-2023  润新知