1、问题现象
生产环境上,对计算节点文件系统修复,导致某些虚机的镜像文件数据丢失,出现异常,最终造成虚机无法启动,查看对应计算节点的nova日志,报如下错误
nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2560, in power_on nova-compute: self._hard_reboot(context, instance, network_info, block_device_info) nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2449, in _hard_reboot nova-compute: vifs_already_plugged=True) nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5191, in _create_domain_and_network nova-compute: destroy_disks_on_failure) nova-compute: File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ nova-compute: self.force_reraise() nova-compute: File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise nova-compute: six.reraise(self.type_, self.value, self.tb) nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5163, in _create_domain_and_network nova-compute: post_xml_callback=post_xml_callback) nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5081, in _create_domain nova-compute: guest.launch(pause=pause) nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 145, in launch nova-compute: self._encoded_xml, errors='ignore') nova-compute: File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ nova-compute: self.force_reraise() nova-compute: File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise nova-compute: six.reraise(self.type_, self.value, self.tb) nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 140, in launch nova-compute: return self._domain.createWithFlags(flags) nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit nova-compute: result = proxy_call(self._autowrap, f, *args, **kwargs) nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call nova-compute: rv = execute(f, *args, **kwargs) nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute nova-compute: six.reraise(c, e, tb) nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker nova-compute: rv = meth(*args, **kwargs) nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1065, in createWithFlags nova-compute: if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) nova-compute: libvirtError: internal error: process exited while connecting to monitor: 2020-03-16T01:44:43.128499Z
qemu-kvm: -drive file=/os_instance/3dc75704-f729-4c33-865b-313f0e8a8df8/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none:
qcow2: Image is corrupt; cannot be opened read/write
2、修复方法
进入到虚机disk的目录下,执行qemu-img check disk,检查镜像数据的一致性,发现很多error,执行qemu-img check -r all disk命令,对磁盘镜像进行修复,最后重启虚机即可
3、qemu-img check命令详解
qemu-img check [-f fmt] [--output=ofmt] [-r [leaks | all]] filename
对磁盘镜像文件进行一致性检查,查找镜像文件中的错误,目前仅支持对“qcow2”、“qed”、“vdi”格式文件的检查。其中,qcow2是QEMU 0.8.3版本引入的镜像文件格式,也是目前使用最广泛的格式。qed(QEMU enhanced disk)是从QEMU 0.14版开始加入的增强磁盘文件格式,为了避免qcow2格式的一些缺点,也为了提高性能,不过目前还不够成熟。而vdi(Virtual Disk Image)是Oracle的VirtualBox虚拟机中的存储格式。
参数-f fmt是指定文件的格式,如果不指定格式qemu-img会自动检测,filename是磁盘镜像文件的名称(包括路径)。
如果指定了“-r”,qemu-img将尝试修复在检查时发现的任何非一致性。在使用qemu-img check -r 命令执行,最好对磁盘文件进行备份,-r leaks 仅修复集群损坏。
-r all修复各种类型的错误,该命令执行后,会有一个退出码,不同的数字,表示不同的检测结果
0 检查完成,镜像(现在)是一致的
1 检查由于内部错误而未完成
2 检查完成,镜像已损坏
3 检查完成,镜像已泄漏集群,但没有损坏
63 镜像格式不支持检查