前面讲了QEMU的qcow2格式的internal snapshot和external snapshot,这都是虚拟机文件格式的功能。
这是文件级别的。
还可以是文件系统级别的,比如很多文件系统支持snapshot,如OCFS2
还可以是block级别的,比如LVM支持snapshot
我们这节来分析openstack中各种snapshot的实现。
在Openstack中,Instance的启动大概有两种,一种是从image启动,一种是从bootable volume启动
启动了的instance还可以attach一个volume。
从image启动并且attach一个volume的libvirt xml里面有
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none'/>
<source file='/var/lib/nova/instances/59ca11ea-0978-4f7d-8385-480649e63a1d/disk'/>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/disk/by-path/ip-16.158.166.197:3260-iscsi-iqn.2010-10.org.openstack:volume-f6ba87f7-d0b6-4fdb-ac82-346371e78c48-lun-1'/>
<target dev='vdb' bus='virtio'/>
<serial>f6ba87f7-d0b6-4fdb-ac82-346371e78c48</serial>
<alias name='virtio-disk1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</disk>
从bootable volume启动的instance的libvirt xml里面有
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/disk/by-path/ip-16.158.166.197:3260-iscsi-iqn.2010-10.org.openstack:volume-640a10f7-3965-4a47-9641-002a94526444-lun-1'/>
<target dev='vda' bus='virtio'/>
<serial>640a10f7-3965-4a47-9641-002a94526444</serial>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
snapshot可以分为下面的几种
- 对Instance进行snapshot
- 对image启动的instance进行snapshot
- 将运行当中的instance的Ephemeral disk打成snapshot,然后上传到glance上去
- 根据虚拟机的状态和libvirt的版本,又分为live snapshot和code snapshot
- 对instance进行snapshot的时候,attached volume不会同时snapshot
- 对bootable volume启动的instance进行snapshot
- 对运行中的instance snapshot的时候,发现disk时volume,则调用cinder,对LVM进行snapshot
- snapshot的metadata会出现在glance里面,但是并不上传到glance上去,而是在LVM里面
- 这个snapshot也会出现在cinder的数据库里面
- 对volume进行snapshot:
- 最终调用cinder,对backend的LVM进行snapshot
- 这个snapshot会出现在cinder的数据库里面
所以snapshot本质上就两种,对libvirt的disk进行snapshot和对LVM进行snapshot
要进行snapshot调用的命令为
nova --debug image-create d9793e05-111c-43bb-93ed-672e94ad096e myInstanceWithVolume-snapshot3
调用的rest为
curl -i 'http://16.158.166.197:8774/v2/c24c59846a7f44538d958e7548cc74a3/servers/d9793e05-111c-43bb-93ed-672e94ad096e/action' -X POST -H "X-Auth-Project-Id: openstack" -H "User-Agent: python-novaclient" -H "Content-Type: application/json" -H "Accept: application/json" -H "X-Auth-Token: +Onwmvjs6T0CELsN48ON4PUNMUhF-" -d '{"createImage": {"name": "myInstanceWithVolume-snapshot3", "metadata": {}}}'
会调用/usr/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py中的
@wsgi.response(202)
@wsgi.serializers(xml=FullServerTemplate)
@wsgi.deserializers(xml=ActionDeserializer)
@wsgi.action('createImage')
@common.check_snapshots_enabled
def _action_create_image(self, req, id, body):
在这个函数中
#如果是volume则调用volume snapshot
image = self.compute_api.snapshot_volume_backed(
context,
instance,
image_meta,
image_name,
extra_properties=props)
#如果是普通的instance则进行普通的snapshot
image = self.compute_api.snapshot(context,
instance,
image_name,
extra_properties=props)
我们先来分析volume snapshot
在/usr/lib/python2.7/dist-packages/nova/compute/api.py中有函数
@check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.STOPPED])
def snapshot_volume_backed(self, context, instance, image_meta, name,
extra_properties=None):
它首先会
#调用volume的api来snapshot
snapshot = self.volume_api.create_snapshot_force(
context, volume['id'], name, volume['display_description'])
然后
#然后将这个snapshot的metadata加入glance
return self.image_service.create(context, image_meta, data='')
在/usr/lib/python2.7/dist-packages/cinder/volume/api.py中有函数
def create_snapshot_force(self, context,
volume, name,
description, metadata=None):
return self._create_snapshot(context, volume, name, description,
True, metadata)
在_create_snapshot函数中
#在cinder数据库里面创建
snapshot = self.db.snapshot_create(context, options)
#创建volume snapshot
self.volume_rpcapi.create_snapshot(context, volume, snapshot)
在/usr/lib/python2.7/dist-packages/cinder/volume/manager.py的中有函数
def create_snapshot(self, context, volume_id, snapshot_id):
#会调用driver创建snapshot
model_update = self.driver.create_snapshot(snapshot_ref)
volume_driver默认是cinder.volume.drivers.lvm.LVMISCSIDriver,它继承于LVMVolumeDriver
LVMVolumeDriver有函数
def create_snapshot(self, snapshot):
"""Creates a snapshot."""
self.vg.create_lv_snapshot(self._escape_snapshot(snapshot['name']),
snapshot['volume_name'],
self.configuration.lvm_type)
这里的vg是
self.vg = lvm.LVM(self.configuration.volume_group,
root_helper,
lvm_type=self.configuration.lvm_type,
executor=self._execute)
在/usr/lib/python2.7/dist-packages/cinder/brick/local_dev/lvm.py中有函数
def create_lv_snapshot(self, name, source_lv_name, lv_type='default'):
它主要执行了下面的命令
cmd = ['lvcreate', '--name', name,
'--snapshot', '%s/%s' % (self.vg_name, source_lv_name)]
lvcreate --size 100M --snapshot --name snap /dev/vg00/lvol1
我们再来分析instance snapshot
在/usr/lib/python2.7/dist-packages/nova/compute/api.py中有函数
@wrap_check_policy
@check_instance_cell
@check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.STOPPED,
vm_states.PAUSED, vm_states.SUSPENDED])
def snapshot(self, context, instance, name, extra_properties=None):
#在glance中创建一个image的记录
image_meta = self._create_image(context, instance, name,
'snapshot',
extra_properties=extra_properties)
#真正创建snapshot
self.compute_rpcapi.snapshot_instance(context, instance,
image_meta['id'])
在/usr/lib/python2.7/dist-packages/nova/compute/manager.py中有函数
@wrap_exception()
@reverts_task_state
@wrap_instance_fault
@delete_image_on_error
def snapshot_instance(self, context, image_id, instance):
它调用_snapshot_instance,并最终调用driver
self.driver.snapshot(context, instance, image_id,
update_task_state)
nova compute的driver是libvirt
在/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py中有函数
1461 def snapshot(self, context, instance, image_href, update_task_state):
是snapshot的核心函数
1)得到libvirt的domain
virt_dom = self._lookup_by_name(instance['name'])
(Pdb) p instance['name']
'instance-0000000d'
(Pdb) p virt_dom
<libvirt.virDomain object at 0x7f22a87227d0>
2)得到image service
(image_service, image_id) = glance.get_remote_image_service(context, instance['image_ref'])
(Pdb) p image_service
<nova.image.glance.GlanceImageService object at 0x7f22a8722bd0>
(Pdb) p image_id
u'd96b0e41-8264-41de-8dbb-6b31ce9bfbfc'
3) 得到image的metadata
base = compute_utils.get_image_metadata(context, image_service, image_id, instance)
(Pdb) p base
{u'min_disk': 20, u'container_format': u'bare', u'min_ram': 0, u'disk_format': u'qcow2', 'properties': {u'instance_type_memory_mb': u'2048', u'instance_type_swap': u'0', u'instance_type_root_gb': u'20', u'instance_type_name': u'm1.small', u'instance_type_id': u'5', u'instance_type_ephemeral_gb': u'0', u'instance_type_rxtx_factor': u'1.0', u'network_allocated': u'True', u'instance_type_flavorid': u'2', u'instance_type_vcpus': u'1', u'base_image_ref': u'd96b0e41-8264-41de-8dbb-6b31ce9bfbfc'}}
4)得到snapshot
snapshot = snapshot_image_service.show(context, snapshot_image_id)
(Pdb) p snapshot
{'status': u'queued', 'name': u'myinstancewithvolume-snapshot6', 'deleted': False, 'container_format': u'bare', 'created_at': datetime.datetime(2014, 7, 3, 20, 48, 46, tzinfo=<iso8601.iso8601.Utc object at 0x7f22a873d8d0>), 'disk_format': u'qcow2', 'updated_at': datetime.datetime(2014, 7, 3, 20, 48, 46, tzinfo=<iso8601.iso8601.Utc object at 0x7f22a873d8d0>), 'id': u'2b2da7ba-9cba-4cc9-87da-a47666e62f57', 'owner': u'c24c59846a7f44538d958e7548cc74a3', 'min_ram': 0, 'checksum': None, 'min_disk': 20, 'is_public': False, 'deleted_at': None, 'properties': {u'instance_uuid': u'd9793e05-111c-43bb-93ed-672e94ad096e', u'instance_type_memory_mb': u'2048', u'user_id': u'5df0e89888364c0b80d47a7a426a9a67', u'image_type': u'snapshot', u'instance_type_id': u'5', u'instance_type_name': u'm1.small', u'instance_type_ephemeral_gb': u'0', u'instance_type_rxtx_factor': u'1.0', u'instance_type_root_gb': u'20', u'network_allocated': u'True', u'instance_type_flavorid': u'2', u'instance_type_vcpus': u'1', u'instance_type_swap': u'0', u'base_image_ref': u'd96b0e41-8264-41de-8dbb-6b31ce9bfbfc'}, 'size': 0}
5)得到Ephemeral disk的路径和格式
disk_path = libvirt_utils.find_disk(virt_dom)
source_format = libvirt_utils.get_disk_type(disk_path)
(Pdb) p disk_path
'/var/lib/nova/instances/d9793e05-111c-43bb-93ed-672e94ad096e/disk'
(Pdb) p source_format
'qcow2'
6)创建snapshot的metadata
metadata = self._create_snapshot_metadata(base, instance, image_format, snapshot['name'])
(Pdb) p metadata
{'status': 'active', 'name': u'myinstancewithvolume-snapshot6', 'container_format': u'bare', 'disk_format': 'qcow2', 'is_public': False, 'properties': {'kernel_id': u'', 'image_location': 'snapshot', 'image_state': 'available', 'ramdisk_id': u'', 'owner_id': u'c24c59846a7f44538d958e7548cc74a3'}}
7) 为snapshot生成一个文件名
snapshot_name = uuid.uuid4().hex
(Pdb) p snapshot_name
'7d9d745f5bf5482eb43886f6e69ed6e5'
8)得到snapshot文件夹
snapshot_directory = CONF.libvirt.snapshots_directory
(Pdb) p snapshot_directory
'/var/lib/nova/instances/snapshots'
9)调用live snapshot
(Pdb) p virt_dom
<libvirt.virDomain object at 0x7f22a87227d0>
(Pdb) p disk_path
'/var/lib/nova/instances/d9793e05-111c-43bb-93ed-672e94ad096e/disk'
(Pdb) p out_path
'/var/lib/nova/instances/snapshots/tmps2A_Uy/7d9d745f5bf5482eb43886f6e69ed6e5'
(Pdb) p image_format
'qcow2'
self._live_snapshot(virt_dom, disk_path, out_path, image_format)
10) _live_snapshot: 终止或者cancel当前的block operation
domain.blockJobAbort(disk_path, 0)
这是类似执行命令
virsh blockjob <domain> <path> [--abort] [--async] [--pivot] [--info] [<bandwidth>]
调用的是libvirt的virDomainBlockJobAbort
http://libvirt.org/html/libvirt-libvirt.html#virDomainBlockJobAbort
根据当前的job的不同以及flag的不同,结果不一样
如果job是VIR_DOMAIN_BLOCK_JOB_TYPE_PULL类型,cancel它需要很长时间,所以flag通常设为VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC,来异步cancel,等job真正被cancel了,会有一个event发出。
如果job是VIR_DOMAIN_BLOCK_JOB_TYPE_COPY,则马上停止当前操作,disk回到原来的状态
如果job是VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT,也马上停止,保持原状态不变
如果job是上面两种状态,而且flag设为VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT,则abort会失败,被告知正在进行copy或者commit操作。
11)_live_snapshot: 得到原disk的base disk的位置,生成新disk的位置
src_disk_size = libvirt_utils.get_disk_size(disk_path)
src_back_path = libvirt_utils.get_disk_backing_file(disk_path, basename=False)
(Pdb) p src_disk_size
21474836480
(Pdb) p src_back_path
'/var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678'
disk_delta = out_path + '.delta'
(Pdb) p disk_delta
'/var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta'
12)_live_snapshot: 创建一个cow disk
libvirt_utils.create_cow_image(src_back_path, disk_delta, src_disk_size)
其实是执行下面的命令
qemu-img info /var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678
qemu-img create -f qcow2 -o backing_file=/var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678,size=21474836480 /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta
13)_live_snapshot: 将原来的disk复制到新的disk
domain.blockRebase(disk_path, disk_delta, 0,
libvirt.VIR_DOMAIN_BLOCK_REBASE_COPY |
libvirt.VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT |
libvirt.VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
(Pdb) p disk_path
'/var/lib/nova/instances/d9793e05-111c-43bb-93ed-672e94ad096e/disk'
(Pdb) p disk_delta
'/var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta'
经过这一步,我们可以看到新的disk中有了内容
root:/var/lib/nova/instances# qemu-img info d9793e05-111c-43bb-93ed-672e94ad096e/disk
image: d9793e05-111c-43bb-93ed-672e94ad096e/disk
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 379M
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678
Format specific information:
compat: 1.1
lazy refcounts: false
root:/var/lib/nova/instances# qemu-img info /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta
image: /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 18G
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678
Format specific information:
compat: 1.1
lazy refcounts: false
这相当于执行了blockpull
例如
[root@moon ~]# virsh snapshot-list daisy --tree snap1-daisy | +- snap2-daisy | +- snap3-daisy
snap3的base是snap2,snap2的base是snap1
我们想让snap2的base直接是snap1
[root@moon ~]# virsh blockpull --domain daisy --path /export/vmimgs/snap3-daisy.qcow2 --base /export/vmimgs/snap1-daisy.qcow2 --wait --verbose Block Pull: [100 %] Pull complete
做了这个后,发现snap3直接base是snap1了
[root@moon ~]# qemu-img info /export/vmimgs/snap3-daisy.qcow2 image: /export/vmimgs/snap3-daisy.qcow2 file format: qcow2 virtual size: 20G (21474836480 bytes) disk size: 145M cluster_size: 65536 backing file: /export/vmimgs/snap1-daisy.qcow2 [root@moon ~]#
14)_live_snapshot: 将新的disk进行压缩
libvirt_utils.extract_snapshot(disk_delta, 'qcow2', out_path, image_format)
调用下面的命令
qemu-img convert -f qcow2 -O qcow2 /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030
root:/var/lib/nova/instances# qemu-img info /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030
image: /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 862M
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
15)将image上传到glance
在snapshot函数中
image_service.update(context, image_href, metadata, image_file)