[pve-devel] [PATCH v3 qemu-server 1/7] migration: only migrate disks used by the guest
Fiona Ebner
f.ebner at proxmox.com
Fri Jun 2 11:45:52 CEST 2023
Am 01.06.23 um 15:53 schrieb Aaron Lauterer:
> When scanning all configured storages for disk images belonging to the
> VM, the migration could easily fail if a storage is not available, but
> enabled. That storage might not even be used by the VM at all.
>
> By not doing that and only looking at the disk images referenced in the
> VM config, we can avoid that.
> Extra handling is needed for disk images currently in the 'pending'
> section of the VM config. These disk images used to be detected by
> scanning all storages before.
> It is also necessary to fetch some information (size, format) about the
> disk images explicitly that used to be provided by the initial scan of
> all storages.
>
> The big change regarding behavior is that disk images not referenced in
> the VM config file will be ignored. They are already orphans that used
> to be migrated as well, but are now left where they are. The tests have
> been adapted to that changed behavior.
>
> Signed-off-by: Aaron Lauterer <a.lauterer at proxmox.com>
> ---
> changes sind v2:
> - move handling of pending changes into QemuSerer::foreach_volid
> Seems to not have any bad side-effects
This should really be its own patch and talk about what the side effects
actually are and why they are okay in the commit message.
Maybe even keep old behavior for replication at first and then add a
second patch which explicitly enables replication for pending volumes by
removing the filtering initially added. Easiest probably is an option
$exclude_pending for foreach_volid() used only by replication and then
remove it in the following patch.
Like that, each patch has its own clear purpose and effect.
> - style fixes
> - use 'volume_size_info()' to get format and size of the image
>
> PVE/QemuMigrate.pm | 88 ++++++++-------------------
> PVE/QemuServer.pm | 9 ++-
> test/MigrationTest/QemuMigrateMock.pm | 9 +++
> test/run_qemu_migrate_tests.pl | 12 ++--
> 4 files changed, 50 insertions(+), 68 deletions(-)
>
> diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
> index 09cc1d8..163a721 100644
> --- a/PVE/QemuMigrate.pm
> +++ b/PVE/QemuMigrate.pm
> @@ -149,6 +149,22 @@ sub lock_vm {
> return PVE::QemuConfig->lock_config($vmid, $code, @param);
> }
>
> +sub target_storage_check_available {
> + my ($self, $storecfg, $targetsid, $volid) = @_;
> +
> + if (!$self->{opts}->{remote}) {
> + # check if storage is available on target node
> + my $target_scfg = PVE::Storage::storage_check_enabled(
> + $storecfg,
> + $targetsid,
> + $self->{node},
> + );
> + my ($vtype) = PVE::Storage::parse_volname($storecfg, $volid);
> + die "$volid: content type '$vtype' is not available on storage '$targetsid'\n"
> + if !$target_scfg->{content}->{$vtype};
> + }
> +}
This should also be its own patch, please.
> +
> sub prepare {
> my ($self, $vmid) = @_;
>
> @@ -396,17 +358,21 @@ sub scan_local_volumes {
> $targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $sid);
> }
>
> - # check target storage on target node if intra-cluster migration
> - if (!$self->{opts}->{remote}) {
> - PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
> -
> - return if $scfg->{shared};
> - }
> + return if $scfg->{shared} && !$self->{opts}->{remote};
> + $self->target_storage_check_available($storecfg, $targetsid, $volid);
Shouldn't these two lines be switched?
Before, the enabled check would be done if !$self->{opts}->{remote} and
then early return.
But now, if $scfg->{shared} && !$self->{opts}->{remote};, you early
return without doing the check at all.
With switched lines, you call the helper, which does the check if
!$self->{opts}->{remote} and then you do the early return like before.
>
> $local_volumes->{$volid}->{ref} = $attr->{referenced_in_config} ? 'config' : 'snapshot';
> $local_volumes->{$volid}->{ref} = 'storage' if $attr->{is_unused};
Not yours, but this is actually wrong, because if the same volume is
already referenced in a snapshot, the reference will be overwritten to
be 'storage' here :P
> + $local_volumes->{$volid}->{ref} = 'storage' if $attr->{is_pending};
But yours is also not correct ;) It should only be done for volumes that
are only referenced in pending, but not in the current config. You don't
touch the...
> @@ -4888,6 +4888,8 @@ sub foreach_volid {
$volhash->{$volid}->{referenced_in_config} = 1 if !defined($snapname);
...line in PVE/QemuServer.pm's foreach_volid(), so is_pending being 1
means referenced_in_config is also 1 and there's no way to distinguish
via attributes if a volume is only in pending or both in pending and the
current config.
Not sure if we should even bother to distinguish between more than
'config' and 'snapshot' anymore. Everything that's just in a snapshot
gets 'snapshot' and everything that's in the current config (including
pending) gets 'config'. No need for 'storage' at all.
> $volhash->{$volid}->{is_unused} //= 0;
> $volhash->{$volid}->{is_unused} = 1 if $key =~ /^unused\d+$/;
>
> + $volhash->{$volid}->{is_pending} = 1 if $pending;
> +
> $volhash->{$volid}->{drivename} = $key if is_valid_drivename($key);
> };
>
> diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm
> index 94fe686..cec34b7 100644
> --- a/test/MigrationTest/QemuMigrateMock.pm
> +++ b/test/MigrationTest/QemuMigrateMock.pm
> @@ -230,6 +230,15 @@ $MigrationTest::Shared::storage_module->mock(
> }
> return $res;
> },
> + volume_size_info => sub {
> + my ($scfg, $volid) = @_;
> + my ($storeid, $volname) = PVE::Storage::parse_volume_id($volid);
> +
> + for my $v ($source_vdisks->{$storeid}->@*) {
> + return wantarray ? ($v->{size}, $v->{format}, $v->{used}, $v->{parent}) : $v->{size}
> + if $v->{volid} eq $volid;
> + }
Please die if the expected volid was not among the ones in the list, so
that we get a clear error.
> + },
> vdisk_free => sub {
> my ($scfg, $volid) = @_;
>
More information about the pve-devel
mailing list