[pve-devel] [PATCH v5 qemu-server 14/19] Update volume IDs in one go

Wed Apr 15 11:50:46 CEST 2020

On April 15, 2020 10:04 am, Fabian Ebner wrote:
> On 14.04.20 12:41, Fabian Ebner wrote:
>> On 09.04.20 09:53, Fabian Grünbichler wrote:
>>> small nit inline
>>>
>>> On April 8, 2020 11:25 am, Fabian Ebner wrote:
>>>> Use 'update_volume_ids' for the live-migrated disks as well.
>>>>
>>>> Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
>>>> ---
>>>>   PVE/QemuMigrate.pm | 26 ++++++++++----------------
>>>>   1 file changed, 10 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
>>>> index 14f3fcb..eb78537 100644
>>>> --- a/PVE/QemuMigrate.pm
>>>> +++ b/PVE/QemuMigrate.pm
>>>> @@ -838,14 +838,20 @@ sub phase2 {
>>>>           my $source_drive = PVE::QemuServer::parse_drive($drive, 
>>>> $conf->{$drive});
>>>>           my $target_drive = PVE::QemuServer::parse_drive($drive, 
>>>> $target->{drivestr});
>>>> -        my $source_sid = 
>>>> PVE::Storage::Plugin::parse_volume_id($source_drive->{file});
>>>> -        my $target_sid = 
>>>> PVE::Storage::Plugin::parse_volume_id($target_drive->{file});
>>>> +        my $source_volid = $source_drive->{file};
>>>> +        my $target_volid = $target_drive->{file};
>>>> +
>>>> +        my $source_sid = 
>>>> PVE::Storage::Plugin::parse_volume_id($source_volid);
>>>> +        my $target_sid = 
>>>> PVE::Storage::Plugin::parse_volume_id($target_volid);
>>>>           my $bwlimit = 
>>>> PVE::Storage::get_bandwidth_limit('migration', [$source_sid, 
>>>> $target_sid], $opt_bwlimit);
>>>>           my $bitmap = $target->{bitmap};
>>>>           $self->log('info', "$drive: start migration to $nbd_uri");
>>>>           PVE::QemuServer::qemu_drive_mirror($vmid, $drive, $nbd_uri, 
>>>> $vmid, undef, $self->{storage_migration_jobs}, 'skip', undef, 
>>>> $bwlimit, $bitmap);
>>>> +
>>>> +        $self->{volume_map}->{$source_volid} = $target_volid;
>>>> +        $self->log('info', "volume '$source_volid' is 
>>>> '$target_volid' on the target\n");
>>>
>>> potential followup: if we do that before attempting to start the 
>>> mirror which might
>>> die, it would make the cleanup handling easier.
>> 
>> You're right. The reason I decided to do it here, was to avoid code 
>> duplication, because of the distinction between of unix socket/no unix 
>> socket above. And the drivestrings need to be parsed just for this 
>> otherwise.
>> 
>> I guess instead of saving the drivestring in the target_drive hash, we 
>> only need to save the volume ID. The other information that is in 
>> drivestring is not used anymore, because the old settings from the 
>> config are copied over, just the volume ID is updated.
>> 
>> This makes me realize a potential bug with only updating the volume_ids. 
>> If the newly allocated volume would have a different format than the 
>> original, but 'format' was explicitly set for the volume in the old 
>> config, the config will be wrong after migration. AFAICT this currently 
>> does not happen, but might change in the future. Also the size might be 
>> slightly off (e.g. image was a raw file, but newly allocated one is 
>> bigger because it's an LVM and uses extents).
>> 
> 
> Just tested around a bit. For online migration this does happen, because 
> the format can change when e.g. going from a directory storage with a 
> qcow2 image to an LVM storage.

yes, sorry for missing it during review.

> 
>> For offline migration we (currently) don't have a way to send back the 
>> drivestring and in general it's desireable to preserve options for a 
>> drive. Does it make sense to add a method to be executed after 
>> migration, which would check/fix such errors, or is doing a rescan enough?
>> 
> 
> Now I think it might be enough to extend rescan to also check+fix the 
> format (the needed information is already present) and call it after the 
> migration.
> Another way would be to just delete eventual 'format' entries from the 
> volume hashes in update_volids or is there any case where the format 
> isn't encoded in the volid (of course taking into account the storage 
> type) and the 'format=...' part of the property string is actually needed?

I *think* the format=... part is only needed for the shorthand

qm set/create XXX -scsi0 STORAGE:SIZE

syntax, where you can use STORAGE:SIZE,format=FORMAT to select a format 
if the storage supports multiple ones.

at migration time no such entries should exist, so rescanning or 
dropping the format property should be fine. there are lots of hits for 
'{format}' in qemu-server though, so those need to be checked ;)

alternatively/as a quick fix, we could also go back to using what the 
target node provided us, and skip those keys when passing $conf to 
update_volume_ids. for snapshots this does not really matter, since they 
can't change format and can be updated either way. for disks only used 
in snapshots, it does not matter either because they don't get migrated 
via NBD.

e.g., at line 1148:

if ($self->{volume_map}) {
  foreach my $target_drive (keys %{$self->{target_drive}}) {
    delete $conf->{$target_drive};
  }
  PVE::QemuConfig->update_volume_ids($conf, $self->{volume_map});
  foreach my $target_drive (keys %{$self->{target_drive}}) {
    $conf->{$target_drive} = $self->{target_drive}->{$target_drive}->{drivestr};
  }
  PVE::QemuConfig->write_config($vmid, $conf);
}

> 
>>> even better if we would
>>> allocate each disk individually before calling start - then we'd also
>>> avoid the problem of 'third of five disks failed to allocate, but we
>>> haven't told you about the first two yet' ;) but for now, just moving
>>> these two lines two lines up would simplify the last patch of this 
>>> series.
>>>
>>>>       }
>>>>       }
>>>> @@ -1114,13 +1120,6 @@ sub phase3_cleanup {
>>>>       my $tunnel = $self->{tunnel};
>>>> -    # Needs to happen before printing the drives that where migrated 
>>>> via qemu,
>>>> -    # since a new volume on the target might have the same ID as an 
>>>> old volume
>>>> -    # on the source and update_volume_ids relies on matching IDs in 
>>>> the config.
>>>> -    if ($self->{volume_map}) {
>>>> -    PVE::QemuConfig->update_volume_ids($conf, $self->{volume_map});
>>>> -    }
>>>> -
>>>>       if ($self->{storage_migration}) {
>>>>       # finish block-job with block-job-cancel, to disconnect source 
>>>> VM from NBD
>>>>       # to avoid it trying to re-establish it. We are in blockjob 
>>>> ready state,
>>>> @@ -1131,16 +1130,11 @@ sub phase3_cleanup {
>>>>           eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, 
>>>> $self->{storage_migration_jobs}) };
>>>>           eval { PVE::QemuMigrate::cleanup_remotedisks($self) };
>>>>           die "Failed to complete storage migration: $err\n";
>>>> -    } else {
>>>> -        foreach my $target_drive (keys %{$self->{target_drive}}) {
>>>> -        my $drive = PVE::QemuServer::parse_drive($target_drive, 
>>>> $self->{target_drive}->{$target_drive}->{drivestr});
>>>> -        $conf->{$target_drive} = PVE::QemuServer::print_drive($drive);
>>>> -        }
>>>>       }
>>>>       }
>>>> -    # only write after successfully completing the migration
>>>> -    if ($self->{volume_map} || $self->{storage_migration}) {
>>>> +    if ($self->{volume_map}) {
>>>> +    PVE::QemuConfig->update_volume_ids($conf, $self->{volume_map});
>>>>       PVE::QemuConfig->write_config($vmid, $conf);
>>>>       }
>>>> -- 
>>>> 2.20.1
>>>>
>>>>
>>>> _______________________________________________
>>>> pve-devel mailing list
>>>> pve-devel at pve.proxmox.com
>>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>>
>>>>
>>>
>>> _______________________________________________
>>> pve-devel mailing list
>>> pve-devel at pve.proxmox.com
>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>
>> 
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>