[pve-devel] [PATCH container] fix #3030: activate volumes at the right time for restart migration
Fabian Ebner
f.ebner at proxmox.com
Thu Oct 29 09:26:00 CET 2020
Am 28.10.20 um 14:15 schrieb Fabian Grünbichler:
> On October 15, 2020 12:24 pm, Fabian Ebner wrote:
>> The lxc-pve-poststop-hook deactivates volumes when a container is stopped.
>> To make sure that volumes are active when using the restart mode,
>> move activate_volumes to after the conditional vm_stop. The lxc-stop command
>> used in vm_stop waits for the hook script to complete, so there is no race.
>>
>> Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
>> ---
>>
>> For VMs we don't have restart migration, so no similar bug there.
>>
>> An alternative would be to communicate to the hook script to
>> not deactivate the volumes. That would mean writing the lock=migrate
>> to the config earlier (currently it's being set in phase1) and
>> then checking for the lock in the hookscript.
>
> isn't this still wrong, as it only activates the volumes directly
> referenced by the config, but we storage migrate unused (referenced and
> unreferenced) and snapshot volumes as well? wouldn't it make more sense
> that storage_migrate ensures the passed-in volid is activated before
> accessing it? and then before switching the container over, we ensure
> all volids we passed to storage_migrate get deactivated.. the others
> were already deactivated by the container shutting down anyway.
>
You're right, and with volumes not referenced in the config, the QEMU
code is also affected by the issue.
Should we still keep the current activate_volumes in prepare? I imagine
one reason it's there, is that we would die early. If we move activation
to within storage_migrate we'd lose that. But I can check that the
volumes are not required to be active somewhere else during migration
and remove the call in prepare if that's preferred.
Alternatively, we could move the activate_volumes call to before the
loop where we repeatedly call storage_migrate.
In the long run, we might want to rework storage_migrate a bit, having
it take a list of volumes instead, and separating the checks+activation
and the transfer itself. With the goal of being able to die early for
problems like "target storage doesn't support format" or "activation
failed".
>>
>> src/PVE/LXC/Migrate.pm | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/PVE/LXC/Migrate.pm b/src/PVE/LXC/Migrate.pm
>> index 90d74b4..5ef16d2 100644
>> --- a/src/PVE/LXC/Migrate.pm
>> +++ b/src/PVE/LXC/Migrate.pm
>> @@ -90,8 +90,6 @@ sub prepare {
>>
>> });
>>
>> - PVE::Storage::activate_volumes($self->{storecfg}, $need_activate);
>> -
>> # todo: test if VM uses local resources
>>
>> # test ssh connection
>> @@ -110,6 +108,8 @@ sub prepare {
>> $running = 0;
>> }
>>
>> + PVE::Storage::activate_volumes($self->{storecfg}, $need_activate);
>> +
>> return $running;
>> }
>>
>> --
>> 2.20.1
>>
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
>>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
More information about the pve-devel
mailing list