[pve-devel] [PATCH container] fix #3030: activate volumes at the right time for restart migration

Fabian Ebner f.ebner at proxmox.com
Thu Oct 29 09:26:00 CET 2020


Am 28.10.20 um 14:15 schrieb Fabian Grünbichler:
> On October 15, 2020 12:24 pm, Fabian Ebner wrote:
>> The lxc-pve-poststop-hook deactivates volumes when a container is stopped.
>> To make sure that volumes are active when using the restart mode,
>> move activate_volumes to after the conditional vm_stop. The lxc-stop command
>> used in vm_stop waits for the hook script to complete, so there is no race.
>>
>> Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
>> ---
>>
>> For VMs we don't have restart migration, so no similar bug there.
>>
>> An alternative would be to communicate to the hook script to
>> not deactivate the volumes. That would mean writing the lock=migrate
>> to the config earlier (currently it's being set in phase1) and
>> then checking for the lock in the hookscript.
> 
> isn't this still wrong, as it only activates the volumes directly
> referenced by the config, but we storage migrate unused (referenced and
> unreferenced) and snapshot volumes as well? wouldn't it make more sense
> that storage_migrate ensures the passed-in volid is activated before
> accessing it? and then before switching the container over, we ensure
> all volids we passed to storage_migrate get deactivated.. the others
> were already deactivated by the container shutting down anyway.
> 

You're right, and with volumes not referenced in the config, the QEMU 
code is also affected by the issue.

Should we still keep the current activate_volumes in prepare? I imagine 
one reason it's there, is that we would die early. If we move activation 
to within storage_migrate we'd lose that. But I can check that the 
volumes are not required to be active somewhere else during migration 
and remove the call in prepare if that's preferred.
Alternatively, we could move the activate_volumes call to before the 
loop where we repeatedly call storage_migrate.

In the long run, we might want to rework storage_migrate a bit, having 
it take a list of volumes instead, and separating the checks+activation 
and the transfer itself. With the goal of being able to die early for 
problems like "target storage doesn't support format" or "activation 
failed".

>>
>>   src/PVE/LXC/Migrate.pm | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/PVE/LXC/Migrate.pm b/src/PVE/LXC/Migrate.pm
>> index 90d74b4..5ef16d2 100644
>> --- a/src/PVE/LXC/Migrate.pm
>> +++ b/src/PVE/LXC/Migrate.pm
>> @@ -90,8 +90,6 @@ sub prepare {
>>   
>>       });
>>   
>> -    PVE::Storage::activate_volumes($self->{storecfg}, $need_activate);
>> -
>>       # todo: test if VM uses local resources
>>   
>>       # test ssh connection
>> @@ -110,6 +108,8 @@ sub prepare {
>>   	$running = 0;
>>       }
>>   
>> +    PVE::Storage::activate_volumes($self->{storecfg}, $need_activate);
>> +
>>       return $running;
>>   }
>>   
>> -- 
>> 2.20.1
>>
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 





More information about the pve-devel mailing list