[pve-devel] [PATCH qemu-server v2] Fix #2412: Missing VMs in pools
Thomas Lamprecht
t.lamprecht at proxmox.com
Tue Oct 15 11:33:53 CEST 2019
On 10/15/19 11:23 AM, Dominic Jäger wrote:
> On Mon, Oct 14, 2019 at 02:02:26PM +0200, Thomas Lamprecht wrote:
>> On 10/14/19 1:10 PM, Dominic Jäger wrote:
>>> Between calling vm_destroy and removing the ID from user.cfg (remove_vm_access)
>>> creating a new VM with this ID was possible. VMs could go missing from pools as
>>> a consequence.
>>>
>>> Adding a lock solves this for clones from the same node. Additionally,
>>> unlinking must happen at the very end of the deletion process to avoid that
>>> other nodes use the ID in the meanwhile.
>>>
>>> Co-developed-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
>>> Signed-off-by: Dominic Jäger <d.jaeger at proxmox.com>
>>> ---
>>> v1->v2: Move unlink to the end of the deletion process
>>>
>>> PVE/API2/Qemu.pm | 11 ++++++++---
>>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
>>> index 267a08e..f84aca7 100644
>>> --- a/PVE/API2/Qemu.pm
>>> +++ b/PVE/API2/Qemu.pm
>>> @@ -1492,9 +1492,14 @@ __PACKAGE__->register_method({
>>> my $upid = shift;
>>>
>>> syslog('info', "destroy VM $vmid: $upid\n");
>>> - PVE::QemuServer::vm_destroy($storecfg, $vmid, $skiplock);
>>
>> so skiplock is now a dead parameter? I mean that /could/ be OK, but the
>> commit never mentions anywhere why this is the case. No users, just kept
>> for backward compat? If so I'd at least adapt the params description, so
>> people reading man pages or CLI help output can have an idea that this
>> does nothing.
>> Or is it in use and still required to be handled? If so, address that.
>> Either way, just "silently" removing it is not OK.
>>
> Removing the parameter was an accident. I'll think about it and send a v3.
thanks
>
>>> - PVE::AccessControl::remove_vm_access($vmid);
>>> - PVE::Firewall::remove_vmfw_conf($vmid);
>>> + PVE::QemuConfig->lock_config($vmid, sub {
>>> + die "VM $vmid is running - destroy failed\n"
>>> + if (PVE::QemuServer::check_running($vmid));
>>> + PVE::QemuServer::destroy_vm($storecfg, $vmid, 1, 0);
>>> + PVE::AccessControl::remove_vm_access($vmid);
>>> + PVE::Firewall::remove_vmfw_conf($vmid);
>>> + unlink PVE::QemuConfig->config_file($vmid);
>>
>> unlink does not errors itself if the removal failed (e.g., pmxcfs went just
>> inquorate, so please do a
>> unlink ... or die "final removal of $vmid config failed: $!\n";
>>
>> to tell an user in such a case.
>
> Ok. Is there a reason why we don't do this in PVE::QemuServer::destroy_vm or
> should we actually add the error handling there, too?
No there's no reason, it should be there too.
Also destroy_vm should be modified to only do the $keep_empty_config thing after
the "remove unused" volumes (separate patch/series) and maybe cleanup the double
nested eval there too (also separate patch)..
More information about the pve-devel
mailing list