[pve-devel] Consistency in volume deletion in process of concurrent VM deletion
Fabian Grünbichler
f.gruenbichler at proxmox.com
Wed Oct 22 11:49:30 CEST 2025
On October 21, 2025 5:33 pm, Andrei Perepiolkin via pve-devel wrote:
> Hi Proxmox Community,
>
> There might be a potential consistency problem with Proxmox vm deletion.
>
> If Proxmox receives multiple concurrent VM deletion requests, where each
> VM has multiple disks located on shared storage.
>
> The deletion process may fail or hang when attempting to acquire the
> storage
> lock(https://github.com/proxmox/pve-storage/blob/master/src/PVE/Storage.pm#L1196C1-L1209C7).
>
> ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
> ...
>
> Eventually, the VM configuration files in /etc/pve are removed, but some
> VM disks may remain.
>
> Additionally, the Web UI shows all deletions as successful, even though
> some disks were not deleted.
>
> In my opinion, a VM should either be deleted completely—including all
> dependent resources—or the deletion should fail, leaving the VM
> configuration file with an updated state.
the underlying issue is that the scope of the lock taken for certain
storage operations is very big for shared storages. we could probably
reduce it to a more meaningful level for most such storages:
https://bugzilla.proxmox.com/show_bug.cgi?id=1962
but the the error handling might also be lacking in this case, would
have to double-check.
>
> Im reproducing this by:
>
> for i in `seq 401 420` ; do qm clone 104 $i --name "win-$i" --full
> --storage jdss-Pool-2 ; done;
>
> for i in `seq 401 410` ; do qm destroy $i
> --destroy-unreferenced-disks 1 --purge 1 & done ;
>
>
> Have to notice that ssh session that I use to conduct 'qm destroy'
> command get terminated by Proxmox.
that seems unexpected, are you sure this is caused by PVE?
> Ive duplicated as a bug at:
> https://bugzilla.proxmox.com/show_bug.cgi?id=6957
it would be better to either send a mail or file a bug, to not risk
splitting the discussion..
> Is this a bug and will it be addressed in near future?
nobody picked up the work regarding the lock granularity, but it would
be a nice improvement IMHO!
Fabian
More information about the pve-devel
mailing list