[pve-devel] VM locked after failed Snapshot
Dietmar Maurer
dietmar at proxmox.com
Mon Sep 8 17:01:14 CEST 2014
> I see multiple problems here.
>
> 1.) lock state should be removed by PVE in case of a failure.
>
> Currently snapshot_create calls snapshot_prepare to set the lock. And at the end
> snapshot_commit deletes the log.
>
> But currently in case of $err
>
> if ($err) {
> warn "snapshot create failed: starting cleanup\n";
> eval { snapshot_delete($vmid, $snapname, 0, $drivehash); };
> warn $@ if $@;
> die $err;
> }
>
> The lock isn't removed.
That is the intention of the look. We cannot commit, and rollback also fails. So we
keep the lock to indicate the need for operator invention.
> What is the correct way to remove a lock in this case?
Manually edit the config. Not sure if there is a better way.
>
> 2.) in case of an unexpected failure or signal ceph/rbd does not remove it's
> watcher from the image. So the snapshot_delete failed in this case.
>
> Output:
>
> image has watchers - not removing
> Removing image: 0% complete...failed.
> rbd: error: image still has watchers
>
> rbd has an automatic timeout after 30s should PVE handle this by waiting 30s
> and try it again?
What are 'unexpected failures'? I think such things should be handled inside librbd?
More information about the pve-devel
mailing list