[pve-devel] VM locked after failed Snapshot
Stefan Priebe
s.priebe at profihost.ag
Mon Sep 8 21:30:27 CEST 2014
Am 08.09.2014 17:01, schrieb Dietmar Maurer:
>> I see multiple problems here.
>>
>> 1.) lock state should be removed by PVE in case of a failure.
>>
>> Currently snapshot_create calls snapshot_prepare to set the lock. And at the end
>> snapshot_commit deletes the log.
>>
>> But currently in case of $err
>>
>> if ($err) {
>> warn "snapshot create failed: starting cleanup\n";
>> eval { snapshot_delete($vmid, $snapname, 0, $drivehash); };
>> warn $@ if $@;
>> die $err;
>> }
>>
>> The lock isn't removed.
>
> That is the intention of the look. We cannot commit, and rollback also fails. So we
> keep the lock to indicate the need for operator invention.
Mhm but if a snapshot fails it fails - there is an error message. What
is the reason to keep the lock?
>> What is the correct way to remove a lock in this case?
>
> Manually edit the config. Not sure if there is a better way.
>>
>> 2.) in case of an unexpected failure or signal ceph/rbd does not remove it's
>> watcher from the image. So the snapshot_delete failed in this case.
>>
>> Output:
>>
>> image has watchers - not removing
>> Removing image: 0% complete...failed.
>> rbd: error: image still has watchers
>>
>> rbd has an automatic timeout after 30s should PVE handle this by waiting 30s
>> and try it again?
>
> What are 'unexpected failures'? I think such things should be handled inside librbd?
In this case the rbd command had "received interupt"? May be a restart
fo the pveproxy or api daemon? It is handles inside librbd ;-) It's the
30s timeout ;-)
Stefan
More information about the pve-devel
mailing list