[pve-devel] VM locked after failed Snapshot
Stefan Priebe - Profihost AG
s.priebe at profihost.ag
Tue Sep 9 10:34:15 CEST 2014
Am 09.09.2014 um 02:11 schrieb Alexandre DERUMIER:
>>> I'm still on 2.0 - did it also exist in 2.0?
>
> I'm not sure if that bug was present on 2.0.
>
> just take 1 snasphot with vmstate, then a second snasphot with vmstate.
>
>
> (But don't seem related as you take a snasphot without memory state)
At least it has a reference counting bug too. But another one ;-) fixed
that. Will not switch to Qemu 2.1 before 2.1.2. In the first releases
they had too many bugs.
Stefan
> ----- Mail original -----
>
> De: "Stefan Priebe" <s.priebe at profihost.ag>
> À: "Alexandre DERUMIER" <aderumier at odiso.com>
> Cc: pve-devel at pve.proxmox.com
> Envoyé: Lundi 8 Septembre 2014 21:32:04
> Objet: Re: [pve-devel] VM locked after failed Snapshot
>
> Am 08.09.2014 17:18, schrieb Alexandre DERUMIER:
>> Hi,
>>
>> my 2cents, but could it be related to the vmstate bug ?
>> https://git.proxmox.com/?p=pve-qemu-kvm.git;a=commit;h=62d638ff1e9fb96ca078be2225426aaac8f909f6
>
> I'm still on 2.0 - did it also exist in 2.0?
>
>> (Is a a vm snasphot with vmstate ?)
> No.
>
> Stefan
>
>> ----- Mail original -----
>>
>> De: "Stefan Priebe - Profihost AG" <s.priebe at profihost.ag>
>> À: pve-devel at pve.proxmox.com
>> Envoyé: Lundi 8 Septembre 2014 12:06:48
>> Objet: [pve-devel] VM locked after failed Snapshot
>>
>> Hi,
>>
>> today i had the following problem.
>>
>> 1.) i wanted to create a snapshot of a vm
>> 2.) it failed for unknown reason and i had the following output (PVE
>> Webgui):
>>
>> image has watchers - not removing
>> Removing image: 0% complete...failed.
>> rbd: error: image still has watchers
>> TASK ERROR: received interrupt
>>
>> 3.) The VM was than in a locked state (VM is locked (snapshot))
>>
>> I see multiple problems here.
>>
>> 1.) lock state should be removed by PVE in case of a failure.
>>
>> Currently snapshot_create calls snapshot_prepare to set the lock. And at
>> the end snapshot_commit deletes the log.
>>
>> But currently in case of $err
>>
>> if ($err) {
>> warn "snapshot create failed: starting cleanup\n";
>> eval { snapshot_delete($vmid, $snapname, 0, $drivehash); };
>> warn $@ if $@;
>> die $err;
>> }
>>
>> The lock isn't removed.
>>
>> What is the correct way to remove a lock in this case?
>>
>> 2.) in case of an unexpected failure or signal ceph/rbd does not remove
>> it's watcher from the image. So the snapshot_delete failed in this case.
>>
>> Output:
>>
>> image has watchers - not removing
>> Removing image: 0% complete...failed.
>> rbd: error: image still has watchers
>>
>> rbd has an automatic timeout after 30s should PVE handle this by waiting
>> 30s and try it again?
>>
>> Greets,
>> Stefan
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at pve.proxmox.com
>> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
More information about the pve-devel
mailing list