[pve-devel] [PATCH] rbd : free_image : retry if rbd has watchers

Alexandre DERUMIER aderumier at odiso.com
Mon Dec 8 05:07:14 CET 2014


>>We are still experiencing corruption when performing live disk moves, even with this patch in place. 

Do you talk about patch " rbd : free_image : retry if rbd has watchers" ?

because we don't have applied this patch, because it was not the problem but the consequence.

Do you have tested last qemu-server package from pvetest repository ?(>= 3.3-3)

It should be fixed here.



----- Mail original -----
De: "Andrew Thrift" <andrew at networklabs.co.nz>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Lundi 8 Décembre 2014 04:42:23
Objet: Re: [pve-devel] [PATCH] rbd : free_image : retry if rbd has watchers

Hi Alexandre, 
We are still experiencing corruption when performing live disk moves, even with this patch in place. 

Is there anything we can do to help pinpoint the cause of this ? 



On Wed, Nov 12, 2014 at 7:39 PM, Alexandre DERUMIER < aderumier at odiso.com > wrote: 


Ok, Great! 

Thanks for testing. 

(@cc pve-devel) 

----- Mail transféré ----- 

De: "Andrew Thrift" < andrew at networklabs.co.nz > 
À: "Alexandre DERUMIER" < aderumier at odiso.com > 
Envoyé: Mercredi 12 Novembre 2014 05:10:40 
Objet: Re: [pve-devel] [PATCH] rbd : free_image : retry if rbd has watchers 


Hi Alexandre, 


Initial testing looks promising. 


I have tested migrating disks that have active writes on them and it worked well. All files had matching md5 sums. 


I will test with 4K writes tomorrow. 


On Fri, Nov 7, 2014 at 10:42 PM, Andrew Thrift < andrew at networklabs.co.nz > wrote: 



Thanks Alexandre, 


I will try these first thing Monday. 


Have a good weekend ! 








On Fri, Nov 7, 2014 at 10:29 PM, Alexandre DERUMIER < aderumier at odiso.com > wrote: 

<blockquote> 
>>Do you know why online Disk Move's could be causing this corruption ? We have had to stop using it as if we corrupt a customers DB server it would not be a good thing.... :( 

Can you try to 2 patchs I have sent ? I think it should fix the problem. 


----- Mail original ----- 

De: "Andrew Thrift" < andrew at networklabs.co.nz > 
À: "Alexandre DERUMIER" < aderumier at odiso.com > 
Envoyé: Jeudi 6 Novembre 2014 21:18:19 
Objet: Re: [pve-devel] [PATCH] rbd : free_image : retry if rbd has watchers 




HI Alexandre, 


Not related specifically to this patch. But using DIsk Move while the VM is online results in corruption for us almost every time we use it. 


We are using PVE3.3 with RBD storage. Typically we are moving from one RBD pool to another. We seem to get coorruption if the block copy completes or fails. 


We are primarily running Windows guest OS's with virtio or virtio-scsi disks. 


Our Ceph cluster has 84 spinning disks and 7x Intel S3700 Journal's. Networking to all devices is 2x10gigabit bonded and performance generally is very good. 




Do you know why online Disk Move's could be causing this corruption ? We have had to stop using it as if we corrupt a customers DB server it would not be a good thing.... :( 




On Fri, Nov 7, 2014 at 5:00 AM, Alexandre DERUMIER < aderumier at odiso.com > wrote: 


I'll resend a V2 tommorow 
----- Mail original ----- 

De: "Dietmar Maurer" < dietmar at proxmox.com > 
À: "Alexandre DERUMIER" < aderumier at odiso.com > 
Cc: pve-devel at pve.proxmox.com 
Envoyé: Jeudi 6 Novembre 2014 16:08:38 
Objet: RE: [pve-devel] [PATCH] rbd : free_image : retry if rbd has watchers 



> >>And what happens if we get other errors? 
> 
> Currently It's retrying until $i > ~0 
> 
> but we could add a die directly if $err !~ image still has watchers 

Yes, I think that would be better. 
_______________________________________________ 
pve-devel mailing list 
pve-devel at pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 




</blockquote> 






More information about the pve-devel mailing list