[PVE-User] HA migration behaviour vs. failures

Joel S. | VOZELIA joel at vozelia.com
Tue Jul 29 17:34:23 CEST 2014


Hi, 

Would you mind sharing that HA control script /usr/local/bin/bascule_rhcluster.pl?


Best regards, 
Joel.

----- Original Message -----
> From: "Dhaussy Alexandre" <ADhaussy at voyages-sncf.com>
> To: "Dietmar Maurer" <dietmar at proxmox.com>, pve-user at pve.proxmox.com
> Sent: Tuesday, July 29, 2014 5:02:39 PM
> Subject: Re: [PVE-User] HA migration behaviour vs. failures
> 
> Le 29/07/2014 07:20, Dietmar Maurer a écrit :
> > OK, I changed the behavior:
> >
> > https://git.proxmox.com/?p=qemu-server.git;a=commitdiff;h=debe88829e468928271c6d0baf6592b682a70c46
> > https://git.proxmox.com/?p=pve-manager.git;a=commitdiff;h=c0a008a8b3e1a4938b10cbd09f7be403ce17f1cb
> >
> > Would be great if you can test?
> Thank you ! Much appreciated !
> I just applied your patch and rebooted the 3 cluster nodes.
> 
> root at proxmoxt2:~# /usr/local/bin/bascule_rhcluster.pl proxmoxt1
> ===
> === Starting cluster switch to proxmoxt1 (2 threads)
> === Start time : 29/07/2014 - 15:31:27
> ===
> START (29/07/2014 - 15:31:27) : clusvcadm -M pvevm:101 -m proxmoxt1
> START (29/07/2014 - 15:31:27) : clusvcadm -M pvevm:102 -m proxmoxt1
> --> Trying to migrate pvevm:101 to proxmoxt1...Success (28 secs)
> START (29/07/2014 - 15:31:55) : clusvcadm -M pvevm:103 -m proxmoxt1
> --> Trying to migrate pvevm:103 to proxmoxt1...Failed; service running
> on original owner (9 secs)
> START (29/07/2014 - 15:32:04) : clusvcadm -M pvevm:104 -m proxmoxt1
> --> Trying to migrate pvevm:102 to proxmoxt1...Success (38 secs)
> START (29/07/2014 - 15:32:05) : clusvcadm -M pvevm:105 -m proxmoxt1
> --> Trying to migrate pvevm:105 to proxmoxt1...Failed; service running
> on original owner (6 secs)
> START (29/07/2014 - 15:32:11) : clusvcadm -M pvevm:106 -m proxmoxt1
> --> Trying to migrate pvevm:104 to proxmoxt1...Failed; service running
> on original owner (7 secs)
> START (29/07/2014 - 15:32:11) : clusvcadm -M pvevm:107 -m proxmoxt1
> --> Trying to migrate pvevm:106 to proxmoxt1...Failed; service running
> on original owner (7 secs)
> START (29/07/2014 - 15:32:18) : clusvcadm -M pvevm:108 -m proxmoxt1
> --> Trying to migrate pvevm:107 to proxmoxt1...Failed; service running
> on original owner (9 secs)
> START (29/07/2014 - 15:32:20) : clusvcadm -M pvevm:109 -m proxmoxt1
> --> Trying to migrate pvevm:108 to proxmoxt1...Success (25 secs)
> START (29/07/2014 - 15:32:43) : clusvcadm -M pvevm:111 -m proxmoxt1
> --> Trying to migrate pvevm:109 to proxmoxt1...Success (29 secs)
> START (29/07/2014 - 15:32:49) : clusvcadm -M pvevm:112 -m proxmoxt1
> --> Trying to migrate pvevm:111 to proxmoxt1...Success (50 secs)
> START (29/07/2014 - 15:33:33) : clusvcadm -M pvevm:113 -m proxmoxt1
> --> Trying to migrate pvevm:112 to proxmoxt1...Success (50 secs)
> START (29/07/2014 - 15:33:39) : clusvcadm -M pvevm:114 -m proxmoxt1
> --> Trying to migrate pvevm:113 to proxmoxt1...Success (48 secs)
> START (29/07/2014 - 15:34:21) : clusvcadm -M pvevm:115 -m proxmoxt1
> --> Trying to migrate pvevm:114 to proxmoxt1...Success (50 secs)
> --> Trying to migrate pvevm:115 to proxmoxt1...Success (34 secs)
> ===
> === End time : 29/07/2014 - 15:34:55
> ===
> 
> root at proxmoxt2:~# clustat | grep 'pvevm.*proxmoxt2'
>   pvevm:103 proxmoxt2                                   started
>   pvevm:104 proxmoxt2                                   started
>   pvevm:105 proxmoxt2                                   started
>   pvevm:106 proxmoxt2                                   started
>   pvevm:107 proxmoxt2                                   started
> 
> Let's try again...
> 
> root at proxmoxt2:~# /usr/local/bin/bascule_rhcluster.pl proxmoxt1
> ....
> --> Trying to migrate pvevm:103 to proxmoxt1...Success (80 secs)
> --> Trying to migrate pvevm:104 to proxmoxt1...Success (106 secs)
> --> Trying to migrate pvevm:105 to proxmoxt1...Success (29 secs)
> --> Trying to migrate pvevm:106 to proxmoxt1...Success (100 secs)
> --> Trying to migrate pvevm:107 to proxmoxt1...Success (107 secs)
> 
> root at proxmoxt2:~# clustat | grep 'pvevm.*proxmoxt2' | wc -l
> 0
> 
> Good. Let's try to live migrate back to the original node...
> 
> root at proxmoxt1:~# perl /usr/local/bin/bascule_rhcluster.pl proxmoxt2
> ....
> --> Trying to migrate pvevm:102 to proxmoxt2...Success (23 secs)
> --> Trying to migrate pvevm:101 to proxmoxt2...Success (23 secs)
> --> Trying to migrate pvevm:103 to proxmoxt2...Success (65 secs)
> --> Trying to migrate pvevm:104 to proxmoxt2...Success (73 secs)
> --> Trying to migrate pvevm:105 to proxmoxt2...Success (55 secs)
> --> Trying to migrate pvevm:106 to proxmoxt2...Failed; service running
> on original owner (49 secs)
> --> Trying to migrate pvevm:108 to proxmoxt2...Success (21 secs)
> --> Trying to migrate pvevm:109 to proxmoxt2...Success (21 secs)
> --> Trying to migrate pvevm:111 to proxmoxt2...Success (34 secs)
> --> Trying to migrate pvevm:107 to proxmoxt2...Success (97 secs)
> --> Trying to migrate pvevm:112 to proxmoxt2...Success (25 secs)
> --> Trying to migrate pvevm:113 to proxmoxt2...Success (32 secs)
> --> Trying to migrate pvevm:114 to proxmoxt2...Success (30 secs)
> --> Trying to migrate pvevm:115 to proxmoxt2...Success (21 secs)
> 
> root at proxmoxt1:~# perl /usr/local/bin/bascule_rhcluster.pl proxmoxt2
> --> Trying to migrate pvevm:106 to proxmoxt2...Success (50 secs)
> 
> OK! Still some random failures..
> I wonder if it could be related to some kind of latency induced by the
> ongoing self healing...whatever...
> 
> On the other hand, no more downtimes ! Sweeet ! :)
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>



More information about the pve-user mailing list