[PVE-User] Losing quorum - cluster broken

Kurt Bauer kurt.bauer at univie.ac.at
Thu Apr 23 12:56:44 CEST 2015



Nicolas Costes wrote:
>>> Is your multicast working correctly?
>> How can I check ?

Easiest way I can think of:
1. Have a look at the joined mcast-groups, e.g:
root at vm1:~# netstat -g | grep vmbr14
vmbr14          1      239.192.32.212
vmbr14          1      all-systems.mcast.net
vmbr14          1      ff02::202%6423424
vmbr14          1      ff02::1:ff2b:3324%6423424
vmbr14          1      ip6-allnodes
vmbr14          1      ff01::1%6423424

If you have nothing similar to the first entry, there's a problem with
joining the mcast-group.

And then, on your other cluster members:
root at vm2:~# tcpdump -i vmbr14 host 239.192.32.212
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vmbr14, link-type EN10MB (Ethernet), capture size 65535 bytes
10:54:12.343763 IP vm1.5404 > 239.192.32.212.5405: UDP, length 119
10:54:12.579970 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 75
10:54:12.581167 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 1473
10:54:12.581184 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 1473
10:54:12.581190 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 1473
10:54:12.581196 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 1473
10:54:12.581201 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 338
10:54:12.583360 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 766
10:54:12.599887 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 1390
10:54:12.600991 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 1389
10:54:12.811843 IP vm1.5404 > 239.192.32.212.5405: UDP, length 119
10:54:12.838775 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 75
10:54:12.839954 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 1473
10:54:12.839970 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 848
10:54:12.841340 IP vm-iec.5404 > 239.192.32.212.5405: UDP, length 1161
10:54:13.052364 IP vm1.5404 > 239.192.32.212.5405: UDP, length 119
10:54:13.818986 IP vm21.5404 > 239.192.32.212.5405: UDP, length 75

You should see a lot of communication between all your clustermembers.
Of course you have to replace vmbr14 in the above examples with the
interface your clustercommunication should use.

If you have joins, but no communication, you should maybe check your
firewall settings.

Best regards,
Kurt


> 
> Ok, I found it, reading https://pve.proxmox.com/wiki/Multicast_notes . On both 
> nodes :
> 
> # omping -m 239.192.205.35 yin hongcha
> 
> 
> [...]
> yin :   unicast, seq=115, size=69 bytes, dist=0, time=0.385ms
> yin :   unicast, seq=116, size=69 bytes, dist=0, time=0.384ms
> yin :   unicast, seq=117, size=69 bytes, dist=0, time=0.390ms
> 
> yin :   unicast, xmt/rcv/%loss = 117/117/0%, min/avg/max/std-dev = 
> 0.312/0.373/0.407/0.022
> yin : multicast, xmt/rcv/%loss = 117/0/100%, min/avg/max/std-dev = 
> 0.000/0.000/0.000/0.000
> 
> So, no multicast, only unicast, right ?
> 
> 
> 



More information about the pve-user mailing list