[pve-devel] Quorum problems with NICs Intel of 10 Gb/s and VMsturns off

Sat Jan 3 16:31:11 CET 2015

Alexandre Derumier 
Ingénieur système et stockage 

Fixe : 03 20 68 90 88 
Fax : 03 20 68 90 81 

45 Bvd du Général Leclerc 59100 Roubaix 
12 rue Marivaux 75002 Paris 

MonSiteEstLent.com - Blog dédié à la webperformance et la gestion de pics de trafic 

De: "Cesar Peschiera" <brain at click.com.py> 
À: "aderumier" <aderumier at odiso.com> 
Cc: "pve-devel" <pve-devel at pve.proxmox.com> 
Envoyé: Samedi 3 Janvier 2015 03:41:20 
Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and VMsturns off 

Hi Alexandre 

Many thanks for your reply, which is much appreciated. 

Unfortunately, your suggestion does not work for me, so i will comment the 
results. 

Between some comments, also in this message i have 7 questions for you, and 
i'll be very grateful if you can answer me. 

Only for that be clear about of the version of the programs that i have 
installed in the nodes that has a behaviour strange (2 of 6 PVE nodes): 
shell> pveversion -v 
proxmox-ve-2.6.32: 3.3-139 (running kernel: 3.10.0-5-pve) 
pve-manager: 3.3-5 (running version: 3.3-5/bfebec03) 
pve-kernel-3.10.0-5-pve: 3.10.0-19 
pve-kernel-2.6.32-34-pve: 2.6.32-139 
lvm2: 2.02.98-pve4 
clvm: 2.02.98-pve4 
corosync-pve: 1.4.7-1 
openais-pve: 1.1.4-3 
libqb0: 0.11.1-2 
redhat-cluster-pve: 3.2.0-2 
resource-agents-pve: 3.9.2-4 
fence-agents-pve: 4.0.10-1 
pve-cluster: 3.0-15 
qemu-server: 3.3-5 <------especial patch created by Alexandre for me 
pve-firmware: 1.1-3 
libpve-common-perl: 3.0-19 
libpve-access-control: 3.0-15 
libpve-storage-perl: 3.0-25 
pve-libspice-server1: 0.12.4-3 
vncterm: 1.1-8 
vzctl: 4.0-1pve6 
vzprocps: 2.0.11-2 
vzquota: 3.1-2 
pve-qemu-kvm: 2.2-2 <------especial patch created by Alexandre for me 
ksm-control-daemon: 1.1-1 
glusterfs-client: 3.5.2-1 

After a minute of apply on only a node (pve6), these commands, i lost the 
quorum in two nodes (pve5 and pve6): 
The commands executed on only a node (pve6): 
echo 1 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping 
echo 0 > /sys/class/net/vmbr0/bridge/multicast_querier 

The error message in the node where i applied the commands (pve6) is this: 
Message from syslogd at pve6 at Jan 2 20:58:32 ... 
rgmanager[4912]: #1: Quorum Dissolved 

And as collateral effect, as the pve5 node is configured with HA for a VM 
with a "failover domain" between pve5 and pve6 (the nodes), also pve5 has 
loss the quorum and the VM that is in HA turns off brutally. 

These are the error messages in the screen of the pve5 node: 
[ 61.246002] dlm: rgmanager: send_repeat_remove dir 6 rg="pvevm:112" 
[119373.380111] dlm: closing connection to node 1" 
[119373:300150] dlm: closing connection to node 2" 
[119373:380182] dlm: closing connection to node 3" 
[119373:300205] dlm: closing connection to node 4" 
[119373:380229] dlm: closing connection to node 6" 
[119373:300268] dlm: closing connection to node 7" 
[119373:380319] dlm: closing connection to node 8" 
[119545:042242] dlm: closing connection to node 3" 
[119545:042264] dlm: closing connection to node 8" 
[119545:042281] dlm: closing connection to node 7" 
[119545:042300] dlm: closing connection to node 2" 
[119545:042316] dlm: closing connection to node 1" 
[119545:042331] dlm: closing connection to node 4" 
[119545:042347] dlm: closing connection to node 5" 
[119545:042891] dlm: dlm user daemon left 1 lockspaces 

So i believe that pve has a bug and a great problem, but i am not sure of 
that, but i know that if the pve6 node for some reason turns off brutally, 
the pve5 node will lose quorum and his VM in HA also will turn off, and this 
behaviour will give me several problems due that actually i don't know what 
i must do for start the VM in the node that is alive? 

So my questions are: 
1) Why the pve5 node lost the quorum if i don't applied any change in this 
node? 
(this node always had the multicast snooping filter disabled) 
2) Why the VM that is running on pve5 node and also is configured in HA 
turns off brutally? 
3) If it is a bug, can someone apply a patch to code? 

Moreover, talking about of firewall enabled for the VMs: 
I remember that +/- 1 month ago, i tried apply to the firewall a rule 
restrictive of access of the IP address of cluster communication to the VMs 
without successful, ie, with a policy of firewall by default of "allow", 
each time that i enable this unique and restrictive rule to the VM, the VM 
lose all network communication. Maybe i am wrong in something. 

So i would like to ask you somethings: 

4) Can you do a test, and then tell me the results? 
5) If the results are positives, can you tell me how do it? 
6) And if the results are negatives, can you apply a patch to code? 

Moreover, the last question: 
7) As each PVE node has his "firewall" tag in the PVE GUI, i guess that such 
option is for apply firewall rules of in/out that affect only to this node, 
right?, or for what exist such option? 

----- Original Message ----- 
From: "Alexandre DERUMIER" <aderumier at odiso.com> 
To: "Cesar Peschiera" <brain at click.com.py> 
Cc: "pve-devel" <pve-devel at pve.proxmox.com> 
Sent: Friday, January 02, 2015 5:40 AM 
Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and 
VMsturns off 

Hi, 

>>But as i need that the VMs and the PVE host can be accessed from any 
>>workstation, the vlan option isn't a option useful for me. 
Ok 

>>And about of cluster communication and the VMs, as i don't want that the 
>>multicast packages go to the VMs, i believe that i can cut it for the VMs 
>>of 
>>two modes: 
>> 
>>a) Removing the option "post-up echo 0 > 
>>/sys/devices/virtual/net/vmbr0/bridge/multicast_snooping " to my NIC 
>>configuration of the PVE host if i will have a behaviour stable. 

Yes, indeed you can enable snooping to filter multicast 

>>b) By firewall will be very easy, since that i know the IP address of 
>>origin 
>>of cluster communication, but unfortunately the wiki of PVE don't show 
>>clearly how can i apply it, ie, i see the "firewall" tag in datacenter, 
>>PVE 
>>hosts and in the network configuration of the VMs, and the wiki don't says 
>>nothing about of this, for me, with a global configuration that affect to 
>>all VMs of the cluster will be wonderfull using IPset or some other way 
>>that 
>>be simple of apply. 

I think you can create a security group with a rule which block the 
multicast adress of your pve cluster 

#pvecm status|grep "Multicast addresses" 

to get your cluster multicast address 

Then add this security group to each vm. 

(Currently datacenter rules apply only on hosts IN|OUT iptables rules, but 
not in FORWARD iptables rules which is used by vms) 

----- Mail original ----- 
De: "Cesar Peschiera" <brain at click.com.py> 
À: "aderumier" <aderumier at odiso.com> 
Cc: "pve-devel" <pve-devel at pve.proxmox.com> 
Envoyé: Vendredi 2 Janvier 2015 05:10:08 
Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and 
VMsturns off 

Hi Alexandre. 

Thanks for your reply. 

But as i need that the VMs and the PVE host can be accessed from any 
workstation, the vlan option isn't a option useful for me. 

Anyway, i am testing with I/OAT DMA Engine enabled in the Bios Hardware, 
that after some days with few activity, the CMAN cluster is stable, soon i 
will prove with a lot of network activity . 

And about of cluster communication and the VMs, as i don't want that the 
multicast packages go to the VMs, i believe that i can cut it for the VMs of 
two modes: 

a) Removing the option "post-up echo 0 > 
/sys/devices/virtual/net/vmbr0/bridge/multicast_snooping " to my NIC 
configuration of the PVE host if i will have a behaviour stable. 

b) By firewall will be very easy, since that i know the IP address of origin 
of cluster communication, but unfortunately the wiki of PVE don't show 
clearly how can i apply it, ie, i see the "firewall" tag in datacenter, PVE 
hosts and in the network configuration of the VMs, and the wiki don't says 
nothing about of this, for me, with a global configuration that affect to 
all VMs of the cluster will be wonderfull using IPset or some other way that 
be simple of apply. 

Do you have some idea of how avoid that multicast packages go to the VMs in 
a stable mode? and how apply it? 

----- Original Message ----- 
From: "Alexandre DERUMIER" <aderumier at odiso.com> 
To: "Cesar Peschiera" <brain at click.com.py> 
Cc: "pve-devel" <pve-devel at pve.proxmox.com> 
Sent: Wednesday, December 31, 2014 3:33 AM 
Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and 
VMsturns off 

Hi Cesar, 

I think I totaly forgot that we can't add an ip on an interface slave of a 
bridge. 

Myself I'm using a tagged vlan interface for the cluster communication 

something like: 

auto bond0 
iface bond0 inet manual 
slaves eth0 eth2 
bond_miimon 100 
bond_mode 802.3ad 
bond_xmit_hash_policy layer2 

auto bond0.100 
iface bond0 inet static 
address 192.100.100.50 
netmask 255.255.255.0 
gateway 192.100.100.4 

auto vmbr0 
iface vmbr0 inet manual 
bridge_ports bond0 
bridge_stp off 
bridge_fd 0 
post-up echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping 

----- Mail original ----- 
De: "Cesar Peschiera" <brain at click.com.py> 
À: "aderumier" <aderumier at odiso.com> 
Cc: "pve-devel" <pve-devel at pve.proxmox.com> 
Envoyé: Mercredi 31 Décembre 2014 05:01:37 
Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and 
VMsturns off 

Hi Alexandre 

Today, and after a week, again a node lost the cluster communication. So i 
changed the configuration of the Bios Hardware to "I/OAT DMA enabled" (that 
work very well in others nodes Dell R320 with NICs of 1 Gb/s). 

Moreover, trying to follow your advice of to put 192.100.100.51 ip address 
directly to bond0 and not in vmbr0, when i reboot the node, it is totally 
isolated, and i see a message that says that vmbr0 missing a IP address. 
Also the node is totally isolated when i apply this ip address to vmbr0: 
0.0.0.0/255.255.255.255 

In practical terms, can you tell me how can i add a IP address to bond0 and 
also have a bridge for these same NICs? 

- Now, this is my configuration: 
auto bond0 
iface bond0 inet manual 
slaves eth0 eth2 
bond_miimon 100 
bond_mode 802.3ad 
bond_xmit_hash_policy layer2 

auto vmbr0 
iface vmbr0 inet static 
address 192.100.100.50 
netmask 255.255.255.0 
gateway 192.100.100.4 
bridge_ports bond0 
bridge_stp off 
bridge_fd 0 
post-up echo 0 > 
/sys/devices/virtual/net/vmbr0/bridge/multicast_snooping 

----- Original Message ----- 
From: "Alexandre DERUMIER" <aderumier at odiso.com> 
To: "Cesar Peschiera" <brain at click.com.py> 
Cc: "pve-devel" <pve-devel at pve.proxmox.com> 
Sent: Friday, December 19, 2014 7:59 AM 
Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and 
VMsturns off 

maybe can you try to put 192.100.100.51 ip address directly to bond0, 

to avoid corosync traffic going through to vmbr0. 

(I remember some old offloading bugs with 10gbe nic and linux bridge) 

----- Mail original ----- 
De: "Cesar Peschiera" <brain at click.com.py> 
À: "aderumier" <aderumier at odiso.com> 
Cc: "pve-devel" <pve-devel at pve.proxmox.com> 
Envoyé: Vendredi 19 Décembre 2014 11:08:33 
Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and 
VMsturns off 

>can you post your /etc/network/interfaces of theses 10gb/s nodes ? 

This is my configuration: 
Note: The LAN use 192.100.100.0/24 

#Network interfaces 
auto lo 
iface lo inet loopback 

iface eth0 inet manual 
iface eth1 inet manual 
iface eth2 inet manual 
iface eth3 inet manual 
iface eth4 inet manual 
iface eth5 inet manual 
iface eth6 inet manual 
iface eth7 inet manual 
iface eth8 inet manual 
iface eth9 inet manual 
iface eth10 inet manual 
iface eth11 inet manual 

#PVE Cluster and VMs (NICs are of 10 Gb/s): 
auto bond0 
iface bond0 inet manual 
slaves eth0 eth2 
bond_miimon 100 
bond_mode 802.3ad 
bond_xmit_hash_policy layer2 

#PVE Cluster and VMs: 
auto vmbr0 
iface vmbr0 inet static 
address 192.100.100.51 
netmask 255.255.255.0 
gateway 192.100.100.4 
bridge_ports bond0 
bridge_stp off 
bridge_fd 0 
post-up echo 0 > 
/sys/devices/virtual/net/vmbr0/bridge/multicast_snooping 
post-up echo 1 > /sys/class/net/vmbr0/bridge/multicast_querier 

#A link for DRBD (NICs are of 10 Gb/s): 
auto bond401 
iface bond401 inet static 
address 10.1.1.51 
netmask 255.255.255.0 
slaves eth1 eth3 
bond_miimon 100 
bond_mode balance-rr 
mtu 9000 

#Other link for DRBD (NICs are of 10 Gb/s): 
auto bond402 
iface bond402 inet static 
address 10.2.2.51 
netmask 255.255.255.0 
slaves eth4 eth6 
bond_miimon 100 
bond_mode balance-rr 
mtu 9000 

#Other link for DRBD (NICs are of 10 Gb/s): 
auto bond403 
iface bond403 inet static 
address 10.3.3.51 
netmask 255.255.255.0 
slaves eth5 eth7 
bond_miimon 100 
bond_mode balance-rr 
mtu 9000 

#A link for the NFS-Backups (NICs are of 1 Gb/s): 
auto bond10 
iface bond10 inet static 
address 10.100.100.51 
netmask 255.255.255.0 
slaves eth8 eth10 
bond_miimon 100 
bond_mode balance-rr 
#bond_mode active-backup 
mtu 9000 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.proxmox.com/pipermail/pve-devel/attachments/20150103/eb223384/attachment.htm>