[pve-devel] Quorum problems with NICs Intel of 10 Gb/s and VMsturns off

Sat Jan 3 03:41:20 CET 2015

Hi Alexandre

Many thanks for your reply, which is much appreciated.

Unfortunately, your suggestion does not work for me, so i will comment the
results.

Between some comments, also in this message i have 7 questions for you, and
i'll be very grateful if you can answer me.

Only for that be clear about of the version of the programs that i have
installed in the nodes that has a behaviour strange (2 of 6 PVE nodes):
shell> pveversion -v
proxmox-ve-2.6.32: 3.3-139 (running kernel: 3.10.0-5-pve)
pve-manager: 3.3-5 (running version: 3.3-5/bfebec03)
pve-kernel-3.10.0-5-pve: 3.10.0-19
pve-kernel-2.6.32-34-pve: 2.6.32-139
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-1
pve-cluster: 3.0-15
qemu-server: 3.3-5 <------especial patch created by Alexandre for me
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-15
libpve-storage-perl: 3.0-25
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.2-2 <------especial patch created by Alexandre for me
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

After a minute of apply on only a node (pve6), these commands, i lost the
quorum in two nodes (pve5 and pve6):
The commands executed on only a node (pve6):
echo 1 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping
echo 0 > /sys/class/net/vmbr0/bridge/multicast_querier

The error message in the node where i applied the commands (pve6) is this:
Message from syslogd at pve6 at Jan  2 20:58:32 ...
rgmanager[4912]: #1: Quorum Dissolved

And as collateral effect, as the pve5 node is configured with HA for a VM
with a "failover domain" between pve5 and pve6 (the nodes), also pve5 has
loss the quorum and the VM that is in HA turns off brutally.

These are the error messages in the screen of the pve5 node:
[    61.246002] dlm: rgmanager: send_repeat_remove dir 6 rg="pvevm:112"
[119373.380111] dlm: closing connection to node 1"
[119373:300150] dlm: closing connection to node 2"
[119373:380182] dlm: closing connection to node 3"
[119373:300205] dlm: closing connection to node 4"
[119373:380229] dlm: closing connection to node 6"
[119373:300268] dlm: closing connection to node 7"
[119373:380319] dlm: closing connection to node 8"
[119545:042242] dlm: closing connection to node 3"
[119545:042264] dlm: closing connection to node 8"
[119545:042281] dlm: closing connection to node 7"
[119545:042300] dlm: closing connection to node 2"
[119545:042316] dlm: closing connection to node 1"
[119545:042331] dlm: closing connection to node 4"
[119545:042347] dlm: closing connection to node 5"
[119545:042891] dlm: dlm user daemon left 1 lockspaces

So i believe that pve has a bug and a great problem, but i am not sure of
that, but i know that if the pve6 node for some reason turns off brutally,
the pve5 node will lose quorum and his VM in HA also will turn off, and this
behaviour will give me several problems due that actually i don't know what
i must do for start the VM in the node that is alive?

So my questions are:
1) Why the pve5 node lost the quorum if i don't applied any change in this
node?
(this node always had the multicast snooping filter disabled)
2) Why the VM that is running on pve5 node and also is configured in HA
turns off brutally?
3) If it is a bug, can someone apply a patch to code?

Moreover, talking about of firewall enabled for the VMs:
I remember that +/- 1 month ago, i tried apply to the firewall a rule
restrictive of access of the IP address of cluster communication to the VMs
without successful, ie, with a policy of firewall by default of "allow",
each time that i enable this unique and restrictive rule to the VM, the VM
lose all network communication. Maybe i am wrong in something.

So i would like to ask you somethings:

4) Can you do a test, and then tell me the results?
5) If the results are positives, can you tell me how do it?
6) And if the results are negatives, can you apply a patch to code?

Moreover, the last question:
7) As each PVE node has his "firewall" tag in the PVE GUI, i guess that such
option is for apply firewall rules of in/out that affect only to this node,
right?, or for what exist such option?

----- Original Message ----- 
From: "Alexandre DERUMIER" <aderumier at odiso.com>
To: "Cesar Peschiera" <brain at click.com.py>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Sent: Friday, January 02, 2015 5:40 AM
Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off

Hi,

>>But as i need that the VMs and the PVE host can be accessed from any
>>workstation, the vlan option isn't a option useful for me.
Ok

>>And about of cluster communication and the VMs, as i don't want that the
>>multicast packages go to the VMs, i believe that i can cut it for the VMs
>>of
>>two modes:
>>
>>a) Removing the option "post-up echo 0 >
>>/sys/devices/virtual/net/vmbr0/bridge/multicast_snooping " to my NIC
>>configuration of the PVE host if i will have a behaviour stable.

Yes, indeed you can enable snooping to filter multicast

>>b) By firewall will be very easy, since that i know the IP address of
>>origin
>>of cluster communication, but unfortunately the wiki of PVE don't show
>>clearly how can i apply it, ie, i see the "firewall" tag in datacenter,
>>PVE
>>hosts and in the network configuration of the VMs, and the wiki don't says
>>nothing about of this, for me, with a global configuration that affect to
>>all VMs of the cluster will be wonderfull using IPset or some other way
>>that
>>be simple of apply.

I think you can create a security group with a rule which block the
multicast adress of your pve cluster

#pvecm status|grep "Multicast addresses"

to get your cluster multicast address

Then add this security group to each vm.

(Currently datacenter rules apply only on hosts IN|OUT iptables rules, but
not in FORWARD iptables rules which is used by vms)

----- Mail original -----
De: "Cesar Peschiera" <brain at click.com.py>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Vendredi 2 Janvier 2015 05:10:08
Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off

Hi Alexandre.

Thanks for your reply.

But as i need that the VMs and the PVE host can be accessed from any
workstation, the vlan option isn't a option useful for me.

Anyway, i am testing with I/OAT DMA Engine enabled in the Bios Hardware,
that after some days with few activity, the CMAN cluster is stable, soon i
will prove with a lot of network activity .

And about of cluster communication and the VMs, as i don't want that the
multicast packages go to the VMs, i believe that i can cut it for the VMs of
two modes:

a) Removing the option "post-up echo 0 >
/sys/devices/virtual/net/vmbr0/bridge/multicast_snooping " to my NIC
configuration of the PVE host if i will have a behaviour stable.

b) By firewall will be very easy, since that i know the IP address of origin
of cluster communication, but unfortunately the wiki of PVE don't show
clearly how can i apply it, ie, i see the "firewall" tag in datacenter, PVE
hosts and in the network configuration of the VMs, and the wiki don't says
nothing about of this, for me, with a global configuration that affect to
all VMs of the cluster will be wonderfull using IPset or some other way that
be simple of apply.

Do you have some idea of how avoid that multicast packages go to the VMs in
a stable mode? and how apply it?

----- Original Message ----- 
From: "Alexandre DERUMIER" <aderumier at odiso.com>
To: "Cesar Peschiera" <brain at click.com.py>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Sent: Wednesday, December 31, 2014 3:33 AM
Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off

Hi Cesar,

I think I totaly forgot that we can't add an ip on an interface slave of a
bridge.

Myself I'm using a tagged vlan interface for the cluster communication

something like:

auto bond0
iface bond0 inet manual
slaves eth0 eth2
bond_miimon 100
bond_mode 802.3ad
bond_xmit_hash_policy layer2

auto bond0.100
iface bond0 inet static
address 192.100.100.50
netmask 255.255.255.0
gateway 192.100.100.4

auto vmbr0
iface vmbr0 inet manual
bridge_ports bond0
bridge_stp off
bridge_fd 0
post-up echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping

----- Mail original ----- 
De: "Cesar Peschiera" <brain at click.com.py>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mercredi 31 Décembre 2014 05:01:37
Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off

Hi Alexandre

Today, and after a week, again a node lost the cluster communication. So i
changed the configuration of the Bios Hardware to "I/OAT DMA enabled" (that
work very well in others nodes Dell R320 with NICs of 1 Gb/s).

Moreover, trying to follow your advice of to put 192.100.100.51 ip address
directly to bond0 and not in vmbr0, when i reboot the node, it is totally
isolated, and i see a message that says that vmbr0 missing a IP address.
Also the node is totally isolated when i apply this ip address to vmbr0:
0.0.0.0/255.255.255.255

In practical terms, can you tell me how can i add a IP address to bond0 and
also have a bridge for these same NICs?

- Now, this is my configuration:
auto bond0
iface bond0 inet manual
slaves eth0 eth2
bond_miimon 100
bond_mode 802.3ad
bond_xmit_hash_policy layer2

auto vmbr0
iface vmbr0 inet static
address 192.100.100.50
netmask 255.255.255.0
gateway 192.100.100.4
bridge_ports bond0
bridge_stp off
bridge_fd 0
post-up echo 0 >
/sys/devices/virtual/net/vmbr0/bridge/multicast_snooping

----- Original Message ----- 
From: "Alexandre DERUMIER" <aderumier at odiso.com>
To: "Cesar Peschiera" <brain at click.com.py>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Sent: Friday, December 19, 2014 7:59 AM
Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off

maybe can you try to put 192.100.100.51 ip address directly to bond0,

to avoid corosync traffic going through to vmbr0.

(I remember some old offloading bugs with 10gbe nic and linux bridge)

----- Mail original ----- 
De: "Cesar Peschiera" <brain at click.com.py>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Vendredi 19 Décembre 2014 11:08:33
Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off

>can you post your /etc/network/interfaces of theses 10gb/s nodes ?

This is my configuration:
Note: The LAN use 192.100.100.0/24

#Network interfaces
auto lo
iface lo inet loopback

iface eth0 inet manual
iface eth1 inet manual
iface eth2 inet manual
iface eth3 inet manual
iface eth4 inet manual
iface eth5 inet manual
iface eth6 inet manual
iface eth7 inet manual
iface eth8 inet manual
iface eth9 inet manual
iface eth10 inet manual
iface eth11 inet manual

#PVE Cluster and VMs (NICs are of 10 Gb/s):
auto bond0
iface bond0 inet manual
slaves eth0 eth2
bond_miimon 100
bond_mode 802.3ad
bond_xmit_hash_policy layer2

#PVE Cluster and VMs:
auto vmbr0
iface vmbr0 inet static
address 192.100.100.51
netmask 255.255.255.0
gateway 192.100.100.4
bridge_ports bond0
bridge_stp off
bridge_fd 0
post-up echo 0 >
/sys/devices/virtual/net/vmbr0/bridge/multicast_snooping
post-up echo 1 > /sys/class/net/vmbr0/bridge/multicast_querier

#A link for DRBD (NICs are of 10 Gb/s):
auto bond401
iface bond401 inet static
address 10.1.1.51
netmask 255.255.255.0
slaves eth1 eth3
bond_miimon 100
bond_mode balance-rr
mtu 9000

#Other link for DRBD (NICs are of 10 Gb/s):
auto bond402
iface bond402 inet static
address 10.2.2.51
netmask 255.255.255.0
slaves eth4 eth6
bond_miimon 100
bond_mode balance-rr
mtu 9000

#Other link for DRBD (NICs are of 10 Gb/s):
auto bond403
iface bond403 inet static
address 10.3.3.51
netmask 255.255.255.0
slaves eth5 eth7
bond_miimon 100
bond_mode balance-rr
mtu 9000

#A link for the NFS-Backups (NICs are of 1 Gb/s):
auto bond10
iface bond10 inet static
address 10.100.100.51
netmask 255.255.255.0
slaves eth8 eth10
bond_miimon 100
bond_mode balance-rr
#bond_mode active-backup
mtu 9000