[PVE-User] Multicast problems with Intel X540 - 10Gtek network card?
Ronny Aasen
ronny+pve-user at aasen.cx
Tue Dec 4 20:03:17 CET 2018
vmbr10 is a bridge (or as switch by another name)
if you want the switch to work reliably with multicast you probably need
to enable multicast querier.
|echo 1 > /sys/devices/virtual/net/vmbr0/bridge/multicast_querier|
or you can disable snooping, so that it treats multicast as broadcast. |
echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping|
this problem with multicast traffic also may lead to unreliable ipv6 nd
and nd-ra usage.
https://pve.proxmox.com/wiki/Multicast_notes have some more notes and
exampes around mulicast_querier
kind regards
Ronny Aasen
On 04.12.2018 17:54, Eneko Lacunza wrote:
> Hi all,
>
> Seems I found the solution.
>
> eth3 on proxmox1 is a broadcom 1gbit card connected to HPE switch; it
> is VLAN 10 untagged on the switch end.
>
> I changed the vmbr10 bridge to use eth4.10 on the X540 card, and after
> ifdown/ifup and corosync and pve-cluster restart, now everything seems
> good; cluster is stable and omping is happy too after 10 minutes :)
>
> It is strange because multicast is on VLAN 1 network...
>
> Cheers and thanks a lot
> Eneko
>
> El 4/12/18 a las 16:18, Eneko Lacunza escribió:
>>
>> hi Marcus,
>>
>> El 4/12/18 a las 16:09, Marcus Haarmann escribió:
>>> Hi,
>>>
>>> you did not provide details about your configuration.
>>> How is the network card set up ? Bonding ?
>>> Send your /etc/network/interfaces details.
>>> If bonding is active, check if the mode is correct in
>>> /proc/net/bonding.
>>> We encountered differences between /etc/network/interfaces setup and
>>> resulting mode.
>>> Also, check your switch configuration, VLAN setup, MTU etc.
>> Yes, sorry about that. I have double checked the switch and all 3
>> node SFP+ port have the same configuration.
>>
>> /etc/network/interfaces in proxmox1 node:
>> auto lo
>> iface lo inet loopback
>> iface eth0 inet manual
>> iface eth1 inet manual
>> iface eth2 inet manual
>> iface eth3 inet manual
>> iface eth4 inet manual
>> iface eth5 inet manual
>>
>> auto vmbr10
>> iface vmbr10 inet static
>> address 192.168.10.201
>> netmask 255.255.255.0
>> bridge_ports eth3
>> bridge_stp off
>> bridge_fd 0
>>
>> auto vmbr0
>> iface vmbr0 inet static
>> address 192.168.0.201
>> netmask 255.255.255.0
>> gateway 192.168.0.100
>> bridge_ports eth4
>> bridge_stp off
>> bridge_fd 0
>>
>> auto eth4.100
>> iface eth4.100 inet static
>> address 10.0.2.1
>> netmask 255.255.255.0
>> up ip addr add 10.0.3.1/24 dev eth4.100
>>
>> Cluster is running on vmbr0 network (192.168.0.0/24)
>>
>> Cheers
>>
>>>
>>> Marcus Haarmann
>>>
>>>
>>> Von: "Eneko Lacunza" <elacunza at binovo.es>
>>> An: "pve-user" <pve-user at pve.proxmox.com>
>>> Gesendet: Dienstag, 4. Dezember 2018 15:57:10
>>> Betreff: [PVE-User] Multicast problems with Intel X540 - 10Gtek
>>> network card?
>>>
>>> Hi all,
>>>
>>> We have just updated a 3-node Proxmox cluster from 3.4 to 5.2, Ceph
>>> hammer to Luminous and the network from 1 Gbit to 10Gbit... one of the
>>> three Proxmox nodes is new too :)
>>>
>>> Generally all was good and VMs are working well. :-)
>>>
>>> BUT, we have some problems with the cluster; promxox1 node joins and
>>> then after about 4 minutes drops from the cluster.
>>>
>>> All multicast tests
>>> https://pve.proxmox.com/wiki/Multicast_notes#Using_omping_to_test_multicast
>>>
>>> run fine except the last one:
>>>
>>> *** proxmox1:
>>>
>>> root at proxmox1:~# omping -c 600 -i 1 -F -q proxmox1 proxmox3 proxmox4
>>>
>>> proxmox3 : waiting for response msg
>>>
>>> proxmox4 : waiting for response msg
>>>
>>> proxmox3 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox4 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox3 : given amount of query messages was sent
>>>
>>> proxmox4 : given amount of query messages was sent
>>>
>>> proxmox3 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev
>>> = 0.073/0.184/0.390/0.061
>>>
>>> proxmox3 : multicast, xmt/rcv/%loss = 600/262/56%,
>>> min/avg/max/std-dev = 0.092/0.207/0.421/0.068
>>>
>>> proxmox4 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev
>>> = 0.049/0.167/0.369/0.059
>>>
>>> proxmox4 : multicast, xmt/rcv/%loss = 600/262/56%,
>>> min/avg/max/std-dev = 0.063/0.185/0.386/0.064
>>>
>>>
>>> *** proxmox3:
>>>
>>> root at proxmox3:/etc# omping -c 600 -i 1 -F -q proxmox1 proxmox3 proxmox4
>>>
>>> proxmox1 : waiting for response msg
>>>
>>> proxmox4 : waiting for response msg
>>>
>>> proxmox4 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox1 : waiting for response msg
>>>
>>> proxmox1 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox4 : given amount of query messages was sent
>>>
>>> proxmox1 : given amount of query messages was sent
>>>
>>> proxmox1 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev
>>> = 0.083/0.193/1.030/0.055
>>>
>>> proxmox1 : multicast, xmt/rcv/%loss = 600/600/0%,
>>> min/avg/max/std-dev = 0.102/0.209/1.050/0.054
>>>
>>> proxmox4 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev
>>> = 0.041/0.108/0.172/0.026
>>>
>>> proxmox4 : multicast, xmt/rcv/%loss = 600/600/0%,
>>> min/avg/max/std-dev = 0.048/0.123/0.190/0.030
>>>
>>>
>>> *** root at proxmox4:~# omping -c 600 -i 1 -F -q proxmox1 proxmox3
>>> proxmox4
>>>
>>> proxmox1 : waiting for response msg
>>>
>>> proxmox3 : waiting for response msg
>>>
>>> proxmox1 : waiting for response msg
>>>
>>> proxmox3 : waiting for response msg
>>>
>>> proxmox3 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox1 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox1 : given amount of query messages was sent
>>>
>>> proxmox3 : given amount of query messages was sent
>>>
>>> proxmox1 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev
>>> = 0.085/0.188/0.356/0.040
>>>
>>> proxmox1 : multicast, xmt/rcv/%loss = 600/600/0%,
>>> min/avg/max/std-dev = 0.114/0.208/0.377/0.041
>>>
>>> proxmox3 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev
>>> = 0.048/0.117/0.289/0.023
>>>
>>> proxmox3 : multicast, xmt/rcv/%loss = 600/600/0%,
>>> min/avg/max/std-dev = 0.064/0.134/0.290/0.026
>>>
>>>
>>> Ok, so it seems we have a network problem on proxmox1 node. Network
>>> cards are as follows:
>>>
>>> - proxmox1: Intel X540 (10Gtek)
>>> - proxmox3: Intel X710 (Intel)
>>> - proxmox4: Intel X710 (Intel)
>>>
>>> Switch is Dell N1224T-ON.
>>>
>>> Does anyone have experience with Intel X540 chip network cards or Linux
>>> ixgbe network driver or 10Gtek manufacturer?
>>>
>>> If we change corosync communication to 1 Gbit network cards (broadcom)
>>> connected to an old HPE 1800-24G switch, cluster is stable...
>>>
>>> We also have a running cluster with Dell n1224T-ON switch and X710
>>> network cards without issues.
>>>
>>> Thanks a lot
>>> Eneko
>>>
>>>
>>
>>
>
>
More information about the pve-user
mailing list