[PVE-User] Proxmox VE 7.0 (beta) released!

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Jul 2 22:57:44 CEST 2021

On 29.06.21 10:05, Mark Schouten wrote:
> Hi,
> Op 24-06-2021 om 15:16 schreef Martin Maurer:
>> We are pleased to announce the first beta release of Proxmox Virtual Environment 7.0! The 7.x family is based on the great Debian 11 "Bullseye" and comes with a 5.11 kernel, QEMU 6.0, LXC 4.0, OpenZFS 2.0.4.
> I just upgraded a node in our demo cluster and all seemed fine. Except for non-working cluster network. I was unable to ping the node through the cluster interface, pvecm saw no other nodes and ceph was broken.
> However, if I ran tcpdump, ping started working, but not the rest.
> Interesting situation, which I 'fixed' by disabling vlan-aware-bridge for that interface. After the reboot, everything works (AFAICS).
> If Proxmox wants to debug this, feel free to reach out to me, I can grant you access to this node so you can check it out.

FYI, there was some more investigation regarding this, mostly spear headed by Wolfgang,
and we found and fixed[0] an actual, rather old (fixes commit is from 2014!), bridge bug
in the kernel.

The first few lines of the fix's commit message[0] explain the basics:

> [..] bridges with `vlan_filtering 1` and only 1 auto-port don't
> set IFF_PROMISC for unicast-filtering-capable ports.

Further, we saw all that weird behavior as
* while this is independent of any specific network driver, those specific drivers
  vary wildly in how the do things, and some thus worked (by luck) while others did

* It can really only happen in the vlan-aware case, as else all ports are set promisc
  no matter what, but depending in which order things are done the result may still
  differ even with vlan-aware on

* It did not matter before (i.e., before systemd started to also apply their
  MACAddressPolicy by default onto virtual devices like bridges) because then the
  bridge basically always had a MAC from one of it's ports, so the fdb always
  contained the bridge's MAC implicitly and the bug was concealed.

So it's quite likely that this rather confusing mix of behaviors would had pop up
in more places, where bridges are used, in the upcoming  months when that systemd
change slowly rolled into stable distros, so actually really nice to find and fix
(*knocks wood*) this during beta!

Anyhow, a newer kernel build is now also available in the bullseye based pvetest
repository, if you want to test and confirm the fix:

pve-kernel-5.11.22-1-pve version 5.11.22-2


[0]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=a019abd80220

More information about the pve-user mailing list