[PVE-User] Proxmox VE 7.0 (beta) released!

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Jun 29 16:14:50 CEST 2021


On 29.06.21 15:31, Stoiko Ivanov wrote:
> On Tue, 29 Jun 2021 14:04:05 +0200
> Mark Schouten <mark at tuxis.nl> wrote:
> 
>> Hi,
>>
>> Op 29-06-2021 om 12:31 schreef Thomas Lamprecht:
>>>> I do not completely understand why that fixes it though.  Commenting out MACAddressPolicy=persistent helps, but why?
>>>>  
>>>
>>> Because duplicate MAC addresses are not ideal, to say the least?  
>>
>> That I understand. :)
>>
>> But, the cluster interface works when bridge_vlan_aware is off, 
>> regardless of the MacAddressPolicy setting.
>>
> 
> We managed to find a reproducer - my current guess is that it might have
> something to do with intel NIC drivers or some changes in ifupdown2 (or
> udev, or in their interaction ;) - Sadly if tcpdump fixes the issues, it
> makes debugging quite hard :)

The issue is that the kernel always (since close to forever) cleared the bridge's
promisc mode when there was either no port or exactly one port with flood or learning
enabled in the `br_manage_promisc` function.

Further, on toggeling VLAN-aware the aforementioned `br_manage_promisc` is called
from `br_vlan_filter_toggle`

So, why does this breaks now? I really do not think it's due to some driver-specific
stuff, not impossible but the following sounds like a better explanation about the
"why now":

Previously the MAC address of the bridge was the same as the one from the single port,
so there it didn't matter to much if promisc was on on the single port itself, the
bridge could accept the packages. But now, with the systemd default MACAddresPolicy
"persistent" now also applying to bridges, the bridge gets a different MAC than the
port, which means the disabled promisc matters on that port quite a bit more.

So vlan-aware on "breaks" it by mistake, as then a br_manage_promisc call is made
at a time where the "clear promisc for port" logic triggers, so rather a side-effect
than a real cause.

I quite tempted to drop the br_auto_port special case for the single port case in
the kernel as fix, but need to think about this - and probably will send that to
LKML first to poke for some comments...





More information about the pve-user mailing list