[pve-devel] BUG in vlan aware bridge
Josef Johansson
josef at oderland.se
Thu Oct 14 07:14:43 CEST 2021
Hi,
I did some more digging searching for 'bridge-nf-call-iptables
fragmentation'
Found these forum posts:
https://forum.proxmox.com/threads/net-bridge-bridge-nf-call-iptables-and-friends.64766/
https://forum.proxmox.com/threads/linux-bridge-reassemble-fragmented-packets.96432/
And this patch, which seems like they at least TRIED to get it fixed ;)
https://lists.linuxfoundation.org/pipermail/bridge/2019-August/012185.html
Med vänliga hälsningar
Josef Johansson
On 10/13/21 16:32, VELARTIS Philipp Dürhammer wrote:
> If you Stop pve firewall service and echo 0 > /proc/sys/net/bridge/bridge-nf-call-iptables (you stop the netfilter hook)
> Then it works for me also with taged tap devices and vlan aware bridge. I think it is a kernel bug.
> What I don’t understand why not more people are reporting it...
>
>
> -----Ursprüngliche Nachricht-----
> Von: Josef Johansson <josef at oderland.se>
> Gesendet: Mittwoch, 13. Oktober 2021 16:19
> An: VELARTIS Philipp Dürhammer <p.duerhammer at velartis.at>; 'pve-devel at lists.proxmox.com' <pve-devel at lists.proxmox.com>
> Betreff: Re: AW: [pve-devel] BUG in vlan aware bridge
>
> Hi,
>
> I can confirm that s > 12000 does not work on either
>
> size, tap(untagged, mtu 1500)->vlan-aware bridge(mtu 9000)->bond(mtu 9000), tap(tagged, mtu1500)->vlan-aware bridge(mtu 9000)->bond(mtu 9000)
>
> s > 12000, doesn't work, doesn't work
>
> s > 8000 , works, doesn't work
>
>
> The traffic(one packet defragmented) is just dropped between bridge and tap. I tried my NOTRACK and it didn't have any affect.
>
>
> We have either a bug in my mellanox cards here or the kernel. I don't think this is a normal case.
>
> Med vänliga hälsningar
> Josef Johansson
>
> On 10/13/21 15:53, VELARTIS Philipp Dürhammer wrote:
>> And what happens if you use packet size > 9000? this should still
>> work...(because it gets fragmented)
>>
>> -----Ursprüngliche Nachricht-----
>> Von: pve-devel <pve-devel-bounces at lists.proxmox.com> Im Auftrag von
>> Josef Johansson
>> Gesendet: Mittwoch, 13. Oktober 2021 13:37
>> An: pve-devel at lists.proxmox.com
>> Betreff: Re: [pve-devel] BUG in vlan aware bridge
>>
>> Hi,
>>
>> AFAIK it's netfilter that is doing defragmenting so that it can firewall.
>>
>> If you specify
>>
>> iptables -t raw -I PREROUTING -s 77.244.240.131 -j NOTRACK
>>
>> iptables -t raw -I PREROUTING -s 37.16.72.52 -j NOTRACK
>>
>> you should be able to make it ignore your packets.
>>
>>
>> As a datapoint I could ping fine from a MTU 1500 host, over MTU 9000 vlan-aware bridges with firewalls to another MTU 1500.
>>
>> As you would assume the package is defragmented over MTU 9000 links and fragmented again over MTU 1500 devices.
>>
>> Med vänliga hälsningar
>> Josef Johansson
>>
>> On 10/13/21 11:22, VELARTIS Philipp Dürhammer wrote:
>>> HI,
>>>
>>>
>>> Yes i think it has nothing to do with the bonds but with the vlan aware bridge interface.
>>>
>>> I see this with ping -s 1500
>>>
>>> On tap interface:
>>> 11:19:35.141414 62:47:e0:fe:f9:31 > 54:e0:32:27:6e:50, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 39999, offset 0, flags [+], proto ICMP (1), length 1500)
>>> 37.16.72.52 > 77.244.240.131: ICMP echo request, id 2182, seq 4,
>>> length 1480
>>> 11:19:35.141430 62:47:e0:fe:f9:31 > 54:e0:32:27:6e:50, ethertype IPv4 (0x0800), length 562: (tos 0x0, ttl 64, id 39999, offset 1480, flags [none], proto ICMP (1), length 548)
>>> 37.16.72.52 > 77.244.240.131: ip-proto-1
>>>
>>> On vmbr0:
>>> 11:19:35.141442 62:47:e0:fe:f9:31 > 54:e0:32:27:6e:50, ethertype 802.1Q (0x8100), length 2046: vlan 350, p 0, ethertype IPv4 (0x0800), (tos 0x0, ttl 64, id 39999, offset 0, flags [none], proto ICMP (1), length 2028)
>>> 37.16.72.52 > 77.244.240.131: ICMP echo request, id 2182, seq 4,
>>> length 2008
>>>
>>> On bond0 its gone....
>>>
>>> But who is in charge of fragementing the packets normally? The bridge itself? Netfilter?
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: pve-devel <pve-devel-bounces at lists.proxmox.com> Im Auftrag von
>>> Stoyan Marinov
>>> Gesendet: Mittwoch, 13. Oktober 2021 00:46
>>> An: Proxmox VE development discussion <pve-devel at lists.proxmox.com>
>>> Betreff: Re: [pve-devel] BUG in vlan aware bridge
>>>
>>> OK, I have just verified it has nothing to do with bonds. I get the same behavior with vlan aware bridge, bridge-nf-call-iptables=1 with regular eth0 being part of the bridge. Packets arrive fragmented on tap, reassembled by netfilter and then re-injected in bridge assembled (full size).
>>>
>>> I did have limited success by setting net.bridge.bridge-nf-filter-vlan-tagged to 1. Now packets seem to get fragmented on the way out and back in, but there are still issues:
>>>
>>> 1. I'm testing with ping -s 2000 (1500 mtu everywhere) to an external box. I do see reply packets arrive on the vm nic, but ping doesn't see them. Haven't analyzed much further.
>>> 2. While watching with tcpdump (inside the vm) i notice "ip reassembly time exceeded" messages being generated from the vm.
>>>
>>> I'll try to investigate a bit further tomorrow.
>>>
>>>> On 12 Oct 2021, at 11:26 PM, Stoyan Marinov <stoyan at marinov.us> wrote:
>>>>
>>>> That's an interesting observation. Now that I think about it, it could be caused by bonding and not the underlying device. When I tested this (about an year ago) I was using bonding on the mlx adapters and not using bonding on intel ones.
>>>>
>>>>> On 12 Oct 2021, at 3:36 PM, VELARTIS Philipp Dürhammer <p.duerhammer at velartis.at> wrote:
>>>>>
>>>>> HI,
>>>>>
>>>>> we use HP Server with Intel Cards or the standard hp nic ( ithink
>>>>> also intel)
>>>>>
>>>>> Also I see the I did a mistake:
>>>>>
>>>>> Setup working:
>>>>> tapX (UNtagged) <- -> vmbr0 <- - > bond0
>>>>>
>>>>> is correct. (before I had also tagged)
>>>>>
>>>>> it should be :
>>>>>
>>>>> Setup not working:
>>>>> tapX (tagged) <- -> vmbr0 <- - > bond0
>>>>>
>>>>> Setup working:
>>>>> tapX (untagged) <- -> vmbr0 <- - > bond0
>>>>>
>>>>> Setup also working:
>>>>> tapX < - - > vmbr0v350 < -- > bond0.350 < -- > bond0
>>>>>
>>>>> -----Ursprüngliche Nachricht-----
>>>>> Von: pve-devel <pve-devel-bounces at lists.proxmox.com> Im Auftrag von
>>>>> Stoyan Marinov
>>>>> Gesendet: Dienstag, 12. Oktober 2021 13:16
>>>>> An: Proxmox VE development discussion <pve-devel at lists.proxmox.com>
>>>>> Betreff: Re: [pve-devel] BUG in vlan aware bridge
>>>>>
>>>>> I'm having the very same issue with Mellanox ethernet adapters. I don't see this behavior with Intel nics. What network cards do you have?
>>>>>
>>>>>> On 12 Oct 2021, at 1:48 PM, VELARTIS Philipp Dürhammer <p.duerhammer at velartis.at> wrote:
>>>>>>
>>>>>> HI,
>>>>>>
>>>>>> i am playing around since days because we have strange packet losses.
>>>>>> Finally I can report following (Linux 5.11.22-4-pve, Proxmox 7, all devices MTU 1500):
>>>>>>
>>>>>> Packet with sizes > 1500 without VLAN working well but at the moment they are Tagged they are dropped by the bond device.
>>>>>> Netfilter (set to 1) always reassembles the packets when they arrive a bridge. But they don't get fragmented again I they are VLAN tagged. So the bond device drops them. If the bridge is NOT Vlan aware they also get fragmented and it works well.
>>>>>>
>>>>>> Setup not working:
>>>>>>
>>>>>> tapX (tagged) <- -> vmbr0 <- - > bond0
>>>>>>
>>>>>> Setup working:
>>>>>>
>>>>>> tapX (tagged) <- -> vmbr0 <- - > bond0
>>>>>>
>>>>>> Setup also working:
>>>>>>
>>>>>> tapX < - - > vmbr0v350 < -- > bond0.350 < -- > bond0
>>>>>>
>>>>>> Have you got any idea where to search? I don't understand who is
>>>>>> in charge of fragmenting packages again if they get reassembled by
>>>>>> netfilter. (and why it is not working with vlan aware bridges)
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> pve-devel mailing list
>>>>>> pve-devel at lists.proxmox.com
>>>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>>>>
>>>>> _______________________________________________
>>>>> pve-devel mailing list
>>>>> pve-devel at lists.proxmox.com
>>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>>> _______________________________________________
>>>>> pve-devel mailing list
>>>>> pve-devel at lists.proxmox.com
>>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>> _______________________________________________
>>>> pve-devel mailing list
>>>> pve-devel at lists.proxmox.com
>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>> _______________________________________________
>>> pve-devel mailing list
>>> pve-devel at lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>> _______________________________________________
>>> pve-devel mailing list
>>> pve-devel at lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
More information about the pve-devel
mailing list