[pve-devel] BUG in vlan aware bridge

Josef Johansson josef at oderland.se
Wed Oct 13 15:53:40 CEST 2021


Med vänliga hälsningar
Josef Johansson

On 10/13/21 15:47, VELARTIS Philipp Dürhammer wrote:
>>> As a datapoint I could ping fine from a MTU 1500 host, over MTU 9000 vlan-aware bridges with firewalls to another MTU 1500.
>>> As you would assume the package is defragmented over MTU 9000 links and fragmented again over MTU 1500 devices.
> So you did a ping with -s 2000 (or bigger) and your tap device is vlan tagged from the vm where you ping?
Oh right. I have to test that out correctly. I have it lab, will reach
back to you when I've tested it properly.
> -----Ursprüngliche Nachricht-----
> Von: pve-devel <pve-devel-bounces at lists.proxmox.com> Im Auftrag von Josef Johansson
> Gesendet: Mittwoch, 13. Oktober 2021 13:37
> An: pve-devel at lists.proxmox.com
> Betreff: Re: [pve-devel] BUG in vlan aware bridge
>
> Hi,
>
> AFAIK it's netfilter that is doing defragmenting so that it can firewall.
>
> If you specify
>
> iptables -t raw -I PREROUTING -s 77.244.240.131 -j NOTRACK
>
> iptables -t raw -I PREROUTING -s 37.16.72.52 -j NOTRACK
>
> you should be able to make it ignore your packets.
>
>
> As a datapoint I could ping fine from a MTU 1500 host, over MTU 9000 vlan-aware bridges with firewalls to another MTU 1500.
>
> As you would assume the package is defragmented over MTU 9000 links and fragmented again over MTU 1500 devices.
>
> Med vänliga hälsningar
> Josef Johansson
>
> On 10/13/21 11:22, VELARTIS Philipp Dürhammer wrote:
>> HI,
>>
>>
>> Yes i think it has nothing to do with the bonds but with the vlan aware bridge interface.
>>
>> I see this with ping -s 1500
>>
>> On tap interface: 
>> 11:19:35.141414 62:47:e0:fe:f9:31 > 54:e0:32:27:6e:50, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 39999, offset 0, flags [+], proto ICMP (1), length 1500)
>>     37.16.72.52 > 77.244.240.131: ICMP echo request, id 2182, seq 4, 
>> length 1480
>> 11:19:35.141430 62:47:e0:fe:f9:31 > 54:e0:32:27:6e:50, ethertype IPv4 (0x0800), length 562: (tos 0x0, ttl 64, id 39999, offset 1480, flags [none], proto ICMP (1), length 548)
>>     37.16.72.52 > 77.244.240.131: ip-proto-1
>>
>> On vmbr0:
>> 11:19:35.141442 62:47:e0:fe:f9:31 > 54:e0:32:27:6e:50, ethertype 802.1Q (0x8100), length 2046: vlan 350, p 0, ethertype IPv4 (0x0800), (tos 0x0, ttl 64, id 39999, offset 0, flags [none], proto ICMP (1), length 2028)
>>     37.16.72.52 > 77.244.240.131: ICMP echo request, id 2182, seq 4, 
>> length 2008
>>
>> On bond0 its gone....
>>
>> But who is in charge of fragementing the packets normally? The bridge itself? Netfilter?
>>
>> -----Ursprüngliche Nachricht-----
>> Von: pve-devel <pve-devel-bounces at lists.proxmox.com> Im Auftrag von 
>> Stoyan Marinov
>> Gesendet: Mittwoch, 13. Oktober 2021 00:46
>> An: Proxmox VE development discussion <pve-devel at lists.proxmox.com>
>> Betreff: Re: [pve-devel] BUG in vlan aware bridge
>>
>> OK, I have just verified it has nothing to do with bonds. I get the same behavior with vlan aware bridge, bridge-nf-call-iptables=1 with regular eth0 being part of the bridge. Packets arrive fragmented on tap, reassembled by netfilter and then re-injected in bridge assembled (full size).
>>
>> I did have limited success by setting net.bridge.bridge-nf-filter-vlan-tagged to 1. Now packets seem to get fragmented on the way out and back in, but there are still issues:
>>
>> 1. I'm testing with ping -s 2000 (1500 mtu everywhere) to an external box. I do see reply packets arrive on the vm nic, but ping doesn't see them. Haven't analyzed much further.
>> 2. While watching with tcpdump (inside the vm) i notice "ip reassembly time exceeded" messages being generated from the vm.
>>
>> I'll try to investigate a bit further tomorrow.
>>
>>> On 12 Oct 2021, at 11:26 PM, Stoyan Marinov <stoyan at marinov.us> wrote:
>>>
>>> That's an interesting observation. Now that I think about it, it could be caused by bonding and not the underlying device. When I tested this (about an year ago) I was using bonding on the mlx adapters and not using bonding on intel ones.
>>>
>>>> On 12 Oct 2021, at 3:36 PM, VELARTIS Philipp Dürhammer <p.duerhammer at velartis.at> wrote:
>>>>
>>>> HI,
>>>>
>>>> we use HP Server with Intel Cards or the standard hp nic ( ithink 
>>>> also intel)
>>>>
>>>> Also I see the I did a mistake:
>>>>
>>>> Setup working:
>>>> tapX (UNtagged) <- -> vmbr0 <- - > bond0
>>>>
>>>> is correct. (before I had also tagged)
>>>>
>>>> it should be :
>>>>
>>>> Setup not working:
>>>> tapX (tagged) <- -> vmbr0 <- - > bond0
>>>>
>>>> Setup working:
>>>> tapX (untagged) <- -> vmbr0 <- - > bond0
>>>>
>>>> Setup also working:
>>>> tapX < - - > vmbr0v350 < -- > bond0.350 < -- > bond0
>>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: pve-devel <pve-devel-bounces at lists.proxmox.com> Im Auftrag von 
>>>> Stoyan Marinov
>>>> Gesendet: Dienstag, 12. Oktober 2021 13:16
>>>> An: Proxmox VE development discussion <pve-devel at lists.proxmox.com>
>>>> Betreff: Re: [pve-devel] BUG in vlan aware bridge
>>>>
>>>> I'm having the very same issue with Mellanox ethernet adapters. I don't see this behavior with Intel nics. What network cards do you have?
>>>>
>>>>> On 12 Oct 2021, at 1:48 PM, VELARTIS Philipp Dürhammer <p.duerhammer at velartis.at> wrote:
>>>>>
>>>>> HI,
>>>>>
>>>>> i am playing around since days because we have strange packet losses.
>>>>> Finally I can report following (Linux 5.11.22-4-pve, Proxmox 7, all devices MTU 1500):
>>>>>
>>>>> Packet with sizes > 1500 without VLAN working well but at the moment they are Tagged they are dropped by the bond device.
>>>>> Netfilter (set to 1) always reassembles the packets when they arrive a bridge. But they don't get fragmented again I they are VLAN tagged. So the bond device drops them. If the bridge is NOT Vlan aware they also get fragmented and it works well.
>>>>>
>>>>> Setup not working:
>>>>>
>>>>> tapX (tagged) <- -> vmbr0 <- - > bond0
>>>>>
>>>>> Setup working:
>>>>>
>>>>> tapX (tagged) <- -> vmbr0 <- - > bond0
>>>>>
>>>>> Setup also working:
>>>>>
>>>>> tapX < - - > vmbr0v350 < -- > bond0.350 < -- > bond0
>>>>>
>>>>> Have you got any idea where to search? I don't understand who is in 
>>>>> charge of fragmenting packages again if they get reassembled by 
>>>>> netfilter. (and why it is not working with vlan aware bridges)
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> pve-devel mailing list
>>>>> pve-devel at lists.proxmox.com
>>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>>>
>>>> _______________________________________________
>>>> pve-devel mailing list
>>>> pve-devel at lists.proxmox.com
>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>> _______________________________________________
>>>> pve-devel mailing list
>>>> pve-devel at lists.proxmox.com
>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>> _______________________________________________
>>> pve-devel mailing list
>>> pve-devel at lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel




More information about the pve-devel mailing list