[pve-devel] [PATCH access-control/cluster/docs/gui-tests/manager/network/proxmox{, -firewall, -ve-rs, -perl-rs, -widget-toolkit} v3 00/75] Add SDN Fabrics

Gabriel Goller g.goller at proxmox.com
Thu Jun 26 09:04:55 CEST 2025


Replying to this, just so that we keep a record on the mailing list.

On 12.06.2025 17:01, Hannes Duerr wrote:
>Tested as follow:
>Created 5 Proxmox VE nodes
>joined them as cluster
>added a two interfaces per node, all interfaces are on the same host bridge.
>Assigned the interfaces VLAN tags so that the nodes form a circle:
>   ----1---
> /            \
>5            2
> \           /
>  4-------3
>
>== OSPF ==
>
>Created new OSPF fabric `backbone` with area 0.0.0.0 and ipv4 prefix 
>192.168.2.0/24
>Added all 5 nodes and assigned them the ipv4 addresses 192.168.0.[1-5] 
>(unnumbered)
>Checked routes with vtysh -c 'show ip ospf route' and pinged all ips 
>-> works as expected
>
>Added PtP /31 address to the interfaces (numbered) and reloaded the config
>Checked routes with vtysh -c 'show ip ospf route' and pinged all ips 
>-> works as expected
>Removed nodes 5 and 4
>
>Created additional OSPF fabric `ospf2` with area 1.1.1.1 and ipv4 
>prefix 192.168.1.0/24
>Added nodes 3,4 and 5
>Added PtP /31 address to the interfaces (numbered) and reloaded the config
>┌──────────────────┐    ┌──────────────────┐
>│  Area 0.0.0.0             │    │  Area 1.1.1.1             │
>│                                  │ │                                  │
>│  F1 <-> F2 <-> F3 <┼──┼> F3 <-> F4 <-> F5  │
>│                                  │    │                             │
>└──────────────────┘    └──────────────────┘
>Checked routes with vtysh -c 'show ip route'
>Codes: K - kernel route, C - connected, L - local, S - static,
>       O - OSPF, * - FIB route
>[...]
>O   192.168.0.1/32 [110/10] via 0.0.0.0, dummy_backbone onlink, 
>rmapsrc 192.168.0.1, weight 1, 06:40:38
>O>* 192.168.0.2/32 [110/20] via 192.168.0.2, ens20 onlink, rmapsrc 
>192.168.0.1, weight 1, 06:40:23
>O>* 192.168.0.3/32 [110/30] via 192.168.0.2, ens20 onlink, rmapsrc 
>192.168.0.1, weight 1, 06:40:18
>O   192.168.1.3/32 [110/30] via 192.168.0.2, ens20 onlink, rmapsrc 
>192.168.0.1, weight 1, 06:40:18
>O   192.168.1.4/32 [110/40] via 192.168.0.2, ens20 onlink, rmapsrc 
>192.168.0.1, weight 1, 06:40:14
>O   192.168.1.5/32 [110/50] via 192.168.0.2, ens20 onlink, rmapsrc 
>192.168.0.1, weight 1, 06:40:08
>
>You can see that the ospf routes are created automatically, but are
>not transferred to the FDB. Accordingly, they are not visible in the
>kernel routing table. The reason for this is the restriction of access
>in the /etc/frr/frr.conf
>`access-list pve_ospf_backbone_ips permit 192.168.0.0/24`
>
>We discussed this already off-list and for now keeping it like this

This will probably be a future addition, something like "import-subnets"
or even "import-fabrics" where you can select other subnets/fabrics that
are allowed. We currently filter all the routes in frr, so that only
routes to the actual fabric ips (from the dummy interface) are inserted
(This is to avoid inserting p2p ip addresses into the fib).

>== Open Fabric ==
>
>Created new OpenFabric fabric `of1` with ipv6 prefix 
>2a02:ab8:308:3:eff:0:ff00:1/64
>Added all 5 nodes and assigned them the ipv6 
>addresses 2a02:ab8:308:3:eff:0:ff00:[1-5] (unnumbered)
>Checked routes with vtysh -c 'show openfabric route' and pinged all 
>ips -> works as expected
>
>Installed Ceph Cluster on all nodes and initialized 2 OSDs per node
>Took one node down and the routes switch as expected
>Took the node up again -> the node was not pingable anymore and the 
>routes did not come up again
>even after 10 minutes waiting
>
>Already talked to Gabriel about this but we're not yet sure what the 
>issue is here.

The issue here is two-fold:

* IPv6 forwarding was not enabled. Here we need to enable IPv6
   forwarding globally because there is no per-interface switch as there
   is with IPv4. This is fixed in v4.
* When booting up there is a race between openfabric initiating the
   interface (circuit) and the underlying interface coming up. This will
   result in fabricd not configuring the circuit. That's also why a FRR
   restart after the initial boot fixes the issue. This is fixed with 
   https://github.com/FRRouting/frr/pull/17083 which is included in the
   10.3.1 version which is shipped with debian trixie.


Thanks a lot for testing!




More information about the pve-devel mailing list