[pve-devel] [PATCH ve-rs/firewall/qemu-server/manager v2 00/13] fix #5180: migrate conntrack state on live migration
Stefan Hanreich
s.hanreich at proxmox.com
Fri May 30 13:34:02 CEST 2025
Had some off-list discussion with Christoph today and wanted to jot it
down here (test + review of the code will follow):
We currently do not migrate CT entries related to natted connections, so
if you run the reproducer inside a VM that uses SNAT, the connection
will still drop on migration (also happens with the firewall disabled,
so no downgrade there). We should be able to mark those entries as well
in nftables by matching on iifname in the forward chain of the host -
with iptables it might be a bit trickier, haven't checked yet.
When migrating, we'd need to rewrite the CT entries to the proper IP
that is used for SNAT, which should be doable with SDN Simple Zones.
Otherwise I don't think there's a sane way to do CT migration for SNAT.
Certainly not a blocker, but we should document this and possibly
follow-up with a solution for SDN simple zones utilizing SNAT.
Christoph will look into using labels, rather than fw marks, since we
might wanna use them later on for things other than CT migration (NAT in
EVPN zones with overlapping IPs comes to mind - Wireguard can use
fwmarks for routing as well). If we can get away with labels, then that
would imo be preferable.
On 4/24/25 13:19, Christoph Heiss wrote:
> Fixes #5180 [0].
>
> This implements migration of per-VM conntrack state on live-migration.
>
> The core of the implementation are in patch #7 & #8. See there for more
> details.
>
> Patch #1 - #3 implement CONNMARK'ing any VM traffic with their unique
> VMID. This is needed later on to filter conntrack entries for the
> migration. These three patches can be applied independently,
> CONNMARK'ing traffic does not have any visible impact.
>
> Currently, remote/inter-cluster migration is not supported and indicated
> to the user with a warning. See also patch #8 for a bit more in-depth
> explanation.
>
> Needed dependency bumps between packages are indicated in the notes
> appropriately.
>
> [0] https://bugzilla.proxmox.com/show_bug.cgi?id=5180
>
> Testing
> =======
>
> I've primarily tested intra-cluster live-migrations, with both the
> iptables-based and nftables-based firewall), using the reproducer as
> described in #5180. I further verified that the D-Bus servers get
> started as expected and are _always_ stopped, even in the case of some
> migration error.
>
> Finally, I also checked using `conntrack -L -m <vmid>` tool that the
> conntrack entries are
> a) added/updated on the target node and
> b) removed from the source node afterwards
>
> Also tested was the migration from/to an "old" (unpatched) node, which
> results in the issue as per #5180 & appropriate warnings in the UI.
>
> For remote migrations, only tested that the warning is logged as
> expected.
>
> History
> =======
>
> v1: https://lore.proxmox.com/pve-devel/20250317141152.1247324-1-c.heiss@proxmox.com/
>
> Changes v1 -> v2:
> * rebased as necessary
> * "un-rfc'd" firewall conntrack flushing patches
> * use an instanced systemd service instead of fork+exec for the
> pve-dbus-vmstate helper
>
> Diffstat
> ========
>
> pve-firewall:
>
> Christoph Heiss (2):
> firewall: add connmark rule with VMID to all guest chains
> firewall: helpers: add sub for flushing conntrack entries by mark
>
> debian/control | 3 ++-
> src/PVE/Firewall.pm | 7 +++++--
> src/PVE/Firewall/Helpers.pm | 11 +++++++++++
> 3 files changed, 18 insertions(+), 3 deletions(-)
>
> proxmox-firewall:
>
> Christoph Heiss (1):
> firewall: add connmark rule with VMID to all guest chains
>
> proxmox-firewall/src/firewall.rs | 14 +++-
> .../integration_tests__firewall.snap | 84 +++++++++++++++++++
> proxmox-nftables/src/expression.rs | 9 ++
> proxmox-nftables/src/statement.rs | 10 ++-
> 4 files changed, 114 insertions(+), 3 deletions(-)
>
> proxmox-ve-rs:
>
> Christoph Heiss (1):
> config: guest: allow access to raw Vmid value
>
> proxmox-ve-config/src/guest/types.rs | 4 ++++
> 1 file changed, 4 insertions(+)
>
> qemu-server:
>
> Christoph Heiss (5):
> qmp helpers: allow passing structured args via qemu_objectadd()
> api2: qemu: add module exposing node migration capabilities
> fix #5180: libexec: add QEMU dbus-vmstate daemon for migrating
> conntrack
> fix #5180: migrate: integrate helper for live-migrating conntrack info
> migrate: flush old VM conntrack entries after successful migration
>
> Makefile | 7 +-
> PVE/API2/Qemu.pm | 72 +++++++++++
> PVE/API2/Qemu/Makefile | 2 +-
> PVE/API2/Qemu/Migration.pm | 46 +++++++
> PVE/CLI/qm.pm | 5 +
> PVE/QemuMigrate.pm | 69 ++++++++++
> PVE/QemuServer.pm | 6 +
> PVE/QemuServer/DBusVMState.pm | 120 ++++++++++++++++++
> PVE/QemuServer/Makefile | 1 +
> PVE/QemuServer/QMPHelpers.pm | 4 +-
> dbus-vmstate/Makefile | 7 ++
> dbus-vmstate/dbus-vmstate | 168 +++++++++++++++++++++++++
> dbus-vmstate/org.qemu.VMState1.conf | 11 ++
> dbus-vmstate/pve-dbus-vmstate at .service | 10 ++
> debian/control | 7 +-
> 15 files changed, 530 insertions(+), 5 deletions(-)
> create mode 100644 PVE/API2/Qemu/Migration.pm
> create mode 100644 PVE/QemuServer/DBusVMState.pm
> create mode 100644 dbus-vmstate/Makefile
> create mode 100755 dbus-vmstate/dbus-vmstate
> create mode 100644 dbus-vmstate/org.qemu.VMState1.conf
> create mode 100644 dbus-vmstate/pve-dbus-vmstate at .service
>
> pve-manager:
>
> Christoph Heiss (4):
> api2: capabilities: explicitly import CPU capabilities module
> api2: capabilities: proxy index endpoints to respective nodes
> api2: capabilities: expose new qemu/migration endpoint
> ui: window: Migrate: add checkbox for migrating VM conntrack state
>
> PVE/API2/Capabilities.pm | 9 +++++
> www/manager6/window/Migrate.js | 73 ++++++++++++++++++++++++++++++++--
> 2 files changed, 78 insertions(+), 4 deletions(-)
>
More information about the pve-devel
mailing list