[pve-devel] [PATCH qemu-server v2 7/13] fix #5180: libexec: add QEMU dbus-vmstate daemon for migrating conntrack
Stefan Hanreich
s.hanreich at proxmox.com
Tue Jun 3 11:34:13 CEST 2025
On 4/24/25 13:19, Christoph Heiss wrote:
> First part to fixing #5180 [0].
>
> Adds a simple D-Bus server which implements the `org.qemu.VMState1`
> interface as specified in the QEMU documentation [1].
>
> Using the built-in QEMU VMState machinery saves us from having to worry
> about transfer and convergence of the data and letl QEMU take care of
> it.
>
> Any object on the D-Bus path `/org/qemu/VMState1` implementing that
> interface will be called by QEMU during live-migration, iif the `Id`
> property is registered within the `dbus-vmstate` QEMU object for a
> specific VM.
>
> The actual state loading/restoring is done via the conntrack(8) tool, a
> small tool which already implements hard parts of interacting with the
> conntrack subsystem via netlink.
>
> Filtering is done on CONNMARK, which is set to the specific VMID for all
> packets by the firewall.
>
> Additionally, a custom `com.proxmox.VMStateHelper` interface is
> implemented by the object, adding a small `Quit` method for cleanly
> shutting down the daemon via the D-Bus API.
>
> For all to work, D-Bus needs a policy describing who is allowed to
> access the interface. [2]
>
> Currently, there is a hard-limit of 1 MiB of state enforced by QEMU.
> Typical conntrack state entries as dumped by conntrack(8) in the `save`
> output format are just plaintext, ASCII lines and mostly around
> 150-200 characters. That translates then to about ~5200 entries that can
> be migrated.
>
> Such a typical line looks like:
>
> -A -t 431974 -u SEEN_REPLY,ASSURED -s 10.1.0.1 -d 10.1.1.20 \
> -r 10.1.1.20 -q 10.1.0.1 -p tcp --sport 48550 --dport 22 \
> --reply-port-src 22 --reply-port-dst 48550 --state ESTABLISHED
>
> In the future, compression could be implemented for these before sending
> them to QEMU, which should increase the above number quite a bit - since
> these entries are nicely compressible.
>
> [0] https://bugzilla.proxmox.com/show_bug.cgi?id=5180
> [1] https://www.qemu.org/docs/master/interop/dbus-vmstate.html
> [2] https://dbus.freedesktop.org/doc/dbus-daemon.1.html#configuration_file
>
> Signed-off-by: Christoph Heiss <c.heiss at proxmox.com>
> ---
> Changes v1 -> v2:
> * convert dbus-vmstate to instanced systemd service
> * fix plural for zero entries in migration log
>
> Makefile | 7 +-
> dbus-vmstate/Makefile | 7 ++
> dbus-vmstate/dbus-vmstate | 168 +++++++++++++++++++++++++
> dbus-vmstate/org.qemu.VMState1.conf | 11 ++
> dbus-vmstate/pve-dbus-vmstate at .service | 10 ++
> debian/control | 7 +-
> 6 files changed, 208 insertions(+), 2 deletions(-)
> create mode 100644 dbus-vmstate/Makefile
> create mode 100755 dbus-vmstate/dbus-vmstate
> create mode 100644 dbus-vmstate/org.qemu.VMState1.conf
> create mode 100644 dbus-vmstate/pve-dbus-vmstate at .service
>
> diff --git a/Makefile b/Makefile
> index ed67fe0a..2591c2d0 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -3,7 +3,7 @@ include /usr/share/dpkg/default.mk
> PACKAGE=qemu-server
> BUILDDIR ?= $(PACKAGE)-$(DEB_VERSION_UPSTREAM)
>
> -DESTDIR=
> +export DESTDIR=
> PREFIX=/usr
> SBINDIR=$(PREFIX)/sbin
> LIBDIR=$(PREFIX)/lib/$(PACKAGE)
> @@ -16,6 +16,10 @@ ZSHCOMPLDIR=$(PREFIX)/share/zsh/vendor-completions/
> export PERLDIR=$(PREFIX)/share/perl5
> PERLINCDIR=$(PERLDIR)/asm-x86_64
>
> +export LIBSYSTEMDDIR=$(PREFIX)/lib/systemd
> +export LIBEXECDIR=$(PREFIX)/libexec/$(PACKAGE)
> +export DBUSDIR=$(PREFIX)/share/dbus-1
> +
> GITVERSION:=$(shell git rev-parse HEAD)
>
> DEB=$(PACKAGE)_$(DEB_VERSION_UPSTREAM_REVISION)_$(DEB_BUILD_ARCH).deb
> @@ -68,6 +72,7 @@ install: $(PKGSOURCES)
> $(MAKE) -C query-machine-capabilities install
> $(MAKE) -C qemu-configs install
> $(MAKE) -C vm-network-scripts install
> + $(MAKE) -C dbus-vmstate install
> install -m 0755 qm $(DESTDIR)$(SBINDIR)
> install -m 0755 qmrestore $(DESTDIR)$(SBINDIR)
> install -D -m 0644 modules-load.conf $(DESTDIR)/etc/modules-load.d/qemu-server.conf
> diff --git a/dbus-vmstate/Makefile b/dbus-vmstate/Makefile
> new file mode 100644
> index 00000000..177bbbc1
> --- /dev/null
> +++ b/dbus-vmstate/Makefile
> @@ -0,0 +1,7 @@
> +all:
> +
> +.PHONY: install
> +install:
> + install -D -m 0755 dbus-vmstate $(DESTDIR)/$(LIBEXECDIR)/dbus-vmstate
> + install -D -m 0644 pve-dbus-vmstate at .service $(DESTDIR)/$(LIBSYSTEMDDIR)/system/pve-dbus-vmstate at .service
> + install -D -m 0644 org.qemu.VMState1.conf $(DESTDIR)/$(DBUSDIR)/system.d/org.qemu.VMState1.conf
> diff --git a/dbus-vmstate/dbus-vmstate b/dbus-vmstate/dbus-vmstate
> new file mode 100755
> index 00000000..04a1b53d
> --- /dev/null
> +++ b/dbus-vmstate/dbus-vmstate
> @@ -0,0 +1,168 @@
> +#!/usr/bin/perl
> +
> +# Exports an DBus object implementing
> +# https://www.qemu.org/docs/master/interop/dbus-vmstate.html
> +
> +package PVE::QemuServer::DBusVMState;
> +
> +use warnings;
> +use strict;
> +
> +use Carp;
> +use Net::DBus;
> +use Net::DBus::Exporter qw(org.qemu.VMState1);
> +use Net::DBus::Reactor;
> +use PVE::QemuServer::Helpers;
> +use PVE::QemuServer::QMPHelpers qw(qemu_objectadd qemu_objectdel);
> +use PVE::SafeSyslog;
> +use PVE::Tools;
> +
> +use base qw(Net::DBus::Object);
> +
> +use Class::MethodMaker [ scalar => [ qw(Id NumMigratedEntries) ]];
> +dbus_property('Id', 'string', 'read');
> +dbus_property('NumMigratedEntries', 'uint32', 'read', 'com.proxmox.VMStateHelper');
> +
> +sub new {
> + my ($class, $service, $vmid) = @_;
> +
> + my $self = $class->SUPER::new($service, '/org/qemu/VMState1');
> + $self->{vmid} = $vmid;
> + $self->Id("pve-vmstate-$vmid");
> + $self->NumMigratedEntries(0);
> +
> + bless $self, $class;
> + return $self;
> +}
> +
> +sub Load {
> + my ($self, $bytes) = @_;
> +
> + my $len = scalar(@$bytes);
> + return if $len <= 1; # see also the `Save` method
> +
> + my $text = pack('c*', @$bytes);
> +
> + eval {
> + PVE::Tools::run_command(
> + ['conntrack', '--load-file', '-'],
> + input => $text,
> + );
> + };
> + if (my $err = $@) {
nit: could just use $@ directly here? some additional occurences below
> + syslog('warn', "failed to restore conntrack state: $err\n");
> + } else {
> + syslog('info', "restored $len bytes of conntrack state\n");
> + }
> +}
> +dbus_method('Load', [['array', 'byte']], []);
> +
> +use constant {
> + # From the documentation:
> + # https://www.qemu.org/docs/master/interop/dbus-vmstate.html),
> + # > For now, the data amount to be transferred is arbitrarily limited to 1Mb.
> + #
> + # See also qemu/backends/dbus-vmstate.c:DBUS_VMSTATE_SIZE_LIMIT
> + DBUS_VMSTATE_SIZE_LIMIT => 1024 * 1024,
> +};
> +
> +sub Save {
> + my ($self) = @_;
> +
> + my $text = '';
> + my $truncated = 0;
> + my $num_entries = 0;
> + eval {
> + PVE::Tools::run_command(
> + ['conntrack', '--dump', '--mark', $self->{vmid}, '--output', 'save'],
> + outfunc => sub {
> + my ($line) = @_;
> + return if $truncated;
> +
> + if ((length($text) + length($line)) > DBUS_VMSTATE_SIZE_LIMIT) {
> + syslog('warn', 'conntrack state too large, ignoring further entries');
> + $truncated = 1;
> + return;
> + }
> +
> + # conntrack(8) does not preserve the `--mark` option, apparently
> + # just add it back ourselves
> + $text .= "$line --mark $self->{vmid}\n";
> + },
> + errfunc => sub {
> + my ($line) = @_;
> +
> + if ($line =~ /(\d) flow entries/) {
> + syslog('info', "received $1 conntrack entries");
> + # conntrack reports the number of displayed entries on stderr,
> + # which shouldn't be considered an error.
> + $self->NumMigratedEntries($1);
> + return;
> + }
> + syslog('err', $line);
> + }
> + );
> + };
> + if (my $err = $@) {
here
> + syslog('warn', "failed to save conntrack state: $err\n");
> +
> + # Apparently either Net::DBus does not correctly zero-sized (byte)
> + # arrays correctly - returning [] yields QEMU failing with
> + #
> + # "kvm: dbus_save_state_proxy: Failed to Save: not a byte array"
> + #
> + # Thus, just return an array with a single element and detect that
> + # appropriately in the `Load`. A valid conntrack state can *never* be
> + # just a single byte, so it is safe to rely on that.
> + return [0];
> + }
> +
> + my @bytes = unpack('c*', $text);
> + my $len = scalar(@bytes);
> +
> + syslog('info', "transferring $len bytes of conntrack state\n");
> +
> + # Same as above w.r.t. returning as single-element array.
> + return $len == 0 ? [0] : \@bytes;
> +}
> +dbus_method('Save', [], [['array', 'byte']]);
> +
> +# Additional method for cleanly shutting down the service.
> +sub Quit {
> + my ($self) = @_;
> +
> + syslog('info', "shutting down gracefully ..\n");
> +
> + # On the source side, the VM won't exist anymore, so no need to remove
> + # anything.
> + if (PVE::QemuServer::Helpers::vm_running_locally($self->{vmid})) {
> + eval { qemu_objectdel($self->{vmid}, 'pve-vmstate') };
> + if (my $err = $@) {
here
> + syslog('warn', "failed to remove object: $err\n");
> + }
> + }
> +
> + Net::DBus::Reactor->main()->shutdown();
> +}
> +dbus_method('Quit', [], [], 'com.proxmox.VMStateHelper', { no_return => 1 });
> +
> +my $vmid = shift;
> +
> +my $dbus = Net::DBus->system();
> +my $service = $dbus->export_service('org.qemu.VMState1');
> +my $obj = PVE::QemuServer::DBusVMState->new($service, $vmid);
> +
> +$SIG{TERM} = sub {
> + $obj->Quit();
> +};
> +
> +my $addr = $dbus->get_unique_name();
> +syslog('info', "pve-vmstate-$vmid listening on $addr\n");
> +
> +# Inform QEMU about our running dbus-vmstate helper
> +qemu_objectadd($vmid, 'pve-vmstate', 'dbus-vmstate',
> + addr => 'unix:path=/run/dbus/system_bus_socket',
> + 'id-list' => "pve-vmstate-$vmid",
> +);
> +
> +Net::DBus::Reactor->main()->run();
> diff --git a/dbus-vmstate/org.qemu.VMState1.conf b/dbus-vmstate/org.qemu.VMState1.conf
> new file mode 100644
> index 00000000..cfedcae4
> --- /dev/null
> +++ b/dbus-vmstate/org.qemu.VMState1.conf
> @@ -0,0 +1,11 @@
> +<?xml version="1.0"?>
> +<!DOCTYPE busconfig PUBLIC "-//freedesktop//DTD D-BUS Bus Configuration 1.0//EN"
> + "http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd">
> +<busconfig>
> + <policy user="root">
> + <allow own="org.qemu.VMState1" />
> + <allow send_destination="org.qemu.VMState1" />
> + <allow receive_sender="org.qemu.VMState1" />
> + <allow send_destination="com.proxmox.VMStateHelper" />
> + </policy>
> +</busconfig>
> diff --git a/dbus-vmstate/pve-dbus-vmstate at .service b/dbus-vmstate/pve-dbus-vmstate at .service
> new file mode 100644
> index 00000000..56b4e285
> --- /dev/null
> +++ b/dbus-vmstate/pve-dbus-vmstate at .service
> @@ -0,0 +1,10 @@
> +[Unit]
> +Description=PVE DBus VMState Helper (VM %i)
> +Requires=dbus.socket
> +After=dbus.socket
> +PartOf=%i.scope
> +
> +[Service]
> +Slice=qemu.slice
> +Type=simple
> +ExecStart=/usr/libexec/qemu-server/dbus-vmstate %i
> diff --git a/debian/control b/debian/control
> index d6c20040..ee1ca177 100644
> --- a/debian/control
> +++ b/debian/control
> @@ -3,9 +3,11 @@ Section: admin
> Priority: optional
> Maintainer: Proxmox Support Team <support at proxmox.com>
> Build-Depends: debhelper-compat (= 13),
> + libclass-methodmaker-perl,
> libglib2.0-dev,
> libio-multiplex-perl,
> libjson-c-dev,
> + libnet-dbus-perl,
> libpve-apiclient-perl,
> libpve-cluster-perl,
> libpve-common-perl (>= 8.0.2),
> @@ -28,11 +30,14 @@ Homepage: https://www.proxmox.com
>
> Package: qemu-server
> Architecture: any
> -Depends: dbus,
> +Depends: conntrack,
> + dbus,
> genisoimage,
> + libclass-methodmaker-perl,
> libio-multiplex-perl,
> libjson-perl,
> libjson-xs-perl,
> + libnet-dbus-perl,
> libnet-ssleay-perl,
> libpve-access-control (>= 8.0.0~),
> libpve-apiclient-perl,
More information about the pve-devel
mailing list