[pve-devel] [RFC qemu-server 1/1] partially fix #4501: migration: start vm: move port reservation and usage closer together
Wolfgang Bumiller
w.bumiller at proxmox.com
Wed Nov 15 11:12:45 CET 2023
On Wed, Nov 15, 2023 at 09:55:22AM +0100, Fabian Grünbichler wrote:
> On November 14, 2023 3:02 pm, Fiona Ebner wrote:
> > Currently, volume activation, PCI reservation and resetting systemd
> > scope happen in between and the 5 second expiretime used for port
> > reservation might not be enough.
> >
> > Still not ideal, because entering systemd scope and maybe starting
> > swtpm still happen after reservation before the QEMU binary can be
> > invoked and actually use the port, but the reservation needs to happen
> > outside of the fork, because the result is used there too.
> >
> > Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
>
> Acked-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
>
> we could move the whole statefile handling further down, but then some
> additional side-effects need to be taken care of/refactored, this seems
> like a minimal-invasive version for the uncommon (insecure) case.
>
> > ---
> > PVE/QemuServer.pm | 20 ++++++++++++++------
> > 1 file changed, 14 insertions(+), 6 deletions(-)
> >
> > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> > index c465fb6f..aeaea8eb 100644
> > --- a/PVE/QemuServer.pm
> > +++ b/PVE/QemuServer.pm
> > @@ -5697,6 +5697,9 @@ sub vm_start_nolock {
> > return $migration_ip;
> > };
> >
> > + # helper to move port reservation and usage closer together to avoid expiry (bug #4501)
> > + my $append_tcp_migration_cmdline;
> > +
> > if ($statefile) {
> > if ($statefile eq 'tcp') {
> > my $migrate = $res->{migrate} = { proto => 'tcp' };
> > @@ -5717,12 +5720,13 @@ sub vm_start_nolock {
> > $migrate->{addr} = "[$migrate->{addr}]" if Net::IP::ip_is_ipv6($migrate->{addr});
> > }
> >
> > - my $pfamily = PVE::Tools::get_host_address_family($nodename);
> > - $migrate->{port} = PVE::Tools::next_migrate_port($pfamily);
> > - $migrate->{uri} = "tcp:$migrate->{addr}:$migrate->{port}";
> > - push @$cmd, '-incoming', $migrate->{uri};
> > - push @$cmd, '-S';
> > -
>
> nit: I'd maybe add another comment here, maybe something like
>
> # delay migration port reservation to prevent expiry before binding
What about adding an option to `next_migrate_port()` to actually return
the open socket to keep the reservation?
Also, did we consider passing the file descriptor through to qemu via
`-incoming fd:$number`?
>
> > + $append_tcp_migration_cmdline = sub {
> > + my $pfamily = PVE::Tools::get_host_address_family($nodename);
> > + $migrate->{port} = PVE::Tools::next_migrate_port($pfamily);
> > + $migrate->{uri} = "tcp:$migrate->{addr}:$migrate->{port}";
> > + push @$cmd, '-incoming', $migrate->{uri};
> > + push @$cmd, '-S';
> > + };
> > } elsif ($statefile eq 'unix') {
> > # should be default for secure migrations as a ssh TCP forward
> > # tunnel is not deterministic reliable ready and fails regurarly
> > @@ -5840,6 +5844,10 @@ sub vm_start_nolock {
> > $systemd_properties{timeout} = 10 if $statefile; # setting up the scope shoul be quick
> >
> > my $run_qemu = sub {
> > + # sets the port+uri for $res->{migrate} which is printed below and part of the result, so
> > + # needs to happen outside of the fork.
> > + $append_tcp_migration_cmdline->() if $append_tcp_migration_cmdline;
> > +
> > PVE::Tools::run_fork sub {
> > PVE::Systemd::enter_systemd_scope($vmid, "Proxmox VE VM $vmid", %systemd_properties);
> >
> > --
> > 2.39.2
More information about the pve-devel
mailing list