[pve-devel] [PATCH v2 qemu-server 2/2] remote-migration: add target-cpu param

DERUMIER, Alexandre alexandre.derumier at groupe-cyllene.com
Thu Apr 27 07:50:10 CEST 2023


Hi,

Le mercredi 26 avril 2023 à 15:14 +0200, Fabian Grünbichler a écrit :
> On April 25, 2023 6:52 pm, Alexandre Derumier wrote:
> > This patch add support for remote migration when target
> > cpu model is different.
> > 
> > The target vm is restart after the migration
> 
> so this effectively introduces a new "hybrid" migration mode ;) the
> changes are a bit smaller than I expected (in part thanks to patch
> #1),
> which is good.
> 
> there are semi-frequent requests for another variant (also applicable
> to
> containers) in the form of a two phase migration
> - storage migrate
> - stop guest
> - incremental storage migrate
> - start guest on target
> 

But I'm not sure how to to an incremental storage migrate, without
storage snapshot send|receiv.  (so zfs && rbd could work).

- Vm/ct is running
- do a first snapshot + sync to target with zfs|rbd send|receive
- stop the guest
- do a second snapshot + incremental sync + sync to target with zfs|rbd
send|receive
- start the guest on remote


(or maybe for vm, without snapshot, with a dirty bitmap ? But we need
to be able to write the dirty map content to disk somewhere after vm
stop, and reread it for the last increment )

- vm is running
- create a dirty-bitmap and start sync with qemu-block-storage
- stop the vm && save the dirty bitmap
- reread the dirtymap && do incremental sync (with the new qemu-daemon-
storage or starting the vm paused ?


And currently we don't support yet offline storage migration. (BTW,
This is also breaking migration with unused disk).
I don't known if we can send send|receiv transfert through the tunnel ?
(I never tested it)


> given that it might make sense to save-guard this implementation
> here,
> and maybe switch to a new "mode" parameter?
> 
> online => switching CPU not allowed
> offline or however-we-call-this-new-mode (or in the future, two-
> phase-restart) => switching CPU allowed
> 

Yes, I was thinking about that too.
Maybe not "offline", because maybe we want to implement a real offline
mode later.
But simply "restart" ?



> > 
> > Signed-off-by: Alexandre Derumier <aderumier at odiso.com>
> > ---
> >  PVE/API2/Qemu.pm   | 18 ++++++++++++++++++
> >  PVE/CLI/qm.pm      |  6 ++++++
> >  PVE/QemuMigrate.pm | 25 +++++++++++++++++++++++++
> >  3 files changed, 49 insertions(+)
> > 
> > diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> > index 587bb22..6703c87 100644
> > --- a/PVE/API2/Qemu.pm
> > +++ b/PVE/API2/Qemu.pm
> > @@ -4460,6 +4460,12 @@ __PACKAGE__->register_method({
> >                 optional => 1,
> >                 default => 0,
> >             },
> > +           'target-cpu' => {
> > +               optional => 1,
> > +               description => "Target Emulated CPU model. For
> > online migration, the storage is live migrate, but the memory
> > migration is skipped and the target vm is restarted.",
> > +               type => 'string',
> > +               format => 'pve-vm-cpu-conf',
> > +           },
> >             'target-storage' => get_standard_option('pve-
> > targetstorage', {
> >                 completion =>
> > \&PVE::QemuServer::complete_migration_storage,
> >                 optional => 0,
> > @@ -4557,11 +4563,14 @@ __PACKAGE__->register_method({
> >         raise_param_exc({ 'target-bridge' => "failed to parse
> > bridge map: $@" })
> >             if $@;
> >  
> > +       my $target_cpu = extract_param($param, 'target-cpu');
> 
> this is okay
> 
> > +
> >         die "remote migration requires explicit storage mapping!\n"
> >             if $storagemap->{identity};
> >  
> >         $param->{storagemap} = $storagemap;
> >         $param->{bridgemap} = $bridgemap;
> > +       $param->{targetcpu} = $target_cpu;
> 
> but this is a bit confusing with the variable/hash key naming ;)
> 
> >         $param->{remote} = {
> >             conn => $conn_args, # re-use fingerprint for tunnel
> >             client => $api_client,
> > @@ -5604,6 +5613,15 @@ __PACKAGE__->register_method({
> >                     PVE::QemuServer::nbd_stop($state->{vmid});
> >                     return;
> >                 },
> > +               'restart' => sub {
> > +                   PVE::QemuServer::vm_stop(undef, $state->{vmid},
> > 1, 1);
> > +                   my $info = PVE::QemuServer::vm_start_nolock(
> > +                       $state->{storecfg},
> > +                       $state->{vmid},
> > +                       $state->{conf},
> > +                   );
> > +                   return;
> > +               },
> >                 'resume' => sub {
> >                     if
> > (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) {
> >                         PVE::QemuServer::vm_resume($state->{vmid},
> > 1, 1);
> > diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
> > index c3c2982..06c74c1 100755
> > --- a/PVE/CLI/qm.pm
> > +++ b/PVE/CLI/qm.pm
> > @@ -189,6 +189,12 @@ __PACKAGE__->register_method({
> >                 optional => 1,
> >                 default => 0,
> >             },
> > +           'target-cpu' => {
> > +               optional => 1,
> > +               description => "Target Emulated CPU model. For
> > online migration, the storage is live migrate, but the memory
> > migration is skipped and the target vm is restarted.",
> > +               type => 'string',
> > +               format => 'pve-vm-cpu-conf',
> > +           },
> >             'target-storage' => get_standard_option('pve-
> > targetstorage', {
> >                 completion =>
> > \&PVE::QemuServer::complete_migration_storage,
> >                 optional => 0,
> > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
> > index e182415..04f8053 100644
> > --- a/PVE/QemuMigrate.pm
> > +++ b/PVE/QemuMigrate.pm
> > @@ -731,6 +731,11 @@ sub cleanup_bitmaps {
> >  sub live_migration {
> >      my ($self, $vmid, $migrate_uri, $spice_port) = @_;
> >  
> > +    if($self->{opts}->{targetcpu}){
> > +        $self->log('info', "target cpu is different - skip live
> > migration.");
> > +        return;
> > +    }
> > +
> >      my $conf = $self->{vmconf};
> >  
> >      $self->log('info', "starting online/live migration on
> > $migrate_uri");
> > @@ -995,6 +1000,7 @@ sub phase1_remote {
> >      my $remote_conf = PVE::QemuConfig->load_config($vmid);
> >      PVE::QemuConfig->update_volume_ids($remote_conf, $self-
> > >{volume_map});
> >  
> > +    $remote_conf->{cpu} = $self->{opts}->{targetcpu};
> 
> do we need permission checks here (or better, somewhere early on, for
> doing this here)
> 
> >      my $bridges = map_bridges($remote_conf, $self->{opts}-
> > >{bridgemap});
> >      for my $target (keys $bridges->%*) {
> >         for my $nic (keys $bridges->{$target}->%*) {
> > @@ -1354,6 +1360,21 @@ sub phase2 {
> >      live_migration($self, $vmid, $migrate_uri, $spice_port);
> >  
> >      if ($self->{storage_migration}) {
> > +
> > +        #freeze source vm io/s if target cpu is different (no
> > livemigration)
> > +       if ($self->{opts}->{targetcpu}) {
> > +           my $agent_running = $self->{conf}->{agent} &&
> > PVE::QemuServer::qga_check_running($vmid);
> > +           if ($agent_running) {
> > +               print "freeze filesystem\n";
> > +               eval { mon_cmd($vmid, "guest-fsfreeze-freeze"); };
> > +               die $@ if $@;
> 
> die here
> 
> > +           } else {
> > +               print "suspend vm\n";
> > +               eval { PVE::QemuServer::vm_suspend($vmid, 1); };
> > +               warn $@ if $@;
> 
> but warn here?
> 
> I'd like some more rationale for these two variants, what are the
> pros
> and cons? should we make it configurable?
> 
> > +           }
> > +       }
> > +
> >         # finish block-job with block-job-cancel, to disconnect
> > source VM from NBD
> >         # to avoid it trying to re-establish it. We are in blockjob
> > ready state,
> >         # thus, this command changes to it to blockjob complete
> > (see qapi docs)
> > @@ -1608,6 +1629,10 @@ sub phase3_cleanup {
> >      # clear migrate lock
> >      if ($tunnel && $tunnel->{version} >= 2) {
> >         PVE::Tunnel::write_tunnel($tunnel, 10, "unlock");
> > +       if ($self->{opts}->{targetcpu}) {
> > +           $self->log('info', "target cpu is different - restart
> > target vm.");
> > +           PVE::Tunnel::write_tunnel($tunnel, 10, 'restart');
> > +       }
> >  
> >         PVE::Tunnel::finish_tunnel($tunnel);
> >      } else {
> > -- 
> > 2.30.2
> > 
> > 
> > _______________________________________________
> > pve-devel mailing list
> > pve-devel at lists.proxmox.com
> > https://antiphishing.cetsi.fr/proxy/v3?i=Zk92VEFKaGQ4Ums4cnZEUWMTpfHaXFQGRw1_CnOoOH0&r=bHA1dGV3NWJQVUloaWNFUZPm0fiiBviaiy_RDav2GQ1U4uy6lsDDv3uBszpvvWYQN5FqKqFD6WPYupfAUP1c9g&f=SlhDbE9uS2laS2JaZFpNWvmsxai1zlJP9llgnl5HIv-4jAji8Dh2BQawzxID5bzr6Uv-3EQd-eluQbsPfcUOTg&u=https%3A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel&k=XRKU
> > 
> > 
> > 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://antiphishing.cetsi.fr/proxy/v3?i=Zk92VEFKaGQ4Ums4cnZEUWMTpfHaXFQGRw1_CnOoOH0&r=bHA1dGV3NWJQVUloaWNFUZPm0fiiBviaiy_RDav2GQ1U4uy6lsDDv3uBszpvvWYQN5FqKqFD6WPYupfAUP1c9g&f=SlhDbE9uS2laS2JaZFpNWvmsxai1zlJP9llgnl5HIv-4jAji8Dh2BQawzxID5bzr6Uv-3EQd-eluQbsPfcUOTg&u=https%3A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel&k=XRKU
> 



More information about the pve-devel mailing list