[pve-devel] [PATCH v2 qemu-server 2/2] remote-migration: add target-cpu param

DERUMIER, Alexandre alexandre.derumier at groupe-cyllene.com
Fri Apr 28 08:43:56 CEST 2023


> >
>> And currently we don't support yet offline storage migration. (BTW,
>> This is also breaking migration with unused disk).
>> I don't known if we can send send|receiv transfert through the
tunnel ?
>> (I never tested it)

> we do, but maybe you tested with RBD which doesn't support storage
> migration yet? withing a cluster it doesn't need to, since it's a
> shared
> storage, but between cluster we need to implement it (it's on my TODO
> list and shouldn't be too hard since there is 'rbd export/import').
> 
Yes, this was with an unused rbd device indeed.
(Another way could be to implement qemu-storage-daemon (never tested
it) for offline sync with any storage, like lvm)

Also cloud-init drive seem to be unmigratable currently. (I wonder if
we couldn't simply regenerate it on target, as now we have cloud-init
pending section, we can correctly generate the cloudinit with current
running config).



> > > given that it might make sense to save-guard this implementation
> > > here,
> > > and maybe switch to a new "mode" parameter?
> > > 
> > > online => switching CPU not allowed
> > > offline or however-we-call-this-new-mode (or in the future, two-
> > > phase-restart) => switching CPU allowed
> > > 
> > 
> > Yes, I was thinking about that too.
> > Maybe not "offline", because maybe we want to implement a real
> > offline
> > mode later.
> > But simply "restart" ?
> 
> no, I meant moving the existing --online switch to a new mode
> parameter,
> then we'd have "online" and "offline", and then add your new mode on
> top
> "however-we-call-this-new-mode", and then we could in the future also
> add "two-phase-restart" for the sync-twice mode I described :)
> 
> target-cpu would of course also be supported for the (existing)
> offline
> mode, since it just needs to adapt the target-cpu in the config.
> 
> the main thing I'd want to avoid is somebody accidentally setting
> "target-cpu", not knowing/noticing that that entails what amounts to
> a
> reset of the VM as part of the migration..
> 
Yes, that what I had understanded
 ;)

It's was more about "offline" term, because we don't offline the source
vm until the disk migration is finished. (to reduce downtime)
More like "online-restart" instead "offline".

Offline for me , is really, we shut the vm, then do the disk migration.
 

> there were a few things down below that might also be worthy of
> discussion. I also wonder whether the two variants of "freeze FS" and
> "suspend without state" are enough - that only ensures that no more
> I/O
> happens so the volumes are bitwise identical, but shouldn't we also
> at
> least have the option of doing a clean shutdown at that point so that
> applications can serialize/flush their state properly and that gets
> synced across as well? else this is the equivalent of cutting the
> power
> cord, which might not be a good fit for all use cases ;)
> 
I had try the clean shutdown in my v1 patch 
https://lists.proxmox.com/pipermail/pve-devel/2023-March/056291.html
(without doing the block-job-complete) in phase3,  and I have fs
coruption sometime.
Not sure why exactly (Maybe os didn't have correctly shutdown or maybe
some datas in the buffer ?)
Maybe doing the block-job-complete before should make it safe.
(transfert access to the nbd , then do the clean shutdown).

I'll give a try in the V3. 

I just wonder if we can add a new param, like:

--online --fsfreeze

--online --shutdown

--online --2phase-restart

?




 (I'm currently migrating a lot of vm between an old intel cluster to
the new amd cluster, on different datacenter, with a different ceph
cluster, so I can still do real production tests)




> > > > 
> > > > Signed-off-by: Alexandre Derumier <aderumier at odiso.com>
> > > > ---
> > > >  PVE/API2/Qemu.pm   | 18 ++++++++++++++++++
> > > >  PVE/CLI/qm.pm      |  6 ++++++
> > > >  PVE/QemuMigrate.pm | 25 +++++++++++++++++++++++++
> > > >  3 files changed, 49 insertions(+)
> > > > 
> > > > diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> > > > index 587bb22..6703c87 100644
> > > > --- a/PVE/API2/Qemu.pm
> > > > +++ b/PVE/API2/Qemu.pm
> > > > @@ -4460,6 +4460,12 @@ __PACKAGE__->register_method({
> > > >                 optional => 1,
> > > >                 default => 0,
> > > >             },
> > > > +           'target-cpu' => {
> > > > +               optional => 1,
> > > > +               description => "Target Emulated CPU model. For
> > > > online migration, the storage is live migrate, but the memory
> > > > migration is skipped and the target vm is restarted.",
> > > > +               type => 'string',
> > > > +               format => 'pve-vm-cpu-conf',
> > > > +           },
> > > >             'target-storage' => get_standard_option('pve-
> > > > targetstorage', {
> > > >                 completion =>
> > > > \&PVE::QemuServer::complete_migration_storage,
> > > >                 optional => 0,
> > > > @@ -4557,11 +4563,14 @@ __PACKAGE__->register_method({
> > > >         raise_param_exc({ 'target-bridge' => "failed to parse
> > > > bridge map: $@" })
> > > >             if $@;
> > > >  
> > > > +       my $target_cpu = extract_param($param, 'target-cpu');
> > > 
> > > this is okay
> > > 
> > > > +
> > > >         die "remote migration requires explicit storage
> > > > mapping!\n"
> > > >             if $storagemap->{identity};
> > > >  
> > > >         $param->{storagemap} = $storagemap;
> > > >         $param->{bridgemap} = $bridgemap;
> > > > +       $param->{targetcpu} = $target_cpu;
> > > 
> > > but this is a bit confusing with the variable/hash key naming ;)
> > > 
> > > >         $param->{remote} = {
> > > >             conn => $conn_args, # re-use fingerprint for tunnel
> > > >             client => $api_client,
> > > > @@ -5604,6 +5613,15 @@ __PACKAGE__->register_method({
> > > >                     PVE::QemuServer::nbd_stop($state->{vmid});
> > > >                     return;
> > > >                 },
> > > > +               'restart' => sub {
> > > > +                   PVE::QemuServer::vm_stop(undef, $state-
> > > > >{vmid},
> > > > 1, 1);
> > > > +                   my $info =
> > > > PVE::QemuServer::vm_start_nolock(
> > > > +                       $state->{storecfg},
> > > > +                       $state->{vmid},
> > > > +                       $state->{conf},
> > > > +                   );
> > > > +                   return;
> > > > +               },
> > > >                 'resume' => sub {
> > > >                     if
> > > > (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid}))
> > > > {
> > > >                         PVE::QemuServer::vm_resume($state-
> > > > >{vmid},
> > > > 1, 1);
> > > > diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
> > > > index c3c2982..06c74c1 100755
> > > > --- a/PVE/CLI/qm.pm
> > > > +++ b/PVE/CLI/qm.pm
> > > > @@ -189,6 +189,12 @@ __PACKAGE__->register_method({
> > > >                 optional => 1,
> > > >                 default => 0,
> > > >             },
> > > > +           'target-cpu' => {
> > > > +               optional => 1,
> > > > +               description => "Target Emulated CPU model. For
> > > > online migration, the storage is live migrate, but the memory
> > > > migration is skipped and the target vm is restarted.",
> > > > +               type => 'string',
> > > > +               format => 'pve-vm-cpu-conf',
> > > > +           },
> > > >             'target-storage' => get_standard_option('pve-
> > > > targetstorage', {
> > > >                 completion =>
> > > > \&PVE::QemuServer::complete_migration_storage,
> > > >                 optional => 0,
> > > > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
> > > > index e182415..04f8053 100644
> > > > --- a/PVE/QemuMigrate.pm
> > > > +++ b/PVE/QemuMigrate.pm
> > > > @@ -731,6 +731,11 @@ sub cleanup_bitmaps {
> > > >  sub live_migration {
> > > >      my ($self, $vmid, $migrate_uri, $spice_port) = @_;
> > > >  
> > > > +    if($self->{opts}->{targetcpu}){
> > > > +        $self->log('info', "target cpu is different - skip
> > > > live
> > > > migration.");
> > > > +        return;
> > > > +    }
> > > > +
> > > >      my $conf = $self->{vmconf};
> > > >  
> > > >      $self->log('info', "starting online/live migration on
> > > > $migrate_uri");
> > > > @@ -995,6 +1000,7 @@ sub phase1_remote {
> > > >      my $remote_conf = PVE::QemuConfig->load_config($vmid);
> > > >      PVE::QemuConfig->update_volume_ids($remote_conf, $self-
> > > > > {volume_map});
> > > >  
> > > > +    $remote_conf->{cpu} = $self->{opts}->{targetcpu};
> > > 
> > > do we need permission checks here (or better, somewhere early on,
> > > for
> > > doing this here)
> > > 
> > > >      my $bridges = map_bridges($remote_conf, $self->{opts}-
> > > > > {bridgemap});
> > > >      for my $target (keys $bridges->%*) {
> > > >         for my $nic (keys $bridges->{$target}->%*) {
> > > > @@ -1354,6 +1360,21 @@ sub phase2 {
> > > >      live_migration($self, $vmid, $migrate_uri, $spice_port);
> > > >  
> > > >      if ($self->{storage_migration}) {
> > > > +
> > > > +        #freeze source vm io/s if target cpu is different (no
> > > > livemigration)
> > > > +       if ($self->{opts}->{targetcpu}) {
> > > > +           my $agent_running = $self->{conf}->{agent} &&
> > > > PVE::QemuServer::qga_check_running($vmid);
> > > > +           if ($agent_running) {
> > > > +               print "freeze filesystem\n";
> > > > +               eval { mon_cmd($vmid, "guest-fsfreeze-freeze");
> > > > };
> > > > +               die $@ if $@;
> > > 
> > > die here
> > > 
> > > > +           } else {
> > > > +               print "suspend vm\n";
> > > > +               eval { PVE::QemuServer::vm_suspend($vmid, 1);
> > > > };
> > > > +               warn $@ if $@;
> > > 
> > > but warn here?
> > > 
> > > I'd like some more rationale for these two variants, what are the
> > > pros
> > > and cons? should we make it configurable?
> > > > +           }
> > > > +       }
> > > > +
> > > >         # finish block-job with block-job-cancel, to disconnect
> > > > source VM from NBD
> > > >         # to avoid it trying to re-establish it. We are in
> > > > blockjob
> > > > ready state,
> > > >         # thus, this command changes to it to blockjob complete
> > > > (see qapi docs)
> > > > @@ -1608,6 +1629,10 @@ sub phase3_cleanup {
> > > >      # clear migrate lock
> > > >      if ($tunnel && $tunnel->{version} >= 2) {
> > > >         PVE::Tunnel::write_tunnel($tunnel, 10, "unlock");
> > > > +       if ($self->{opts}->{targetcpu}) {
> > > > +           $self->log('info', "target cpu is different -
> > > > restart
> > > > target vm.");
> > > > +           PVE::Tunnel::write_tunnel($tunnel, 10, 'restart');
> > > > +       }
> > > >  
> > > >         PVE::Tunnel::finish_tunnel($tunnel);
> > > >      } else {
> > > > -- 
> > > > 2.30.2
> > > > 
> > > > 
> > > > _______________________________________________
> > > > pve-devel mailing list
> > > > pve-devel at lists.proxmox.com
> > > > https://antiphishing.cetsi.fr/proxy/v3?i=MlZSTzBhZFZ6Nzl4c3EyN7fbSKDePLMxi5u5_onpAoI&r=cm1qVmRYUWk2WXhYZVFHWA0PXtTaYxz7-FIOTkZBm34_dHdSch-gXn7ST9eGhQLN&f=S1Zkd042VWdrZG5qQUxxWk5ps4t67kNuHsBZzdzhpquLKuXqTZLIq2K1DfKr9N61yBafm7AuAITd6bHtRU4zEQ&u=https%3A//antiphishing.cetsi.fr/proxy/v3%3Fi%3DZk92VEFKaGQ4Ums4cnZEUWMTpfHaXFQGRw1_CnOoOH0%26r%3DbHA1dGV3NWJQVUloaWNFUZPm0fiiBviaiy_RDav2GQ1U4uy6lsDDv3uBszpvvWYQN5FqKqFD6WPYupfAUP1c9g%26f%3DSlhDbE9uS2laS2JaZFpNWvmsxai1zlJP9llgnl5HIv-4jAji8Dh2BQawzxID5bzr6Uv-3EQd-eluQbsPfcUOTg%26u%3Dhttps%253A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel%26k%3DXRKU&k=F1is
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > > _______________________________________________
> > > pve-devel mailing list
> > > pve-devel at lists.proxmox.com
> > > https://antiphishing.cetsi.fr/proxy/v3?i=MlZSTzBhZFZ6Nzl4c3EyN7fbSKDePLMxi5u5_onpAoI&r=cm1qVmRYUWk2WXhYZVFHWA0PXtTaYxz7-FIOTkZBm34_dHdSch-gXn7ST9eGhQLN&f=S1Zkd042VWdrZG5qQUxxWk5ps4t67kNuHsBZzdzhpquLKuXqTZLIq2K1DfKr9N61yBafm7AuAITd6bHtRU4zEQ&u=https%3A//antiphishing.cetsi.fr/proxy/v3%3Fi%3DZk92VEFKaGQ4Ums4cnZEUWMTpfHaXFQGRw1_CnOoOH0%26r%3DbHA1dGV3NWJQVUloaWNFUZPm0fiiBviaiy_RDav2GQ1U4uy6lsDDv3uBszpvvWYQN5FqKqFD6WPYupfAUP1c9g%26f%3DSlhDbE9uS2laS2JaZFpNWvmsxai1zlJP9llgnl5HIv-4jAji8Dh2BQawzxID5bzr6Uv-3EQd-eluQbsPfcUOTg%26u%3Dhttps%253A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel%26k%3DXRKU&k=F1is
> > > 
> > 
> > _______________________________________________
> > pve-devel mailing list
> > pve-devel at lists.proxmox.com
> > https://antiphishing.cetsi.fr/proxy/v3?i=MlZSTzBhZFZ6Nzl4c3EyN7fbSKDePLMxi5u5_onpAoI&r=cm1qVmRYUWk2WXhYZVFHWA0PXtTaYxz7-FIOTkZBm34_dHdSch-gXn7ST9eGhQLN&f=S1Zkd042VWdrZG5qQUxxWk5ps4t67kNuHsBZzdzhpquLKuXqTZLIq2K1DfKr9N61yBafm7AuAITd6bHtRU4zEQ&u=https%3A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel&k=F1is
> > 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://antiphishing.cetsi.fr/proxy/v3?i=MlZSTzBhZFZ6Nzl4c3EyN7fbSKDePLMxi5u5_onpAoI&r=cm1qVmRYUWk2WXhYZVFHWA0PXtTaYxz7-FIOTkZBm34_dHdSch-gXn7ST9eGhQLN&f=S1Zkd042VWdrZG5qQUxxWk5ps4t67kNuHsBZzdzhpquLKuXqTZLIq2K1DfKr9N61yBafm7AuAITd6bHtRU4zEQ&u=https%3A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel&k=F1is



More information about the pve-devel mailing list