[pve-devel] [WIP qemu-server 2/3] nvme: make it somewhat work with current version

Aaron Lauterer a.lauterer at proxmox.com
Wed Jul 31 15:40:32 CEST 2024


this patch is not meant to be applied but captures my efforts to make
the old series work (somewhat).

This way I was able to achieve hot-plugging on an i440 VM.

Hot plugging on a q35 VM did not work in my tests, but I think it is
related to how the NVME device was attached to the PCI tree of the VM.
By controlling this more directly, we should be able to attach it to a
PCI port/bridge that allows hot-plugging.

For some reason that I need to investigate further, giving the nvme
device the id `nvmeX` will cause issues when unplugging. It seems our
code will still see the `drive-nvmeX` drive and will add it to the list
of devices present. Therefore, the unplug code will think it did not
remove successfully. I worked around that for now by calling the nvme
device `id=nvmecontX` instead of `id=nvmeX`.

Live migration is still a blocker as the NVME QEMU device does not allow
it.
It fails with the following errors:

2024-07-31 09:31:14 migrate uri => unix:/run/qemu-server/102.migrate failed: VM 102 qmp command 'migrate' failed - State blocked by non-migratable device '0000:00:04.0/nvme'
2024-07-31 09:31:15 ERROR: online migrate failure - VM 102 qmp command 'migrate' failed - State blocked by non-migratable device '0000:00:04.0/nvme'

We need to check (with upstream) what is missing. Also if there are
plans upstream to implement the missing pieces, or if we could do it
ourselves.

I found the following older thread on the qemu mailing list:
https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg05788.html

Signed-off-by: Aaron Lauterer <a.lauterer at proxmox.com>
---
 PVE/QemuServer.pm       | 15 +++++++++------
 PVE/QemuServer/Drive.pm |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 7d8d75b..6dc3eb2 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -1436,7 +1436,7 @@ sub print_drivedevice_full {
     } elsif ($drive->{interface} eq 'nvme') {
 	my $path = $drive->{file};
 	$drive->{serial} //= "$drive->{interface}$drive->{index}"; # serial is mandatory for nvme
-	$device = "nvme,drive=drive-$drive->{interface}$drive->{index},id=$drive->{interface}$drive->{index}";
+	$device = "nvme,drive=drive-$drive->{interface}$drive->{index},id=$drive->{interface}cont$drive->{index}";
     } elsif ($drive->{interface} eq 'ide' || $drive->{interface} eq 'sata') {
 	my $maxdev = ($drive->{interface} eq 'sata') ? $PVE::QemuServer::Drive::MAX_SATA_DISKS : 2;
 	my $controller = int($drive->{index} / $maxdev);
@@ -4292,11 +4292,10 @@ sub vm_deviceplug {
 	    warn $@ if $@;
 	    die $err;
         }
-    } elsif ($deviceid =~ m/^(nvme)(\d+)$/) {
-
+    } elsif ($deviceid =~ m/^nvme(\d+)$/) {
 	qemu_driveadd($storecfg, $vmid, $device);
 
-	my $devicefull = print_drivedevice_full($storecfg, $conf, $vmid, $device, $arch, $machine_type);
+	my $devicefull = print_drivedevice_full($storecfg, $conf, $vmid, $device, undef, $arch, $machine_type);
 	eval { qemu_deviceadd($vmid, $devicefull); };
 	if (my $err = $@) {
 	    eval { qemu_drivedel($vmid, $deviceid); };
@@ -4375,8 +4374,12 @@ sub vm_deviceunplug {
 
 	qemu_iothread_del($vmid, "virtioscsi$device->{index}", $device)
 	    if $conf->{scsihw} && ($conf->{scsihw} eq 'virtio-scsi-single');
-    } elsif ($deviceid =~ m/^(nvme)(\d+)$/) {
-	qemu_devicedel($vmid, $deviceid);
+    } elsif ($deviceid =~ m/^nvme(\d+)$/) {
+	my $device = parse_drive($deviceid, $conf->{$deviceid});
+
+	my $nvmecont = "nvmecont$1";
+	qemu_devicedel($vmid, $nvmecont);
+	qemu_devicedelverify($vmid, $nvmecont);
 	qemu_drivedel($vmid, $deviceid);
     } elsif ($deviceid =~ m/^(net)(\d+)$/) {
 	qemu_devicedel($vmid, $deviceid);
diff --git a/PVE/QemuServer/Drive.pm b/PVE/QemuServer/Drive.pm
index f05ad26..b055674 100644
--- a/PVE/QemuServer/Drive.pm
+++ b/PVE/QemuServer/Drive.pm
@@ -532,6 +532,7 @@ for (my $i = 0; $i < $MAX_SCSI_DISKS; $i++)  {
 
 for (my $i = 0; $i < $MAX_NVME_DISKS; $i++)  {
     $drivedesc_hash->{"nvme$i"} = $nvmedesc;
+    $drivedesc_hash_with_alloc->{"nvme$i"} = $desc_with_alloc->('nvme', $nvmedesc);
 }
 
 
-- 
2.39.2





More information about the pve-devel mailing list