[pve-devel] [PATCH manager stable-7] pve7to8: add check for nvidia-vgpu-mgr

Dominik Csapak d.csapak at proxmox.com
Mon Jun 12 12:00:53 CEST 2023


Currently the nvidia vgpu host driver (15.2) does not support kernels >
6.0 and thus will not work with bookworm based releases for now.

Fail when the service is running, and warn if it only exists, but is
disabled/stopped (in case a user installed it sometime but did not need
it and disabled it).

In any case, link to the known issues section in the upgrade guide
(which we can update to contain up-to-date information).

Signed-off-by: Dominik Csapak <d.csapak at proxmox.com>
---
I opted to not parse more specific information about the driver (like
version, etc.) since it increases the complexity of the check but
without any real upside currently. If there is some future version that
supports it, we can update that to only warn/error for not supported
versions.

I'll add the section to the upgrade guide shortly

 PVE/CLI/pve7to8.pm | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/PVE/CLI/pve7to8.pm b/PVE/CLI/pve7to8.pm
index 6b51e98e..dbcb87ff 100644
--- a/PVE/CLI/pve7to8.pm
+++ b/PVE/CLI/pve7to8.pm
@@ -1215,6 +1215,27 @@ sub check_apt_repos {
     }
 }
 
+sub check_nvidia_vgpu_service {
+    log_info("Checking for existance of NVIDIA vGPU Manager..");
+
+    my $state = $get_systemd_unit_state->("nvidia-vgpu-mgr.service");
+    if ($state && $state eq 'active') {
+	log_fail(
+	    "Running NVIDIA vGPU Service found, possibly not compatible with newer kernel versions,"
+	    ." check with their documentation and"
+	    ." https://pve.proxmox.com/wiki/Upgrade_from_7_to_8#Known_upgrade_issues."
+	);
+    } elsif ($state && $state ne 'unknown') {
+	log_warn(
+	    "NVIDIA vGPU Service found, possibly not compatible with newer kernel versions,"
+	    ." check with their documentation and"
+	    ." https://pve.proxmox.com/wiki/Upgrade_from_7_to_8#Known_upgrade_issues."
+	);
+    } else {
+	log_pass("No NVIDIA vGPU Service found.");
+    }
+}
+
 sub check_time_sync {
     my $unit_active = sub { return $get_systemd_unit_state->($_[0], 1) eq 'active' ? $_[0] : undef };
 
@@ -1337,6 +1358,7 @@ sub check_misc {
     check_lxcfs_fuse_version();
     check_node_and_guest_configurations();
     check_apt_repos();
+    check_nvidia_vgpu_service();
 }
 
 my sub colored_if {
-- 
2.30.2






More information about the pve-devel mailing list