[pve-devel] [PATCH qemu-server v5 2/5] feature #3784: Parameter for guest vIOMMU & machine as property-string

Fiona Ebner f.ebner at proxmox.com
Thu Aug 17 14:29:00 CEST 2023


Am 18.01.23 um 14:57 schrieb Markus Frank:
> vIOMMU enables the option to passthrough pci devices to L2 VMs
> in L1 VMs via Nested Virtualisation.
> 
> QEMU-Parameters:
> https://www.qemu.org/docs/master/system/qemu-manpage.html
> https://wiki.qemu.org/Features/VT-d
> 
> -machine ...,kernel-irqchip=split:
> 
> "split" because of intremap see below.
> 
> -device intel-iommu:
> 
> * caching-mode=on:
> 
> "It is required for -device vfio-pci to work with the VT-d device, because host
> assigned devices requires to setup the DMA mapping on the host before guest DMA
> starts."
> 
> * intremap=on:
> 
> "This enables interrupt remapping feature. It's required to enable complete
> x2apic. Currently it only supports kvm kernel-irqchip modes off or split, while
> full kernel-irqchip is not yet supported."
> 
> Signed-off-by: Markus Frank <m.frank at proxmox.com>
> ---

This one needs a rebase

(...)

> +		    my $q35 = $machine_conf->{type} && ($machine_conf->{type} =~ m/q35/) ? 1 : 0;
> +		    my $kvm = $conf->{kvm};
> +		    $kvm //= 1 if PVE::QemuServer::is_native($arch);
> +		    if ($machine_conf->{viommu} && (!$kvm || !$q35)) {
> +			die "to use vIOMMU please enable kvm and set the machine type to q35\n";
> +		    }
>  
>  		    PVE::QemuConfig->write_config($vmid, $conf);
>  
> @@ -1770,7 +1778,16 @@ my $update_vm_api  = sub {
>  		} elsif ($opt eq 'tags') {
>  		    assert_tag_permissions($vmid, $conf->{$opt}, $param->{$opt}, $rpcenv, $authuser);
>  		    $conf->{pending}->{$opt} = PVE::GuestHelpers::get_unique_tags($param->{$opt});
> -		} else {
> +		} elsif ($opt eq 'machine') {
> +		    my $machine_conf = PVE::QemuServer::parse_machine($param->{$opt});
> +		    my $q35 = $machine_conf->{type} && ($machine_conf->{type} =~ m/q35/) ? 1 : 0;
> +		    my $kvm = $conf->{kvm};
> +		    $kvm //= 1 if PVE::QemuServer::is_native($arch);
> +		    if ($machine_conf->{viommu} && (!$kvm || !$q35)) {
> +			die "to use vIOMMU please enable kvm and set the machine type to q35\n";
> +		    }

Maybe worth adding a helper function taking in the config and the
machine option. It's the very same check as above.

> +		    $conf->{pending}->{$opt} = $param->{$opt};
> +		}else {

Style nit: missing space before else

>  		    $conf->{pending}->{$opt} = $param->{$opt};
>  
>  		    if ($opt eq 'boot') {
> diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
> index 051382c..7c998ef 100644
> --- a/PVE/QemuConfig.pm
> +++ b/PVE/QemuConfig.pm
> @@ -433,7 +433,8 @@ sub __snapshot_rollback_hook {
>  	} else {
>  	    # Note: old code did not store 'machine', so we try to be smart
>  	    # and guess the snapshot was generated with kvm 1.4 (pc-i440fx-1.4).
> -	    $data->{forcemachine} = $conf->{machine} || 'pc-i440fx-1.4';
> +	    my $machine_conf = PVE::QemuServer::parse_machine($conf->{machine});
> +	    $data->{forcemachine} = $machine_conf->{type} || 'pc-i440fx-1.4';
>  
>  	    # we remove the 'machine' configuration if not explicitly specified
>  	    # in the original config.
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 987908d..55c11d5 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -124,6 +124,19 @@ PVE::JSONSchema::register_standard_option('pve-qemu-machine', {
>  	optional => 1,
>  });
>  
> +my $machine_fmt = {
> +    type => get_standard_option('pve-qemu-machine', {

Any reason against changing the standard option itself to be the whole
property string rather than keep the option just being the type? I
noticed that 'runningmachine' still uses only
"get_standard_option('pve-qemu-machine'", but that is wrong after this
patch. Changing the standard option itself would avoid that.

> +	default_key => 1,
> +	format_description => "pve-qemu-machine-type",

That format description is not very telling at all. Usually, this is
used to clarify what exact format the string is, e.g. base64. I don't
think it's needed here and there's already a description of the property
itself.

> +    }),
> +    viommu => {
> +	type => 'boolean',
> +	description => "enable guest vIOMMU (needs kvm to be enabled and q35 to be set as machine)",

Nit: "as machine type" sounds slightly better now that machine is more
than just the type

> +	default => 0,
> +	optional => 1,
> +    },
> +};
> +
>  # FIXME: remove in favor of just using the INotify one, it's cached there exactly the same way
>  my $nodename_cache;
>  sub nodename {
> @@ -626,7 +639,12 @@ EODESCR
>  	pattern => $PVE::QemuServer::CPUConfig::qemu_cmdline_cpu_re,
>  	format_description => 'QEMU -cpu parameter'
>      },
> -    machine => get_standard_option('pve-qemu-machine'),
> +    machine => {
> +	description => "Specifies the Qemu machine type.",

Nit: QEMU should be capitalized

> +	type => 'string',
> +	optional => 1,
> +	format => $machine_fmt,
> +    },
>      arch => {
>  	description => "Virtual processor architecture. Defaults to the host.",
>  	optional => 1,
> @@ -2134,6 +2152,21 @@ sub parse_watchdog {
>      return $res;
>  }
>  
> +sub parse_machine {
> +    my ($value) = @_;
> +
> +    return if !$value;
> +
> +    my $res = eval { parse_property_string($machine_fmt, $value) };
> +    die $@ if $@;

No need for eval if you just die with the exact same error ;)

> +    return $res;
> +}
> +
> +sub print_machine {
> +    my ($machine_conf) = @_;
> +    return PVE::JSONSchema::print_property_string($machine_conf, $machine_fmt);
> +}
> +
>  sub parse_guest_agent {
>      my ($conf) = @_;
>  

(...)

> @@ -4137,6 +4174,15 @@ sub config_to_command {
>      }
>      push @$machineFlags, "type=${machine_type_min}";
>  
> +    if ($machine_conf->{viommu} && (!$kvm || !$q35)) {
> +        die "to use vIOMMU please enable kvm and set the machine type to q35\n";
> +    }
> +
> +    if ($machine_conf->{viommu}) {

Style nit: the check for kvm+q35 and the die from above could be moved
into this block

Or instead re-use the check helper I suggested above

> +	unshift @$devices, '-device', "intel-iommu,intremap=on,caching-mode=on";
> +	push @$machineFlags, 'kernel-irqchip=split';
> +    }
> +
>      push @$cmd, @$devices;
>      push @$cmd, '-rtc', join(',', @$rtcFlags) if scalar(@$rtcFlags);
>      push @$cmd, '-machine', join(',', @$machineFlags) if scalar(@$machineFlags);
> diff --git a/PVE/QemuServer/Machine.pm b/PVE/QemuServer/Machine.pm
> index d9429ed..bfbde59 100644
> --- a/PVE/QemuServer/Machine.pm
> +++ b/PVE/QemuServer/Machine.pm
> @@ -15,7 +15,8 @@ our $PVE_MACHINE_VERSION = {
>  sub machine_type_is_q35 {
>      my ($conf) = @_;
>  
> -    return $conf->{machine} && ($conf->{machine} =~ m/q35/) ? 1 : 0;
> +    my $machine_conf = PVE::QemuServer::parse_machine($conf->{machine});
> +    return $machine_conf->{type} && ($machine_conf->{type} =~ m/q35/) ? 1 : 0;
>  }

The parse and print functions and option definition should really live
inside here, i.e. Machine.pm. The Machine module should not call back
into PVE::QemuServer if we can somehow avoid it.

>  
>  sub current_from_query_machines {
> @@ -120,7 +121,8 @@ sub qemu_machine_pxe {
>  
>      my $machine =  get_current_qemu_machine($vmid);
>  
> -    if ($conf->{machine} && $conf->{machine} =~ m/\.pxe$/) {
> +    my $machine_conf = PVE::QemuServer::parse_machine($conf->{machine});
> +    if ($machine_conf->{type} && $machine_conf->{type} =~ m/\.pxe$/) {
>  	$machine .= '.pxe';
>      }
>  





More information about the pve-devel mailing list