[pve-devel] [PATCH qemu-server 3/4] config: add AMD SEV-SNP support.
Daniel Kral
d.kral at proxmox.com
Fri Feb 7 11:55:59 CET 2025
Hi,
thanks for wanting to contribute and sending a patch for AMD SEV-SNP! We
really appreciate it!
Unfortunately I don't have the hardware to test this and neither have
enough knowledge about the specific details for this, but I added some
higher-level inlined comments as I wanted to take a closer look at this
anyway. Hope those help the process.
On another note, if you haven't already, please make sure that you have
a signed CLA sent to office at proxmox.com (see
https://pve.proxmox.com/wiki/Developer_Documentation for more information).
All else, I'd be happy to see this applied!
Daniel
On 2/7/25 09:51, Philipp Giersfeld wrote:
> This patch is for enabling AMD SEV-SNP support.
>
> Where applicable, it extends support for existing SEV(-ES) variables
> to SEV-SNP.
This could be more specific and tell which parameters are not applicable
to SEV-SNP anymore (i.e. no-key-sharing).
>
> The default policy value is identical to QEMU’s, and the therefore
> required option has been added to configure SMT support.
>
> The code was tested by running a VM without SEV, with SEV, SEV-ES,
> SEV-SNP. Each configuration was tested with and without an EFI disk
> attached. For SEV-enabled configurations it was also verified that the
> kernel actually used the respective feature.
>
> Signed-off-by: Philipp Giersfeld <philipp.giersfeld at canarybit.eu>
> ---
> PVE/API2/Qemu.pm | 9 +++--
> PVE/QemuServer.pm | 57 +++++++++++++++++++++++---------
> PVE/QemuServer/CPUConfig.pm | 66 ++++++++++++++++++++++++++++---------
> 3 files changed, 99 insertions(+), 33 deletions(-)
>
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 295260e7..35c76dc2 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -548,8 +548,13 @@ my sub create_disks : prototype($$$$$$$$$$$) {
> my $volid;
> if ($ds eq 'efidisk0') {
> my $smm = PVE::QemuServer::Machine::machine_type_is_q35($conf);
> - ($volid, $size) = PVE::QemuServer::create_efidisk(
> - $storecfg, $storeid, $vmid, $fmt, $arch, $disk, $smm);
> + my $amd_sev_type = PVE::QemuServer::CPUConfig::get_amd_sev_type($conf);
> + if($amd_sev_type && $amd_sev_type eq 'snp') {
> + die "SEV-SNP only uses consolidated read-only firmware";
it could be just my reader, but it seems like there are three tabs
instead of two tabs + 4 spaces to align it with my $smm.
also nit: this special case could be handled with a die post-if, i.e.
```
die "SEV-SNP only uses consolidated read-only firmware"
if $amd_sev_type && $amd_sev_type eq 'snp';
```
as there's plenty of indentation in this subroutine already.
> + } else {
> + ($volid, $size) = PVE::QemuServer::create_efidisk(
> + $storecfg, $storeid, $vmid, $fmt, $arch, $disk, $smm, $amd_sev_type);
> + }
> } elsif ($ds eq 'tpmstate0') {
> # swtpm can only use raw volumes, and uses a fixed size
> $size = PVE::Tools::convert_size(PVE::QemuServer::Drive::TPMSTATE_DISK_SIZE, 'b' => 'kb');
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 808c0e1c..e46b3434 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -52,7 +52,7 @@ use PVE::QemuConfig;
> use PVE::QemuServer::Helpers qw(config_aware_timeout min_version kvm_user_version windows_version);
> use PVE::QemuServer::Cloudinit;
> use PVE::QemuServer::CGroup;
> -use PVE::QemuServer::CPUConfig qw(print_cpu_device get_cpu_options get_cpu_bitness is_native_arch get_amd_sev_object);
> +use PVE::QemuServer::CPUConfig qw(print_cpu_device get_cpu_options get_cpu_bitness is_native_arch get_amd_sev_object get_amd_sev_type);
> use PVE::QemuServer::Drive qw(is_valid_drivename checked_volume_format drive_is_cloudinit drive_is_cdrom drive_is_read_only parse_drive print_drive);
> use PVE::QemuServer::Machine;
> use PVE::QemuServer::Memory qw(get_current_memory);
> @@ -88,6 +88,13 @@ my $OVMF = {
> "$EDK2_FW_BASE/OVMF_CODE_4M.secboot.fd",
> "$EDK2_FW_BASE/OVMF_VARS_4M.ms.fd",
> ],
> + '4m-sev' => [
> + "$EDK2_FW_BASE/OVMF_CVM_CODE_4M.fd",
> + "$EDK2_FW_BASE/OVMF_CVM_VARS_4M.fd",
> + ],
> + '4m-snp' => [
> + "$EDK2_FW_BASE/OVMF_CVM_4M.fd",
> + ],
> # FIXME: These are legacy 2MB-sized images that modern OVMF doesn't supports to build
> # anymore. how can we deperacate this sanely without breaking existing instances, or using
> # older backups and snapshot?
> @@ -3172,19 +3179,28 @@ sub vga_conf_has_spice {
> return $1 || 1;
> }
>
> -sub get_ovmf_files($$$) {
> - my ($arch, $efidisk, $smm) = @_;
> +sub get_ovmf_files($$$$) {
> + my ($arch, $efidisk, $smm, $amd_sev_type) = @_;
>
> my $types = $OVMF->{$arch}
> or die "no OVMF images known for architecture '$arch'\n";
>
> my $type = 'default';
> if ($arch eq 'x86_64') {
> - if (defined($efidisk->{efitype}) && $efidisk->{efitype} eq '4m') {
> - $type = $smm ? "4m" : "4m-no-smm";
> - $type .= '-ms' if $efidisk->{'pre-enrolled-keys'};
> + if ($amd_sev_type && $amd_sev_type eq 'snp') {
> + $type = "4m-snp";
> + my ($ovmf) = $types->{$type}->@*;
> + die "EFI base image '$ovmf' not found\n" if ! -f $ovmf;
> + return ($ovmf);
> + } elsif ($amd_sev_type) {
> + $type = "4m-sev";
> } else {
> - # TODO: log_warn about use of legacy images for x86_64 with Promxox VE 9
> + if (defined($efidisk->{efitype}) && $efidisk->{efitype} eq '4m') {
> + $type = $smm ? "4m" : "4m-no-smm";
> + $type .= '-ms' if $efidisk->{'pre-enrolled-keys'};
> + } else {
> + # TODO: log_warn about use of legacy images for x86_64 with Promxox VE 9
> + }
nit: this branch could also be folded into an elsif
> }
> }
>
> @@ -3329,7 +3345,8 @@ my sub print_ovmf_drive_commandlines {
>
> my $d = $conf->{efidisk0} ? parse_drive('efidisk0', $conf->{efidisk0}) : undef;
>
> - my ($ovmf_code, $ovmf_vars) = get_ovmf_files($arch, $d, $q35);
> + my $amd_sev_type = get_amd_sev_type($conf);
> + my ($ovmf_code, $ovmf_vars) = get_ovmf_files($arch, $d, $q35, $amd_sev_type);
>
> my $var_drive_str = "if=pflash,unit=1,id=drive-efidisk0";
> if ($d) {
> @@ -3526,10 +3543,17 @@ sub config_to_command {
> die "OVMF (UEFI) BIOS is not supported on 32-bit CPU types\n"
> if !$forcecpu && get_cpu_bitness($conf->{cpu}, $arch) == 32;
>
> - my ($code_drive_str, $var_drive_str) =
> - print_ovmf_drive_commandlines($conf, $storecfg, $vmid, $arch, $q35, $version_guard);
> - push $cmd->@*, '-drive', $code_drive_str;
> - push $cmd->@*, '-drive', $var_drive_str;
> + my $amd_sev_type = get_amd_sev_type($conf);
> + if ($amd_sev_type && $amd_sev_type eq 'snp') {
> + my $arch = PVE::QemuServer::Helpers::get_vm_arch($conf);
> + my $d = $conf->{efidisk0} ? parse_drive('efidisk0', $conf->{efidisk0}) : undef;
> + push $cmd->@*, '-bios', get_ovmf_files($arch, $d, undef, $amd_sev_type);
> + } else {
> + my ($code_drive_str, $var_drive_str) = print_ovmf_drive_commandlines(
> + $conf, $storecfg, $vmid, $arch, $q35, $version_guard);
> + push $cmd->@*, '-drive', $code_drive_str;
> + push $cmd->@*, '-drive', $var_drive_str;
> + }
> }
>
> if ($q35) { # tell QEMU to load q35 config early
> @@ -8337,7 +8361,8 @@ sub get_efivars_size {
> my $arch = PVE::QemuServer::Helpers::get_vm_arch($conf);
> $efidisk //= $conf->{efidisk0} ? parse_drive('efidisk0', $conf->{efidisk0}) : undef;
> my $smm = PVE::QemuServer::Machine::machine_type_is_q35($conf);
> - my (undef, $ovmf_vars) = get_ovmf_files($arch, $efidisk, $smm);
> + my $amd_sev_type = get_amd_sev_type($conf);
> + my (undef, $ovmf_vars) = get_ovmf_files($arch, $efidisk, $smm, $amd_sev_type);
> return -s $ovmf_vars;
> }
>
> @@ -8361,10 +8386,10 @@ sub update_tpmstate_size {
> $conf->{tpmstate0} = print_drive($disk);
> }
>
> -sub create_efidisk($$$$$$$) {
> - my ($storecfg, $storeid, $vmid, $fmt, $arch, $efidisk, $smm) = @_;
> +sub create_efidisk($$$$$$$$) {
> + my ($storecfg, $storeid, $vmid, $fmt, $arch, $efidisk, $smm, $amd_sev_type) = @_;
>
> - my (undef, $ovmf_vars) = get_ovmf_files($arch, $efidisk, $smm);
> + my (undef, $ovmf_vars) = get_ovmf_files($arch, $efidisk, $smm, $amd_sev_type);
>
> my $vars_size_b = -s $ovmf_vars;
> my $vars_size = PVE::Tools::convert_size($vars_size_b, 'b' => 'kb');
> diff --git a/PVE/QemuServer/CPUConfig.pm b/PVE/QemuServer/CPUConfig.pm
> index e65d8c26..dbc31462 100644
> --- a/PVE/QemuServer/CPUConfig.pm
> +++ b/PVE/QemuServer/CPUConfig.pm
> @@ -18,6 +18,7 @@ get_cpu_options
> get_cpu_bitness
> is_native_arch
> get_amd_sev_object
> +get_amd_sev_type
> );
>
> # under certain race-conditions, this module might be loaded before pve-cluster
> @@ -231,25 +232,32 @@ my $cpu_fmt = {
> my $sev_fmt = {
> type => {
> description => "Enable standard SEV with type='std' or enable"
> - ." experimental SEV-ES with the 'es' option.",
> + ." experimental SEV-ES with the 'es' option or enable "
> + ."experimental SEV-SNP with the 'snp option.",
`'snp'` instead of `'snp`
> type => 'string',
> default_key => 1,
> format_description => "sev-type",
> - enum => ['std', 'es'],
> + enum => ['std', 'es', 'snp'],
> maxLength => 3,
> },
> 'no-debug' => {
> - description => "Sets policy bit 0 to 1 to disallow debugging of guest",
> + description => "Sets policy bit to disallow debugging of guest",
> type => 'boolean',
> default => 0,
> optional => 1,
> },
> 'no-key-sharing' => {
> - description => "Sets policy bit 1 to 1 to disallow key sharing with other guests",
> + description => "Sets policy bit to disallow key sharing with other guests",
Good catch to also adapt these!
> type => 'boolean',
> default => 0,
> optional => 1,
> },
> + 'allow-smt' => {
> + description => "Sets policy bit to allow SMT",
This would certainly benefit for a clarification that SMT = Simultaneous
Multi Threading.
> + type => 'boolean',
> + default => 1,
> + optional => 1,
> + },
> "kernel-hashes" => {
> description => "Add kernel hashes to guest firmware for measured linux kernel launch",
> type => 'boolean',
> @@ -823,6 +831,16 @@ sub get_hw_capabilities {
> }
> return $hw_capabilities;
> }
> +sub get_amd_sev_type {
> + my ($conf) = @_;
> +
> + if ($conf->{'amd-sev'}) {
> + return PVE::JSONSchema::parse_property_string($sev_fmt, $conf->{'amd-sev'})->{type};
> + }
> + else {
> + return undef;
> + }
nit: could be a `return undef if !$conf->{'amd-sev'};`
> +}
>
> sub get_amd_sev_object {
> my ($amd_sev, $bios) = @_;
> @@ -836,22 +854,40 @@ sub get_amd_sev_object {
> if ($amd_sev_conf->{type} eq 'es' && !$sev_hw_caps->{'sev-support-es'}) {
> die "Your CPU does not support AMD SEV-ES.\n";
> }
> + if ($amd_sev_conf->{type} eq 'snp' && !$sev_hw_caps->{'sev-support-snp'}) {
> + die "Your CPU does not support AMD SEV-SNP.\n";
> + }
> if (!$bios || $bios ne 'ovmf') {
> die "To use AMD SEV, you need to change the BIOS to OVMF.\n";
> }
>
> - my $sev_mem_object = 'sev-guest,id=sev0';
> - $sev_mem_object .= ',cbitpos='.$sev_hw_caps->{cbitpos};
> - $sev_mem_object .= ',reduced-phys-bits='.$sev_hw_caps->{'reduced-phys-bits'};
> + my $sev_mem_object = '';
> + my $policy;
> + if ($amd_sev_conf->{type} eq 'es' || $amd_sev_conf->{type} eq 'std') {
> + $sev_mem_object .= 'sev-guest,id=sev0';
> + $sev_mem_object .= ',cbitpos='.$sev_hw_caps->{cbitpos};
> + $sev_mem_object .= ',reduced-phys-bits='.$sev_hw_caps->{'reduced-phys-bits'};
> +
> + # guest policy bit calculation as described here:
> + # https://documentation.suse.com/sles/15-SP5/html/SLES-amd-sev/article-amd-sev.html#table-guestpolicy
> + $policy = 0b0000;
> + $policy += 0b0001 if $amd_sev_conf->{'no-debug'};
> + $policy += 0b0010 if $amd_sev_conf->{'no-key-sharing'};
> + $policy += 0b0100 if $amd_sev_conf->{type} eq 'es';
> + # disable migration with bit 3 nosend to prevent amd-sev-migration-attack
> + $policy += 0b1000;
Since the AMD SEV-SNP code below uses hex numbers since they use the
higher number bits 15, 16 and 19, I think it would be more consistent to
convert the above binary numbers (in its own patch before this!) to either
1. hexadecimal numbers too,
2. or maybe even more readable: use shift operators for all of these, as
the official manuals specify them in bits. Also ORing those is more
standard for me than adding binary numbers, but that's just a nit, e.g.
```
$policy = 0;
$policy |= 1 << 0 if $amd_sev_conf->{'no-debug'};
$policy |= 1 << 1 if $amd_sev_conf->{'no-key-sharing'};
$policy |= 1 << 2 if $amd_sev_conf->{type} eq 'es';
# disable migration with bit 3 nosend to prevent amd-sev-migration-attack
$policy |= 1 << 3;
```
> + } elsif ($amd_sev_conf->{type} eq 'snp') {
> + $sev_mem_object .= 'sev-snp-guest,id=sev0';
> + $sev_mem_object .= ',cbitpos='.$sev_hw_caps->{cbitpos};
> + $sev_mem_object .= ',reduced-phys-bits='.$sev_hw_caps->{'reduced-phys-bits'};
> +
> + # guest policy bit calculation as described here:
> + # https://libvirt.org/formatdomain.html#id121
This should be https://libvirt.org/formatdomain.html#launch-security,
and nit: could also directly point to the official AMD specification at
https://www.amd.com/system/files/TechDocs/56860.pdf and then chapter 4.3
Guest Policy settings
> + $policy = 0x20000;
> + $policy += 0x80000 if $amd_sev_conf->{'no-debug'};
I think this is wrong, as the AMD SEV-SNP specification in 4.3 Guest
Policy states:
Bit 19 (Debug): 0 - Debugging is disallowed, 1 - Debugging is allowed
This is probably an oversight, since the AMD **SEV** spec states the
opposite for the debug bit 0.
> + $policy += 0x10000 if $amd_sev_conf->{'allow-smt'};
And if you consider the shift operation + ORing then the above becomes:
```
# Reserved bit must be one
$policy = 1 << 17;
$policy |= 1 << 16 if $amd_sev_conf->{'allow-smt'};
$policy |= 1 << 19 if !$amd_sev_conf->{'no-debug'};
```
> + }
>
> - # guest policy bit calculation as described here:
> - # https://documentation.suse.com/sles/15-SP5/html/SLES-amd-sev/article-amd-sev.html#table-guestpolicy
> - my $policy = 0b0000;
> - $policy += 0b0001 if $amd_sev_conf->{'no-debug'};
> - $policy += 0b0010 if $amd_sev_conf->{'no-key-sharing'};
> - $policy += 0b0100 if $amd_sev_conf->{type} eq 'es';
> - # disable migration with bit 3 nosend to prevent amd-sev-migration-attack
> - $policy += 0b1000;
>
> $sev_mem_object .= ',policy='.sprintf("%#x", $policy);
> $sev_mem_object .= ',kernel-hashes=on' if ($amd_sev_conf->{'kernel-hashes'});
With those comments fixed, consider this:
Reviewed-by: Daniel Kral <d.kral at proxmox.com>
More information about the pve-devel
mailing list