[pve-devel] [PATCH v2 qemu-server] fix #2083: Add hv_tlbflush, hv_ipi, hv_evmcs enlightenments

Stefan Reiter s.reiter at proxmox.com
Mon Jun 24 14:37:14 CEST 2019


> On June 21, 2019 10:53 AM Thomas Lamprecht <t.lamprecht at proxmox.com> wrote:
> 
>  
> Am 6/19/19 um 10:23 AM schrieb Stefan Reiter:
> > Kernels 4.18+ (4.17+ for evmcs) support new Hyper-V enlightenments for
> > Windows KVM guests. QEMU supports these since 3.0 and 3.1 respectively.
> > tlbflush and ipi improve performance on overcommitted systems, evmcs
> > improves nested virtualization.
> > 
> > It's not entirely clear to me if Win7 already supports these, but since
> > it doesn't cause any performance penalties (and it works fine without
> > crashing, which makes sense either way, because Hyper-V enlightenments
> > are opt-in by the guest OS), enabling it regardless should be fine.
> > (As opposed to adding a new if branch for win8+)
> > 
> > Feature explanations to the best of my understanding:
> > 
> > hv_tlbflush allows the guest OS to trigger tlb shootdowns via a
> > hypercall. This allows CPUs to be identified via their vpindex (which
> > makes hv_vpindex a prerequisite to hv_tlbflush, but that is already
> > handled in our code). In overcommited configurations, where multiple
> > vCPUs reside on one pCPU, this increases performance of guest tlb
> > flushes, by only flushing each pCPU once. It also allows multiple tlb
> > flushes with only one vmexit.
> > 
> > hv_ipi allows sending inter-processor interrupts via vpindex, once again
> > making it a prerequisite. Benefits are pretty much as with tlbflush.
> > 
> > hv_evmcs is a VM control structure in L1 guest memory, allowing an L1 guest
> > to modify L2 VMCS and entering L2 without having the L0 host perform an
> > expensive VMCS update on trapping the nested vmenter.
> > 
> > Signed-off-by: Stefan Reiter <s.reiter at proxmox.com>
> > ---
> > 
> > v1 -> v2:
> >     * Added commit description
> >     * Fixed formatting (sorry)
> >     * Changed hv_ipi and hv_evmcs to QEMU version 3.1 only
> > 
> > The last one was my mistake, I forgot a step in my testing setup for v1.
> > ipi and evmcs are only supported in QEMU 3.1+, although kernel support
> > is still present since 4.18/4.17. Since only 3.0 is rolled out, this is
> > now preparation for the future I guess.
> > 
> > Live migration, both up and down versions works fine in my testing,
> > as long as the target systems kernel is version 4.18+. As far as I'm
> > aware, CPU feature flags like all of the hv_* ones are only checked on
> > guest bootup. Our code already strips them from the target command line,
> > so QEMU is working fine, and KVM already supports the hypercalls.
> > 
> > Migration to systems running older kernels will probably fail.
> > 
> > The microsoft Hyper-V spec is a good source for deeper information:
> > https://github.com/MicrosoftDocs/Virtualization-Documentation/raw/live/tlfs/Hypervisor%20Top%20Level%20Functional%20Specification%20v5.0C.pdf
> > 
> 
> Looks OK, I'm only waiting on a quick notice from you about the
> additional migration test Dominik suggested off-list/off-line, FYI.
> 

Migration tests ran OK, works fine in both directions (old <-> new), as
long as QEMU version stays the same and both systems have a kernel
supporting the hv_tlbflush feature.

> And maybe I'll will followup with moving the tlbflush also into 3.1,
> better to be safe than sorry here :)
> 

Might still be a good idea.

> >  PVE/QemuServer.pm | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> > index 341e0b0..aff0915 100644
> > --- a/PVE/QemuServer.pm
> > +++ b/PVE/QemuServer.pm
> > @@ -7170,6 +7170,15 @@ sub add_hyperv_enlightenments {
> >  	    push @$cpuFlags , 'hv_synic';
> >  	    push @$cpuFlags , 'hv_stimer';
> >  	}
> > +
> > +	if (qemu_machine_feature_enabled ($machine_type, $kvmver, 3, 0)) {
> > +	    push @$cpuFlags , 'hv_tlbflush';
> > +	}
> > +
> > +	if (qemu_machine_feature_enabled ($machine_type, $kvmver, 3, 1)) {
> > +	    push @$cpuFlags , 'hv_ipi';
> > +	    push @$cpuFlags , 'hv_evmcs';

evmcs might be a little bit trickier to get right after all. It seems that
it is only supported on Intel machines. On AMD machines QEMU doesn't start
at all with the flag enabled.

It also disables APICv (posted interrupts), however, SynIC already does that
and is enabled by default, so that shouldn't be an issue.

On the other hand, evmcs improves nesting performance even for Linux guests
(e.g. Linux L2 on Linux L1 on Linux Host), so I could look into enabling it
in a different part of the code altogether (the other Hyper-V enlightenments
do nothing for Linux guests, since most of that is already covered by KVM PV
extensions). There I could check for supported (Intel) CPUs (vmx in CPU
flags).

Maybe remove hv_evmcs in your followup for now and I'll send another patch
series with that?

> > +	}
> >      }
> >  }
> >  
> >




More information about the pve-devel mailing list