2020-04-28 23:12:31

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 0/2] KVM: nVMX: vmcs.SYSENTER optimization and "fix"

Patch 1 is a "fix" for handling SYSENTER_EIP/ESP in L2 on a 32-bit vCPU.
The primary motivation is to provide consistent behavior after patch 2.

Patch 2 is essentially a re-submission of a nested VMX optimization to
avoid redundant VMREADs to the SYSENTER fields in the nested VM-Exit path.

After patch 2 and without patch 1, KVM would end up with weird behavior
where L1 and L2 would only see 32-bit values for their own SYSENTER_E*P
MSRs, but L1 could see a 64-bit value for L2's MSRs.

Sean Christopherson (2):
KVM: nVMX: Truncate writes to vmcs.SYSENTER_EIP/ESP for 32-bit vCPU
KVM: nVMX: Drop superfluous VMREAD of vmcs02.GUEST_SYSENTER_*

arch/x86/kvm/vmx/nested.c | 4 ----
arch/x86/kvm/vmx/vmx.c | 18 ++++++++++++++++--
2 files changed, 16 insertions(+), 6 deletions(-)

--
2.26.0


2020-04-28 23:12:43

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 2/2] KVM: nVMX: Drop superfluous VMREAD of vmcs02.GUEST_SYSENTER_*

Don't propagate GUEST_SYSENTER_* from vmcs02 to vmcs12 on nested VM-Exit
as the vmcs12 fields are updated in vmx_set_msr(), and writes to the
corresponding MSRs are always intercepted by KVM when running L2.

Dropping the propagation was intended to be done in the same commit that
added vmcs12 writes in vmx_set_msr()[1], but for reasons unknown was
only shuffled around[2][3].

[1] https://patchwork.kernel.org/patch/10933215
[2] https://patchwork.kernel.org/patch/10933215/#22682289
[3] https://lore.kernel.org/patchwork/patch/1088643

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 2c36f3f53108..a54231bac047 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3944,10 +3944,6 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
vmcs12->guest_cs_ar_bytes = vmcs_read32(GUEST_CS_AR_BYTES);
vmcs12->guest_ss_ar_bytes = vmcs_read32(GUEST_SS_AR_BYTES);

- vmcs12->guest_sysenter_cs = vmcs_read32(GUEST_SYSENTER_CS);
- vmcs12->guest_sysenter_esp = vmcs_readl(GUEST_SYSENTER_ESP);
- vmcs12->guest_sysenter_eip = vmcs_readl(GUEST_SYSENTER_EIP);
-
vmcs12->guest_interruptibility_info =
vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);

--
2.26.0

2020-04-28 23:13:10

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 1/2] KVM: nVMX: Truncate writes to vmcs.SYSENTER_EIP/ESP for 32-bit vCPU

Explicitly truncate the data written to vmcs.SYSENTER_EIP/ESP on WRMSR
if the virtual CPU doesn't support 64-bit mode. The SYSENTER address
fields in the VMCS are natural width, i.e. bits 63:32 are dropped if the
CPU doesn't support Intel 64 architectures. This behavior is visible to
the guest after a VM-Exit/VM-Exit roundtrip, e.g. if the guest sets bits
63:32 in the actual MSR.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3ab6ca6062ce..bc91ce499a7a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1936,6 +1936,16 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 0;
}

+static u64 nested_vmx_truncate_sysenter_addr(struct kvm_vcpu *vcpu,
+ u64 data)
+{
+#ifdef CONFIG_X86_64
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
+ return (u32)data;
+#endif
+ return (unsigned long)data;
+}
+
/*
* Writes msr value into the appropriate "register".
* Returns 0 on success, non-0 otherwise.
@@ -1973,13 +1983,17 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
vmcs_write32(GUEST_SYSENTER_CS, data);
break;
case MSR_IA32_SYSENTER_EIP:
- if (is_guest_mode(vcpu))
+ if (is_guest_mode(vcpu)) {
+ data = nested_vmx_truncate_sysenter_addr(vcpu, data);
get_vmcs12(vcpu)->guest_sysenter_eip = data;
+ }
vmcs_writel(GUEST_SYSENTER_EIP, data);
break;
case MSR_IA32_SYSENTER_ESP:
- if (is_guest_mode(vcpu))
+ if (is_guest_mode(vcpu)) {
+ data = nested_vmx_truncate_sysenter_addr(vcpu, data);
get_vmcs12(vcpu)->guest_sysenter_esp = data;
+ }
vmcs_writel(GUEST_SYSENTER_ESP, data);
break;
case MSR_IA32_DEBUGCTLMSR:
--
2.26.0

2020-04-28 23:50:01

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH 0/2] KVM: nVMX: vmcs.SYSENTER optimization and "fix"

On Tue, Apr 28, 2020 at 4:10 PM Sean Christopherson
<[email protected]> wrote:
>
> Patch 1 is a "fix" for handling SYSENTER_EIP/ESP in L2 on a 32-bit vCPU.
> The primary motivation is to provide consistent behavior after patch 2.
>
> Patch 2 is essentially a re-submission of a nested VMX optimization to
> avoid redundant VMREADs to the SYSENTER fields in the nested VM-Exit path.
>
> After patch 2 and without patch 1, KVM would end up with weird behavior
> where L1 and L2 would only see 32-bit values for their own SYSENTER_E*P
> MSRs, but L1 could see a 64-bit value for L2's MSRs.
>
> Sean Christopherson (2):
> KVM: nVMX: Truncate writes to vmcs.SYSENTER_EIP/ESP for 32-bit vCPU
> KVM: nVMX: Drop superfluous VMREAD of vmcs02.GUEST_SYSENTER_*
>
> arch/x86/kvm/vmx/nested.c | 4 ----
> arch/x86/kvm/vmx/vmx.c | 18 ++++++++++++++++--
> 2 files changed, 16 insertions(+), 6 deletions(-)

It seems like this could be fixed more generally by truncating
natural-width fields on 32-bit vCPUs in handle_vmwrite(). However,
that also would imply that we can't shadow any natural-width fields on
a 32-bit vCPU.

2020-04-29 15:13:43

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 0/2] KVM: nVMX: vmcs.SYSENTER optimization and "fix"

On Tue, Apr 28, 2020 at 04:45:25PM -0700, Jim Mattson wrote:
> On Tue, Apr 28, 2020 at 4:10 PM Sean Christopherson
> <[email protected]> wrote:
> >
> > Patch 1 is a "fix" for handling SYSENTER_EIP/ESP in L2 on a 32-bit vCPU.
> > The primary motivation is to provide consistent behavior after patch 2.
> >
> > Patch 2 is essentially a re-submission of a nested VMX optimization to
> > avoid redundant VMREADs to the SYSENTER fields in the nested VM-Exit path.
> >
> > After patch 2 and without patch 1, KVM would end up with weird behavior
> > where L1 and L2 would only see 32-bit values for their own SYSENTER_E*P
> > MSRs, but L1 could see a 64-bit value for L2's MSRs.
> >
> > Sean Christopherson (2):
> > KVM: nVMX: Truncate writes to vmcs.SYSENTER_EIP/ESP for 32-bit vCPU
> > KVM: nVMX: Drop superfluous VMREAD of vmcs02.GUEST_SYSENTER_*
> >
> > arch/x86/kvm/vmx/nested.c | 4 ----
> > arch/x86/kvm/vmx/vmx.c | 18 ++++++++++++++++--
> > 2 files changed, 16 insertions(+), 6 deletions(-)
>
> It seems like this could be fixed more generally by truncating
> natural-width fields on 32-bit vCPUs in handle_vmwrite(). However,
> that also would imply that we can't shadow any natural-width fields on
> a 32-bit vCPU.

handle_vmwrite() and handle_vmread() already correctly handle truncating
writes/reads when L1 isn't in 64-bit mode.

This path is effectively out-of-band, for lack of a better phrase. The
WRMSR is intercepted and the data is stuffed into vmcs02. Without these
patches, the effective L2 state depends on the underlying hardware
capabilities, e.g. L2 gets 64-bit behavior if L0 is a 64-bit CPU, and
32-bit behavior if L0 is a 32-bit CPU. It's "wrong", but consistent as the
value seen by L2 is the same value that is saved into vmcs12. Of course in
the 64-bit CPU case, L1 can't actually see the full value via VMREAD as the
vCPU is 32-bit, but at least the underlying memory/machinery is consistent.

With just patch 2, the above would still be true for 64-bit L0, but for
32-bit L0 it would result in L2 seeing a 32-bit value while saving a 64-bit
value into vmcs12. Again, L1 wouldn't see the 64-bit value when using
VMREAD, but the value in memory is still wrong-ish.

Truncating the value on WRMSR interception makes the behavior fully
dependent on the vCPU capabilities, i.e. what L2 sees is the same value
that's saved into vmcs12, which is the same value seen by VMREAD in L1,
irrespective of whether L0 is 64-bit or 32-bit.

2020-05-04 18:45:55

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 0/2] KVM: nVMX: vmcs.SYSENTER optimization and "fix"

On 29/04/20 01:10, Sean Christopherson wrote:
> Patch 1 is a "fix" for handling SYSENTER_EIP/ESP in L2 on a 32-bit vCPU.
> The primary motivation is to provide consistent behavior after patch 2.
>
> Patch 2 is essentially a re-submission of a nested VMX optimization to
> avoid redundant VMREADs to the SYSENTER fields in the nested VM-Exit path.
>
> After patch 2 and without patch 1, KVM would end up with weird behavior
> where L1 and L2 would only see 32-bit values for their own SYSENTER_E*P
> MSRs, but L1 could see a 64-bit value for L2's MSRs.
>
> Sean Christopherson (2):
> KVM: nVMX: Truncate writes to vmcs.SYSENTER_EIP/ESP for 32-bit vCPU
> KVM: nVMX: Drop superfluous VMREAD of vmcs02.GUEST_SYSENTER_*
>
> arch/x86/kvm/vmx/nested.c | 4 ----
> arch/x86/kvm/vmx/vmx.c | 18 ++++++++++++++++--
> 2 files changed, 16 insertions(+), 6 deletions(-)
>

Queued, thanks.

Paolo