2023-09-15 21:09:32

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v2 0/3] SEV-ES TSC_AUX virtualization fix and optimization

This patch series provides fixes to the TSC_AUX virtualization support
and an optimization to reduce the number of WRMSRs to TSC_AUX when
it is virtualized.

---

Changes since v1:
- Move TSC_AUX virtualization support out of init_vmcb_after_set_cpuid()
path and into the vcpu_after_set_cpuid() path
- Add an additional patch to properly set or clear intercepts based
on TSC_AUX virtualization requirements
- Simplify the TSC_AUX virtualization optimization to set the host save
area TSC_AUX value once during svm_hardware_enable().
- Since the TSC_AUX virtualization can't be disabled for an SEV-ES guest,
eliminate the "v_tsc_aux" flag and check against the host feature and
type of guest, directly.

Patches based on https://git.kernel.org/pub/scm/virt/kvm/kvm.git master
and commit:
7c7cce2cf7ee ("Merge tag 'kvmarm-fixes-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD")

Tom Lendacky (3):
KVM: SVM: Fix TSC_AUX virtualization setup
KVM: SVM: Fix TSC_AUX virtualization intercept update logic
KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX

arch/x86/kvm/svm/sev.c | 34 +++++++++++++++++++++++++--------
arch/x86/kvm/svm/svm.c | 43 ++++++++++++++++++++++++++++++++++--------
arch/x86/kvm/svm/svm.h | 1 +
3 files changed, 62 insertions(+), 16 deletions(-)

--
2.41.0


2023-09-16 01:16:35

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v2 1/3] KVM: SVM: Fix TSC_AUX virtualization setup

The checks for virtualizing TSC_AUX occur during the vCPU reset processing
path. However, at the time of initial vCPU reset processing, when the vCPU
is first created, not all of the guest CPUID information has been set. In
this case the RDTSCP and RDPID feature support for the guest is not in
place and so TSC_AUX virtualization is not established.

This continues for each vCPU created for the guest. On the first boot of
an AP, vCPU reset processing is executed as a result of an APIC INIT
event, this time with all of the guest CPUID information set, resulting
in TSC_AUX virtualization being enabled, but only for the APs. The BSP
always sees a TSC_AUX value of 0 which probably went unnoticed because,
at least for Linux, the BSP TSC_AUX value is 0.

Move the TSC_AUX virtualization enablement out of the init_vmcb() path and
into the vcpu_after_set_cpuid() path to allow for proper initialization of
the support after the guest CPUID information has been set.

Fixes: 296d5a17e793 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/kvm/svm/sev.c | 35 +++++++++++++++++++++++++++--------
arch/x86/kvm/svm/svm.c | 9 ++-------
arch/x86/kvm/svm/svm.h | 1 +
3 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index b9a0a939d59f..4ac01f338903 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2962,6 +2962,33 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in)
count, in);
}

+static void sev_es_vcpu_after_set_cpuid(struct vcpu_svm *svm)
+{
+ struct kvm_vcpu *vcpu = &svm->vcpu;
+
+ if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) &&
+ (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_RDPID))) {
+ set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1);
+ if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
+ svm_clr_intercept(svm, INTERCEPT_RDTSCP);
+ }
+}
+
+void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm)
+{
+ struct kvm_vcpu *vcpu = &svm->vcpu;
+ struct kvm_cpuid_entry2 *best;
+
+ /* For sev guests, the memory encryption bit is not reserved in CR3. */
+ best = kvm_find_cpuid_entry(vcpu, 0x8000001F);
+ if (best)
+ vcpu->arch.reserved_gpa_bits &= ~(1UL << (best->ebx & 0x3f));
+
+ if (sev_es_guest(svm->vcpu.kvm))
+ sev_es_vcpu_after_set_cpuid(svm);
+}
+
static void sev_es_init_vmcb(struct vcpu_svm *svm)
{
struct vmcb *vmcb = svm->vmcb01.ptr;
@@ -3024,14 +3051,6 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 1, 1);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1);
-
- if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) &&
- (guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP) ||
- guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDPID))) {
- set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1);
- if (guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP))
- svm_clr_intercept(svm, INTERCEPT_RDTSCP);
- }
}

void sev_init_vmcb(struct vcpu_svm *svm)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f283eb47f6ac..aef1ddf0b705 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4284,7 +4284,6 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index)
static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
- struct kvm_cpuid_entry2 *best;

/*
* SVM doesn't provide a way to disable just XSAVES in the guest, KVM
@@ -4328,12 +4327,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_FLUSH_CMD, 0,
!!guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D));

- /* For sev guests, the memory encryption bit is not reserved in CR3. */
- if (sev_guest(vcpu->kvm)) {
- best = kvm_find_cpuid_entry(vcpu, 0x8000001F);
- if (best)
- vcpu->arch.reserved_gpa_bits &= ~(1UL << (best->ebx & 0x3f));
- }
+ if (sev_guest(vcpu->kvm))
+ sev_vcpu_after_set_cpuid(svm);

init_vmcb_after_set_cpuid(vcpu);
}
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index f41253958357..be67ab7fdd10 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -684,6 +684,7 @@ void __init sev_hardware_setup(void);
void sev_hardware_unsetup(void);
int sev_cpu_init(struct svm_cpu_data *sd);
void sev_init_vmcb(struct vcpu_svm *svm);
+void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm);
void sev_free_vcpu(struct kvm_vcpu *vcpu);
int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
--
2.41.0

2023-09-16 06:01:39

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v2 2/3] KVM: SVM: Fix TSC_AUX virtualization intercept update logic

With the TSC_AUX virtualization support now in the vcpu_set_after_cpuid()
path, the intercepts must be either cleared or set based on the guest
CPUID input. Currently the support only clears the intercepts.

Also, vcpu_set_after_cpuid() calls svm_recalc_instruction_intercepts() as
part of the processing, so the setting or clearing of the RDTSCP intercept
can be dropped from the TSC_AUX virtualization support.

Update the support to always set or clear the TSC_AUX MSR intercept based
on the virtualization requirements.

Fixes: 296d5a17e793 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/kvm/svm/sev.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 4ac01f338903..4900c078045a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2966,12 +2966,11 @@ static void sev_es_vcpu_after_set_cpuid(struct vcpu_svm *svm)
{
struct kvm_vcpu *vcpu = &svm->vcpu;

- if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) &&
- (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
- guest_cpuid_has(vcpu, X86_FEATURE_RDPID))) {
- set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1);
- if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
- svm_clr_intercept(svm, INTERCEPT_RDTSCP);
+ if (boot_cpu_has(X86_FEATURE_V_TSC_AUX)) {
+ bool v_tsc_aux = guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_RDPID);
+
+ set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, v_tsc_aux, v_tsc_aux);
}
}

--
2.41.0

2023-09-16 15:31:08

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v2 3/3] KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX

When the TSC_AUX MSR is virtualized, the TSC_AUX value is swap type "B"
within the VMSA. This means that the guest value is loaded on VMRUN and
the host value is restored from the host save area on #VMEXIT.

Since the value is restored on #VMEXIT, the KVM user return MSR support
for TSC_AUX can be replaced by populating the host save area with the
current host value of TSC_AUX. And, since TSC_AUX is not changed by Linux
post-boot, the host save area can be set once in svm_hardware_enable().
This eliminates the two WRMSR instructions associated with the user return
MSR support.

Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/kvm/svm/svm.c | 34 +++++++++++++++++++++++++++++++++-
1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index aef1ddf0b705..9507df93f410 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -683,6 +683,21 @@ static int svm_hardware_enable(void)

amd_pmu_enable_virt();

+ /*
+ * If TSC_AUX virtualization is supported, TSC_AUX becomes a swap type
+ * "B" field (see sev_es_prepare_switch_to_guest()) for SEV-ES guests.
+ * Since Linux does not change the value of TSC_AUX once set, prime the
+ * TSC_AUX field now to avoid a RDMSR on every vCPU run.
+ */
+ if (boot_cpu_has(X86_FEATURE_V_TSC_AUX)) {
+ struct sev_es_save_area *hostsa;
+ u32 msr_hi;
+
+ hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
+
+ rdmsr(MSR_TSC_AUX, hostsa->tsc_aux, msr_hi);
+ }
+
return 0;
}

@@ -1532,7 +1547,14 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
if (tsc_scaling)
__svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);

- if (likely(tsc_aux_uret_slot >= 0))
+ /*
+ * TSC_AUX is always virtualized for SEV-ES guests when the feature is
+ * available. The user return MSR support is not required in this case
+ * because TSC_AUX is restored on #VMEXIT from the host save area
+ * (which has been initialized in svm_hardware_enable()).
+ */
+ if (likely(tsc_aux_uret_slot >= 0) &&
+ (!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);

svm->guest_state_loaded = true;
@@ -3086,6 +3108,16 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
svm->sysenter_esp_hi = guest_cpuid_is_intel(vcpu) ? (data >> 32) : 0;
break;
case MSR_TSC_AUX:
+ /*
+ * TSC_AUX is always virtualized for SEV-ES guests when the
+ * feature is available. The user return MSR support is not
+ * required in this case because TSC_AUX is restored on #VMEXIT
+ * from the host save area (which has been initialized in
+ * svm_hardware_enable()).
+ */
+ if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) && sev_es_guest(vcpu->kvm))
+ break;
+
/*
* TSC_AUX is usually changed only during boot and never read
* directly. Intercept TSC_AUX instead of exposing it to the
--
2.41.0

2023-09-22 22:43:08

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SEV-ES TSC_AUX virtualization fix and optimization

Queued, thanks. The part that stood out in patch 2 is the removal of
svm_clr_intercept(), which also applies when the initialization is done
in the wrong place. Either way, svm_clr_intercept() is always going
to be called by svm_recalc_instruction_intercepts() if guest has the
RDTSC bit in its CPUID.

So I extracted that into a separate patch and squashed the rest of
patch 2 into patch 1.

Paolo


2023-09-25 16:57:17

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SEV-ES TSC_AUX virtualization fix and optimization

On 9/22/23 16:24, Paolo Bonzini wrote:
> Queued, thanks. The part that stood out in patch 2 is the removal of
> svm_clr_intercept(), which also applies when the initialization is done
> in the wrong place. Either way, svm_clr_intercept() is always going
> to be called by svm_recalc_instruction_intercepts() if guest has the
> RDTSC bit in its CPUID.
>
> So I extracted that into a separate patch and squashed the rest of
> patch 2 into patch 1.

Works for me. Thanks, Paolo!

>
> Paolo
>
>