On Thu, Sep 14, 2023, Tom Lendacky wrote:
> The checks for virtualizing TSC_AUX occur during the vCPU reset processing
> path. However, at the time of initial vCPU reset processing, when the vCPU
> is first created, not all of the guest CPUID information has been set. In
> this case the RDTSCP and RDPID feature support for the guest is not in
> place and so TSC_AUX virtualization is not established.
>
> This continues for each vCPU created for the guest. On the first boot of
> an AP, vCPU reset processing is executed as a result of an APIC INIT
> event, this time with all of the guest CPUID information set, resulting
> in TSC_AUX virtualization being enabled, but only for the APs. The BSP
> always sees a TSC_AUX value of 0 which probably went unnoticed because,
> at least for Linux, the BSP TSC_AUX value is 0.
>
> Move the TSC_AUX virtualization enablement into the vcpu_after_set_cpuid()
> path to allow for proper initialization of the support after the guest
> CPUID information has been set.
>
> Fixes: 296d5a17e793 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
> Signed-off-by: Tom Lendacky <[email protected]>
> ---
> arch/x86/kvm/svm/sev.c | 27 +++++++++++++++++++--------
> arch/x86/kvm/svm/svm.c | 3 +++
> arch/x86/kvm/svm/svm.h | 1 +
> 3 files changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index b9a0a939d59f..565c9de87c6d 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2962,6 +2962,25 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in)
> count, in);
> }
>
> +static void sev_es_init_vmcb_after_set_cpuid(struct vcpu_svm *svm)
I would rather name this sev_es_after_set_cpuid() and call it directly from
svm_vcpu_after_set_cpuid(). Or I suppose bounce through sev_after_set_cpuid(),
but that seems gratuitous.
AFAICT, there's no point in calling this from init_vmcb(); guest_cpuid_has() is
guaranteed to be false when called during vCPU creation and so the intercept
behavior will be correct, and even if SEV-ES called init_vmcb() from
shutdown_interception(), which it doesn't, guest_cpuid_has() wouldn't change,
i.e. the intercepts wouldn't need to be changed.
init_vmcb_after_set_cpuid() is a special snowflake because it handles both SVM's
true defaults *and* guest CPUID updates.
> +{
> + struct kvm_vcpu *vcpu = &svm->vcpu;
> +
> + if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) &&
> + (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
> + guest_cpuid_has(vcpu, X86_FEATURE_RDPID))) {
> + set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1);
This needs to toggled interception back on if RDTSCP and RDPID are hidden from
the guest. KVM's wonderful ABI doesn't disallow multiple calls to KVM_SET_CPUID2
before KVM_RUN.
> + if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> + svm_clr_intercept(svm, INTERCEPT_RDTSCP);
Same thing here.
> + }
> +}
On 9/14/23 15:28, Sean Christopherson wrote:
> On Thu, Sep 14, 2023, Tom Lendacky wrote:
>> The checks for virtualizing TSC_AUX occur during the vCPU reset processing
>> path. However, at the time of initial vCPU reset processing, when the vCPU
>> is first created, not all of the guest CPUID information has been set. In
>> this case the RDTSCP and RDPID feature support for the guest is not in
>> place and so TSC_AUX virtualization is not established.
>>
>> This continues for each vCPU created for the guest. On the first boot of
>> an AP, vCPU reset processing is executed as a result of an APIC INIT
>> event, this time with all of the guest CPUID information set, resulting
>> in TSC_AUX virtualization being enabled, but only for the APs. The BSP
>> always sees a TSC_AUX value of 0 which probably went unnoticed because,
>> at least for Linux, the BSP TSC_AUX value is 0.
>>
>> Move the TSC_AUX virtualization enablement into the vcpu_after_set_cpuid()
>> path to allow for proper initialization of the support after the guest
>> CPUID information has been set.
>>
>> Fixes: 296d5a17e793 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
>> Signed-off-by: Tom Lendacky <[email protected]>
>> ---
>> arch/x86/kvm/svm/sev.c | 27 +++++++++++++++++++--------
>> arch/x86/kvm/svm/svm.c | 3 +++
>> arch/x86/kvm/svm/svm.h | 1 +
>> 3 files changed, 23 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
>> index b9a0a939d59f..565c9de87c6d 100644
>> --- a/arch/x86/kvm/svm/sev.c
>> +++ b/arch/x86/kvm/svm/sev.c
>> @@ -2962,6 +2962,25 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in)
>> count, in);
>> }
>>
>> +static void sev_es_init_vmcb_after_set_cpuid(struct vcpu_svm *svm)
>
> I would rather name this sev_es_after_set_cpuid() and call it directly from
> svm_vcpu_after_set_cpuid(). Or I suppose bounce through sev_after_set_cpuid(),
> but that seems gratuitous.
There is a sev_guest() check in svm_vcpu_after_set_cpuid(), so I can move
that into sev_vcpu_after_set_cpuid() and keep the separate
sev_es_vcpu_after_set_cpuid(). And it looks like you would prefer to not
have "vcpu" in the function name? Might be better search-wise if vcpu
remains part of the name?
>
> AFAICT, there's no point in calling this from init_vmcb(); guest_cpuid_has() is
> guaranteed to be false when called during vCPU creation and so the intercept
> behavior will be correct, and even if SEV-ES called init_vmcb() from
> shutdown_interception(), which it doesn't, guest_cpuid_has() wouldn't change,
> i.e. the intercepts wouldn't need to be changed.
Ok, I thought that's how it worked, but wasn't 100% sure. I'll move it out
of the init_vmcb() path.
>
> init_vmcb_after_set_cpuid() is a special snowflake because it handles both SVM's
> true defaults *and* guest CPUID updates.
>
>> +{
>> + struct kvm_vcpu *vcpu = &svm->vcpu;
>> +
>> + if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) &&
>> + (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
>> + guest_cpuid_has(vcpu, X86_FEATURE_RDPID))) {
>> + set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1);
>
> This needs to toggled interception back on if RDTSCP and RDPID are hidden from
> the guest. KVM's wonderful ABI doesn't disallow multiple calls to KVM_SET_CPUID2
> before KVM_RUN.
Do you want that as a separate patch with the first patch purely
addressing the current issue? Or combine them?
>
>> + if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
>> + svm_clr_intercept(svm, INTERCEPT_RDTSCP);
>
> Same thing here.
Will do.
Thanks,
Tom
>
>> + }
>> +}
On 9/14/23 15:48, Tom Lendacky wrote:
> On 9/14/23 15:28, Sean Christopherson wrote:
>> On Thu, Sep 14, 2023, Tom Lendacky wrote:
>
>>
>>> + if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
>>> + svm_clr_intercept(svm, INTERCEPT_RDTSCP);
>>
>> Same thing here.
>
> Will do.
For RDTSCP, svm_recalc_instruction_intercepts() will set/clear the RDTSCP
intercept as part of the svm_vcpu_set_after_cpuid() path, but it will only
do it based on kvm_cpu_cap_has(X86_FEATURE_RDTSCP) being true, which is
very likely.
Do you think that is good enough and we can drop the setting and clearing
of the RDTSCP intercept in the sev_es_vcpu_set_after_cpuid() function and
only deal with the TSC_AUX MSR intercept?
On a side note, it looks like RDTSCP would not be intercepted if the KVM
cap X86_FEATURE_RDTSCP feature is cleared, however unlikely, in
kvm_set_cpu_caps() and RDTSCP is not advertised to the guest (assuming the
guest is ignoring the RDTSCP CPUID bit).
Thanks,
Tom
>
> Thanks,
> Tom
>
>>
>>> + }
>>> +}