2023-11-19 17:35:42

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> When updating guest CPUID entries to emulate runtime behavior, e.g. when
> the guest enables a CR4-based feature that is tied to a CPUID flag, also
> update the vCPU's cpu_caps accordingly. This will allow replacing all
> usage of guest_cpuid_has() with guest_cpu_cap_has().
>
> Take care not to update guest capabilities when KVM is updating CPUID
> entries that *may* become the vCPU's CPUID, e.g. if userspace tries to set
> bogus CPUID information. No extra call to update cpu_caps is needed as
> the cpu_caps are initialized from the incoming guest CPUID, i.e. will
> automatically get the updated values.
>
> Note, none of the features in question use guest_cpu_cap_has() at this
> time, i.e. aside from settings bits in cpu_caps, this is a glorified nop.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 48 +++++++++++++++++++++++++++++++-------------
> 1 file changed, 34 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 36bd04030989..37a991439fe6 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -262,31 +262,48 @@ static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)
> return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
> }
>
> +static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
> + struct kvm_cpuid_entry2 *entry,
> + unsigned int x86_feature,
> + bool has_feature)
> +{
> + if (entry)
> + cpuid_entry_change(entry, x86_feature, has_feature);
> +
> + if (vcpu)
> + guest_cpu_cap_change(vcpu, x86_feature, has_feature);
> +}
> +
> static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> int nent)
> {
> struct kvm_cpuid_entry2 *best;
> + struct kvm_vcpu *caps = vcpu;
> +
> + /*
> + * Don't update vCPU capabilities if KVM is updating CPUID entries that
> + * are coming in from userspace!
> + */
> + if (entries != vcpu->arch.cpuid_entries)
> + caps = NULL;

I think that this should be decided by the caller. Just a boolean will suffice.

Or even better: since the userspace CPUID update is really not important in terms of performance,
why to special case it?

Even if these guest caps are later overwritten, I don't see why we
need to avoid updating them, and in fact introduce a small risk of them not being consistent
with the other cpu caps.

With this we can avoid having the 'cap' variable which is *very* confusing as well.


>
> best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> - if (best) {
> - /* Update OSXSAVE bit */
> - if (boot_cpu_has(X86_FEATURE_XSAVE))
> - cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
> +
> + if (boot_cpu_has(X86_FEATURE_XSAVE))
> + kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
> kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
>
> - cpuid_entry_change(best, X86_FEATURE_APIC,
> - vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> + kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
> + vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
>
> - if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> - cpuid_entry_change(best, X86_FEATURE_MWAIT,
> - vcpu->arch.ia32_misc_enable_msr &
> - MSR_IA32_MISC_ENABLE_MWAIT);
> - }
> + if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> + kvm_update_feature_runtime(caps, best, X86_FEATURE_MWAIT,
> + vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT);
>
> best = cpuid_entry2_find(entries, nent, 7, 0);
> - if (best && boot_cpu_has(X86_FEATURE_PKU))
> - cpuid_entry_change(best, X86_FEATURE_OSPKE,
> - kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> + if (boot_cpu_has(X86_FEATURE_PKU))
> + kvm_update_feature_runtime(caps, best, X86_FEATURE_OSPKE,
> + kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
>
> best = cpuid_entry2_find(entries, nent, 0xD, 0);
> if (best)
> @@ -353,6 +370,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> * Reset guest capabilities to userspace's guest CPUID definition, i.e.
> * honor userspace's definition for features that don't require KVM or
> * hardware management/support (or that KVM simply doesn't care about).
> + *
> + * Note, KVM has already done runtime updates on guest CPUID, i.e. this
> + * will also correctly set runtime features in guest CPU capabilities.
> */
> for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> const struct cpuid_reg cpuid = reverse_cpuid[i];


Best regards,
Maxim Levitsky


2023-11-24 06:35:41

by Xu Yilun

[permalink] [raw]
Subject: Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features

On Sun, Nov 19, 2023 at 07:35:30PM +0200, Maxim Levitsky wrote:
> On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > When updating guest CPUID entries to emulate runtime behavior, e.g. when
> > the guest enables a CR4-based feature that is tied to a CPUID flag, also
> > update the vCPU's cpu_caps accordingly. This will allow replacing all
> > usage of guest_cpuid_has() with guest_cpu_cap_has().
> >
> > Take care not to update guest capabilities when KVM is updating CPUID
> > entries that *may* become the vCPU's CPUID, e.g. if userspace tries to set
> > bogus CPUID information. No extra call to update cpu_caps is needed as
> > the cpu_caps are initialized from the incoming guest CPUID, i.e. will
> > automatically get the updated values.
> >
> > Note, none of the features in question use guest_cpu_cap_has() at this
> > time, i.e. aside from settings bits in cpu_caps, this is a glorified nop.
> >
> > Signed-off-by: Sean Christopherson <[email protected]>
> > ---
> > arch/x86/kvm/cpuid.c | 48 +++++++++++++++++++++++++++++++-------------
> > 1 file changed, 34 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 36bd04030989..37a991439fe6 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -262,31 +262,48 @@ static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)
> > return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
> > }
> >
> > +static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
> > + struct kvm_cpuid_entry2 *entry,
> > + unsigned int x86_feature,
> > + bool has_feature)
> > +{
> > + if (entry)
> > + cpuid_entry_change(entry, x86_feature, has_feature);
> > +
> > + if (vcpu)
> > + guest_cpu_cap_change(vcpu, x86_feature, has_feature);
> > +}
> > +
> > static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> > int nent)
> > {
> > struct kvm_cpuid_entry2 *best;
> > + struct kvm_vcpu *caps = vcpu;
> > +
> > + /*
> > + * Don't update vCPU capabilities if KVM is updating CPUID entries that
> > + * are coming in from userspace!
> > + */
> > + if (entries != vcpu->arch.cpuid_entries)
> > + caps = NULL;
>
> I think that this should be decided by the caller. Just a boolean will suffice.

kvm_set_cpuid() calls this function only to validate/adjust the temporary
"entries" variable. While kvm_update_cpuid_runtime() calls it to do system
level changes.

So I kind of agree to make the caller fully awared, how about adding a
newly named wrapper for kvm_set_cpuid(), like:


static void kvm_adjust_cpuid_entry(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
int nent)

{
WARN_ON(entries == vcpu->arch.cpuid_entries);
__kvm_update_cpuid_runtime(vcpu, entries, nent);
}

>
> Or even better: since the userspace CPUID update is really not important in terms of performance,
> why to special case it?
>
> Even if these guest caps are later overwritten, I don't see why we
> need to avoid updating them, and in fact introduce a small risk of them not being consistent

IIUC, for kvm_set_cpuid() case, KVM may then fail the userspace cpuid setting,
so we can't change guest caps at this phase.

Thanks,
Yilun

> with the other cpu caps.
>
> With this we can avoid having the 'cap' variable which is *very* confusing as well.
>
>
> >
> > best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> > - if (best) {
> > - /* Update OSXSAVE bit */
> > - if (boot_cpu_has(X86_FEATURE_XSAVE))
> > - cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
> > +
> > + if (boot_cpu_has(X86_FEATURE_XSAVE))
> > + kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
> > kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
> >
> > - cpuid_entry_change(best, X86_FEATURE_APIC,
> > - vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> > + kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
> > + vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> >
> > - if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> > - cpuid_entry_change(best, X86_FEATURE_MWAIT,
> > - vcpu->arch.ia32_misc_enable_msr &
> > - MSR_IA32_MISC_ENABLE_MWAIT);
> > - }
> > + if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> > + kvm_update_feature_runtime(caps, best, X86_FEATURE_MWAIT,
> > + vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT);
> >
> > best = cpuid_entry2_find(entries, nent, 7, 0);
> > - if (best && boot_cpu_has(X86_FEATURE_PKU))
> > - cpuid_entry_change(best, X86_FEATURE_OSPKE,
> > - kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> > + if (boot_cpu_has(X86_FEATURE_PKU))
> > + kvm_update_feature_runtime(caps, best, X86_FEATURE_OSPKE,
> > + kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> >
> > best = cpuid_entry2_find(entries, nent, 0xD, 0);
> > if (best)
> > @@ -353,6 +370,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > * Reset guest capabilities to userspace's guest CPUID definition, i.e.
> > * honor userspace's definition for features that don't require KVM or
> > * hardware management/support (or that KVM simply doesn't care about).
> > + *
> > + * Note, KVM has already done runtime updates on guest CPUID, i.e. this
> > + * will also correctly set runtime features in guest CPU capabilities.
> > */
> > for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> > const struct cpuid_reg cpuid = reverse_cpuid[i];
>
>
> Best regards,
> Maxim Levitsky
>
>

2023-11-28 00:44:00

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features

On Fri, Nov 24, 2023, Xu Yilun wrote:
> On Sun, Nov 19, 2023 at 07:35:30PM +0200, Maxim Levitsky wrote:
> > On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > > static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> > > int nent)
> > > {
> > > struct kvm_cpuid_entry2 *best;
> > > + struct kvm_vcpu *caps = vcpu;
> > > +
> > > + /*
> > > + * Don't update vCPU capabilities if KVM is updating CPUID entries that
> > > + * are coming in from userspace!
> > > + */
> > > + if (entries != vcpu->arch.cpuid_entries)
> > > + caps = NULL;
> >
> > I think that this should be decided by the caller. Just a boolean will suffice.

I strongly disagree. The _only_ time the caps should be updated is if
entries == vcpu->arch.cpuid_entries, and if entries == cpuid_entires than the caps
should _always_ be updated.

> kvm_set_cpuid() calls this function only to validate/adjust the temporary
> "entries" variable. While kvm_update_cpuid_runtime() calls it to do system
> level changes.
>
> So I kind of agree to make the caller fully awared, how about adding a
> newly named wrapper for kvm_set_cpuid(), like:
>
>
> static void kvm_adjust_cpuid_entry(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> int nent)
>
> {
> WARN_ON(entries == vcpu->arch.cpuid_entries);
> __kvm_update_cpuid_runtime(vcpu, entries, nent);

But taking it a step further, we end up with

WARN_ON_ONCE(update_caps != (entries == vcpu->arch.cpuid_entries));

which is silly since any bugs that would result in the WARN firing can be avoided
by doing:

update_caps = entries == vcpu->arch.cpuid_entries;

which eventually distils down to the code I posted.

> > Or even better: since the userspace CPUID update is really not important in
> > terms of performance, why to special case it?
> >
> > Even if these guest caps are later overwritten, I don't see why we need to
> > avoid updating them, and in fact introduce a small risk of them not being
> > consistent
>
> IIUC, for kvm_set_cpuid() case, KVM may then fail the userspace cpuid setting,
> so we can't change guest caps at this phase.

> Or even better: since the userspace CPUID update is really not important in
> terms of performance, why to special case it?

Yep, and sadly __kvm_update_cpuid_runtime() *must* be invoked before kvm_set_cpuid()
is guaranteed to succeed because the whole point is to massage guest CPUID before
checking for divergences.

> > With this we can avoid having the 'cap' variable which is *very* confusing as well.

I agree the "caps" variable is confusing, but it's the least awful option I see.
The alternatives I can think of are:

1. Update a dummy caps array
2. Take a snapshot of the caps and restore them
3. Have separate paths for updated guest CPUID versus guest caps

#1 would require passing a "u32 *" to guest_cpu_cap_change() (or an equivalent),
which I really, really don't want to do. It' also a waste of cycles, and I'm
skeptical that it would be any less confusing than the proposed code.

#2 increases the complexity of kvm_set_cpuid() by introducing recovery paths, i.e.
adds more things that can break, and again is wasteful (though copying ~100 bytes
or so in a slow path isn't a big deal).

#3 would create unnecessary maintenance burden as we'd have to ensure any changes
hit both paths.

2023-11-28 05:16:00

by Xu Yilun

[permalink] [raw]
Subject: Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features

On Mon, Nov 27, 2023 at 04:43:45PM -0800, Sean Christopherson wrote:
> On Fri, Nov 24, 2023, Xu Yilun wrote:
> > On Sun, Nov 19, 2023 at 07:35:30PM +0200, Maxim Levitsky wrote:
> > > On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > > > static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> > > > int nent)
> > > > {
> > > > struct kvm_cpuid_entry2 *best;
> > > > + struct kvm_vcpu *caps = vcpu;
> > > > +
> > > > + /*
> > > > + * Don't update vCPU capabilities if KVM is updating CPUID entries that
> > > > + * are coming in from userspace!
> > > > + */
> > > > + if (entries != vcpu->arch.cpuid_entries)
> > > > + caps = NULL;
> > >
> > > I think that this should be decided by the caller. Just a boolean will suffice.
>
> I strongly disagree. The _only_ time the caps should be updated is if
> entries == vcpu->arch.cpuid_entries, and if entries == cpuid_entires than the caps
> should _always_ be updated.
>
> > kvm_set_cpuid() calls this function only to validate/adjust the temporary
> > "entries" variable. While kvm_update_cpuid_runtime() calls it to do system
> > level changes.
> >
> > So I kind of agree to make the caller fully awared, how about adding a
> > newly named wrapper for kvm_set_cpuid(), like:
> >
> >
> > static void kvm_adjust_cpuid_entry(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> > int nent)
> >
> > {
> > WARN_ON(entries == vcpu->arch.cpuid_entries);
> > __kvm_update_cpuid_runtime(vcpu, entries, nent);
>
> But taking it a step further, we end up with
>
> WARN_ON_ONCE(update_caps != (entries == vcpu->arch.cpuid_entries));
>
> which is silly since any bugs that would result in the WARN firing can be avoided
> by doing:
>
> update_caps = entries == vcpu->arch.cpuid_entries;
>
> which eventually distils down to the code I posted.

OK, I agree with you.

My initial idea is to make developers easier to recognize what is
happening by name, without looking into __kvm_update_cpuid_runtime().
But it seems causing other subtle confuse and wastes cycles. Maybe the
comment is already good enough for developers.

Thanks,
Yilun