Check "this" CPU instead of the boot CPU when querying SVM support so that
the per-CPU checks done during hardware enabling actually function as
intended, i.e. will detect issues where SVM isn't support on all CPUs.
Disable migration for the use from svm_init() mostly so that the standard
accessors for the per-CPU data can be used without getting yelled at by
CONFIG_DEBUG_PREEMPT=y sanity checks. Preventing the "disabled by BIOS"
error message from reporting the wrong CPU is largely a bonus, as ensuring
a stable CPU during module load is a non-goal for KVM.
Link: https://lore.kernel.org/all/[email protected]
Cc: Kai Huang <[email protected]>
Cc: Chao Gao <[email protected]>
Reviewed-by: Kai Huang <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ff6c769aafb2..9e449167e71b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -518,18 +518,20 @@ static void svm_init_osvw(struct kvm_vcpu *vcpu)
vcpu->arch.osvw.status |= 1;
}
-static bool kvm_is_svm_supported(void)
+static bool __kvm_is_svm_supported(void)
{
- int cpu = raw_smp_processor_id();
+ int cpu = smp_processor_id();
+ struct cpuinfo_x86 *c = &cpu_data(cpu);
+
u64 vm_cr;
- if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
- boot_cpu_data.x86_vendor != X86_VENDOR_HYGON) {
+ if (c->x86_vendor != X86_VENDOR_AMD &&
+ c->x86_vendor != X86_VENDOR_HYGON) {
pr_err("CPU %d isn't AMD or Hygon\n", cpu);
return false;
}
- if (!boot_cpu_has(X86_FEATURE_SVM)) {
+ if (!cpu_has(c, X86_FEATURE_SVM)) {
pr_err("SVM not supported by CPU %d\n", cpu);
return false;
}
@@ -548,9 +550,20 @@ static bool kvm_is_svm_supported(void)
return true;
}
+static bool kvm_is_svm_supported(void)
+{
+ bool supported;
+
+ migrate_disable();
+ supported = __kvm_is_svm_supported();
+ migrate_enable();
+
+ return supported;
+}
+
static int svm_check_processor_compat(void)
{
- if (!kvm_is_svm_supported())
+ if (!__kvm_is_svm_supported())
return -EIO;
return 0;
--
2.41.0.487.g6d72f3e995-goog
On Mon, Jul 24, 2023, Peter Zijlstra wrote:
> On Fri, Jul 21, 2023 at 01:18:54PM -0700, Sean Christopherson wrote:
> > Check "this" CPU instead of the boot CPU when querying SVM support so that
> > the per-CPU checks done during hardware enabling actually function as
> > intended, i.e. will detect issues where SVM isn't support on all CPUs.
>
> Is that a realistic concern?
It's not a concern in the sense that it should never happen, but I know of at
least one example where VMX on Intel completely disappeared[1]. The "compatibility"
checks are really more about the entire VMX/SVM feature set, the base VMX/SVM
support check is just an easy and obvious precursor to the full compatibility
checks.
Of course, SVM doesn't currently have compatibility checks on the full SVM feature
set, but that's more due to lack of a forcing function than a desire to _not_ have
them. Intel CPUs have a pesky habit of bugs, ucode updates, and/or in-field errors
resulting in VMX features randomly appearing or disappearing. E.g. there's an
ongoing buzilla (sorry) issue[2] where a user is only able to load KVM *after* a
suspend+resume cycle, because TSC scaling only shows up on one socket immediately
after boot, which is then somehow resolved by suspend+resume.
[1] 009bce1df0bb ("x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted")
[2] https://bugzilla.kernel.org/show_bug.cgi?id=217574
On Fri, Jul 21, 2023 at 01:18:54PM -0700, Sean Christopherson wrote:
> Check "this" CPU instead of the boot CPU when querying SVM support so that
> the per-CPU checks done during hardware enabling actually function as
> intended, i.e. will detect issues where SVM isn't support on all CPUs.
Is that a realistic concern?
Hi Sean,
On Fri, Jul 21, 2023 at 01:18:54PM -0700, Sean Christopherson wrote:
> +static bool kvm_is_svm_supported(void)
> +{
> + bool supported;
> +
> + migrate_disable();
> + supported = __kvm_is_svm_supported();
> + migrate_enable();
I am typically very wary of the constructs like this, as the value
returned is obsolete the moment migrate_enable() happens. Is value of
"svm was supported at some time in the past but may or may not be
supported right now" useful and if it is then could you add comment why?
Thanks.
--
Dmitry
On Mon, Jul 24, 2023, Dmitry Torokhov wrote:
> Hi Sean,
>
> On Fri, Jul 21, 2023 at 01:18:54PM -0700, Sean Christopherson wrote:
> > +static bool kvm_is_svm_supported(void)
> > +{
> > + bool supported;
> > +
> > + migrate_disable();
> > + supported = __kvm_is_svm_supported();
> > + migrate_enable();
>
> I am typically very wary of the constructs like this, as the value
> returned is obsolete the moment migrate_enable() happens.
Yeah, I don't like this code, but there's no great solution in this case. Or at
least, none that I've found. At some point KVM has to enable migration/preemption
before "is KVM supported?" is ultimately consumed, as the real consumer is
userspace.
> Is value of "svm was supported at some time in the past but may or may not be
> supported right now" useful and if it is then could you add comment why?
No, because barring fatal silicon/ucode/kernel bugs, SVM support isn't expected
to disappear or (re)appear).
KVM defends against the "disappear" case as much as can be reasonably expected.
It's ugly, but functionally ok (not perfect, but ok). KVM doesn't actually care
which CPU does the initial support check, because KVM will do fully protected support
checks on all CPUs before actually letting userspace create VMs. This is why the
changelog states that ensuring a stable CPU is a non-goal, and also why the inner
helpers don't use the raw accessors.
The "(re)appear" case doesn't need to be handled, because userspace could simply
retry if it really wanted to (but that would be quite insane/nonsensical, and
just asking for problems).
I didn't add a comment because VMX uses the exact same pattern, and I didn't
want to copy+paste a non-trivial comment. And this is a single use local helper,
so I'm not terribly concerned about it being misused.
That said, I'll see if I can find a common, intuitive location to document this.
On Mon, Jul 24, 2023 at 02:40:03PM -0700, Sean Christopherson wrote:
> On Mon, Jul 24, 2023, Peter Zijlstra wrote:
> > On Fri, Jul 21, 2023 at 01:18:54PM -0700, Sean Christopherson wrote:
> > > Check "this" CPU instead of the boot CPU when querying SVM support so that
> > > the per-CPU checks done during hardware enabling actually function as
> > > intended, i.e. will detect issues where SVM isn't support on all CPUs.
> >
> > Is that a realistic concern?
>
> It's not a concern in the sense that it should never happen, but I know of at
> least one example where VMX on Intel completely disappeared[1]. The "compatibility"
> checks are really more about the entire VMX/SVM feature set, the base VMX/SVM
> support check is just an easy and obvious precursor to the full compatibility
> checks.
>
> Of course, SVM doesn't currently have compatibility checks on the full SVM feature
> set, but that's more due to lack of a forcing function than a desire to _not_ have
> them. Intel CPUs have a pesky habit of bugs, ucode updates, and/or in-field errors
> resulting in VMX features randomly appearing or disappearing. E.g. there's an
> ongoing buzilla (sorry) issue[2] where a user is only able to load KVM *after* a
> suspend+resume cycle, because TSC scaling only shows up on one socket immediately
> after boot, which is then somehow resolved by suspend+resume.
>
> [1] 009bce1df0bb ("x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted")
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=217574
Is that using late loading of ucode? Anything that changes *any* feature
flag must be early ucode load, there is no other possible way since
einux does feature enumeration early, and features are fixed thereafter.
This is one of the many reasons late loading is a trainwreck.
Doing suspend/resume probably re-loads the firmware and re-does the
feature enumeration -- I didn't check.
Also, OMG don't you just love computers :/
On Tue, Jul 25, 2023, Peter Zijlstra wrote:
> On Mon, Jul 24, 2023 at 02:40:03PM -0700, Sean Christopherson wrote:
> > On Mon, Jul 24, 2023, Peter Zijlstra wrote:
> > > On Fri, Jul 21, 2023 at 01:18:54PM -0700, Sean Christopherson wrote:
> > > > Check "this" CPU instead of the boot CPU when querying SVM support so that
> > > > the per-CPU checks done during hardware enabling actually function as
> > > > intended, i.e. will detect issues where SVM isn't support on all CPUs.
> > >
> > > Is that a realistic concern?
> >
> > It's not a concern in the sense that it should never happen, but I know of at
> > least one example where VMX on Intel completely disappeared[1]. The "compatibility"
> > checks are really more about the entire VMX/SVM feature set, the base VMX/SVM
> > support check is just an easy and obvious precursor to the full compatibility
> > checks.
> >
> > Of course, SVM doesn't currently have compatibility checks on the full SVM feature
> > set, but that's more due to lack of a forcing function than a desire to _not_ have
> > them. Intel CPUs have a pesky habit of bugs, ucode updates, and/or in-field errors
> > resulting in VMX features randomly appearing or disappearing. E.g. there's an
> > ongoing buzilla (sorry) issue[2] where a user is only able to load KVM *after* a
> > suspend+resume cycle, because TSC scaling only shows up on one socket immediately
> > after boot, which is then somehow resolved by suspend+resume.
> >
> > [1] 009bce1df0bb ("x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted")
> > [2] https://bugzilla.kernel.org/show_bug.cgi?id=217574
>
> Is that using late loading of ucode?
Not sure, though I don't think that is relevant for this particular bug.
> Anything that changes *any* feature flag must be early ucode load, there is
> no other possible way since einux does feature enumeration early, and
> features are fixed thereafter.
>
> This is one of the many reasons late loading is a trainwreck.
>
> Doing suspend/resume probably re-loads the firmware
Ya, it does.
> and re-does the feature enumeration -- I didn't check.
The reported ucode revision is the same before and after resume, and is consistent
across all CPUs. KVM does the per-CPU feature enumeration (for sanity checks)
everytime userspace attempts to load KVM (the module), so the timing of the ucode
patch load _shouldn't_ matter.
The user is running quite old ucode for their system, so the current theory is that
old buggy ucode is to blame.