2023-04-26 18:04:27

by Tony Battersby

[permalink] [raw]
Subject: [PATCH v2] x86/cpu: fix SME test in stop_this_cpu()

Check that the CPU supports the desired CPUID leaf before attempting to
read it. On Intel, querying an invalid extended CPUID leaf returns the
values of the maximum basic CPUID leaf. Depending on the CPU, this
could cause the SME test to incorrectly evaluate to true, causing
native_wbinvd() to be executed when it should have been skipped (seen on
a Supermicro X8DTH-6F board with Intel Xeon X5650).

Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use")
Cc: <[email protected]> # 5.18+
Signed-off-by: Tony Battersby <[email protected]>
---

Changes since v1: updated title and description.

arch/x86/kernel/process.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b650cde3f64d..26aa32e8f636 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -754,13 +754,15 @@ bool xen_set_default_idle(void)

void __noreturn stop_this_cpu(void *dummy)
{
+ struct cpuinfo_x86 *c = this_cpu_ptr(&cpu_info);
+
local_irq_disable();
/*
* Remove this CPU:
*/
set_cpu_online(smp_processor_id(), false);
disable_local_APIC();
- mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
+ mcheck_cpu_clear(c);

/*
* Use wbinvd on processors that support SME. This provides support
@@ -774,7 +776,8 @@ void __noreturn stop_this_cpu(void *dummy)
* Test the CPUID bit directly because the machine might've cleared
* X86_FEATURE_SME due to cmdline options.
*/
- if (cpuid_eax(0x8000001f) & BIT(0))
+ if (c->extended_cpuid_level >= 0x8000001f &&
+ (cpuid_eax(0x8000001f) & BIT(0)))
native_wbinvd();
for (;;) {
/*
--
2.25.1


2023-05-22 15:00:33

by Tony Battersby

[permalink] [raw]
Subject: [PATCH v2 RESEND] x86/cpu: fix SME test in stop_this_cpu()

Original thread title: "x86/cpu: fix intermittent lockup on poweroff"

I think we all agreed that my small patch below was correct.  tglx also
had an additional patch to fix the underlying race condition, but the
thread seems to have died out.  Can I get my patch applied at least?

Thanks,
Tony Battersby
Cybernetics

---

Check that the CPU supports the desired CPUID leaf before attempting to
read it. On Intel, querying an invalid extended CPUID leaf returns the
values of the maximum basic CPUID leaf. Depending on the CPU, this
could cause the SME test to incorrectly evaluate to true, causing
native_wbinvd() to be executed when it should have been skipped (seen on
a Supermicro X8DTH-6F board with Intel Xeon X5650).

Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use")
Cc: <[email protected]> # 5.18+
Signed-off-by: Tony Battersby <[email protected]>
---

Changes since v1: updated title and description.

arch/x86/kernel/process.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b650cde3f64d..26aa32e8f636 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -754,13 +754,15 @@ bool xen_set_default_idle(void)

void __noreturn stop_this_cpu(void *dummy)
{
+ struct cpuinfo_x86 *c = this_cpu_ptr(&cpu_info);
+
local_irq_disable();
/*
* Remove this CPU:
*/
set_cpu_online(smp_processor_id(), false);
disable_local_APIC();
- mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
+ mcheck_cpu_clear(c);

/*
* Use wbinvd on processors that support SME. This provides support
@@ -774,7 +776,8 @@ void __noreturn stop_this_cpu(void *dummy)
* Test the CPUID bit directly because the machine might've cleared
* X86_FEATURE_SME due to cmdline options.
*/
- if (cpuid_eax(0x8000001f) & BIT(0))
+ if (c->extended_cpuid_level >= 0x8000001f &&
+ (cpuid_eax(0x8000001f) & BIT(0)))
native_wbinvd();
for (;;) {
/*
-- 2.25.1