LinuxLists.cc - [patch v3 7/7] x86/smp: Put CPUs into INIT on shutdown if possible

2023-06-15 20:44:11

Subject: [patch v3 7/7] x86/smp: Put CPUs into INIT on shutdown if possible

Parking CPUs in a HLT loop is not completely safe vs. kexec() as HLT can
resume execution due to NMI, SMI and MCE, which has the same issue as the
MWAIT loop.

Kicking the secondary CPUs into INIT makes this safe against NMI and SMI.

A broadcast MCE will take the machine down, but a broadcast MCE which makes
HLT resume and execute overwritten text, pagetables or data will end up in
a disaster too.

So chose the lesser of two evils and kick the secondary CPUs into INIT
unless the system has installed special wakeup mechanisms which are not
using INIT.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ashok Raj <[email protected]>
---
V3: Renamed the function to smp_park_other_cpus_in_init() so it can
be reused for crash eventually.
---
arch/x86/include/asm/smp.h | 2 ++
arch/x86/kernel/smp.c | 39 ++++++++++++++++++++++++++++++++-------
arch/x86/kernel/smpboot.c | 19 +++++++++++++++++++
3 files changed, 53 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -139,6 +139,8 @@ void native_send_call_func_ipi(const str
void native_send_call_func_single_ipi(int cpu);
void x86_idle_thread_init(unsigned int cpu, struct task_struct *idle);

+bool smp_park_other_cpus_in_init(void);
+
void smp_store_boot_cpu_info(void);
void smp_store_cpu_info(int id);

--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -131,7 +131,7 @@ static int smp_stop_nmi_callback(unsigne
}

/*
- * this function calls the 'stop' function on all other CPUs in the system.
+ * Disable virtualization, APIC etc. and park the CPU in a HLT loop
*/
DEFINE_IDTENTRY_SYSVEC(sysvec_reboot)
{
@@ -172,13 +172,17 @@ static void native_stop_other_cpus(int w
* 2) Wait for all other CPUs to report that they reached the
* HLT loop in stop_this_cpu()
*
- * 3) If #2 timed out send an NMI to the CPUs which did not
- * yet report
+ * 3) If the system uses INIT/STARTUP for CPU bringup, then
+ * send all present CPUs an INIT vector, which brings them
+ * completely out of the way.
*
- * 4) Wait for all other CPUs to report that they reached the
+ * 4) If #3 is not possible and #2 timed out send an NMI to the
+ * CPUs which did not yet report
+ *
+ * 5) Wait for all other CPUs to report that they reached the
* HLT loop in stop_this_cpu()
*
- * #3 can obviously race against a CPU reaching the HLT loop late.
+ * #4 can obviously race against a CPU reaching the HLT loop late.
* That CPU will have reported already and the "have all CPUs
* reached HLT" condition will be true despite the fact that the
* other CPU is still handling the NMI. Again, there is no
@@ -194,7 +198,7 @@ static void native_stop_other_cpus(int w
/*
* Don't wait longer than a second for IPI completion. The
* wait request is not checked here because that would
- * prevent an NMI shutdown attempt in case that not all
+ * prevent an NMI/INIT shutdown in case that not all
* CPUs reach shutdown state.
*/
timeout = USEC_PER_SEC;
@@ -202,7 +206,27 @@ static void native_stop_other_cpus(int w
udelay(1);
}

- /* if the REBOOT_VECTOR didn't work, try with the NMI */
+ /*
+ * Park all other CPUs in INIT including "offline" CPUs, if
+ * possible. That's a safe place where they can't resume execution
+ * of HLT and then execute the HLT loop from overwritten text or
+ * page tables.
+ *
+ * The only downside is a broadcast MCE, but up to the point where
+ * the kexec() kernel brought all APs online again an MCE will just
+ * make HLT resume and handle the MCE. The machine crashs and burns
+ * due to overwritten text, page tables and data. So there is a
+ * choice between fire and frying pan. The result is pretty much
+ * the same. Chose frying pan until x86 provides a sane mechanism
+ * to park a CPU.
+ */
+ if (smp_park_other_cpus_in_init())
+ goto done;
+
+ /*
+ * If park with INIT was not possible and the REBOOT_VECTOR didn't
+ * take all secondary CPUs offline, try with the NMI.
+ */
if (!cpumask_empty(&cpus_stop_mask)) {
/*
* If NMI IPI is enabled, try to register the stop handler
@@ -234,6 +258,7 @@ static void native_stop_other_cpus(int w
udelay(1);
}

+done:
local_irq_save(flags);
disable_local_APIC();
mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1465,6 +1465,25 @@ void arch_thaw_secondary_cpus_end(void)
cache_aps_init();
}

+bool smp_park_other_cpus_in_init(void)
+{
+ unsigned int cpu, this_cpu = smp_processor_id();
+ unsigned int apicid;
+
+ if (apic->wakeup_secondary_cpu_64 || apic->wakeup_secondary_cpu)
+ return false;
+
+ for_each_present_cpu(cpu) {
+ if (cpu == this_cpu)
+ continue;
+ apicid = apic->cpu_present_to_apicid(cpu);
+ if (apicid == BAD_APICID)
+ continue;
+ send_init_sequence(apicid);
+ }
+ return true;
+}
+
/*
* Early setup to make printk work.
*/

2023-06-20 10:41:49

by Borislav Petkov

[permalink] [raw]

Subject: Re: [patch v3 7/7] x86/smp: Put CPUs into INIT on shutdown if possible

On Thu, Jun 15, 2023 at 10:34:00PM +0200, Thomas Gleixner wrote:
> @@ -202,7 +206,27 @@ static void native_stop_other_cpus(int w
> udelay(1);
> }
>
> - /* if the REBOOT_VECTOR didn't work, try with the NMI */
> + /*
> + * Park all other CPUs in INIT including "offline" CPUs, if
> + * possible. That's a safe place where they can't resume execution
> + * of HLT and then execute the HLT loop from overwritten text or
> + * page tables.
> + *
> + * The only downside is a broadcast MCE, but up to the point where
> + * the kexec() kernel brought all APs online again an MCE will just
> + * make HLT resume and handle the MCE. The machine crashs and burns

"crashes"

With that

Reviewed-by: Borislav Petkov (AMD) <[email protected]>

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-06-20 13:18:07

by tip-bot2 for Thomas Gleixner

[permalink] [raw]

Subject: [tip: x86/core] x86/smp: Put CPUs into INIT on shutdown if possible

The following commit has been merged into the x86/core branch of tip:

Commit-ID: 45e34c8af58f23db4474e2bfe79183efec09a18b
Gitweb: https://git.kernel.org/tip/45e34c8af58f23db4474e2bfe79183efec09a18b
Author: Thomas Gleixner <[email protected]>
AuthorDate: Thu, 15 Jun 2023 22:34:00 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Tue, 20 Jun 2023 14:51:47 +02:00

x86/smp: Put CPUs into INIT on shutdown if possible

Parking CPUs in a HLT loop is not completely safe vs. kexec() as HLT can
resume execution due to NMI, SMI and MCE, which has the same issue as the
MWAIT loop.

Kicking the secondary CPUs into INIT makes this safe against NMI and SMI.

A broadcast MCE will take the machine down, but a broadcast MCE which makes
HLT resume and execute overwritten text, pagetables or data will end up in
a disaster too.

So chose the lesser of two evils and kick the secondary CPUs into INIT
unless the system has installed special wakeup mechanisms which are not
using INIT.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ashok Raj <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
arch/x86/include/asm/smp.h | 2 ++-
arch/x86/kernel/smp.c | 39 ++++++++++++++++++++++++++++++-------
arch/x86/kernel/smpboot.c | 19 ++++++++++++++++++-
3 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index d4ce5cb..5906aa9 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -139,6 +139,8 @@ void native_send_call_func_ipi(const struct cpumask *mask);
void native_send_call_func_single_ipi(int cpu);
void x86_idle_thread_init(unsigned int cpu, struct task_struct *idle);

+bool smp_park_other_cpus_in_init(void);
+
void smp_store_boot_cpu_info(void);
void smp_store_cpu_info(int id);

diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 174d623..0076932 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -131,7 +131,7 @@ static int smp_stop_nmi_callback(unsigned int val, struct pt_regs *regs)
}

/*
- * this function calls the 'stop' function on all other CPUs in the system.
+ * Disable virtualization, APIC etc. and park the CPU in a HLT loop
*/
DEFINE_IDTENTRY_SYSVEC(sysvec_reboot)
{
@@ -172,13 +172,17 @@ static void native_stop_other_cpus(int wait)
* 2) Wait for all other CPUs to report that they reached the
* HLT loop in stop_this_cpu()
*
- * 3) If #2 timed out send an NMI to the CPUs which did not
- * yet report
+ * 3) If the system uses INIT/STARTUP for CPU bringup, then
+ * send all present CPUs an INIT vector, which brings them
+ * completely out of the way.
*
- * 4) Wait for all other CPUs to report that they reached the
+ * 4) If #3 is not possible and #2 timed out send an NMI to the
+ * CPUs which did not yet report
+ *
+ * 5) Wait for all other CPUs to report that they reached the
* HLT loop in stop_this_cpu()
*
- * #3 can obviously race against a CPU reaching the HLT loop late.
+ * #4 can obviously race against a CPU reaching the HLT loop late.
* That CPU will have reported already and the "have all CPUs
* reached HLT" condition will be true despite the fact that the
* other CPU is still handling the NMI. Again, there is no
@@ -194,7 +198,7 @@ static void native_stop_other_cpus(int wait)
/*
* Don't wait longer than a second for IPI completion. The
* wait request is not checked here because that would
- * prevent an NMI shutdown attempt in case that not all
+ * prevent an NMI/INIT shutdown in case that not all
* CPUs reach shutdown state.
*/
timeout = USEC_PER_SEC;
@@ -202,7 +206,27 @@ static void native_stop_other_cpus(int wait)
udelay(1);
}

- /* if the REBOOT_VECTOR didn't work, try with the NMI */
+ /*
+ * Park all other CPUs in INIT including "offline" CPUs, if
+ * possible. That's a safe place where they can't resume execution
+ * of HLT and then execute the HLT loop from overwritten text or
+ * page tables.
+ *
+ * The only downside is a broadcast MCE, but up to the point where
+ * the kexec() kernel brought all APs online again an MCE will just
+ * make HLT resume and handle the MCE. The machine crashes and burns
+ * due to overwritten text, page tables and data. So there is a
+ * choice between fire and frying pan. The result is pretty much
+ * the same. Chose frying pan until x86 provides a sane mechanism
+ * to park a CPU.
+ */
+ if (smp_park_other_cpus_in_init())
+ goto done;
+
+ /*
+ * If park with INIT was not possible and the REBOOT_VECTOR didn't
+ * take all secondary CPUs offline, try with the NMI.
+ */
if (!cpumask_empty(&cpus_stop_mask)) {
/*
* If NMI IPI is enabled, try to register the stop handler
@@ -225,6 +249,7 @@ static void native_stop_other_cpus(int wait)
udelay(1);
}

+done:
local_irq_save(flags);
disable_local_APIC();
mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index b403ead..4ee4339 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1465,6 +1465,25 @@ void arch_thaw_secondary_cpus_end(void)
cache_aps_init();
}

+bool smp_park_other_cpus_in_init(void)
+{
+ unsigned int cpu, this_cpu = smp_processor_id();
+ unsigned int apicid;
+
+ if (apic->wakeup_secondary_cpu_64 || apic->wakeup_secondary_cpu)
+ return false;
+
+ for_each_present_cpu(cpu) {
+ if (cpu == this_cpu)
+ continue;
+ apicid = apic->cpu_present_to_apicid(cpu);
+ if (apicid == BAD_APICID)
+ continue;
+ send_init_sequence(apicid);
+ }
+ return true;
+}
+
/*
* Early setup to make printk work.
*/

2023-07-03 03:52:48

by Baokun Li

[permalink] [raw]

Subject: [BUG REPORT] Triggering a panic in an x86 virtual machine does not wait

When I manually trigger panic in a qume x86 VM with

`echo c > /proc/sysrq-trigger`,

I find that the VM will probably reboot directly, but the
PANIC_TIMEOUT is 0.
This prevents us from exporting the vmcore via panic, and even if we succeed
in panic exporting the vmcore, the processes in the vmcore are mostly
stop_this_cpu(). By dichotomizing we found the patch that introduced the
behavior change

45e34c8af58f ("x86/smp: Put CPUs into INIT on shutdown if possible"),

can anyone help to see what is happening?

Thanks!
--
With Best Regards,
Baokun Li
.

2023-07-05 09:15:37

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [BUG REPORT] Triggering a panic in an x86 virtual machine does not wait

On Mon, Jul 03 2023 at 11:44, Baokun Li wrote:

> When I manually trigger panic in a qume x86 VM with
>
> `echo c > /proc/sysrq-trigger`,
>
> I find that the VM will probably reboot directly, but the
> PANIC_TIMEOUT is 0.
> This prevents us from exporting the vmcore via panic, and even if we succeed
> in panic exporting the vmcore, the processes in the vmcore are mostly
> stop_this_cpu(). By dichotomizing we found the patch that introduced the
> behavior change
>
> 45e34c8af58f ("x86/smp: Put CPUs into INIT on shutdown if possible"),

Bah, I missed that this is used by crash too. So if this happens to be
invoked on an AP, i.e. not on CPU 0, then the INIT will reset the
machine. Fix below.

Thanks,

tglx
---
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ed2d51960a7d..e1aa2cd7734b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1348,6 +1348,14 @@ bool smp_park_other_cpus_in_init(void)
if (apic->wakeup_secondary_cpu_64 || apic->wakeup_secondary_cpu)
return false;

+ /*
+ * If this is a crash stop which does not execute on the boot CPU,
+ * then this cannot use the INIT mechanism because INIT to the boot
+ * CPU will reset the machine.
+ */
+ if (this_cpu)
+ return false;
+
for_each_present_cpu(cpu) {
if (cpu == this_cpu)
continue;

2023-07-06 07:09:57

by Baokun Li

[permalink] [raw]

Subject: Re: [BUG REPORT] Triggering a panic in an x86 virtual machine does not wait

On 2023/7/5 16:59, Thomas Gleixner wrote:
> On Mon, Jul 03 2023 at 11:44, Baokun Li wrote:
>
>> When I manually trigger panic in a qume x86 VM with
>>
>>        `echo c > /proc/sysrq-trigger`,
>>
>> I find that the VM will probably reboot directly, but the
>> PANIC_TIMEOUT is 0.
>> This prevents us from exporting the vmcore via panic, and even if we succeed
>> in panic exporting the vmcore, the processes in the vmcore are mostly
>> stop_this_cpu(). By dichotomizing we found the patch that introduced the
>> behavior change
>>
>>    45e34c8af58f ("x86/smp: Put CPUs into INIT on shutdown if possible"),
> Bah, I missed that this is used by crash too. So if this happens to be
> invoked on an AP, i.e. not on CPU 0, then the INIT will reset the
> machine. Fix below.
>
> Thanks,
>
> tglx
> ---
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index ed2d51960a7d..e1aa2cd7734b 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1348,6 +1348,14 @@ bool smp_park_other_cpus_in_init(void)
> if (apic->wakeup_secondary_cpu_64 || apic->wakeup_secondary_cpu)
> return false;
>
> + /*
> + * If this is a crash stop which does not execute on the boot CPU,
> + * then this cannot use the INIT mechanism because INIT to the boot
> + * CPU will reset the machine.
> + */
> + if (this_cpu)
> + return false;
> +
> for_each_present_cpu(cpu) {
> if (cpu == this_cpu)
> continue;
This patch does fix the problem of rebooting at panic, but the exported
stack
stays at stop_this_cpu() like below, instead of showing what the
corresponding
process is doing as before.

PID: 681      TASK: ffff9ac2429d3080 CPU: 2    COMMAND: "fsstress"
#0 [ffffb00200184fd0] stop_this_cpu at ffffffff89a4ffd8
#1 [ffffb00200184fe8] __sysvec_reboot at ffffffff89a94213
#2 [ffffb00200184ff0] sysvec_reboot at ffffffff8aee7491
--- <IRQ stack> ---
    RIP: 0000000000000010 RSP: 0000000000000018 RFLAGS: ffffb00200f8bd08
    RAX: ffff9ac256fda9d8 RBX: 0000000009973a85 RCX: ffff9ac256fda078
    RDX: ffff9ac24416e300 RSI: ffff9ac256fda9e0 RDI: ffffffffffffffff
    RBP: ffff9ac2443a5f88   R8: 0000000000000000   R9: ffff9ac2422eeea0
    R10: ffff9ac256fda9d8 R11: 0000000000549921 R12: ffff9ac2422eeea0
    R13: ffff9ac251cd23c8 R14: ffff9ac24269a800 R15: ffff9ac251cd2150
    ORIG_RAX: ffffffff8a1719e4 CS: 0206 SS: ffffffff8a1719c8
bt: WARNING: possibly bogus exception frame

Do you know how this happened? I would be grateful if you could fix it.

Thanks!
--
With Best Regards,
Baokun Li
.

2023-07-07 11:06:36

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [BUG REPORT] Triggering a panic in an x86 virtual machine does not wait

On Thu, Jul 06 2023 at 14:44, Baokun Li wrote:
> On 2023/7/5 16:59, Thomas Gleixner wrote:
>> + /*
>> + * If this is a crash stop which does not execute on the boot CPU,
>> + * then this cannot use the INIT mechanism because INIT to the boot
>> + * CPU will reset the machine.
>> + */
>> + if (this_cpu)
>> + return false;

> This patch does fix the problem of rebooting at panic, but the
> exported stack stays at stop_this_cpu() like below, instead of showing
> what the corresponding process is doing as before.
>
> PID: 681      TASK: ffff9ac2429d3080 CPU: 2    COMMAND: "fsstress"
> #0 [ffffb00200184fd0] stop_this_cpu at ffffffff89a4ffd8
> #1 [ffffb00200184fe8] __sysvec_reboot at ffffffff89a94213
> #2 [ffffb00200184ff0] sysvec_reboot at ffffffff8aee7491
> --- <IRQ stack> ---
>     RIP: 0000000000000010 RSP: 0000000000000018 RFLAGS: ffffb00200f8bd08
>     RAX: ffff9ac256fda9d8 RBX: 0000000009973a85 RCX: ffff9ac256fda078
>     RDX: ffff9ac24416e300 RSI: ffff9ac256fda9e0 RDI: ffffffffffffffff
>     RBP: ffff9ac2443a5f88   R8: 0000000000000000   R9: ffff9ac2422eeea0
>     R10: ffff9ac256fda9d8 R11: 0000000000549921 R12: ffff9ac2422eeea0
>     R13: ffff9ac251cd23c8 R14: ffff9ac24269a800 R15: ffff9ac251cd2150
>     ORIG_RAX: ffffffff8a1719e4 CS: 0206 SS: ffffffff8a1719c8
> bt: WARNING: possibly bogus exception frame
>
> Do you know how this happened? I would be grateful if you could fix it.

No, I don't. But there is clearly a hint:

> bt: WARNING: possibly bogus exception frame

So the exception frame seems to be corrupted. I have no idea why.

The question is, whether this goes away when you revert that commit or not.
I can't oracle that out from your report.

Can you please revert 45e34c8af58f on top of Linus tree and verify that
it makes the issue go away?

Thanks,

tglx

2023-07-07 12:53:06

by Baokun Li

[permalink] [raw]

Subject: Re: [BUG REPORT] Triggering a panic in an x86 virtual machine does not wait

On 2023/7/7 18:18, Thomas Gleixner wrote:
> On Thu, Jul 06 2023 at 14:44, Baokun Li wrote:
>> On 2023/7/5 16:59, Thomas Gleixner wrote:
>>> + /*
>>> + * If this is a crash stop which does not execute on the boot CPU,
>>> + * then this cannot use the INIT mechanism because INIT to the boot
>>> + * CPU will reset the machine.
>>> + */
>>> + if (this_cpu)
>>> + return false;

This does solve the problem of x86 VMs not waiting when they panic, so

Reported-and-tested-by: Baokun Li <[email protected]>

>> This patch does fix the problem of rebooting at panic, but the
>> exported stack stays at stop_this_cpu() like below, instead of showing
>> what the corresponding process is doing as before.
>>
>> PID: 681      TASK: ffff9ac2429d3080 CPU: 2    COMMAND: "fsstress"
>> #0 [ffffb00200184fd0] stop_this_cpu at ffffffff89a4ffd8
>> #1 [ffffb00200184fe8] __sysvec_reboot at ffffffff89a94213
>> #2 [ffffb00200184ff0] sysvec_reboot at ffffffff8aee7491
>> --- <IRQ stack> ---
>>     RIP: 0000000000000010 RSP: 0000000000000018 RFLAGS: ffffb00200f8bd08
>>     RAX: ffff9ac256fda9d8 RBX: 0000000009973a85 RCX: ffff9ac256fda078
>>     RDX: ffff9ac24416e300 RSI: ffff9ac256fda9e0 RDI: ffffffffffffffff
>>     RBP: ffff9ac2443a5f88   R8: 0000000000000000   R9: ffff9ac2422eeea0
>>     R10: ffff9ac256fda9d8 R11: 0000000000549921 R12: ffff9ac2422eeea0
>>     R13: ffff9ac251cd23c8 R14: ffff9ac24269a800 R15: ffff9ac251cd2150
>>     ORIG_RAX: ffffffff8a1719e4 CS: 0206 SS: ffffffff8a1719c8
>> bt: WARNING: possibly bogus exception frame
>>
>> Do you know how this happened? I would be grateful if you could fix it.
> No, I don't. But there is clearly a hint:
>
>> bt: WARNING: possibly bogus exception frame
> So the exception frame seems to be corrupted. I have no idea why.
>
> The question is, whether this goes away when you revert that commit or not.
> I can't oracle that out from your report.
>
> Can you please revert 45e34c8af58f on top of Linus tree and verify that
> it makes the issue go away?
>
> Thanks,
>
> tglx
Yes, the stop_this_cpu() issue persisted after I reverted 45e34c8af58f
and it
has nothing to do with your patch, I will try to bisect to find out
which patch
introduced the issue.

Thank you very much for helping locate and rectify the problem that the x86
VM panic does not wait!

Cheers!
--
With Best Regards,
Baokun Li
.

2023-07-07 13:52:12

by tip-bot2 for Thomas Gleixner

[permalink] [raw]

Subject: [tip: x86/core] x86/smp: Don't send INIT to boot CPU

The following commit has been merged into the x86/core branch of tip:

Commit-ID: b1472a60a584694875a05cf8bcba8bdf0dc1cd3a
Gitweb: https://git.kernel.org/tip/b1472a60a584694875a05cf8bcba8bdf0dc1cd3a
Author: Thomas Gleixner <[email protected]>
AuthorDate: Wed, 05 Jul 2023 10:59:23 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Fri, 07 Jul 2023 15:42:31 +02:00

x86/smp: Don't send INIT to boot CPU

Parking CPUs in INIT works well, except for the crash case when the CPU
which invokes smp_park_other_cpus_in_init() is not the boot CPU. Sending
INIT to the boot CPU resets the whole machine.

Prevent this by validating that this runs on the boot CPU. If not fall back
and let CPUs hang in HLT.

Fixes: 45e34c8af58f ("x86/smp: Put CPUs into INIT on shutdown if possible")
Reported-by: Baokun Li <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Baokun Li <[email protected]>
Link: https://lore.kernel.org/r/87ttui91jo.ffs@tglx
---
arch/x86/kernel/smpboot.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 4ee4339..7417d9b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1473,6 +1473,14 @@ bool smp_park_other_cpus_in_init(void)
if (apic->wakeup_secondary_cpu_64 || apic->wakeup_secondary_cpu)
return false;

+ /*
+ * If this is a crash stop which does not execute on the boot CPU,
+ * then this cannot use the INIT mechanism because INIT to the boot
+ * CPU will reset the machine.
+ */
+ if (this_cpu)
+ return false;
+
for_each_present_cpu(cpu) {
if (cpu == this_cpu)
continue;