2021-10-16 18:08:20

by Nathan Lynch

[permalink] [raw]
Subject: [PATCH] powerpc/smp: do not decrement idle task preempt count in CPU offline

With PREEMPT_COUNT=y, when a CPU is offlined and then onlined again, we
get:

BUG: scheduling while atomic: swapper/1/0/0x00000000
no locks held by swapper/1/0.
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.0-rc2+ #100
Call Trace:
dump_stack_lvl+0xac/0x108
__schedule_bug+0xac/0xe0
__schedule+0xcf8/0x10d0
schedule_idle+0x3c/0x70
do_idle+0x2d8/0x4a0
cpu_startup_entry+0x38/0x40
start_secondary+0x2ec/0x3a0
start_secondary_prolog+0x10/0x14

This is because powerpc's arch_cpu_idle_dead() decrements the idle task's
preempt count, for reasons explained in commit a7c2bb8279d2 ("powerpc:
Re-enable preemption before cpu_die()"), specifically "start_secondary()
expects a preempt_count() of 0."

However, since commit 2c669ef6979c ("powerpc/preempt: Don't touch the idle
task's preempt_count during hotplug") and commit f1a0a376ca0c ("sched/core:
Initialize the idle task with preemption disabled"), that justification no
longer holds.

The idle task isn't supposed to re-enable preemption, so remove the
vestigial preempt_enable() from the CPU offline path.

Tested with pseries and powernv in qemu, and pseries on PowerVM.

Fixes: 2c669ef6979c ("powerpc/preempt: Don't touch the idle task's preempt_count during hotplug")
Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
Signed-off-by: Nathan Lynch <[email protected]>
---
arch/powerpc/kernel/smp.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 9cc7d3dbf439..605bab448f84 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1730,8 +1730,6 @@ void __cpu_die(unsigned int cpu)

void arch_cpu_idle_dead(void)
{
- sched_preempt_enable_no_resched();
-
/*
* Disable on the down path. This will be re-enabled by
* start_secondary() via start_secondary_resume() below
--
2.31.1


2021-10-16 22:14:34

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH] powerpc/smp: do not decrement idle task preempt count in CPU offline

On 15/10/21 09:55, Nathan Lynch wrote:
> With PREEMPT_COUNT=y, when a CPU is offlined and then onlined again, we
> get:
>
> BUG: scheduling while atomic: swapper/1/0/0x00000000
> no locks held by swapper/1/0.
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.0-rc2+ #100
> Call Trace:
> dump_stack_lvl+0xac/0x108
> __schedule_bug+0xac/0xe0
> __schedule+0xcf8/0x10d0
> schedule_idle+0x3c/0x70
> do_idle+0x2d8/0x4a0
> cpu_startup_entry+0x38/0x40
> start_secondary+0x2ec/0x3a0
> start_secondary_prolog+0x10/0x14
>
> This is because powerpc's arch_cpu_idle_dead() decrements the idle task's
> preempt count, for reasons explained in commit a7c2bb8279d2 ("powerpc:
> Re-enable preemption before cpu_die()"), specifically "start_secondary()
> expects a preempt_count() of 0."
>
> However, since commit 2c669ef6979c ("powerpc/preempt: Don't touch the idle
> task's preempt_count during hotplug") and commit f1a0a376ca0c ("sched/core:
> Initialize the idle task with preemption disabled"), that justification no
> longer holds.
>
> The idle task isn't supposed to re-enable preemption, so remove the
> vestigial preempt_enable() from the CPU offline path.
>

Humph, I got confused because 2c669ef6979c explicitly mentions hotplug,
but that's the hotplug machinery which is already involved for bringing up
the secondaries at boot time.

IIUC your issue here is the preempt_count being messed up when
hot-unplugging a CPU, which leads to fireworks during hotplug (IOW I didn't
test my last patch against hotplug - my bad!)

Reviewed-by: Valentin Schneider <[email protected]>

> Tested with pseries and powernv in qemu, and pseries on PowerVM.
>
> Fixes: 2c669ef6979c ("powerpc/preempt: Don't touch the idle task's preempt_count during hotplug")
> Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")

I think only the first Fixes: is needed.

> Signed-off-by: Nathan Lynch <[email protected]>
> ---
> arch/powerpc/kernel/smp.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 9cc7d3dbf439..605bab448f84 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1730,8 +1730,6 @@ void __cpu_die(unsigned int cpu)
>
> void arch_cpu_idle_dead(void)
> {
> - sched_preempt_enable_no_resched();
> -
> /*
> * Disable on the down path. This will be re-enabled by
> * start_secondary() via start_secondary_resume() below
> --
> 2.31.1

2021-10-17 11:22:45

by Nathan Lynch

[permalink] [raw]
Subject: Re: [PATCH] powerpc/smp: do not decrement idle task preempt count in CPU offline

Valentin Schneider <[email protected]> writes:

> On 15/10/21 09:55, Nathan Lynch wrote:
>> With PREEMPT_COUNT=y, when a CPU is offlined and then onlined again, we
>> get:
>>
>> BUG: scheduling while atomic: swapper/1/0/0x00000000
>> no locks held by swapper/1/0.
>> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.0-rc2+ #100
>> Call Trace:
>> dump_stack_lvl+0xac/0x108
>> __schedule_bug+0xac/0xe0
>> __schedule+0xcf8/0x10d0
>> schedule_idle+0x3c/0x70
>> do_idle+0x2d8/0x4a0
>> cpu_startup_entry+0x38/0x40
>> start_secondary+0x2ec/0x3a0
>> start_secondary_prolog+0x10/0x14
>>
>> This is because powerpc's arch_cpu_idle_dead() decrements the idle task's
>> preempt count, for reasons explained in commit a7c2bb8279d2 ("powerpc:
>> Re-enable preemption before cpu_die()"), specifically "start_secondary()
>> expects a preempt_count() of 0."
>>
>> However, since commit 2c669ef6979c ("powerpc/preempt: Don't touch the idle
>> task's preempt_count during hotplug") and commit f1a0a376ca0c ("sched/core:
>> Initialize the idle task with preemption disabled"), that justification no
>> longer holds.
>>
>> The idle task isn't supposed to re-enable preemption, so remove the
>> vestigial preempt_enable() from the CPU offline path.
>>
>
> Humph, I got confused because 2c669ef6979c explicitly mentions hotplug,
> but that's the hotplug machinery which is already involved for bringing up
> the secondaries at boot time.
>
> IIUC your issue here is the preempt_count being messed up when
> hot-unplugging a CPU, which leads to fireworks during hotplug

That's right.

> (IOW I didn't
> test my last patch against hotplug - my bad!)
>
> Reviewed-by: Valentin Schneider <[email protected]>

No worries and thank you for reviewing.


>> Tested with pseries and powernv in qemu, and pseries on PowerVM.
>>
>> Fixes: 2c669ef6979c ("powerpc/preempt: Don't touch the idle task's preempt_count during hotplug")
>> Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
>
> I think only the first Fixes: is needed.

OK, I'll re-send with that changed as well as your r-b. Thanks.