2022-02-25 15:55:53

by Steven Price

[permalink] [raw]
Subject: [PATCH] cpu/hotplug: Set st->cpu earlier

Setting the 'cpu' member of struct cpuhp_cpu_state in cpuhp_create() is
too late as other callbacks can be made before that point. In particular
if one of the earlier callbacks fails and triggers a rollback that
rollback will be done with st->cpu==0 causing CPU0 to be erroneously set
to be dying, causing the scheduler to get mightily confused and throw
its toys out of the pram.

Move the assignment earlier before any callbacks have a chance to run.

Signed-off-by: Steven Price <[email protected]>
CC: Dietmar Eggemann <[email protected]>
---
This was initially triggered by a VM which didn't have enough memory for
its VCPUs, but an easier way of triggering it is to make a change like
below in __smpboot_create_thread (as suggested by Dietmar Eggemann) to
pretend the memory allocation fails for a particular CPU:

td = kzalloc_node(sizeof(*td), GFP_KERNEL, cpu_to_node(cpu));
- if (!td)
+ if (!td || cpu == 1)
return -ENOMEM;

I'm not entirely sure quite where the best place to set st->cpu is, so
please do let me know if there's a better place to do the assignment.
---
kernel/cpu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 407a2568f35e..49c3ef6067e5 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -720,7 +720,6 @@ static void cpuhp_create(unsigned int cpu)

init_completion(&st->done_up);
init_completion(&st->done_down);
- st->cpu = cpu;
}

static int cpuhp_should_run(unsigned int cpu)
@@ -1333,6 +1332,8 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
goto out;
}

+ st->cpu = cpu;
+
/*
* The caller of cpu_up() might have raced with another
* caller. Nothing to do.
--
2.25.1


2022-03-08 06:33:03

by Vincent Donnefort

[permalink] [raw]
Subject: Re: [PATCH] cpu/hotplug: Set st->cpu earlier

On 25/02/2022 13:49, Steven Price wrote:
> Setting the 'cpu' member of struct cpuhp_cpu_state in cpuhp_create() is
> too late as other callbacks can be made before that point. In particular > if one of the earlier callbacks fails and triggers a rollback that
> rollback will be done with st->cpu==0 causing CPU0 to be erroneously set

st->cpu is even needed before any cpuhp_step callback has been run
(cpuhp_set_state() in _cpu_up()). So despite CPUHP_CREATE_THREADS being
the first step, this is indeed not early enough.

> to be dying, causing the scheduler to get mightily confused and throw
> its toys out of the pram.
>
> Move the assignment earlier before any callbacks have a chance to run.

Probably needs a

Fixes: 2ea46c6fc945 ("cpumask/hotplug: Fix cpu_dying() state tracking")

>
> Signed-off-by: Steven Price <[email protected]>
> CC: Dietmar Eggemann <[email protected]>
> ---
> This was initially triggered by a VM which didn't have enough memory for
> its VCPUs, but an easier way of triggering it is to make a change like
> below in __smpboot_create_thread (as suggested by Dietmar Eggemann) to
> pretend the memory allocation fails for a particular CPU:
>
> td = kzalloc_node(sizeof(*td), GFP_KERNEL, cpu_to_node(cpu));
> - if (!td)
> + if (!td || cpu == 1)
> return -ENOMEM;
>
> I'm not entirely sure quite where the best place to set st->cpu is, so
> please do let me know if there's a better place to do the assignment.
> ---
> kernel/cpu.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 407a2568f35e..49c3ef6067e5 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -720,7 +720,6 @@ static void cpuhp_create(unsigned int cpu)
>
> init_completion(&st->done_up);
> init_completion(&st->done_down);
> - st->cpu = cpu;
> }
>
> static int cpuhp_should_run(unsigned int cpu)
> @@ -1333,6 +1332,8 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
> goto out;
> }
>
> + st->cpu = cpu;
> +

Could eventually go just before cpuhp_set_state(), in the same function
as this seems to be the first user of st->cpu.

> /*
> * The caller of cpu_up() might have raced with another
> * caller. Nothing to do.