2022-11-17 16:45:18

by Phil Auld

[permalink] [raw]
Subject: [PATCH v5 0/2] cpuhp: fix some st->target issues

Fix a few cpuhp related issues.

The first prevents target_store() from calling cpu_down() when
target == state which prevents the cpu being incorrectly marked
as dying. The second just makes the boot cpu have a valid cpuhp
target rather than 0 (CPU_OFFLINE) while being in state
CPU_ONLINE.

v3: Added code to make sure st->target == target in the nop case.

v4: Use WARN_ON in the case where state == target but st->target does
not.

v5: Fixed lowercase on first patch title and cleaned up cover letter.
Rebased on v6.1-rc5.

Phil Auld (2):
cpuhp: Make target_store() a nop when target == state
cpuhp: Set cpuhp target for boot cpu

cpu.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--
2.31.1



2022-11-17 17:17:23

by Phil Auld

[permalink] [raw]
Subject: [PATCH v5 1/2] cpuhp: Make target_store() a nop when target == state

Writing the current state back in hotplug/target calls cpu_down()
which will set cpu dying even when it isn't and then nothing will
ever clear it. A stress test that reads values and writes them back
for all cpu device files in sysfs will trigger the BUG() in
select_fallback_rq once all cpus are marked as dying.

kernel/cpu.c::target_store()
...
if (st->state < target)
ret = cpu_up(dev->id, target);
else
ret = cpu_down(dev->id, target);

cpu_down() -> cpu_set_state()
bool bringup = st->state < target;
...
if (cpu_dying(cpu) != !bringup)
set_cpu_dying(cpu, !bringup);

Fix this by letting state==target fall through in the target_store()
conditional. Also make sure st->target == target in that case.

Signed-off-by: Phil Auld <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Fixes: 757c989b9994 ("cpu/hotplug: Make target state writeable")
Cc: Thomas Gleixner <[email protected]>
Cc: "Peter Zijlstra (Intel)" <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
---
kernel/cpu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index bbad5e375d3b..979de993f853 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2326,8 +2326,10 @@ static ssize_t target_store(struct device *dev, struct device_attribute *attr,

if (st->state < target)
ret = cpu_up(dev->id, target);
- else
+ else if (st->state > target)
ret = cpu_down(dev->id, target);
+ else if (WARN_ON(st->target != target))
+ st->target = target;
out:
unlock_device_hotplug();
return ret ? ret : count;
--
2.31.1


Subject: [tip: smp/core] cpuhp: Make target_store() a nop when target == state

The following commit has been merged into the smp/core branch of tip:

Commit-ID: 0fa5abb6b7d85bf5688b2e11113f50317fb0121c
Gitweb: https://git.kernel.org/tip/0fa5abb6b7d85bf5688b2e11113f50317fb0121c
Author: Phil Auld <[email protected]>
AuthorDate: Thu, 17 Nov 2022 11:23:28 -05:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Thu, 01 Dec 2022 12:35:08 +01:00

cpuhp: Make target_store() a nop when target == state

Writing the current state back in hotplug/target calls cpu_down()
which will set cpu dying even when it isn't and then nothing will
ever clear it. A stress test that reads values and writes them back
for all cpu device files in sysfs will trigger the BUG() in
select_fallback_rq once all cpus are marked as dying.

kernel/cpu.c::target_store()
...
if (st->state < target)
ret = cpu_up(dev->id, target);
else
ret = cpu_down(dev->id, target);

cpu_down() -> cpu_set_state()
bool bringup = st->state < target;
...
if (cpu_dying(cpu) != !bringup)
set_cpu_dying(cpu, !bringup);

Fix this by letting state==target fall through in the target_store()
conditional. Also make sure st->target == target in that case.

Fixes: 757c989b9994 ("cpu/hotplug: Make target state writeable")
Signed-off-by: Phil Auld <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
kernel/cpu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index bbad5e3..979de99 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2326,8 +2326,10 @@ static ssize_t target_store(struct device *dev, struct device_attribute *attr,

if (st->state < target)
ret = cpu_up(dev->id, target);
- else
+ else if (st->state > target)
ret = cpu_down(dev->id, target);
+ else if (WARN_ON(st->target != target))
+ st->target = target;
out:
unlock_device_hotplug();
return ret ? ret : count;

2022-12-01 13:00:02

by Phil Auld

[permalink] [raw]
Subject: Re: [PATCH v5 0/2] cpuhp: fix some st->target issues

On Thu, Nov 17, 2022 at 11:23:27AM -0500 Phil Auld wrote:
> Fix a few cpuhp related issues.
>
> The first prevents target_store() from calling cpu_down() when
> target == state which prevents the cpu being incorrectly marked
> as dying. The second just makes the boot cpu have a valid cpuhp
> target rather than 0 (CPU_OFFLINE) while being in state
> CPU_ONLINE.
>
> v3: Added code to make sure st->target == target in the nop case.
>
> v4: Use WARN_ON in the case where state == target but st->target does
> not.
>
> v5: Fixed lowercase on first patch title and cleaned up cover letter.
> Rebased on v6.1-rc5.
>
> Phil Auld (2):
> cpuhp: Make target_store() a nop when target == state
> cpuhp: Set cpuhp target for boot cpu
>
> cpu.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>

Thanks for picking these up, Thomas!


Cheers,
Phil


> --
> 2.31.1
>

--

Subject: [tip: smp/core] cpu/hotplug: Make target_store() a nop when target == state

The following commit has been merged into the smp/core branch of tip:

Commit-ID: 64ea6e44f85b9b75925ebe1ba0e6e8430cc4e06f
Gitweb: https://git.kernel.org/tip/64ea6e44f85b9b75925ebe1ba0e6e8430cc4e06f
Author: Phil Auld <[email protected]>
AuthorDate: Thu, 17 Nov 2022 11:23:28 -05:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Fri, 02 Dec 2022 12:43:02 +01:00

cpu/hotplug: Make target_store() a nop when target == state

Writing the current state back in hotplug/target calls cpu_down()
which will set cpu dying even when it isn't and then nothing will
ever clear it. A stress test that reads values and writes them back
for all cpu device files in sysfs will trigger the BUG() in
select_fallback_rq once all cpus are marked as dying.

kernel/cpu.c::target_store()
...
if (st->state < target)
ret = cpu_up(dev->id, target);
else
ret = cpu_down(dev->id, target);

cpu_down() -> cpu_set_state()
bool bringup = st->state < target;
...
if (cpu_dying(cpu) != !bringup)
set_cpu_dying(cpu, !bringup);

Fix this by letting state==target fall through in the target_store()
conditional. Also make sure st->target == target in that case.

Fixes: 757c989b9994 ("cpu/hotplug: Make target state writeable")
Signed-off-by: Phil Auld <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Link: https://lore.kernel.org/r/[email protected]


---
kernel/cpu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index bbad5e3..979de99 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2326,8 +2326,10 @@ static ssize_t target_store(struct device *dev, struct device_attribute *attr,

if (st->state < target)
ret = cpu_up(dev->id, target);
- else
+ else if (st->state > target)
ret = cpu_down(dev->id, target);
+ else if (WARN_ON(st->target != target))
+ st->target = target;
out:
unlock_device_hotplug();
return ret ? ret : count;