2022-07-14 02:09:47

by Zhouyi Zhou

[permalink] [raw]
Subject: [PATCH linux-next] powerpc: use raw_smp_processor_id in arch_touch_nmi_watchdog

use raw_smp_processor_id() in arch_touch_nmi_watchdog
because when called from watchdog, the cpu is preemptible.

Signed-off-by: Zhouyi Zhou <[email protected]>
---
Dear PPC developers

I found this bug when trying to do rcutorture tests in ppc VM of
Open Source Lab of Oregon State University.

qemu-system-ppc64 -nographic -smp cores=4,threads=1 -net none -M pseries -nodefaults -device spapr-vscsi -serial file:/tmp/console.log -m 2G -kernel /home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.08-22.36.11-torture/results-rcuscale-kvfree/TREE/vmlinux -append "debug_boot_weak_hash panic=-1 console=ttyS0 rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot rcuscale.shutdown=1 rcuscale.verbose=0"

tail /tmp/console.log
[ 1232.433552][ T41] BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/41
[ 1232.439751][ T41] caller is arch_touch_nmi_watchdog+0x34/0xd0
[ 1232.440934][ T41] CPU: 3 PID: 41 Comm: khungtaskd Not tainted 5.19.0-rc5-next-20220708-dirty #106
[ 1232.442684][ T41] Call Trace:
[ 1232.443343][ T41] [c0000000029cbbb0] [c0000000006df360] dump_stack_lvl+0x74/0xa8 (unreliable)
[ 1232.445237][ T41] [c0000000029cbbf0] [c000000000d04f30] check_preemption_disabled+0x150/0x160
[ 1232.446926][ T41] [c0000000029cbc80] [c000000000035584] arch_touch_nmi_watchdog+0x34/0xd0
[ 1232.448532][ T41] [c0000000029cbcb0] [c0000000002068ac] watchdog+0x40c/0x5b0
[ 1232.451449][ T41] [c0000000029cbdc0] [c000000000139df4] kthread+0x144/0x170
[ 1232.452896][ T41] [c0000000029cbe10] [c00000000000cd54] ret_from_kernel_thread+0x5c/0x64

After this fix, "BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/41" does not
appear again.

I also examined other places in watchdog.c where smp_processor_id() are used, but they are well protected by preempt
disable.

Kind Regards
Zhouyi
--
arch/powerpc/kernel/watchdog.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 7d28b9553654..ab6b84e00311 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -450,7 +450,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
void arch_touch_nmi_watchdog(void)
{
unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
- int cpu = smp_processor_id();
+ int cpu = raw_smp_processor_id();
u64 tb;

if (!cpumask_test_cpu(cpu, &watchdog_cpumask))
--
2.25.1


2022-07-14 09:30:43

by John Ogness

[permalink] [raw]
Subject: Re: [PATCH linux-next] powerpc: use raw_smp_processor_id in arch_touch_nmi_watchdog

On 2022-07-14, Zhouyi Zhou <[email protected]> wrote:
> use raw_smp_processor_id() in arch_touch_nmi_watchdog
> because when called from watchdog, the cpu is preemptible.

I would expect the correct solution is to make it a non-migration
section. Something like the below (untested) patch.

John Ogness

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index bfc27496fe7e..9d34aa809241 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -450,17 +450,23 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
void arch_touch_nmi_watchdog(void)
{
unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
- int cpu = smp_processor_id();
+ int cpu;
u64 tb;

- if (!cpumask_test_cpu(cpu, &watchdog_cpumask))
+ cpu = get_cpu();
+
+ if (!cpumask_test_cpu(cpu, &watchdog_cpumask)) {
+ goto out;
return;
+ }

tb = get_tb();
if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) {
per_cpu(wd_timer_tb, cpu) = tb;
wd_smp_clear_cpu_pending(cpu);
}
+out:
+ put_cpu();
}
EXPORT_SYMBOL(arch_touch_nmi_watchdog);

2022-07-14 10:14:45

by Zhouyi Zhou

[permalink] [raw]
Subject: Re: [PATCH linux-next] powerpc: use raw_smp_processor_id in arch_touch_nmi_watchdog

Thank John for correcting me ;-)

On Thu, Jul 14, 2022 at 5:25 PM John Ogness <[email protected]> wrote:
>
> On 2022-07-14, Zhouyi Zhou <[email protected]> wrote:
> > use raw_smp_processor_id() in arch_touch_nmi_watchdog
> > because when called from watchdog, the cpu is preemptible.
>
> I would expect the correct solution is to make it a non-migration
> section. Something like the below (untested) patch.
I applied your patch (I have made a tiny modification by removing the
return statement after "goto out;") and
passed the test in the ppc VM of Open Source Lab of Oregon State University.

Tested-by: Zhouyi Zhou <[email protected]>

Many Thanks
Kindly Regards
Zhouyi
>
> John Ogness
>
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index bfc27496fe7e..9d34aa809241 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -450,17 +450,23 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> void arch_touch_nmi_watchdog(void)
> {
> unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
> - int cpu = smp_processor_id();
> + int cpu;
> u64 tb;
>
> - if (!cpumask_test_cpu(cpu, &watchdog_cpumask))
> + cpu = get_cpu();
> +
> + if (!cpumask_test_cpu(cpu, &watchdog_cpumask)) {
> + goto out;
> return;
I think we should remove the return statement here.
> + }
>
> tb = get_tb();
> if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) {
> per_cpu(wd_timer_tb, cpu) = tb;
> wd_smp_clear_cpu_pending(cpu);
> }
> +out:
> + put_cpu();
> }
> EXPORT_SYMBOL(arch_touch_nmi_watchdog);

2022-07-14 12:10:23

by John Ogness

[permalink] [raw]
Subject: Re: [PATCH linux-next] powerpc: use raw_smp_processor_id in arch_touch_nmi_watchdog

On 2022-07-14, Zhouyi Zhou <[email protected]> wrote:
> Thank John for correcting me ;-)

After looking more closely, I do not think disabling migration is the
correct fix either.

The per-cpu variable @wd_timer_tb is written from 2 functions:

- watchdog_timer_interrupt() <-- irq handler
- arch_touch_nmi_watchdog() <-- called from preemptible

Since watchdog_timer_interrupt() is called from irq context, I expect
that interrupts need to be disabled for the update in
arch_touch_nmi_watchdog(). Perhaps a using a per-cpu local_lock_t with
local_lock_irqsave() to protect write access to @wd_timer_tb?

John Ogness