2010-11-11 02:20:51

by DDD

[permalink] [raw]
Subject: [PATCH] x86: avoid calling arch_trigger_all_cpu_backtrace() at the same time on SMP

The spin_lock_debug/rcu_cpu_stall detector uses
trigger_all_cpu_backtrace() to dump cpu backtrace.
Therefore it is possible that trigger_all_cpu_backtrace()
could be called at the same time on different CPUs, which
triggers and 'unknown reason NMI' warning. The following case
illustrates the problem:

CPU1 CPU2 ... CPU N
trigger_all_cpu_backtrace()
set "backtrace_mask" to cpu mask
|
generate NMI interrupts generate NMI interrupts ...
\ | /
\ | /
The "backtrace_mask" will be cleaned by the first NMI interrupt
at nmi_watchdog_tick(), then the following NMI interrupts generated
by other cpus's arch_trigger_all_cpu_backtrace() will be took as
unknown reason NMI interrupts.

This patch uses a lock to avoid the problem, and stop the
arch_trigger_all_cpu_backtrace() calling to avoid dumping double cpu
backtrace info when there is already a trigger_all_cpu_backtrace()
in progress.

Signed-off-by: Dongdong Deng <[email protected]>
Reviewed-by: Bruce Ashfield <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: [email protected]
CC: [email protected]
Signed-off-by: Don Zickus <[email protected]>
---
arch/x86/kernel/apic/hw_nmi.c | 23 +++++++++++++++++++++++
arch/x86/kernel/apic/nmi.c | 23 +++++++++++++++++++++++
2 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index cefd694..6084c3b 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -26,9 +26,27 @@ u64 hw_nmi_get_sample_period(void)
}

#ifdef ARCH_HAS_NMI_WATCHDOG
+/* "in progress" flag of arch_trigger_all_cpu_backtrace */
+static int backtrace_flag;
+
void arch_trigger_all_cpu_backtrace(void)
{
int i;
+ unsigned long flags;
+
+ /*
+ * Have to disable irq here, as the
+ * arch_trigger_all_cpu_backtrace() could be
+ * triggered by "spin_lock()" with irqs on.
+ */
+ local_irq_save(flags);
+
+ if (cmpxchg(&backtrace_flag, 0, 1) != 0)
+ /*
+ * If there is already a trigger_all_cpu_backtrace() in progress
+ * (backtrace_flag == 1), don't output double cpu dump infos.
+ */
+ goto out_restore_irq;

cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);

@@ -41,6 +59,11 @@ void arch_trigger_all_cpu_backtrace(void)
break;
mdelay(1);
}
+
+ cmpxchg(&backtrace_flag, 1, 0);
+
+out_restore_irq:
+ local_irq_restore(flags);
}

static int __kprobes
diff --git a/arch/x86/kernel/apic/nmi.c b/arch/x86/kernel/apic/nmi.c
index c90041c..2d4b3a1 100644
--- a/arch/x86/kernel/apic/nmi.c
+++ b/arch/x86/kernel/apic/nmi.c
@@ -549,9 +549,27 @@ int do_nmi_callback(struct pt_regs *regs, int cpu)
return 0;
}

+/* "in progress" flag of arch_trigger_all_cpu_backtrace */
+static int backtrace_flag;
+
void arch_trigger_all_cpu_backtrace(void)
{
int i;
+ unsigned long flags;
+
+ /*
+ * Have to disable irq here, as the
+ * arch_trigger_all_cpu_backtrace() could be
+ * triggered by "spin_lock()" with irqs on.
+ */
+ local_irq_save(flags);
+
+ if (cmpxchg(&backtrace_flag, 0, 1) != 0)
+ /*
+ * If there is already a trigger_all_cpu_backtrace() in progress
+ * (backtrace_flag == 1), don't output double cpu dump infos.
+ */
+ goto out_restore_irq;

cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);

@@ -564,4 +582,9 @@ void arch_trigger_all_cpu_backtrace(void)
break;
mdelay(1);
}
+
+ cmpxchg(&backtrace_flag, 1, 0);
+
+out_restore_irq:
+ local_irq_restore(flags);
}
--
1.6.0.4


2010-11-11 09:23:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: avoid calling arch_trigger_all_cpu_backtrace() at the same time on SMP


* Dongdong Deng <[email protected]> wrote:

> +static int backtrace_flag;

> + if (cmpxchg(&backtrace_flag, 0, 1) != 0)

Sorry to be a PITA, but i asked for test_and_set() because that's the simplest
primitive. cmpxchg() semantics is not nearly as obvious and people regularly get it
wrong :-/

Also, variables that cmpxchg or test_and_set operates on need to be long, not int.

Ingo

2010-11-11 09:51:38

by DDD

[permalink] [raw]
Subject: Re: [PATCH] x86: avoid calling arch_trigger_all_cpu_backtrace() at the same time on SMP

Ingo Molnar wrote:
> * Dongdong Deng <[email protected]> wrote:
>
>> +static int backtrace_flag;
>
>> + if (cmpxchg(&backtrace_flag, 0, 1) != 0)
>
> Sorry to be a PITA, but i asked for test_and_set() because that's the simplest
> primitive. cmpxchg() semantics is not nearly as obvious and people regularly get it
> wrong :-/

As the 'backtrace_flag' could be accessed by multi-cpus on SMP at the
same time, I use cmpxchg() for getting a atomic/memory barrier operation
for 'backtrace_flag' variable.

If we use test_and_set, maybe we need smp_wmb() after test_and_set. (If
I wrong, please correct me, thanks. :-) )

Should we still need to use test_and_set?
If need, I will use test_and_set at my next patch.

>
> Also, variables that cmpxchg or test_and_set operates on need to be long, not int.

Got it, I will change it to 'unsigned long' type.

Thanks for your teaching.

Dongdong


>
> Ingo
>

2010-11-11 09:51:54

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] x86: avoid calling arch_trigger_all_cpu_backtrace() at the same time on SMP

Le jeudi 11 novembre 2010 à 10:23 +0100, Ingo Molnar a écrit :

> Also, variables that cmpxchg or test_and_set operates on need to be long, not int.

Hmm, ok for test_and_set(), it operates on a long.

cmpxchg() is ok on an int AFAIK. If not we have to make some changes :(

btrfs_orphan_cleanup() for example does this :

if (cmpxchg(&root->orphan_cleanup_state, 0, ORPHAN_CLEANUP_STARTED))
...


Same in build_ehash_secret() (net/ipv4/af_inet.c)

cmpxchg(&inet_ehash_secret, 0, rnd);

2010-11-11 09:58:02

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: avoid calling arch_trigger_all_cpu_backtrace() at the same time on SMP


* Eric Dumazet <[email protected]> wrote:

> Le jeudi 11 novembre 2010 ? 10:23 +0100, Ingo Molnar a ?crit :
>
> > Also, variables that cmpxchg or test_and_set operates on need to be long, not int.
>
> Hmm, ok for test_and_set(), it operates on a long.
>
> cmpxchg() is ok on an int AFAIK. If not we have to make some changes :(
>
> btrfs_orphan_cleanup() for example does this :
>
> if (cmpxchg(&root->orphan_cleanup_state, 0, ORPHAN_CLEANUP_STARTED))
> ...
>
>
> Same in build_ehash_secret() (net/ipv4/af_inet.c)
>
> cmpxchg(&inet_ehash_secret, 0, rnd);

You are right - cmpxchg() auto-detects the word size and thus should work on int
too.

Thanks,

Ingo

2010-11-11 10:01:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: avoid calling arch_trigger_all_cpu_backtrace() at the same time on SMP


* DDD <[email protected]> wrote:

> Ingo Molnar wrote:
> >* Dongdong Deng <[email protected]> wrote:
> >
> >>+static int backtrace_flag;
> >
> >>+ if (cmpxchg(&backtrace_flag, 0, 1) != 0)
> >
> >Sorry to be a PITA, but i asked for test_and_set() because that's
> >the simplest primitive. cmpxchg() semantics is not nearly as
> >obvious and people regularly get it wrong :-/
>
> As the 'backtrace_flag' could be accessed by multi-cpus on SMP at
> the same time, I use cmpxchg() for getting a atomic/memory barrier
> operation for 'backtrace_flag' variable.
>
> If we use test_and_set, maybe we need smp_wmb() after test_and_set.
> (If I wrong, please correct me, thanks. :-) )

No, test_and_set_bit() is SMP safe and is an implicit barrier as well - so no
smp_wmb() or other barriers are needed.

Thanks,

Ingo

2010-11-11 11:00:34

by DDD

[permalink] [raw]
Subject: Re: [PATCH] x86: avoid calling arch_trigger_all_cpu_backtrace() at the same time on SMP

Ingo Molnar wrote:
> * DDD <[email protected]> wrote:
>
>> Ingo Molnar wrote:
>>> * Dongdong Deng <[email protected]> wrote:
>>>
>>>> +static int backtrace_flag;
>>>> + if (cmpxchg(&backtrace_flag, 0, 1) != 0)
>>> Sorry to be a PITA, but i asked for test_and_set() because that's
>>> the simplest primitive. cmpxchg() semantics is not nearly as
>>> obvious and people regularly get it wrong :-/
>> As the 'backtrace_flag' could be accessed by multi-cpus on SMP at
>> the same time, I use cmpxchg() for getting a atomic/memory barrier
>> operation for 'backtrace_flag' variable.
>>
>> If we use test_and_set, maybe we need smp_wmb() after test_and_set.
>> (If I wrong, please correct me, thanks. :-) )
>
> No, test_and_set_bit() is SMP safe and is an implicit barrier as well - so no
> smp_wmb() or other barriers are needed.

Yep, the spin_lock of test_and_set_bit() could make sure that.

Thank you very much,
I will send out the new patch quickly. :-)

Dongdong

>
> Thanks,
>
> Ingo
>