2018-01-22 22:01:41

by Tejun Heo

[permalink] [raw]
Subject: [PATCH] lockdep: Avoid triggering hardlockup from debug_show_all_locks()

debug_show_all_locks() iterates all tasks and print held locks whole
holding tasklist_lock. This can take a while on a slow console device
and may end up triggering NMI hardlockup detector if someone else ends
up waiting for tasklist_lock.

Touch the NMI watchdog while printing the held locks to avoid
spuriously triggering the hardlockup detector.

Signed-off-by: Tejun Heo <[email protected]>
---
kernel/locking/lockdep.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5fa1324..5216590 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -49,6 +49,7 @@
#include <linux/gfp.h>
#include <linux/random.h>
#include <linux/jhash.h>
+#include <linux/nmi.h>

#include <asm/sections.h>

@@ -4490,6 +4491,7 @@ void debug_show_all_locks(void)
if (!unlock)
if (read_trylock(&tasklist_lock))
unlock = 1;
+ touch_nmi_watchdog();
} while_each_thread(g, p);

pr_warn("\n");


2018-01-23 19:07:06

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Avoid triggering hardlockup from debug_show_all_locks()

On Mon, 2018-01-22 at 14:00 -0800, Tejun Heo wrote:
> debug_show_all_locks() iterates all tasks and print held locks whole
> holding tasklist_lock. This can take a while on a slow console
> device
> and may end up triggering NMI hardlockup detector if someone else
> ends
> up waiting for tasklist_lock.
>
> Touch the NMI watchdog while printing the held locks to avoid
> spuriously triggering the hardlockup detector.
>
> Signed-off-by: Tejun Heo <[email protected]>

On this patch:

Acked-by: Rik van Riel <[email protected]>


However, it seems like we run into things like
this on a fairly regular (though not very frequent)
basis. Would it make sense to go through the code
and add sprinkle around a few more touch_nmi_watchdog()
calls?

After all, there are maybe a few dozen places where
we print out a lot of debugging information.

2018-01-23 20:57:47

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Avoid triggering hardlockup from debug_show_all_locks()

Hello,

(cc'ing Steven, Sergey and Petr who are working on printk)

On Tue, Jan 23, 2018 at 02:03:57PM -0500, Rik van Riel wrote:
> On Mon, 2018-01-22 at 14:00 -0800, Tejun Heo wrote:
> > debug_show_all_locks() iterates all tasks and print held locks whole
> > holding tasklist_lock. This can take a while on a slow console
> > device
> > and may end up triggering NMI hardlockup detector if someone else
> > ends
> > up waiting for tasklist_lock.
> >
> > Touch the NMI watchdog while printing the held locks to avoid
> > spuriously triggering the hardlockup detector.
> >
> > Signed-off-by: Tejun Heo <[email protected]>
>
> On this patch:
>
> Acked-by: Rik van Riel <[email protected]>
>
>
> However, it seems like we run into things like
> this on a fairly regular (though not very frequent)
> basis. Would it make sense to go through the code
> and add sprinkle around a few more touch_nmi_watchdog()
> calls?
>
> After all, there are maybe a few dozen places where
> we print out a lot of debugging information.

Yeah, it's ridiculous how often printk ends up escalating otherwise
recoverable situations into system crashes. I don't know what the
right answer is. For spurious NMI hardlockups, maybe auditing debug
paths and adding touch_nmi_watchdog() would be enough but that also is
a pretty leaky approach.

Thanks.

--
tejun

2018-01-23 21:01:40

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Avoid triggering hardlockup from debug_show_all_locks()

On Tue, 23 Jan 2018 12:57:06 -0800
Tejun Heo <[email protected]> wrote:

> Yeah, it's ridiculous how often printk ends up escalating otherwise
> recoverable situations into system crashes. I don't know what the
> right answer is. For spurious NMI hardlockups, maybe auditing debug
> paths and adding touch_nmi_watchdog() would be enough but that also is
> a pretty leaky approach.

What about if every printk were to touch NMI watchdog?

NMI watchdog is really there for when the system locks up. If the
system is locked up doing printk, at least we see what is happening,
and not a total freeze.

-- Steve

2018-01-23 21:13:39

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Avoid triggering hardlockup from debug_show_all_locks()

Hello, Steven.

On Tue, Jan 23, 2018 at 04:00:54PM -0500, Steven Rostedt wrote:
> On Tue, 23 Jan 2018 12:57:06 -0800
> Tejun Heo <[email protected]> wrote:
>
> > Yeah, it's ridiculous how often printk ends up escalating otherwise
> > recoverable situations into system crashes. I don't know what the
> > right answer is. For spurious NMI hardlockups, maybe auditing debug
> > paths and adding touch_nmi_watchdog() would be enough but that also is
> > a pretty leaky approach.
>
> What about if every printk were to touch NMI watchdog?
>
> NMI watchdog is really there for when the system locks up. If the
> system is locked up doing printk, at least we see what is happening,
> and not a total freeze.

Yeah, that would definitely be a solution. The downside is that when
the system completely locks up from printk storm while holding
critical locks (say, tasklist_lock), the watchdog won't be able to
reset the system. I guess the judgement would depend on what one
expects of the NMI watchdog, but I personally would be happier with
printk touching NMI automatically.

Thanks.

--
tejun

2018-01-24 02:50:43

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Avoid triggering hardlockup from debug_show_all_locks()

Hello,

On (01/23/18 13:11), Tejun Heo wrote:
[..]
> > What about if every printk were to touch NMI watchdog?
> >
> > NMI watchdog is really there for when the system locks up. If the
> > system is locked up doing printk, at least we see what is happening,
> > and not a total freeze.
>
> Yeah, that would definitely be a solution. The downside is that when
> the system completely locks up from printk storm while holding
> critical locks (say, tasklist_lock), the watchdog won't be able to
> reset the system.

Agreed.

It's not only NMI watchdog. RCU also might get stalled by printk.

> I guess the judgement would depend on what one expects of the NMI watchdog,
> but I personally would be happier with printk touching NMI automatically.

In the long term I think I'd rather move printk to a batched mode: printk
for X seconds (depending on watchdog threshold) tops and offload, don't stay
in the same context.

It seems, sometimes, that "offloading will ruin printk" thing might be a
bit exaggerated. IMHO.

-ss

P.S.
Another problem, and I mentioned it somewhere in another email, is that
upstream printk people don't receive enough [if any at all] feedback from
guys who face printk issues. That's why every time printk_kthread re-surfaces
the reaction is "this is not a real problem, no one is seeing printk issues
like these, you idiot!". It'd be great to have more "we need ABC, because of
XYZ, but printk crashes the system. Here is the backtrace, fix it" reports.
As of now, those things mostly are not reported, that's why people are not
convinced. Just my 5 cents.

2018-01-24 02:55:38

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Avoid triggering hardlockup from debug_show_all_locks()

On Wed, 24 Jan 2018 11:49:55 +0900
Sergey Senozhatsky <[email protected]> wrote:

> Another problem, and I mentioned it somewhere in another email, is that
> upstream printk people don't receive enough [if any at all] feedback from
> guys who face printk issues. That's why every time printk_kthread re-surfaces
> the reaction is "this is not a real problem, no one is seeing printk issues
> like these, you idiot!". It'd be great to have more "we need ABC, because of
> XYZ, but printk crashes the system. Here is the backtrace, fix it" reports.
> As of now, those things mostly are not reported, that's why people are not
> convinced. Just my 5 cents.

If you are seeing these issues, have whoever is reporting them to Cc
LKML, and those of us that would care to listen.

-- Steve

2018-01-24 05:01:52

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Avoid triggering hardlockup from debug_show_all_locks()

On (01/23/18 21:54), Steven Rostedt wrote:
>
> > Another problem, and I mentioned it somewhere in another email, is that
> > upstream printk people don't receive enough [if any at all] feedback from
> > guys who face printk issues. That's why every time printk_kthread re-surfaces
> > the reaction is "this is not a real problem, no one is seeing printk issues
> > like these, you idiot!". It'd be great to have more "we need ABC, because of
> > XYZ, but printk crashes the system. Here is the backtrace, fix it" reports.
> > As of now, those things mostly are not reported, that's why people are not
> > convinced. Just my 5 cents.
>
> If you are seeing these issues, have whoever is reporting them to Cc
> LKML, and those of us that would care to listen.

OK. The lack of reports can also signify that we need to change the way we
handle those reports. If we are going to reply "yes, your system crashed
while doing completely innocent printout, but if we fix it then we can
increase by 0.0001% chances of not getting any printouts at all in that
corner case when your system is in recursive double panic over the latest
bitcoin price and your keyboard is on fire" then I don't think people will
care to report anything.

Maybe Tejun will be kind enough to shed some light on how often FB fleet
suffer from printk related issues, or what are the most common scenarios,
etc. [sensitive information can be reported "off list"]

-ss

Subject: [tip:locking/urgent] locking/lockdep: Avoid triggering hardlockup from debug_show_all_locks()

Commit-ID: 88f1c87de11a86d839f4ce5313e552d96709b990
Gitweb: https://git.kernel.org/tip/88f1c87de11a86d839f4ce5313e552d96709b990
Author: Tejun Heo <[email protected]>
AuthorDate: Mon, 22 Jan 2018 14:00:55 -0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 24 Jan 2018 10:00:09 +0100

locking/lockdep: Avoid triggering hardlockup from debug_show_all_locks()

debug_show_all_locks() iterates all tasks and print held locks whole
holding tasklist_lock. This can take a while on a slow console device
and may end up triggering NMI hardlockup detector if someone else ends
up waiting for tasklist_lock.

Touch the NMI watchdog while printing the held locks to avoid
spuriously triggering the hardlockup detector.

Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/locking/lockdep.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5fa1324..5216590 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -49,6 +49,7 @@
#include <linux/gfp.h>
#include <linux/random.h>
#include <linux/jhash.h>
+#include <linux/nmi.h>

#include <asm/sections.h>

@@ -4490,6 +4491,7 @@ retry:
if (!unlock)
if (read_trylock(&tasklist_lock))
unlock = 1;
+ touch_nmi_watchdog();
} while_each_thread(g, p);

pr_warn("\n");

2018-01-24 19:11:59

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Avoid triggering hardlockup from debug_show_all_locks()

Hello, Sergey, Steven.

On Wed, Jan 24, 2018 at 02:00:35PM +0900, Sergey Senozhatsky wrote:
> On (01/23/18 21:54), Steven Rostedt wrote:
> >
> > > Another problem, and I mentioned it somewhere in another email, is that
> > > upstream printk people don't receive enough [if any at all] feedback from
> > > guys who face printk issues. That's why every time printk_kthread re-surfaces
> > > the reaction is "this is not a real problem, no one is seeing printk issues
> > > like these, you idiot!". It'd be great to have more "we need ABC, because of
> > > XYZ, but printk crashes the system. Here is the backtrace, fix it" reports.
> > > As of now, those things mostly are not reported, that's why people are not
> > > convinced. Just my 5 cents.
> >
> > If you are seeing these issues, have whoever is reporting them to Cc
> > LKML, and those of us that would care to listen.
>
> OK. The lack of reports can also signify that we need to change the way we
> handle those reports. If we are going to reply "yes, your system crashed
> while doing completely innocent printout, but if we fix it then we can
> increase by 0.0001% chances of not getting any printouts at all in that
> corner case when your system is in recursive double panic over the latest
> bitcoin price and your keyboard is on fire" then I don't think people will
> care to report anything.
>
> Maybe Tejun will be kind enough to shed some light on how often FB fleet
> suffer from printk related issues, or what are the most common scenarios,
> etc. [sensitive information can be reported "off list"]

There are efforts to automatically scrub and share kernel splats
publicly, so hopefully we'd be able to provide a better visibility
into the problems we encounter in the future. For now, there are some
security implications and I'm not very sure how liberal I can share.

In terms of frequency, it isn't catastrophic. I think Chris Mason
described it pretty eloquently - "more often than hardware failures,
not so often that we turn off serial console". One painful part is
that it adds noise to signal by escalating an unrelated problem to RCU
stalls or hard lockups just by occupying an unlucky context for too
long.

The relevance of these messages fall pretty rapidly as the number of
consecutive lines increases, so it is frustrating to pay for them this
way. After 15s of flushing along with a number of "X printk messages
dropped" warnings, we aren't really doing anyone any service by trying
to uphold "transmit the message as close to the printking context as
possible".

Thanks.

--
tejun