2014-12-16 22:25:55

by Eric Sandeen

[permalink] [raw]
Subject: [PATCH] check for stack overflow in ___might_sleep

Sometimes a "BUG: sleeping function called from invalid context"
message is not indicative of locking problems, but is the result
of a stack overflow corrupting the thread info.

Witness http://oss.sgi.com/archives/xfs/2014-02/msg00325.html
for example, which took a few go-rounds to sort out.

If we're printing the warning, things are wonky already, and
it'd be informative to check for the stack end corruption at this
point, too.

Signed-off-by: Eric Sandeen <[email protected]>
---

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b5797b7..4ef726c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7328,6 +7328,9 @@ void ___might_sleep(const char *file, int line, int preempt_offset)
in_atomic(), irqs_disabled(),
current->pid, current->comm);

+ if (task_stack_end_corrupted(current))
+ printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
+
debug_show_held_locks(current);
if (irqs_disabled())
print_irqtrace_events(current);