2009-01-26 15:17:41

by Alexey Zaytsev

[permalink] [raw]
Subject: Re: next-20090107: WARNING: at kernel/sched.c:4435


On Mon, Jan 26, 2009 at 18:09, Ingo Molnar <[email protected]> wrote:
>
> * Alexey Zaytsev <[email protected]> wrote:
>
>> On Mon, Jan 26, 2009 at 17:43, Ingo Molnar <[email protected]> wrote:
>> >
>> > * Alexey Zaytsev <[email protected]> wrote:
>> >
>> >> On Wed, Jan 14, 2009 at 05:00, Nick Piggin <[email protected]> wrote:
>> >> > On Sun, Jan 11, 2009 at 03:49:45AM +0100, Ingo Molnar wrote:
>> >> >>
>> >> >> * Alexey Zaytsev <[email protected]> wrote:
>> >> >>
>> >> >> > One more instance of http://marc.info/?l=linux-kernel&m=123134586202636&w=2
>> >> >> > Added Ingo Molnar to CC.
>> >> >>
>> >> >> added Nick on Cc:. Nick, it's about:
>> >> >>
>> >> >> > commit 7317d7b87edb41a9135e30be1ec3f7ef817c53dd
>> >> >> > Author: Nick Piggin <[email protected]>
>> >> >> > Date: Tue Sep 30 20:50:27 2008 +1000
>> >> >> >
>> >> >> > sched: improve preempt debugging
>> >> >>
>> >> >> causing a seemingly spurious warning.
>> >> >
>> >> > I don't know how it is spurious... Presumably the sequence _would_ have
>> >> > caused preempt count to go negative if the bkl were not held...
>> >> >
>> >> > __do_softirq does a __local_bh_disable on entry, and it seems like the
>> >> > _local_bh_enable on exit is what causes this warning. So something is
>> >> > unbalanced somehow. Or is it some weird thing we do in early boot that
>> >> > I am missing?
>> >> >
>> >> > Can you put in some printks around these functions in early boot to
>> >> > get an idea of what preempt_count is doing?
>> >> >
>> >>
>> >> Hi again.
>> >>
>> >> Finally got to debug this. The preempt count on the first __do_softirq entry
>> >> ever is 0, as it is set in irq_ctx_init(). The interrupted swapper
>> >> thread happens
>> >> to be in the kernel_locked() state at the moment, so the warning.
>> >>
>> >> I don't understand why the softirq preempt count is initialized to 0.
>> >> Should not it be SOFTIRQ_OFFSET instead?
>> >
>> > hm, indeed. So this triggers on irqstacks, if an irq happens to hit
>> > the first time a softirq executes (ever)? After that point the
>> > preempt_count in the irq-stack ought to stay elevated.
>>
>> No, this happens on the first softirq, which is run after an irq. An irq
>> interrupts the swapper thread while it is holding the blk. It is
>> executed on the hard irq stack, and the corresponding
>> thread_info.preempt_count is set correctly by irq_ctx_init(), so nothing
>> happens. After the hard IRQ is over, a softirq is run on the soft irq
>> stack, but irq_ctx_init() set it's preempt_count to zero. So after the
>> first softirq os over, sub_preempt_count() discovers that the preempt
>> count is goind back to zero, while the BKL is held (by the interrupted
>> thread), and refuses to decrease the count. So the spftirq preempt_count
>> stays SOFTIRQ_OFFSET which is now correct, so no further warnings are
>> triggered.
>
> yeah. So we need to fix the initial softirq-stack preempt_count value.

Like this? ;)

From: Alexey Zaytsev <[email protected]>
Subject: [PATCH] Set the initial softirq preempt count to SOFTIRQ_OFFSET

Does not changes the preemption semantics, as the
softirq's obviously can't be preempted, but fixes
a spurious warning in sub_preempt_count, which
happens when the preempt count is returned to
zero, and the interrupted thread is holding the
BKL.

Signed-off-by: Alexey Zaytsev <[email protected]>
---

arch/x86/kernel/irq_32.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c
index 74b9ff7..8d99de6 100644
--- a/arch/x86/kernel/irq_32.c
+++ b/arch/x86/kernel/irq_32.c
@@ -141,7 +141,7 @@ void __cpuinit irq_ctx_init(int cpu)
irqctx->tinfo.task = NULL;
irqctx->tinfo.exec_domain = NULL;
irqctx->tinfo.cpu = cpu;
- irqctx->tinfo.preempt_count = 0;
+ irqctx->tinfo.preempt_count = SOFTIRQ_OFFSET;
irqctx->tinfo.addr_limit = MAKE_MM_SEG(0);

softirq_ctx[cpu] = irqctx;


2009-01-26 15:22:34

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH] sched: improve preempt debugging


* Alexey Zaytsev <[email protected]> wrote:

> From: Alexey Zaytsev <[email protected]>
> Subject: [PATCH] Set the initial softirq preempt count to SOFTIRQ_OFFSET
>
> Does not changes the preemption semantics, as the
> softirq's obviously can't be preempted, but fixes
> a spurious warning in sub_preempt_count, which
> happens when the preempt count is returned to
> zero, and the interrupted thread is holding the
> BKL.
>
> Signed-off-by: Alexey Zaytsev <[email protected]>

applied to tip/sched/urgent, thanks Alexey.

I also re-added Nick's original patch on top of that (see it below) - this
way we'll see whether it's all sorted out for good.

Thanks,

Ingo

------------------>
>From ed094285f3495ad5393fc424011c926f1b6138c1 Mon Sep 17 00:00:00 2001
From: Nick Piggin <[email protected]>
Date: Tue, 30 Sep 2008 20:50:27 +1000
Subject: [PATCH] sched: improve preempt debugging

This patch helped me out with a problem I recently had....

Basically, when the kernel lock is held, then preempt_count underflow does not
get detected until it is released which may be a long time (and arbitrarily,
eg at different points it may be rescheduled). If the bkl is released at
schedule, the resulting output is actually fairly cryptic...

With any other lock that elevates preempt_count, it is illegal to schedule
under it (which would get found pretty quickly). bkl allows scheduling with
preempt_count elevated, which makes underflows hard to debug.

Signed-off-by: Ingo Molnar <[email protected]>

[ This is a re-try of this commit:

7317d7b: sched: improve preempt debugging

which was reverted in:

01e3eb8: Revert "sched: improve preempt debugging"

because this commit:

d04d99d: x86: set the initial softirq preempt count to SOFTIRQ_OFFSET

Fixes the real cause of the spurious warning. ]
---
kernel/sched.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 52bbf1c..8c2be1e 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4440,7 +4440,7 @@ void __kprobes sub_preempt_count(int val)
/*
* Underflow?
*/
- if (DEBUG_LOCKS_WARN_ON(val > preempt_count()))
+ if (DEBUG_LOCKS_WARN_ON(val > preempt_count() - (!!kernel_locked())))
return;
/*
* Is the spinlock portion underflowing?