2002-09-12 20:30:32

by Robert Love

[permalink] [raw]
Subject: [PATCH] kernel BUG at sched.c:944! only with CONFIG_PREEMPT=y]

On Thu, 2002-09-12 at 16:20, Steven Cole wrote:

> Yes. Sorry, didn't catch this reply until now.
> I backed out Changeset 1.606 which Linus did to kernel/sched.c did this:
>
> - if (unlikely(in_interrupt()))
> + if (unlikely(in_atomic()))
>
> and 2.5.34-mm2 was able to boot with CONFIG_PREEMPT=y.
>
> As I said in a response to myself on lkml, I know this isn't a fix,
> it just shows there is a problem somewhere with preempt.

No, there is not a problem in preempt... what this change does is BUG()
out if schedule() is called while being in any way non-atomic.

While this sounds like a great debugging check, it is not useful in
general since we surely have some bad code that calls schedule() with
locks held. Further, since the atomic accounting only includes locks if
CONFIG_PREEMPT is set, you only see this with kernel preemption enabled.

Linus, please back this out... attached patch is against current BK.

Yeah, I know we can change the BUG() to a show_stack() ... but I still
think it will be too much and just deter people from using kernel
preemption which is the opposite of what I want.

Robert Love

diff -urN linux-2.5.34/kernel/sched.c linux/kernel/sched.c
--- linux-2.5.34/kernel/sched.c Thu Sep 12 16:26:23 2002
+++ linux/kernel/sched.c Thu Sep 12 16:30:22 2002
@@ -940,8 +940,7 @@
struct list_head *queue;
int idx;

- if (unlikely(in_atomic()))
- BUG();
+ BUG_ON(in_interrupt());

#if CONFIG_DEBUG_HIGHMEM
check_highmem_ptes();



2002-09-12 20:33:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] kernel BUG at sched.c:944! only with CONFIG_PREEMPT=y]


On 12 Sep 2002, Robert Love wrote:

> While this sounds like a great debugging check, it is not useful in
> general since we surely have some bad code that calls schedule() with
> locks held. Further, since the atomic accounting only includes locks if
> CONFIG_PREEMPT is set, you only see this with kernel preemption enabled.

it *is* a great debugging check, at zero added cost. Scheduling from an
atomic region *is* a critical bug that can and will cause problems in 99%
of the cases. Rather fix the asserts that got triggered instead of backing
out useful debugging checks ...

Ingo

2002-09-12 20:40:57

by Robert Love

[permalink] [raw]
Subject: Re: [PATCH] kernel BUG at sched.c:944! only with CONFIG_PREEMPT=y]

On Thu, 2002-09-12 at 16:44, Ingo Molnar wrote:

> it *is* a great debugging check, at zero added cost. Scheduling from an
> atomic region *is* a critical bug that can and will cause problems in 99%
> of the cases. Rather fix the asserts that got triggered instead of backing
> out useful debugging checks ...

There are a lot of shitty drivers that this is going to catch. Yes,
that is great... but we cannot BUG(). There really are a LOT of them.
In the least, we need to show_trace().

Anyhow, this currently will catch _all_ kernel preemptions because the
PREEMPT_ACTIVE flag is set.

We need to do something like the attached (untested), at the very
least...

I would prefer to make this a kernel debugging check, however. Or, make
using kernel preemption the default. Using the "free" abilities of
kernel preemption is great, but not at the expense of its users.

Robert Love

diff -urN linux-2.5.34/kernel/sched.c linux/kernel/sched.c
--- linux-2.5.34/kernel/sched.c Thu Sep 12 16:26:23 2002
+++ linux/kernel/sched.c Thu Sep 12 16:42:59 2002
@@ -940,8 +940,10 @@
struct list_head *queue;
int idx;

- if (unlikely(in_atomic()))
- BUG();
+ if (unlikely(in_atomic() && preempt_count() != PREEMPT_ACTIVE)) {
+ printk(KERN_ERROR "schedule() called while non-atomic!\n");
+ show_stack(NULL);
+ }

#if CONFIG_DEBUG_HIGHMEM
check_highmem_ptes();
@@ -959,7 +961,7 @@
* if entering off of a kernel preemption go straight
* to picking the next task.
*/
- if (unlikely(preempt_count() & PREEMPT_ACTIVE))
+ if (unlikely(preempt_count() == PREEMPT_ACTIVE))
goto pick_next_task;

switch (prev->state) {

2002-09-12 20:57:15

by Steven Cole

[permalink] [raw]
Subject: Re: [PATCH] kernel BUG at sched.c:944! only with CONFIG_PREEMPT=y]

On Thu, 2002-09-12 at 14:45, Robert Love wrote:
> On Thu, 2002-09-12 at 16:44, Ingo Molnar wrote:
>
> > it *is* a great debugging check, at zero added cost. Scheduling from an
> > atomic region *is* a critical bug that can and will cause problems in 99%
> > of the cases. Rather fix the asserts that got triggered instead of backing
> > out useful debugging checks ...
>
> There are a lot of shitty drivers that this is going to catch. Yes,
> that is great... but we cannot BUG(). There really are a LOT of them.
> In the least, we need to show_trace().
>
> Anyhow, this currently will catch _all_ kernel preemptions because the
> PREEMPT_ACTIVE flag is set.
>
> We need to do something like the attached (untested), at the very
> least...
>
> I would prefer to make this a kernel debugging check, however. Or, make
> using kernel preemption the default. Using the "free" abilities of
> kernel preemption is great, but not at the expense of its users.
>
> Robert Love
>
> diff -urN linux-2.5.34/kernel/sched.c linux/kernel/sched.c
> --- linux-2.5.34/kernel/sched.c Thu Sep 12 16:26:23 2002
> +++ linux/kernel/sched.c Thu Sep 12 16:42:59 2002
> @@ -940,8 +940,10 @@
> struct list_head *queue;
> int idx;
>
> - if (unlikely(in_atomic()))
> - BUG();
> + if (unlikely(in_atomic() && preempt_count() != PREEMPT_ACTIVE)) {
> + printk(KERN_ERROR "schedule() called while non-atomic!\n");
> + show_stack(NULL);
> + }
>
> #if CONFIG_DEBUG_HIGHMEM
> check_highmem_ptes();
> @@ -959,7 +961,7 @@
> * if entering off of a kernel preemption go straight
> * to picking the next task.
> */
> - if (unlikely(preempt_count() & PREEMPT_ACTIVE))
> + if (unlikely(preempt_count() == PREEMPT_ACTIVE))
> goto pick_next_task;
>
> switch (prev->state) {
>

OK, that patch is tested now. I had to s/KERN_ERROR/KERN_ERR/, but
I was able to get the system to spew out some
"schedule() called while non-atomic!" messages along
with the traces. I can type those into a file and run it through
ksymoops if that would be helpful.

Steven

2002-09-12 21:03:45

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] kernel BUG at sched.c:944! only with CONFIG_PREEMPT=y]

Ingo Molnar wrote:
>
> On 12 Sep 2002, Robert Love wrote:
>
> > While this sounds like a great debugging check, it is not useful in
> > general since we surely have some bad code that calls schedule() with
> > locks held. Further, since the atomic accounting only includes locks if
> > CONFIG_PREEMPT is set, you only see this with kernel preemption enabled.
>
> it *is* a great debugging check, at zero added cost. Scheduling from an
> atomic region *is* a critical bug that can and will cause problems in 99%
> of the cases. Rather fix the asserts that got triggered instead of backing
> out useful debugging checks ...
>

The problem here is that some random piece of code has bumped
the preemption counter, and we've lost all trace of that at
the site where the problem is detected.

So... In do_initcalls() we'd need:

do {
(*call)();
+ if (in_atomic())
+ printk("initcall at %p is buggy\n", call);
call++;
} while (call < &__initcall_end);

and to diagnose this particular problem I guess we need to
add

if (in_atomic())
printk("goofed at %d\n", __LINE__);

to twenty or so places in start_kernel().

2002-09-13 07:08:26

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] kernel BUG at sched.c:944! only with CONFIG_PREEMPT=y]


On 12 Sep 2002, Robert Love wrote:

> > it *is* a great debugging check, at zero added cost. Scheduling from an
> > atomic region *is* a critical bug that can and will cause problems in 99%
> > of the cases. Rather fix the asserts that got triggered instead of backing
> > out useful debugging checks ...
>
> There are a lot of shitty drivers that this is going to catch. [...]

of course. And your point in making it in_interrupt() had what purpose -
hiding that tons of code breaks preemption? [and tons of code breaks on
SMP.] Your patch was removing precisely the tool that can be used to
improve SMP quality on UP boxes as well.

> [...] Yes, that is great... but we cannot BUG(). There really are a LOT
> of them. In the least, we need to show_trace().

yes. And we also need kallsyms and kksymoops in the kernel, so that people
can send in meaningful traces.

Ingo

2002-09-13 07:31:40

by Robert Love

[permalink] [raw]
Subject: Re: [PATCH] kernel BUG at sched.c:944! only with CONFIG_PREEMPT=y]

On Fri, 2002-09-13 at 03:19, Ingo Molnar wrote:

> of course. And your point in making it in_interrupt() had what purpose -
> hiding that tons of code breaks preemption? [and tons of code breaks on
> SMP.] Your patch was removing precisely the tool that can be used to
> improve SMP quality on UP boxes as well.

Ingo, Attached is the latest patch I sent Linus. I never doubted the
usefulness, just the practicality of having endusers parse so much
information.

At any rate, the approach in Linus's BK is wrong because (a) the debug
is hit way too often to just BUG() and (b) we have to take into account
PREEMPT_ACTIVE.

> yes. And we also need kallsyms and kksymoops in the kernel, so that people
> can send in meaningful traces.

Agree 100% here.

Robert Love

iff -urN linux-2.5.34/kernel/sched.c linux/kernel/sched.c
--- linux-2.5.34/kernel/sched.c Thu Sep 12 16:26:23 2002
+++ linux/kernel/sched.c Thu Sep 12 16:42:59 2002
@@ -940,8 +940,10 @@
struct list_head *queue;
int idx;

- if (unlikely(in_atomic()))
- BUG();
+ if (unlikely(in_atomic() && preempt_count() != PREEMPT_ACTIVE)) {
+ printk(KERN_ERROR "schedule() called while non-atomic!\n");
+ show_stack(NULL);
+ }

#if CONFIG_DEBUG_HIGHMEM
check_highmem_ptes();
@@ -959,7 +961,7 @@
* if entering off of a kernel preemption go straight
* to picking the next task.
*/
- if (unlikely(preempt_count() & PREEMPT_ACTIVE))
+ if (unlikely(preempt_count() == PREEMPT_ACTIVE))
goto pick_next_task;

switch (prev->state) {

2002-09-13 07:35:34

by Robert Love

[permalink] [raw]
Subject: Re: [PATCH] kernel BUG at sched.c:944! only with CONFIG_PREEMPT=y]

On Fri, 2002-09-13 at 03:36, Robert Love wrote:

> - if (unlikely(in_atomic()))
> - BUG();
> + if (unlikely(in_atomic() && preempt_count() != PREEMPT_ACTIVE)) {
> + printk(KERN_ERROR "schedule() called while non-atomic!\n");
> + show_stack(NULL);
> + }

Actually, looking at this again, we probably want to still BUG() if
in_interrupt() but _not_ if in_atomic().

Robert Love

2002-09-13 11:51:11

by Thunder from the hill

[permalink] [raw]
Subject: Re: [PATCH] kernel BUG at sched.c:944! only with CONFIG_PREEMPT=y]

Hi,

On 13 Sep 2002, Robert Love wrote:
> On Fri, 2002-09-13 at 03:36, Robert Love wrote:
> Actually, looking at this again, we probably want to still BUG() if
> in_interrupt() but _not_ if in_atomic().

- if (unlikely(in_atomic()))
- BUG();
+ if (unlikely((in_interrupt() || (!in_atomic())) && preempt_count() != PREEMPT_ACTIVE)) {
+ printk(KERN_ERROR "schedule() called while non-atomic, or in interrupt!\n");
+ show_stack(NULL);
+ }

?

Thunder
--
--./../...-/. -.--/---/..-/.-./..././.-../..-. .---/..-/.../- .-
--/../-./..-/-/./--..-- ../.----./.-../.-.. --./../...-/. -.--/---/..-
.- -/---/--/---/.-./.-./---/.--/.-.-.-
--./.-/-.../.-./.././.-../.-.-.-