Hi,
While running kernbench with the 2.6.23-git8 following oops is produced
Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP:
[<ffffffff8033f347>] __rb_rotate_left+0x7/0x70
PGD 31f7ad067 PUD 31f14d067 PMD 0
Oops: 0000 [1] SMP
CPU 8
Modules linked in: loop dm_mod md_mod sg
Pid: 6923, comm: slpd Not tainted 2.6.23-git8-autokern1 #1
RIP: 0010:[<ffffffff8033f347>] [<ffffffff8033f347>] __rb_rotate_left+0x7/0x70
RSP: 0018:ffff81031d083e90 EFLAGS: 00010086
RAX: ffff8106147550d0 RBX: ffff81033007b650 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff810330080808 RDI: ffff8106147550d0
RBP: ffff8106147550d0 R08: ffff81033007b650 R09: ffff81033007b650
R10: ffff8103300807e0 R11: 0000000000000000 R12: ffff8106147550d0
R13: ffff810330080808 R14: ffff810330080780 R15: 0000000000000008
FS: 00002ab70eae80a0(0000) GS:ffff8106146b5440(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 000000031d08f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process slpd (pid: 6923, threadinfo ffff81031d082000, task ffff81031f31f100)
Stack: ffffffff8033f4aa ffff8106147550c0 ffff81031d083ed0 ffff8103300807e0
ffff81031f31f300 ffffffff8022bc59 0000000000000008 0000000000000384
ffff81031d083f70 ffffffff804b6dec ffff81031d61b0c0 0000000000001000
Call Trace:
[<ffffffff8033f4aa>] rb_insert_color+0x8a/0xf0
[<ffffffff8022bc59>] put_prev_task_fair+0x49/0x60
[<ffffffff804b6dec>] schedule+0xec/0x1d1
[<ffffffff80284685>] vfs_read+0xc5/0x160
[<ffffffff80284b63>] sys_read+0x53/0x90
[<ffffffff8020bb78>] sysret_careful+0xd/0x10
Code: 48 8b 51 10 49 83 e0 fc 48 85 d2 48 89 57 08 74 0c 48 8b 02
RIP [<ffffffff8033f347>] __rb_rotate_left+0x7/0x70
RSP <ffff81031d083e90>
CR2: 0000000000000010
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
* Kamalesh Babulal <[email protected]> wrote:
> While running kernbench with the 2.6.23-git8 following oops is
> produced
>
> Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP:
> [<ffffffff8033f347>] __rb_rotate_left+0x7/0x70
that looks nasty ...
and -git8 should have the v2.6.23 scheduler code in essence.
Ingo
* Ingo Molnar <[email protected]> wrote:
>
> * Kamalesh Babulal <[email protected]> wrote:
>
> > While running kernbench with the 2.6.23-git8 following oops is
> > produced
> >
> > Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP:
> > [<ffffffff8033f347>] __rb_rotate_left+0x7/0x70
>
> that looks nasty ...
>
> and -git8 should have the v2.6.23 scheduler code in essence.
ah, this is likely with the recent scheduler commits included as well,
right?
if it's reproducable, could you try with group scheduling disabled, i.e.
with:
# CONFIG_FAIR_GROUP_SCHED is not set
as that would be the main suspect for such type of a crash.
Ingo
On Tue, Oct 16, 2007 at 11:10:12AM +0200, Ingo Molnar wrote:
>
> * Kamalesh Babulal <[email protected]> wrote:
>
> > While running kernbench with the 2.6.23-git8 following oops is
> > produced
> >
> > Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP:
> > [<ffffffff8033f347>] __rb_rotate_left+0x7/0x70
>
> that looks nasty ...
>
> and -git8 should have the v2.6.23 scheduler code in essence.
To fill in a few details. This was triggered in the middle of a
kernbench run on the machine. A job with just dbench runs in it ran to
completion. The machine is a 4 node numa x86_64 system.
Seems that most schedular options are on:
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_FAIR_USER_SCHED=y
At the moment we don't have any historical jobs back from 2.6.23 so I
cannot be more specific as to when it arrived in mainline. The x86/x86_64
merge broke our build process; a bad assumption here, not a problem with
the merge.
-apw
On Tue, Oct 16, 2007 at 11:46:31AM +0200, Ingo Molnar wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> >
> > * Kamalesh Babulal <[email protected]> wrote:
> >
> > > While running kernbench with the 2.6.23-git8 following oops is
> > > produced
> > >
> > > Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP:
> > > [<ffffffff8033f347>] __rb_rotate_left+0x7/0x70
> >
> > that looks nasty ...
> >
> > and -git8 should have the v2.6.23 scheduler code in essence.
>
> ah, this is likely with the recent scheduler commits included as well,
> right?
>
> if it's reproducable, could you try with group scheduling disabled, i.e.
> with:
>
> # CONFIG_FAIR_GROUP_SCHED is not set
>
> as that would be the main suspect for such type of a crash.
Ok, preliminary results with this disabled seem to get us past
kernbench. I am rerunning with it enabled to confirm it blammo's.
-apw
On Tue, Oct 16, 2007 at 07:00:37PM +0100, Andy Whitcroft wrote:
> > ah, this is likely with the recent scheduler commits included as well,
> > right?
> >
> > if it's reproducable, could you try with group scheduling disabled, i.e.
> > with:
> >
> > # CONFIG_FAIR_GROUP_SCHED is not set
> >
> > as that would be the main suspect for such type of a crash.
>
> Ok, preliminary results with this disabled seem to get us past
> kernbench. I am rerunning with it enabled to confirm it blammo's.
Andy,
I have got details from Kamalesh on the machine where this is
recreatable (we couldn't recreate on other machines). Will start looking
at this tomorrow morning first thing ..
--
Regards,
vatsa
On Tue, Oct 16, 2007 at 07:00:37PM +0100, Andy Whitcroft wrote:
> On Tue, Oct 16, 2007 at 11:46:31AM +0200, Ingo Molnar wrote:
> >
> > * Ingo Molnar <[email protected]> wrote:
> >
> > >
> > > * Kamalesh Babulal <[email protected]> wrote:
> > >
> > > > While running kernbench with the 2.6.23-git8 following oops is
> > > > produced
> > > >
> > > > Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP:
> > > > [<ffffffff8033f347>] __rb_rotate_left+0x7/0x70
> > >
> > > that looks nasty ...
> > >
> > > and -git8 should have the v2.6.23 scheduler code in essence.
> >
> > ah, this is likely with the recent scheduler commits included as well,
> > right?
> >
> > if it's reproducable, could you try with group scheduling disabled, i.e.
> > with:
> >
> > # CONFIG_FAIR_GROUP_SCHED is not set
> >
> > as that would be the main suspect for such type of a crash.
>
> Ok, preliminary results with this disabled seem to get us past
> kernbench. I am rerunning with it enabled to confirm it blammo's.
A rerun with it enabled however ran to completion. So I think we have
to assume this is at best not 100% reproducible. Seems that more
directed testing is going on now.
-apw
* Kamalesh Babulal <[email protected]> wrote:
> While running kernbench with the 2.6.23-git8 following oops is
> produced
Dmitry found something that might explain the crash: could you check
whether the patch below fixes it?
Ingo
---------------------->
Subject: sched: fix new task startup crash
From: Ingo Molnar <[email protected]>
this should fix the put_prev_task crashes that were reported,
Dmitry Adamushko noticed that it's not valid to call into
task_new_fair() if this_cpu != task_cpu(p).
Reported-by: Kamalesh Babulal <[email protected]>
Reported-by: Andy Whitcroft <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1712,7 +1712,8 @@ void fastcall wake_up_new_task(struct ta
p->prio = effective_prio(p);
- if (!p->sched_class->task_new || !current->se.on_rq || !rq->cfs.curr) {
+ if (!p->sched_class->task_new || smp_processor_id() != task_cpu(p) ||
+ !current->se.on_rq || !rq->cfs.curr) {
activate_task(rq, p, 0);
} else {
/*
On Wed, Oct 17, 2007 at 04:21:40PM +0200, Ingo Molnar wrote:
>
> * Kamalesh Babulal <[email protected]> wrote:
>
> > While running kernbench with the 2.6.23-git8 following oops is
> > produced
>
> Dmitry found something that might explain the crash: could you check
> whether the patch below fixes it?
>
> Ingo
>
> ---------------------->
> Subject: sched: fix new task startup crash
> From: Ingo Molnar <[email protected]>
>
> this should fix the put_prev_task crashes that were reported,
> Dmitry Adamushko noticed that it's not valid to call into
> task_new_fair() if this_cpu != task_cpu(p).
>
> Reported-by: Kamalesh Babulal <[email protected]>
> Reported-by: Andy Whitcroft <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>
I have submitted jobs against a couple of the releases which showed this
problem with this patch. They will be a while as there are other tests
running at the moment. Will let you know when they are done.
-apw
On Wed, Oct 17, 2007 at 04:21:40PM +0200, Ingo Molnar wrote:
> * Kamalesh Babulal <[email protected]> wrote:
>
> > While running kernbench with the 2.6.23-git8 following oops is
> > produced
>
> Dmitry found something that might explain the crash: could you check
> whether the patch below fixes it?
> this should fix the put_prev_task crashes that were reported,
> Dmitry Adamushko noticed that it's not valid to call into
> task_new_fair() if this_cpu != task_cpu(p).
I don't see a fundamental reason why it would be invalid to call
task_new_fair() when this_cpu != task_cpu(p). Besides, calling
activate_task->enqueue_task->enqueue_task_fair() on a new born task (as
is being done in the patch you have sent) is slightly buggy in the sense that
its p->se.vruntime is not properly calculated (because we set wakeup argument
as 0).
We (myself, Kamalesh and Dhaval) have tested the patch below, w/o being
able to recreate the problem. The patch allows for task_new_fair() to be
called even for the case when child is being added to another cpu's
runqueue.
--
Child task may be added on a different cpu that the one on which parent
is running. In which case, task_new_fair() should check whether the new
born task's parent entity should be added as well on the cfs_rq.
Patch below fixes the problem in task_new_fair.
Signed-off-by : Srivatsa Vaddagiri <[email protected]>
---
kernel/sched.c | 2 +-
kernel/sched_fair.c | 6 +-----
2 files changed, 2 insertions(+), 6 deletions(-)
Index: current/kernel/sched.c
===================================================================
--- current.orig/kernel/sched.c
+++ current/kernel/sched.c
@@ -1712,7 +1712,7 @@ void fastcall wake_up_new_task(struct ta
p->prio = effective_prio(p);
- if (!p->sched_class->task_new || !current->se.on_rq || !rq->cfs.curr) {
+ if (!p->sched_class->task_new || !current->se.on_rq) {
activate_task(rq, p, 0);
} else {
/*
Index: current/kernel/sched_fair.c
===================================================================
--- current.orig/kernel/sched_fair.c
+++ current/kernel/sched_fair.c
@@ -1031,12 +1031,8 @@ static void task_new_fair(struct rq *rq,
swap(curr->vruntime, se->vruntime);
}
- update_stats_enqueue(cfs_rq, se);
- check_spread(cfs_rq, se);
- check_spread(cfs_rq, curr);
- __enqueue_entity(cfs_rq, se);
- account_entity_enqueue(cfs_rq, se);
se->peer_preempt = 0;
+ enqueue_task_fair(rq, p, 0);
resched_task(rq->curr);
}
--
Regards,
vatsa
* Srivatsa Vaddagiri <[email protected]> wrote:
> On Wed, Oct 17, 2007 at 04:21:40PM +0200, Ingo Molnar wrote:
> > * Kamalesh Babulal <[email protected]> wrote:
> >
> > > While running kernbench with the 2.6.23-git8 following oops is
> > > produced
> >
> > Dmitry found something that might explain the crash: could you check
> > whether the patch below fixes it?
>
> > this should fix the put_prev_task crashes that were reported,
> > Dmitry Adamushko noticed that it's not valid to call into
> > task_new_fair() if this_cpu != task_cpu(p).
>
> I don't see a fundamental reason why it would be invalid to call
> task_new_fair() when this_cpu != task_cpu(p). Besides, calling
> activate_task->enqueue_task->enqueue_task_fair() on a new born task
> (as is being done in the patch you have sent) is slightly buggy in the
> sense that its p->se.vruntime is not properly calculated (because we
> set wakeup argument as 0).
yes - i pointed this out in a separate mail.
> We (myself, Kamalesh and Dhaval) have tested the patch below, w/o
> being able to recreate the problem. The patch allows for
> task_new_fair() to be called even for the case when child is being
> added to another cpu's runqueue.
yes, and your fix is the better one, it goes into the next batch of
fixes.
Ingo
On Wed, Oct 17, 2007 at 04:52:26PM +0200, Ingo Molnar wrote:
> > We (myself, Kamalesh and Dhaval) have tested the patch below, w/o
> > being able to recreate the problem. The patch allows for
> > task_new_fair() to be called even for the case when child is being
> > added to another cpu's runqueue.
>
> yes, and your fix is the better one, it goes into the next batch of
> fixes.
Have that patch submitted too. Somewhen we'll have some more results.
-apw