2006-09-28 20:25:52

by Bryce Harrington

[permalink] [raw]
Subject: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work

Apologies if this has already been reported; I didn't spot it on the
list. We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
-git9, but not -git7:

mptbase: Initiating ioc0 recovery
Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
[<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
PGD 0
Oops: 0000 [1] PREEMPT SMP
CPU 0
Modules linked in:
Pid: 8, comm: events/0 Not tainted 2.6.18-git8 #1
RIP: 0010:[<ffffffff80489aa2>] [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
RSP: 0000:ffff81003ec65e40 EFLAGS: 00010282
RAX: 0000000000000002 RBX: ffff81003e86f640 RCX: 000000000000001e
RDX: 0000000000000001 RSI: 0000000000000213 RDI: 000000000003e86f
RBP: 0000000000000500 R08: ffff81003ec64000 R09: ffff81003ed0cf40
R10: ffff81003e86f640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
R13: 0000000000000213 R14: ffff81003e86f640 R15: ffffffff80489a96
FS: 0000000000000000(0000) GS:ffffffff80779000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
Stack: ffff81003ec65ef8 ffff81003e86f640 ffff81003e86f648 ffffffff8023f1bd
ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
00000000fffffffc ffffffff80593ffd 0000000000000000 ffffffff8023f300
Call Trace:
[<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
[<ffffffff8023f204>] worker_thread+0x0/0x12e
[<ffffffff8023f300>] worker_thread+0xfc/0x12e
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80242433>] kthread+0xc8/0xf1
[<ffffffff8020a3f8>] child_rip+0xa/0x12
[<ffffffff8024236b>] kthread+0x0/0xf1
[<ffffffff8020a3ee>] child_rip+0x0/0x12


Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85
RIP [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
RSP <ffff81003ec65e40>
CR2: 0000000000000500
<6>mptbase: Initiating ioc0 recovery

Full console logs showing the above oops are here:
-git7: ok http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
-git8: Oops http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
-git9: Oops http://crucible.osdl.org/runs/2241/sysinfo/amd01.console

Reference information about the machine this is run on:
http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/

Config files:
-git7: http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
-git8: http://crucible.osdl.org/runs/2233/sysinfo/amd01.config

Bryce


2006-09-28 20:35:00

by Bryce Harrington

[permalink] [raw]
Subject: Re: [Eng] [OOPS] -git8, 9: NULL pointer dereference in mptspi_dv_renegotiate_work

Just checked against latest -git10, same oops:

http://crucible.osdl.org/runs/2256/sysinfo/amd01.console

However, it is not occurring on our ita64, x86, or x86_64 systems
running the same kernels.

Bryce

On Thu, Sep 28, 2006 at 01:25:48PM -0700, Bryce Harrington wrote:
> Apologies if this has already been reported; I didn't spot it on the
> list. We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> -git9, but not -git7:
>
> mptbase: Initiating ioc0 recovery
> Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
> [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> PGD 0
> Oops: 0000 [1] PREEMPT SMP
> CPU 0
> Modules linked in:
> Pid: 8, comm: events/0 Not tainted 2.6.18-git8 #1
> RIP: 0010:[<ffffffff80489aa2>] [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> RSP: 0000:ffff81003ec65e40 EFLAGS: 00010282
> RAX: 0000000000000002 RBX: ffff81003e86f640 RCX: 000000000000001e
> RDX: 0000000000000001 RSI: 0000000000000213 RDI: 000000000003e86f
> RBP: 0000000000000500 R08: ffff81003ec64000 R09: ffff81003ed0cf40
> R10: ffff81003e86f640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
> R13: 0000000000000213 R14: ffff81003e86f640 R15: ffffffff80489a96
> FS: 0000000000000000(0000) GS:ffffffff80779000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
> Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
> Stack: ffff81003ec65ef8 ffff81003e86f640 ffff81003e86f648 ffffffff8023f1bd
> ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
> 00000000fffffffc ffffffff80593ffd 0000000000000000 ffffffff8023f300
> Call Trace:
> [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
> [<ffffffff8023f204>] worker_thread+0x0/0x12e
> [<ffffffff8023f300>] worker_thread+0xfc/0x12e
> [<ffffffff80229f62>] default_wake_function+0x0/0xe
> [<ffffffff80229f62>] default_wake_function+0x0/0xe
> [<ffffffff80242433>] kthread+0xc8/0xf1
> [<ffffffff8020a3f8>] child_rip+0xa/0x12
> [<ffffffff8024236b>] kthread+0x0/0xf1
> [<ffffffff8020a3ee>] child_rip+0x0/0x12
>
>
> Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85
> RIP [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> RSP <ffff81003ec65e40>
> CR2: 0000000000000500
> <6>mptbase: Initiating ioc0 recovery
>
> Full console logs showing the above oops are here:
> -git7: ok http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
> -git8: Oops http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
> -git9: Oops http://crucible.osdl.org/runs/2241/sysinfo/amd01.console
>
> Reference information about the machine this is run on:
> http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/
>
> Config files:
> -git7: http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
> -git8: http://crucible.osdl.org/runs/2233/sysinfo/amd01.config
>
> Bryce
> _______________________________________________
> Eng mailing list
> [email protected]
> https://lists.osdl.org/mailman/listinfo/eng

2006-09-28 21:51:27

by Andrew Morton

[permalink] [raw]
Subject: Re: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work


(cc's added)

On Thu, 28 Sep 2006 13:25:48 -0700
Bryce Harrington <[email protected]> wrote:

> Apologies if this has already been reported;

It has not.

> I didn't spot it on the
> list. We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> -git9, but not -git7:
>
> mptbase: Initiating ioc0 recovery
> Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
> [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> PGD 0
> Oops: 0000 [1] PREEMPT SMP
> CPU 0
> Modules linked in:
> Pid: 8, comm: events/0 Not tainted 2.6.18-git8 #1
> RIP: 0010:[<ffffffff80489aa2>] [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> RSP: 0000:ffff81003ec65e40 EFLAGS: 00010282
> RAX: 0000000000000002 RBX: ffff81003e86f640 RCX: 000000000000001e
> RDX: 0000000000000001 RSI: 0000000000000213 RDI: 000000000003e86f
> RBP: 0000000000000500 R08: ffff81003ec64000 R09: ffff81003ed0cf40
> R10: ffff81003e86f640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
> R13: 0000000000000213 R14: ffff81003e86f640 R15: ffffffff80489a96
> FS: 0000000000000000(0000) GS:ffffffff80779000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
> Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
> Stack: ffff81003ec65ef8 ffff81003e86f640 ffff81003e86f648 ffffffff8023f1bd
> ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
> 00000000fffffffc ffffffff80593ffd 0000000000000000 ffffffff8023f300
> Call Trace:
> [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
> [<ffffffff8023f204>] worker_thread+0x0/0x12e
> [<ffffffff8023f300>] worker_thread+0xfc/0x12e
> [<ffffffff80229f62>] default_wake_function+0x0/0xe
> [<ffffffff80229f62>] default_wake_function+0x0/0xe
> [<ffffffff80242433>] kthread+0xc8/0xf1
> [<ffffffff8020a3f8>] child_rip+0xa/0x12
> [<ffffffff8024236b>] kthread+0x0/0xf1
> [<ffffffff8020a3ee>] child_rip+0x0/0x12
>
>
> Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85
> RIP [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> RSP <ffff81003ec65e40>
> CR2: 0000000000000500
> <6>mptbase: Initiating ioc0 recovery

That's very clever. With gcc-4.0.2 and your .config I get

(gdb) x/20i mptspi_dv_renegotiate_work
0xffffffff8048475e <mptspi_dv_renegotiate_work>: push %rbp
0xffffffff8048475f <mptspi_dv_renegotiate_work+1>: push %rbx
0xffffffff80484760 <mptspi_dv_renegotiate_work+2>: push %rbp
0xffffffff80484761 <mptspi_dv_renegotiate_work+3>: mov 0x60(%rdi),%rbp
0xffffffff80484765 <mptspi_dv_renegotiate_work+7>: callq 0xffffffff8026df58 <kfree>
0xffffffff8048476a <mptspi_dv_renegotiate_work+12>: mov 0x0(%rbp),%rax
0xffffffff8048476e <mptspi_dv_renegotiate_work+16>: xor %esi,%esi
0xffffffff80484770 <mptspi_dv_renegotiate_work+18>: mov 0x150(%rax),%rdi

So on entry to this function, wqw->hd is 0x500.

Or kfree() somehow scrogged your %rbp register.


> Full console logs showing the above oops are here:
> -git7: ok http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
> -git8: Oops http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
> -git9: Oops http://crucible.osdl.org/runs/2241/sysinfo/amd01.console
>
> Reference information about the machine this is run on:
> http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/
>
> Config files:
> -git7: http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
> -git8: http://crucible.osdl.org/runs/2233/sysinfo/amd01.config

> ...

> Just checked against latest -git10, same oops:
>
> http://crucible.osdl.org/runs/2256/sysinfo/amd01.console
>
> However, it is not occurring on our ita64, x86, or x86_64 systems
> running the same kernels.
>

I'd be suspecting a miscompile, or something horrid in kfree().

Does it change anything if you move that kfree() down a bit?

--- a/drivers/message/fusion/mptspi.c~a
+++ a/drivers/message/fusion/mptspi.c
@@ -790,10 +790,9 @@ mptspi_dv_renegotiate_work(void *data)
struct _MPT_SCSI_HOST *hd = wqw->hd;
struct scsi_device *sdev;

- kfree(wqw);
-
shost_for_each_device(sdev, hd->ioc->sh)
mptspi_dv_device(hd, sdev);
+ kfree(wqw);
}

static void
_

2006-09-28 22:54:34

by Bryce Harrington

[permalink] [raw]
Subject: Re: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work

On Thu, Sep 28, 2006 at 02:51:21PM -0700, Andrew Morton wrote:
> On Thu, 28 Sep 2006 13:25:48 -0700
> Bryce Harrington <[email protected]> wrote:
>
> > Apologies if this has already been reported;
>
> It has not.
>
> > I didn't spot it on the
> > list. We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> > -git9, but not -git7:
> >
> > mptbase: Initiating ioc0 recovery
> > Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
> > [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> > PGD 0
> > Oops: 0000 [1] PREEMPT SMP
>

> That's very clever.
>
> I'd be suspecting a miscompile, or something horrid in kfree().
>
> Does it change anything if you move that kfree() down a bit?
>

Got essentially the same oops, although the addresses have changed a
little:

mptbase: Initiating ioc0 recovery
Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
[<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
PGD 0
Oops: 0000 [1] PREEMPT SMP
CPU 0
Modules linked in:
Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
RIP: 0010:[<ffffffff80489aa3>] [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
RSP: 0000:ffff81003ec65e40 EFLAGS: 00010246
RAX: ffff81003ec65ef8 RBX: ffff81003eff6640 RCX: ffff81003ec65ef8
RDX: ffff81003ed0cf58 RSI: 0000000000000000 RDI: ffff81003eff6640
RBP: 0000000000000500 R08: ffff81003ec64000 R09: 00000000ffffffff
R10: 00000000ffffffff R11: ffff81003ed0cf40 R12: ffff81003eff6640
R13: 0000000000000213 R14: ffff81003eff6640 R15: ffffffff80489a96
FS: 0000000000000000(0000) GS:ffffffff8077a000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
Stack: ffff81003eff6640 ffff81003eff6648 ffff81003ed0cf40 ffffffff8023f1bd
ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
00000000fffffffc ffffffff8059457d 0000000000000000 ffffffff8023f30
Call Trace:
[<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
[<ffffffff8023f204>] worker_thread+0x0/0x12e
[<ffffffff8023f300>] worker_thread+0xfc/0x12e
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80242433>] kthread+0xc8/0xf1
[<ffffffff8020a3f8>] child_rip+0xa/0x12
[<ffffffff8024236b>] kthread+0x0/0xf1
[<ffffffff8020a3ee>] child_rip+0x0/0x12


Code: 48 8b 45 00 48 8b b8 50 01 00 00 e8 5d 4d fe ff 48 85 c0 48
RIP [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
RSP <ffff81003ec65e40>
CR2: 0000000000000500
<6>mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
mptbase: Initiating ioc0 recovery
scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=255, IRQ=185
target0:0:0: dma_alloc_coherent for parameters failed
mptscsih: ioc0: attempting task abort! (sc=ffff81003e840c80)
scsi 0:0:0:0:
command: cdb[0]=0x12: 12 00 00 00 24 00
mptbase: Initiating ioc0 recovery

Bryce

> With gcc-4.0.2 and your .config I get
>
> (gdb) x/20i mptspi_dv_renegotiate_work
> 0xffffffff8048475e <mptspi_dv_renegotiate_work>: push %rbp
> 0xffffffff8048475f <mptspi_dv_renegotiate_work+1>: push %rbx
> 0xffffffff80484760 <mptspi_dv_renegotiate_work+2>: push %rbp
> 0xffffffff80484761 <mptspi_dv_renegotiate_work+3>: mov 0x60(%rdi),%rbp
> 0xffffffff80484765 <mptspi_dv_renegotiate_work+7>: callq 0xffffffff8026df58 <kfree>
> 0xffffffff8048476a <mptspi_dv_renegotiate_work+12>: mov 0x0(%rbp),%rax
> 0xffffffff8048476e <mptspi_dv_renegotiate_work+16>: xor %esi,%esi
> 0xffffffff80484770 <mptspi_dv_renegotiate_work+18>: mov 0x150(%rax),%rdi
>
> So on entry to this function, wqw->hd is 0x500.
>
> Or kfree() somehow scrogged your %rbp register.
>
>
> > Full console logs showing the above oops are here:
> > -git7: ok http://crucible.osdl.org/runs/2223/sysinfo/amd01.console
> > -git8: Oops http://crucible.osdl.org/runs/2233/sysinfo/amd01.console
> > -git9: Oops http://crucible.osdl.org/runs/2241/sysinfo/amd01.console
> >
> > Reference information about the machine this is run on:
> > http://crucible.osdl.org/runs/2223/sysinfo/amd01.1/
> >
> > Config files:
> > -git7: http://crucible.osdl.org/runs/2223/sysinfo/amd01.config
> > -git8: http://crucible.osdl.org/runs/2233/sysinfo/amd01.config
>
> > ...
>
> > Just checked against latest -git10, same oops:
> >
> > http://crucible.osdl.org/runs/2256/sysinfo/amd01.console
> >
> > However, it is not occurring on our ita64, x86, or x86_64 systems
> > running the same kernels.
> >
>
> I'd be suspecting a miscompile, or something horrid in kfree().
>
> Does it change anything if you move that kfree() down a bit?
>
> --- a/drivers/message/fusion/mptspi.c~a
> +++ a/drivers/message/fusion/mptspi.c
> @@ -790,10 +790,9 @@ mptspi_dv_renegotiate_work(void *data)
> struct _MPT_SCSI_HOST *hd = wqw->hd;
> struct scsi_device *sdev;
>
> - kfree(wqw);
> -
> shost_for_each_device(sdev, hd->ioc->sh)
> mptspi_dv_device(hd, sdev);
> + kfree(wqw);
> }
>
> static void
> _

2006-09-29 00:27:06

by Andrew Morton

[permalink] [raw]
Subject: Re: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work

On Thu, 28 Sep 2006 15:54:26 -0700
Bryce Harrington <[email protected]> wrote:

> On Thu, Sep 28, 2006 at 02:51:21PM -0700, Andrew Morton wrote:
> > On Thu, 28 Sep 2006 13:25:48 -0700
> > Bryce Harrington <[email protected]> wrote:
> >
> > > Apologies if this has already been reported;
> >
> > It has not.
> >
> > > I didn't spot it on the
> > > list. We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> > > -git9, but not -git7:
> > >
> > > mptbase: Initiating ioc0 recovery
> > > Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
> > > [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> > > PGD 0
> > > Oops: 0000 [1] PREEMPT SMP
> >
>
> > That's very clever.
> >
> > I'd be suspecting a miscompile, or something horrid in kfree().
> >
> > Does it change anything if you move that kfree() down a bit?
> >
>
> Got essentially the same oops, although the addresses have changed a
> little:
>
> mptbase: Initiating ioc0 recovery
> Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
> [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> PGD 0
> Oops: 0000 [1] PREEMPT SMP
> CPU 0
> Modules linked in:
> Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
> RIP: 0010:[<ffffffff80489aa3>] [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> RSP: 0000:ffff81003ec65e40 EFLAGS: 00010246
> RAX: ffff81003ec65ef8 RBX: ffff81003eff6640 RCX: ffff81003ec65ef8
> RDX: ffff81003ed0cf58 RSI: 0000000000000000 RDI: ffff81003eff6640
> RBP: 0000000000000500 R08: ffff81003ec64000 R09: 00000000ffffffff
> R10: 00000000ffffffff R11: ffff81003ed0cf40 R12: ffff81003eff6640
> R13: 0000000000000213 R14: ffff81003eff6640 R15: ffffffff80489a96
> FS: 0000000000000000(0000) GS:ffffffff8077a000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
> Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
> Stack: ffff81003eff6640 ffff81003eff6648 ffff81003ed0cf40 ffffffff8023f1bd
> ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
> 00000000fffffffc ffffffff8059457d 0000000000000000 ffffffff8023f30
> Call Trace:
> [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
> [<ffffffff8023f204>] worker_thread+0x0/0x12e
> [<ffffffff8023f300>] worker_thread+0xfc/0x12e
> [<ffffffff80229f62>] default_wake_function+0x0/0xe
> [<ffffffff80229f62>] default_wake_function+0x0/0xe
> [<ffffffff80242433>] kthread+0xc8/0xf1
> [<ffffffff8020a3f8>] child_rip+0xa/0x12
> [<ffffffff8024236b>] kthread+0x0/0xf1
> [<ffffffff8020a3ee>] child_rip+0x0/0x12
>
>
> Code: 48 8b 45 00 48 8b b8 50 01 00 00 e8 5d 4d fe ff 48 85 c0 48
> RIP [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> RSP <ffff81003ec65e40>
> CR2: 0000000000000500
> <6>mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> mptbase: Initiating ioc0 recovery
> scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=255, IRQ=185
> target0:0:0: dma_alloc_coherent for parameters failed
> mptscsih: ioc0: attempting task abort! (sc=ffff81003e840c80)
> scsi 0:0:0:0:
> command: cdb[0]=0x12: 12 00 00 00 24 00
> mptbase: Initiating ioc0 recovery
>

Ah. Maybe we're simply being passed a junk pointer. This, please:

--- a/drivers/message/fusion/mptspi.c~a
+++ a/drivers/message/fusion/mptspi.c
@@ -804,6 +804,9 @@ mptspi_dv_renegotiate(struct _MPT_SCSI_H
if (!wqw)
return;

+ printk("%p\n", hd);
+ if ((unsigned long)hd < 4000UL)
+ dump_stack();
INIT_WORK(&wqw->work, mptspi_dv_renegotiate_work, wqw);
wqw->hd = hd;

_

2006-09-29 17:17:38

by Bryce Harrington

[permalink] [raw]
Subject: Re: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work

On Thu, Sep 28, 2006 at 05:26:52PM -0700, Andrew Morton wrote:
> On Thu, 28 Sep 2006 15:54:26 -0700
> Bryce Harrington <[email protected]> wrote:
>
> > On Thu, Sep 28, 2006 at 02:51:21PM -0700, Andrew Morton wrote:
> > > On Thu, 28 Sep 2006 13:25:48 -0700
> > > Bryce Harrington <[email protected]> wrote:
> > >
> > > > Apologies if this has already been reported;
> > >
> > > It has not.
> > >
> > > > I didn't spot it on the
> > > > list. We've noticed an Oops on AMD64 when running linux-2.6.18-git8 and
> > > > -git9, but not -git7:
> > > >
> > > > mptbase: Initiating ioc0 recovery
> > > > Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
> > > > [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
> > > > PGD 0
> > > > Oops: 0000 [1] PREEMPT SMP
> > >
> >
> > > That's very clever.
> > >
> > > I'd be suspecting a miscompile, or something horrid in kfree().
> > >
> > > Does it change anything if you move that kfree() down a bit?
> > >
> >
> > Got essentially the same oops, although the addresses have changed a
> > little:
> >
> > mptbase: Initiating ioc0 recovery
> > Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
> > [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> > PGD 0
> > Oops: 0000 [1] PREEMPT SMP
> > CPU 0
> > Modules linked in:
> > Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
> > RIP: 0010:[<ffffffff80489aa3>] [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> > RSP: 0000:ffff81003ec65e40 EFLAGS: 00010246
> > RAX: ffff81003ec65ef8 RBX: ffff81003eff6640 RCX: ffff81003ec65ef8
> > RDX: ffff81003ed0cf58 RSI: 0000000000000000 RDI: ffff81003eff6640
> > RBP: 0000000000000500 R08: ffff81003ec64000 R09: 00000000ffffffff
> > R10: 00000000ffffffff R11: ffff81003ed0cf40 R12: ffff81003eff6640
> > R13: 0000000000000213 R14: ffff81003eff6640 R15: ffffffff80489a96
> > FS: 0000000000000000(0000) GS:ffffffff8077a000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
> > Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
> > Stack: ffff81003eff6640 ffff81003eff6648 ffff81003ed0cf40 ffffffff8023f1bd
> > ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
> > 00000000fffffffc ffffffff8059457d 0000000000000000 ffffffff8023f30
> > Call Trace:
> > [<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
> > [<ffffffff8023f204>] worker_thread+0x0/0x12e
> > [<ffffffff8023f300>] worker_thread+0xfc/0x12e
> > [<ffffffff80229f62>] default_wake_function+0x0/0xe
> > [<ffffffff80229f62>] default_wake_function+0x0/0xe
> > [<ffffffff80242433>] kthread+0xc8/0xf1
> > [<ffffffff8020a3f8>] child_rip+0xa/0x12
> > [<ffffffff8024236b>] kthread+0x0/0xf1
> > [<ffffffff8020a3ee>] child_rip+0x0/0x12
> >
> >
> > Code: 48 8b 45 00 48 8b b8 50 01 00 00 e8 5d 4d fe ff 48 85 c0 48
> > RIP [<ffffffff80489aa3>] mptspi_dv_renegotiate_work+0xd/0x4c
> > RSP <ffff81003ec65e40>
> > CR2: 0000000000000500
> > <6>mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > mptbase: Initiating ioc0 recovery
> > scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=255, IRQ=185
> > target0:0:0: dma_alloc_coherent for parameters failed
> > mptscsih: ioc0: attempting task abort! (sc=ffff81003e840c80)
> > scsi 0:0:0:0:
> > command: cdb[0]=0x12: 12 00 00 00 24 00
> > mptbase: Initiating ioc0 recovery
> >
>
> Ah. Maybe we're simply being passed a junk pointer. This, please:
>
> --- a/drivers/message/fusion/mptspi.c~a
> +++ a/drivers/message/fusion/mptspi.c
> @@ -804,6 +804,9 @@ mptspi_dv_renegotiate(struct _MPT_SCSI_H
> if (!wqw)
> return;
>
> + printk("%p\n", hd);
> + if ((unsigned long)hd < 4000UL)
> + dump_stack();
> INIT_WORK(&wqw->work, mptspi_dv_renegotiate_work, wqw);
> wqw->hd = hd;
>
> _

Here's the stack dump:

mptbase: Initiating ioc0 recovery
0000000000000500

Call Trace:
<IRQ> [<ffffffff803e0d37>] vgacon_cursor+0x0/0x1a7
[<ffffffff80489b19>] mptspi_dv_renegotiate+0x3e/0x79
[<ffffffff80489b7b>] mptspi_ioc_reset+0x27/0x2e
[<ffffffff804846ea>] mpt_do_ioc_recovery+0x115f/0x11dd
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff80238dd4>] do_timer+0x2f6/0x574
[<ffffffff8023851c>] lock_timer_base+0x1b/0x3c
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8022797a>] task_rq_lock+0x3d/0x6f
[<ffffffff80227d03>] resched_task+0x4e/0x71
[<ffffffff8022847f>] try_to_wake_up+0x3d4/0x3e6
[<ffffffff80461794>] atapi_output_bytes+0x21/0x5c
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8022797a>] task_rq_lock+0x3d/0x6f
[<ffffffff80227d03>] resched_task+0x4e/0x71
[<ffffffff8022847f>] try_to_wake_up+0x3d4/0x3e6
[<ffffffff8020cf66>] main_timer_handler+0x1e6/0x3a6
[<ffffffff80228f29>] find_busiest_group+0x21f/0x66f
[<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
[<ffffffff8048500c>] mpt_timer_expired+0x0/0x24
[<ffffffff80485017>] mpt_timer_expired+0xb/0x24
[<ffffffff80238a07>] run_timer_softirq+0x156/0x1b2
[<ffffffff802357a1>] __do_softirq+0x46/0xb1
[<ffffffff8020a76c>] call_softirq+0x1c/0x28
[<ffffffff8020bb5f>] do_softirq+0x2c/0x7d
[<ffffffff80207c37>] default_idle+0x0/0x47
[<ffffffff8023583f>] irq_exit+0x33/0x3e
[<ffffffff8020a216>] apic_timer_interrupt+0x66/0x70
<EOI> [<ffffffff80207c60>] default_idle+0x29/0x47
[<ffffffff80207e3e>] cpu_idle+0x87/0xbe
[<ffffffff80794704>] start_kernel+0x203/0x205
[<ffffffff80794179>] _sinittext+0x179/0x17d

Unable to handle kernel NULL pointer dereference at 0000000000000500 RIP:
[<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
PGD 0
Oops: 0000 [1] PREEMPT SMP
CPU 0
Modules linked in:
Pid: 8, comm: events/0 Not tainted 2.6.18-git10 #1
RIP: 0010:[<ffffffff80489aa2>] [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
RSP: 0000:ffff81003ec65e40 EFLAGS: 00010282
RAX: 0000000000000004 RBX: ffff81003eff3640 RCX: 000000000000001e
RDX: 0000000000000003 RSI: 0000000000000213 RDI: 000000000003eff3
RBP: 0000000000000500 R08: ffff81003ed0cf88 R09: ffff81003ed0cf40
R10: ffff81003eff3640 R11: ffff81003ed0cf40 R12: ffff81003ed0cf40
R13: 0000000000000213 R14: ffff81003eff3640 R15: ffffffff80489a96
FS: 0000000000knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000500 CR3: 0000000000201000 CR4: 00000000000006e0
Process events/0 (pid: 8, threadinfo ffff81003ec64000, task ffff81007f180740)
Stack: 0000000000000000 ffff81003eff3640 ffff81003eff3648 ffffffff8023f1bd
ffff81003ed0cf40 ffff81003ed0cf40 ffffffff8023f204 ffff8100016dfd70
00000000fffffffc fd 0000000000000000 ffffffff8023f300
Call Trace:
[<ffffffff8023f1bd>] run_workqueue+0x9a/0xe1
[<ffffffff8023f204>] worker_thread+0x0/0x12e
[<ffffffff8023f300>] worker_thread+0xfc/0x12e
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80229f62>] default_wake_function+0x0/0xe
[<ffffffff80242433>] kthread+0xc8/0xf1
[<ffffffff8020a3f8>] child_rip+0xa/0x12
[<ffffad+0x0/0xf1
[<ffffffff8020a3ee>] child_rip+0x0/0x12


Code: 48 8b 45 00 31 f6 48 8b b8 50 01 00 00 e8 5c 4d fe ff 48 85
RIP [<ffffffff80489aa2>] mptspi_dv_renegotiate_work+0xc/0x45
RSP <ffff81003ec65e40>
CR2: 0000000000000500
<6>mptbase: Initiating ioc0 recovery
0000000000000500

Call Trace:
<IRQ> [<ffffffff803e0d37>] vgacon_cursor+0x0/0x1a7
[<ffffffff80489b19>] mptspi_dv_renegotiate+0x3e/0x79
[<ffffffff80489b7b>] mptspi_ioc_reset+0x27/0x2e
[<ffffffff804846ea>] mpt_do_ioc_recovery+0x115f/0x11dd
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff80238dd4>] do_timer+0x2f6/0x574
[<ffffffff8023851c>] lock_timer_base+0x1b/0x3c
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8022797a>] task_rq_lock+0x3d/0x6f
[<ffffffff80227d03>] resched_task+0x4e/0x71
[<ffffffff8022847f>] try_to_wake_up+0x3d4/0x3e6
[<ffffffff80461794>] atapi_output_bytes+0x21/0x5c
[<ffffffff802387d7>] current_tick_length+0x5/0x26
[<ffffffff80238dd4>] do_timer+0x2f6/0x574
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff8020cf66>] main_timer_hax3a6
[<ffffffff80228f29>] find_busiest_group+0x21f/0x66f
[<ffffffff8055ebff>] _spin_lock+0xe/0x5e
[<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
[<ffffffff8048500c>] mpt_timer_expired+0x0/0x24
[<ffffffff80485017>] mpt_timer_expired+0xb/0x24
[<ffffffff80238a07>] run_timer_softirq+0x156/0x1b2
[<ffffffff802357a1>] __do_softirq+0x46/0xb1
[<ffffffff8020a76c>] call_softir+0x1c/0x28
[<ffffffff8020bb5f>] do_softirq+0x2c/0x7d
[<ffffffff80207c37>] default_idle+0x0/0x47
[<ffffffff8023583f>] irq_exit+0x33/0x3e
[<ffffffff8020a216>] apic_timer_interrupt+0x66/0x70
<EOI> [<ffffffff80207c60>] default_idle+0x29/0x47
[<ffffffff80207e3e>] cpu_idle+0x87/0xbe
[<ffffffff80794704>] start_kernel+0x203/0x205
[<ffffffff80794179>] _sinittext+0x179/0x17d


Bryce

2006-09-29 18:30:00

by Eric Moore

[permalink] [raw]
Subject: RE: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work

On Friday, September 29, 2006 11:18 AM, Bryce Harrington wrote:

> [<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
> [<ffffffff8048500c>] mpt_timer_expired+0x0/0x24

mpt_timer_expired means most likely we timed out sending
request for config page from firmware. The timeout results
in host reset, which results in domain validation being called.
Perhaps the config pages failed before we allocated memory for hd.

Can you enable debug messages in the driver Makefile, for
the line called MPT_DEBUG_CONFIG; that way we can find out which
config page failed.

There were some changes in scsi_transort_spi.c, that occured
between 2.6.18-git1 and 2.6.18-git2. I doubt these changes
would of effected this. Can you determine between which
git version releases did this problem begin occuring?

Also, can you describe your configuration? Such as which
kind of devices are you usign, and whether if they are U320 devices,
or are their older ones, such as U160.

Eric

2006-09-29 21:41:16

by Bryce Harrington

[permalink] [raw]
Subject: Re: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work

On Fri, Sep 29, 2006 at 12:29:55PM -0600, Moore, Eric wrote:
> On Friday, September 29, 2006 11:18 AM, Bryce Harrington wrote:
>
> > [<ffffffff80484f94>] mpt_HardResetHandler+0xb4/0x12c
> > [<ffffffff8048500c>] mpt_timer_expired+0x0/0x24
>
> mpt_timer_expired means most likely we timed out sending
> request for config page from firmware. The timeout results
> in host reset, which results in domain validation being called.
> Perhaps the config pages failed before we allocated memory for hd.
>
> Can you enable debug messages in the driver Makefile, for
> the line called MPT_DEBUG_CONFIG; that way we can find out which
> config page failed.

Sure; not sure what the interesting part is, but here's the full log
from this:

http://crucible.osdl.org/runs/2265/sysinfo/amd01.2.console

> There were some changes in scsi_transort_spi.c, that occured
> between 2.6.18-git1 and 2.6.18-git2. I doubt these changes
> would of effected this. Can you determine between which
> git version releases did this problem begin occuring?

I found that the problem did not occur with -git7, but did occur with
-git8, 9, 10, 11, and 12. I didn't check kernels prior to that but
could if you think it would help.

> Also, can you describe your configuration? Such as which
> kind of devices are you usign, and whether if they are U320 devices,
> or are their older ones, such as U160.

Sure. Yes, there are two U320 SCSI hd's.

Host: amd01
Kernel: 2.6.12-gentoo-r10
Distribution: gentoo 1.6.14
Memory: 2053852 kB
Arch: x86_64
CPU(s): 2x AMD Opteron(tm) Processor 242

SCSI:
*-pci:1
description: PCI bridge
product: AMD-8131 PCI-X Bridge
vendor: Advanced Micro Devices [AMD]
physical id: 2
bus info: pci@00:02.0
version: 12
width: 32 bits
clock: 66MHz
capabilities: pci normal_decode bus_master cap_list
*-scsi:0
description: SCSI storage controller
product: 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
vendor: LSI Logic / Symbios Logic
physical id: 1
bus info: pci@02:01.0
version: 07
width: 64 bits
clock: 33MHz
capabilities: scsi bus_master cap_list
configuration: driver=mptbase
resources: ioport:c400-c4ff iomemory:fe980000-fe98ffff iomemory:fe970000-fe97ffff irq:185
*-scsi:1
description: SCSI storage controller
product: 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
vendor: LSI Logic / Symbios Logic
physical id: 1.1
bus info: pci@02:01.1
version: 07
width: 64 bits
clock: 33MHz
capabilities: scsi bus_master cap_list
configuration: driver=mptbase
resources: ioport:c800-c8ff iomemory:fe9f0000-fe9fffff iomemory:fe9e0000-fe9effff irq:193


PCI:

00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
00:01.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
00:02.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
00:02.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07) (prog-if 00 [Normal decode])
00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)
00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03) (prog-if 8a [Master SecP PriP])
00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b) (prog-if 10 [OHCI])
01:00.1 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b) (prog-if 10 [OHCI])
01:04.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA])
02:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
02:01.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
02:03.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03)
02:03.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03)

More info about this machine can be found here (for a different testrun).
The INFO directory has the full output from lshw:

http://crucible.osdl.org/runs/2284/sysinfo/amd01.1/

Bryce

2006-09-30 00:11:05

by Eric Moore

[permalink] [raw]
Subject: RE: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work

On Friday, September 29, 2006 3:41 PM, Bryce Harrington wrote:
> > Can you enable debug messages in the driver Makefile, for
> > the line called MPT_DEBUG_CONFIG; that way we can find out which
> > config page failed.
>
> Sure; not sure what the interesting part is, but here's the full log
> from this:
>
> http://crucible.osdl.org/runs/2265/sysinfo/amd01.2.console
>


Thanks. It appears you enabled MPT_DEBUG instead of MPT_DEBUG_CONFIG.
All the "WaitForDoorbell" debugs are from that. Can you recheck your
Makefile.

Thanks,
Eric

2006-09-30 00:28:03

by Bryce Harrington

[permalink] [raw]
Subject: Re: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work

On Fri, Sep 29, 2006 at 06:10:42PM -0600, Moore, Eric wrote:
> On Friday, September 29, 2006 3:41 PM, Bryce Harrington wrote:
> > > Can you enable debug messages in the driver Makefile, for
> > > the line called MPT_DEBUG_CONFIG; that way we can find out which
> > > config page failed.
> >
> > Sure; not sure what the interesting part is, but here's the full log
> > from this:
> >
> > http://crucible.osdl.org/runs/2265/sysinfo/amd01.2.console
> >
>
>
> Thanks. It appears you enabled MPT_DEBUG instead of MPT_DEBUG_CONFIG.
> All the "WaitForDoorbell" debugs are from that. Can you recheck your
> Makefile.

Does this look better?

http://crucible.osdl.org/runs/2265/sysinfo/amd01.3.console

Bryce

2006-09-30 21:55:50

by Bryce Harrington

[permalink] [raw]
Subject: Re: [OOPS] -git8,9: NULL pointer dereference in mptspi_dv_renegotiate_work

On Fri, Sep 29, 2006 at 10:22:50PM -0600, Moore, Eric wrote:
> On Fri 9/29/2006 6:27 PM, Bryce Harrington wrote:
>
> > Does this look better?
> >
> > http://crucible.osdl.org/runs/2265/sysinfo/amd01.3.console
>
>
> It appears that the problem is we're not receiving interrupts.
> The first command after interrupts enabled is not getting a response
> back from firmware, thus timing out. I noticed in the log its
> saying interrupt is at 185, but apparently the INT line is not getting
> raised.
>
> In addition, I understand why the panic. You've compiled the drivers
> into the kernel, instead of module. i.e. if you compiled as module,
> mptspi wouldn't been called while mptbase is loaded, as in your case.
> I guess we would need to add sanity check for that case. I'm usually
> testing as modules.

Ah, that could explain it; when doing testing we do compile everything
in. So it sounds like we could eliminate the panic by compiling as a
module, however is it intended that the driver should work when compiled
in as well? If so, I'd be happy to do additional testing to verify any
fixes worth trying out.

> Besides, we need to undertand why your interrupt controller is not
> generating interrupts.

We typically boot every -mm, -git, -rc and mainline kernel on this
machine, but it's only been relatively recently that this particular
behavior has occurred. Could this suggest that there was a regression
due to recent changes?

Thanks,
Bryce