2008-08-21 05:56:00

by Randy Dunlap

[permalink] [raw]
Subject: BUG: in 2.6.23-rc3-git7 in do_cciss_intr

on x86_64, 4 proc, 8 GB RAM:

calling cciss_init+0x0/0x2e [cciss]
HP CISS Driver (v 3.6.20)
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 54
cciss 0000:42:08.0: PCI INT A -> Link[LNKA] -> GSI 54 (level, high) -> IRQ 54
cciss0: <0x3238> at PCI 0000:42:08.0 IRQ 503 using DAC
BUG: unable to handle kernel NULL pointer dereference at 0000000000000248
IP: [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]
PGD 17e422067 PUD 17e423067 PMD 0
Oops: 0002 [1] SMP
CPU 2
Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]
RSP: 0018:ffff88027f66fee8 EFLAGS: 00010007
RAX: 0000000000000000 RBX: ffff88007f840270 RCX: 000000000000000c
RDX: 0000000000000000 RSI: ffff88027e5c0000 RDI: ffff88027e5c0000
RBP: ffff88027f66ff18 R08: 0000000000000000 R09: ffff88017fa95e88
R10: 0000000000000000 R11: ffff88027f66ff48 R12: ffff88027e5c0000
R13: 0000000000000000 R14: 00000000000001f7 R15: 0000000000000086
FS: 0000000000680850(0000) GS:ffff88017fc02c80(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000248 CR3: 000000017e425000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff88017fa94000, task ffff88027f63d340)
Stack: ffff88027f66fee0 ffff88017e9b4800 0000000000000000 0000000000000000
00000000000001f7 0000000000000000 ffff88027f66ff48 ffffffff8026757e
ffffffff80719000 00000000000001f7 ffff88017e9b4800 ffffffff80719050
Call Trace:
<IRQ> [<ffffffff8026757e>] handle_IRQ_event+0x27/0x57
[<ffffffff80268d08>] handle_edge_irq+0xed/0x12e
[<ffffffff8020eaab>] do_IRQ+0xf6/0x167
[<ffffffff8020c471>] ret_from_intr+0x0/0xa
<EOI> [<ffffffff802122b1>] ? default_idle+0x2b/0x40
[<ffffffff802124bf>] ? c1e_idle+0xd4/0xdb
[<ffffffff8055677d>] ? atomic_notifier_call_chain+0xf/0x11
[<ffffffff8020ac6c>] ? cpu_idle+0x71/0x8f
[<ffffffff8054e752>] ? start_secondary+0x157/0x15c


Code: 8b 83 48 02 00 00 48 39 d8 74 37 49 39 9c 24 c0 00 01 00 75 08 49 89 84 24 c0 00 01 00 48 8b 83 40 02 00 00 48 8b 93 48 02 00 00 <48> 89 90 48 02 00 00 48 8b 93 48 02 00 00 48 89 82 40 02 00 00
RIP [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]
RSP <ffff88027f66fee8>
CR2: 0000000000000248
---[ end trace 902dc79a9e72d3ed ]---


2008-08-21 07:16:32

by Andrew Morton

[permalink] [raw]
Subject: Re: BUG: in 2.6.23-rc3-git7 in do_cciss_intr

On Wed, 20 Aug 2008 22:52:50 -0700 rdunlap <[email protected]> wrote:

> on x86_64, 4 proc, 8 GB RAM:
>
> calling cciss_init+0x0/0x2e [cciss]
> HP CISS Driver (v 3.6.20)
> ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 54
> cciss 0000:42:08.0: PCI INT A -> Link[LNKA] -> GSI 54 (level, high) -> IRQ 54
> cciss0: <0x3238> at PCI 0000:42:08.0 IRQ 503 using DAC
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000248
> IP: [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]
> PGD 17e422067 PUD 17e423067 PMD 0
> Oops: 0002 [1] SMP
> CPU 2
> Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]
> RSP: 0018:ffff88027f66fee8 EFLAGS: 00010007
> RAX: 0000000000000000 RBX: ffff88007f840270 RCX: 000000000000000c
> RDX: 0000000000000000 RSI: ffff88027e5c0000 RDI: ffff88027e5c0000
> RBP: ffff88027f66ff18 R08: 0000000000000000 R09: ffff88017fa95e88
> R10: 0000000000000000 R11: ffff88027f66ff48 R12: ffff88027e5c0000
> R13: 0000000000000000 R14: 00000000000001f7 R15: 0000000000000086
> FS: 0000000000680850(0000) GS:ffff88017fc02c80(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000248 CR3: 000000017e425000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffff88017fa94000, task ffff88027f63d340)
> Stack: ffff88027f66fee0 ffff88017e9b4800 0000000000000000 0000000000000000
> 00000000000001f7 0000000000000000 ffff88027f66ff48 ffffffff8026757e
> ffffffff80719000 00000000000001f7 ffff88017e9b4800 ffffffff80719050
> Call Trace:
> <IRQ> [<ffffffff8026757e>] handle_IRQ_event+0x27/0x57
> [<ffffffff80268d08>] handle_edge_irq+0xed/0x12e
> [<ffffffff8020eaab>] do_IRQ+0xf6/0x167
> [<ffffffff8020c471>] ret_from_intr+0x0/0xa
> <EOI> [<ffffffff802122b1>] ? default_idle+0x2b/0x40
> [<ffffffff802124bf>] ? c1e_idle+0xd4/0xdb
> [<ffffffff8055677d>] ? atomic_notifier_call_chain+0xf/0x11
> [<ffffffff8020ac6c>] ? cpu_idle+0x71/0x8f
> [<ffffffff8054e752>] ? start_secondary+0x157/0x15c
>
>
> Code: 8b 83 48 02 00 00 48 39 d8 74 37 49 39 9c 24 c0 00 01 00 75 08 49 89 84 24 c0 00 01 00 48 8b 83 40 02 00 00 48 8b 93 48 02 00 00 <48> 89 90 48 02 00 00 48 8b 93 48 02 00 00 48 89 82 40 02 00 00
> RIP [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]
> RSP <ffff88027f66fee8>
> CR2: 0000000000000248

Is it repeatable?

Is it a regression?

Any chance of finding the offending code with gdb?

2008-08-21 14:27:03

by Mike Miller

[permalink] [raw]
Subject: RE: in 2.6.23-rc3-git7 in do_cciss_intr



> -----Original Message-----
> From: rdunlap [mailto:[email protected]]
> Sent: Thursday, August 21, 2008 12:53 AM
> To: lkml; scsi; Miller, Mike (OS Dev)
> Subject: BUG: in 2.6.23-rc3-git7 in do_cciss_intr
>
> on x86_64, 4 proc, 8 GB RAM:
>
> calling cciss_init+0x0/0x2e [cciss]
> HP CISS Driver (v 3.6.20)
> ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 54 cciss
> 0000:42:08.0: PCI INT A -> Link[LNKA] -> GSI 54 (level, high)
> -> IRQ 54
> cciss0: <0x3238> at PCI 0000:42:08.0 IRQ 503 using DAC
> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000248
> IP: [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]
> PGD 17e422067 PUD 17e423067 PMD 0
> Oops: 0002 [1] SMP
> CPU 2
> Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> do_cciss_intr+0x627/0xa6c [cciss]
> RSP: 0018:ffff88027f66fee8 EFLAGS: 00010007
> RAX: 0000000000000000 RBX: ffff88007f840270 RCX: 000000000000000c
> RDX: 0000000000000000 RSI: ffff88027e5c0000 RDI: ffff88027e5c0000
> RBP: ffff88027f66ff18 R08: 0000000000000000 R09: ffff88017fa95e88
> R10: 0000000000000000 R11: ffff88027f66ff48 R12: ffff88027e5c0000
> R13: 0000000000000000 R14: 00000000000001f7 R15: 0000000000000086
> FS: 0000000000680850(0000) GS:ffff88017fc02c80(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000248 CR3: 000000017e425000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400 Process swapper (pid: 0, threadinfo
> ffff88017fa94000, task ffff88027f63d340)
> Stack: ffff88027f66fee0 ffff88017e9b4800 0000000000000000
> 0000000000000000
> 00000000000001f7 0000000000000000 ffff88027f66ff48
> ffffffff8026757e ffffffff80719000 00000000000001f7
> ffff88017e9b4800 ffffffff80719050 Call Trace:
> <IRQ> [<ffffffff8026757e>] handle_IRQ_event+0x27/0x57
> [<ffffffff80268d08>] handle_edge_irq+0xed/0x12e
> [<ffffffff8020eaab>] do_IRQ+0xf6/0x167 [<ffffffff8020c471>]
> ret_from_intr+0x0/0xa <EOI> [<ffffffff802122b1>] ?
> default_idle+0x2b/0x40 [<ffffffff802124bf>] ?
> c1e_idle+0xd4/0xdb [<ffffffff8055677d>] ?
> atomic_notifier_call_chain+0xf/0x11
> [<ffffffff8020ac6c>] ? cpu_idle+0x71/0x8f
> [<ffffffff8054e752>] ? start_secondary+0x157/0x15c
>
>
> Code: 8b 83 48 02 00 00 48 39 d8 74 37 49 39 9c 24 c0 00 01
> 00 75 08 49 89 84 24 c0 00 01 00 48 8b 83 40 02 00 00 48 8b
> 93 48 02 00 00 <48> 89 90 48 02 00 00 48 8b 93 48 02 00 00 48
> 89 82 40 02 00 00 RIP [<ffffffffa001bb68>]
> do_cciss_intr+0x627/0xa6c [cciss] RSP <ffff88027f66fee8>
> CR2: 0000000000000248
> ---[ end trace 902dc79a9e72d3ed ]---
>

Randy,
Sorry I haven't replied sooner. I saw your earlier mail, just been busy breaking stuff internally. Did this happen during driver init or runtime?

-- mikem

2008-08-21 15:44:00

by Randy Dunlap

[permalink] [raw]
Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr

On Thu, 21 Aug 2008 14:26:06 +0000 Miller, Mike (OS Dev) wrote:

>
>
> > -----Original Message-----
> > From: rdunlap [mailto:[email protected]]
> > Sent: Thursday, August 21, 2008 12:53 AM
> > To: lkml; scsi; Miller, Mike (OS Dev)
> > Subject: BUG: in 2.6.23-rc3-git7 in do_cciss_intr
> >
> > on x86_64, 4 proc, 8 GB RAM:
> >
> > calling cciss_init+0x0/0x2e [cciss]
> > HP CISS Driver (v 3.6.20)
> > ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 54 cciss
> > 0000:42:08.0: PCI INT A -> Link[LNKA] -> GSI 54 (level, high)
> > -> IRQ 54
> > cciss0: <0x3238> at PCI 0000:42:08.0 IRQ 503 using DAC
> > BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000248
> > IP: [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]
> > PGD 17e422067 PUD 17e423067 PMD 0
> > Oops: 0002 [1] SMP
> > CPU 2
> > Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> > Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> > RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> > do_cciss_intr+0x627/0xa6c [cciss]
> > RSP: 0018:ffff88027f66fee8 EFLAGS: 00010007
> > RAX: 0000000000000000 RBX: ffff88007f840270 RCX: 000000000000000c
> > RDX: 0000000000000000 RSI: ffff88027e5c0000 RDI: ffff88027e5c0000
> > RBP: ffff88027f66ff18 R08: 0000000000000000 R09: ffff88017fa95e88
> > R10: 0000000000000000 R11: ffff88027f66ff48 R12: ffff88027e5c0000
> > R13: 0000000000000000 R14: 00000000000001f7 R15: 0000000000000086
> > FS: 0000000000680850(0000) GS:ffff88017fc02c80(0000)
> > knlGS:0000000000000000
> > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 0000000000000248 CR3: 000000017e425000 CR4: 00000000000006e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > 0000000000000400 Process swapper (pid: 0, threadinfo
> > ffff88017fa94000, task ffff88027f63d340)
> > Stack: ffff88027f66fee0 ffff88017e9b4800 0000000000000000
> > 0000000000000000
> > 00000000000001f7 0000000000000000 ffff88027f66ff48
> > ffffffff8026757e ffffffff80719000 00000000000001f7
> > ffff88017e9b4800 ffffffff80719050 Call Trace:
> > <IRQ> [<ffffffff8026757e>] handle_IRQ_event+0x27/0x57
> > [<ffffffff80268d08>] handle_edge_irq+0xed/0x12e
> > [<ffffffff8020eaab>] do_IRQ+0xf6/0x167 [<ffffffff8020c471>]
> > ret_from_intr+0x0/0xa <EOI> [<ffffffff802122b1>] ?
> > default_idle+0x2b/0x40 [<ffffffff802124bf>] ?
> > c1e_idle+0xd4/0xdb [<ffffffff8055677d>] ?
> > atomic_notifier_call_chain+0xf/0x11
> > [<ffffffff8020ac6c>] ? cpu_idle+0x71/0x8f
> > [<ffffffff8054e752>] ? start_secondary+0x157/0x15c
> >
> >
> > Code: 8b 83 48 02 00 00 48 39 d8 74 37 49 39 9c 24 c0 00 01
> > 00 75 08 49 89 84 24 c0 00 01 00 48 8b 83 40 02 00 00 48 8b
> > 93 48 02 00 00 <48> 89 90 48 02 00 00 48 8b 93 48 02 00 00 48
> > 89 82 40 02 00 00 RIP [<ffffffffa001bb68>]
> > do_cciss_intr+0x627/0xa6c [cciss] RSP <ffff88027f66fee8>
> > CR2: 0000000000000248
> > ---[ end trace 902dc79a9e72d3ed ]---
> >
>
> Randy,
> Sorry I haven't replied sooner. I saw your earlier mail, just been busy breaking stuff internally. Did this happen during driver init or runtime?

Hi Mike,

It's very much during driver init.

Full boot log and .config are attached.

Andrew: I'll rerun the test ASAP. Machine is busy atm.

---
~Randy
Linux Plumbers Conference, 17-19 September 2008, Portland, Oregon USA
http://linuxplumbersconf.org/


Attachments:
netcon-4883.log (70.08 kB)
kconfig.~1 (46.55 kB)
Download all attachments

2008-08-21 15:49:47

by Mike Miller

[permalink] [raw]
Subject: RE: in 2.6.23-rc3-git7 in do_cciss_intr



> -----Original Message-----
> From: Randy Dunlap [mailto:[email protected]]
> Sent: Thursday, August 21, 2008 10:44 AM
> To: Miller, Mike (OS Dev)
> Cc: lkml; scsi; akpm
> Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
>
> On Thu, 21 Aug 2008 14:26:06 +0000 Miller, Mike (OS Dev) wrote:
>
> >
> >
> > > -----Original Message-----
> > > From: rdunlap [mailto:[email protected]]
> > > Sent: Thursday, August 21, 2008 12:53 AM
> > > To: lkml; scsi; Miller, Mike (OS Dev)
> > > Subject: BUG: in 2.6.23-rc3-git7 in do_cciss_intr
> > >
> > > on x86_64, 4 proc, 8 GB RAM:
> > >
> > > calling cciss_init+0x0/0x2e [cciss] HP CISS Driver (v 3.6.20)
> > > ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 54 cciss
> > > 0000:42:08.0: PCI INT A -> Link[LNKA] -> GSI 54 (level, high)
> > > -> IRQ 54
> > > cciss0: <0x3238> at PCI 0000:42:08.0 IRQ 503 using DAC
> > > BUG: unable to handle kernel NULL pointer dereference at
> > > 0000000000000248
> > > IP: [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss] PGD
> > > 17e422067 PUD 17e423067 PMD 0
> > > Oops: 0002 [1] SMP
> > > CPU 2
> > > Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> > > Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> > > RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> > > do_cciss_intr+0x627/0xa6c [cciss]
> > > RSP: 0018:ffff88027f66fee8 EFLAGS: 00010007
> > > RAX: 0000000000000000 RBX: ffff88007f840270 RCX: 000000000000000c
> > > RDX: 0000000000000000 RSI: ffff88027e5c0000 RDI: ffff88027e5c0000
> > > RBP: ffff88027f66ff18 R08: 0000000000000000 R09: ffff88017fa95e88
> > > R10: 0000000000000000 R11: ffff88027f66ff48 R12: ffff88027e5c0000
> > > R13: 0000000000000000 R14: 00000000000001f7 R15: 0000000000000086
> > > FS: 0000000000680850(0000) GS:ffff88017fc02c80(0000)
> > > knlGS:0000000000000000
> > > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > CR2: 0000000000000248 CR3: 000000017e425000 CR4: 00000000000006e0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > 0000000000000400 Process swapper (pid: 0, threadinfo
> > > ffff88017fa94000, task ffff88027f63d340)
> > > Stack: ffff88027f66fee0 ffff88017e9b4800 0000000000000000
> > > 0000000000000000
> > > 00000000000001f7 0000000000000000 ffff88027f66ff48
> ffffffff8026757e
> > > ffffffff80719000 00000000000001f7 ffff88017e9b4800
> ffffffff80719050
> > > Call Trace:
> > > <IRQ> [<ffffffff8026757e>] handle_IRQ_event+0x27/0x57
> > > [<ffffffff80268d08>] handle_edge_irq+0xed/0x12e
> [<ffffffff8020eaab>]
> > > do_IRQ+0xf6/0x167 [<ffffffff8020c471>]
> ret_from_intr+0x0/0xa <EOI>
> > > [<ffffffff802122b1>] ?
> > > default_idle+0x2b/0x40 [<ffffffff802124bf>] ?
> > > c1e_idle+0xd4/0xdb [<ffffffff8055677d>] ?
> > > atomic_notifier_call_chain+0xf/0x11
> > > [<ffffffff8020ac6c>] ? cpu_idle+0x71/0x8f [<ffffffff8054e752>] ?
> > > start_secondary+0x157/0x15c
> > >
> > >
> > > Code: 8b 83 48 02 00 00 48 39 d8 74 37 49 39 9c 24 c0 00
> 01 00 75 08
> > > 49 89 84 24 c0 00 01 00 48 8b 83 40 02 00 00 48 8b
> > > 93 48 02 00 00 <48> 89 90 48 02 00 00 48 8b 93 48 02 00 00 48
> > > 89 82 40 02 00 00 RIP [<ffffffffa001bb68>]
> > > do_cciss_intr+0x627/0xa6c [cciss] RSP <ffff88027f66fee8>
> > > CR2: 0000000000000248
> > > ---[ end trace 902dc79a9e72d3ed ]---
> > >
> >
> > Randy,
> > Sorry I haven't replied sooner. I saw your earlier mail,
> just been busy breaking stuff internally. Did this happen
> during driver init or runtime?
>
> Hi Mike,
>
> It's very much during driver init.
>
> Full boot log and .config are attached.
>
> Andrew: I'll rerun the test ASAP. Machine is busy atm.

Randy,
We know of a race condition in cciss_init_one. It's fixed in 2.6.26 I believe. Here's the patch:

http://groups.google.com/group/linux.kernel/browse_thread/thread/7b39f2b77622ab03/4f5f45c008655ca1?hl=en&lnk=gst&q=cciss#4f5f45c008655ca1

-- mikem


>
> ---
> ~Randy
> Linux Plumbers Conference, 17-19 September 2008, Portland,
> Oregon USA http://linuxplumbersconf.org/
>

2008-08-21 16:15:33

by Randy Dunlap

[permalink] [raw]
Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr

On Thu, 21 Aug 2008 15:48:35 +0000 Miller, Mike (OS Dev) wrote:

>
>
> > -----Original Message-----
> > From: Randy Dunlap [mailto:[email protected]]
> > Sent: Thursday, August 21, 2008 10:44 AM
> > To: Miller, Mike (OS Dev)
> > Cc: lkml; scsi; akpm
> > Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
> >
> > On Thu, 21 Aug 2008 14:26:06 +0000 Miller, Mike (OS Dev) wrote:
> >
> > >
> > >
> > > > -----Original Message-----
> > > > From: rdunlap [mailto:[email protected]]
> > > > Sent: Thursday, August 21, 2008 12:53 AM
> > > > To: lkml; scsi; Miller, Mike (OS Dev)
> > > > Subject: BUG: in 2.6.23-rc3-git7 in do_cciss_intr
> > > >
> > > > on x86_64, 4 proc, 8 GB RAM:
> > > >
> > > > calling cciss_init+0x0/0x2e [cciss] HP CISS Driver (v 3.6.20)
> > > > ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 54 cciss
> > > > 0000:42:08.0: PCI INT A -> Link[LNKA] -> GSI 54 (level, high)
> > > > -> IRQ 54
> > > > cciss0: <0x3238> at PCI 0000:42:08.0 IRQ 503 using DAC
> > > > BUG: unable to handle kernel NULL pointer dereference at
> > > > 0000000000000248
> > > > IP: [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss] PGD
> > > > 17e422067 PUD 17e423067 PMD 0
> > > > Oops: 0002 [1] SMP
> > > > CPU 2
> > > > Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> > > > Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> > > > RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> > > > do_cciss_intr+0x627/0xa6c [cciss]
> > > > RSP: 0018:ffff88027f66fee8 EFLAGS: 00010007
> > > > RAX: 0000000000000000 RBX: ffff88007f840270 RCX: 000000000000000c
> > > > RDX: 0000000000000000 RSI: ffff88027e5c0000 RDI: ffff88027e5c0000
> > > > RBP: ffff88027f66ff18 R08: 0000000000000000 R09: ffff88017fa95e88
> > > > R10: 0000000000000000 R11: ffff88027f66ff48 R12: ffff88027e5c0000
> > > > R13: 0000000000000000 R14: 00000000000001f7 R15: 0000000000000086
> > > > FS: 0000000000680850(0000) GS:ffff88017fc02c80(0000)
> > > > knlGS:0000000000000000
> > > > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > > CR2: 0000000000000248 CR3: 000000017e425000 CR4: 00000000000006e0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > > 0000000000000400 Process swapper (pid: 0, threadinfo
> > > > ffff88017fa94000, task ffff88027f63d340)
> > > > Stack: ffff88027f66fee0 ffff88017e9b4800 0000000000000000
> > > > 0000000000000000
> > > > 00000000000001f7 0000000000000000 ffff88027f66ff48
> > ffffffff8026757e
> > > > ffffffff80719000 00000000000001f7 ffff88017e9b4800
> > ffffffff80719050
> > > > Call Trace:
> > > > <IRQ> [<ffffffff8026757e>] handle_IRQ_event+0x27/0x57
> > > > [<ffffffff80268d08>] handle_edge_irq+0xed/0x12e
> > [<ffffffff8020eaab>]
> > > > do_IRQ+0xf6/0x167 [<ffffffff8020c471>]
> > ret_from_intr+0x0/0xa <EOI>
> > > > [<ffffffff802122b1>] ?
> > > > default_idle+0x2b/0x40 [<ffffffff802124bf>] ?
> > > > c1e_idle+0xd4/0xdb [<ffffffff8055677d>] ?
> > > > atomic_notifier_call_chain+0xf/0x11
> > > > [<ffffffff8020ac6c>] ? cpu_idle+0x71/0x8f [<ffffffff8054e752>] ?
> > > > start_secondary+0x157/0x15c
> > > >
> > > >
> > > > Code: 8b 83 48 02 00 00 48 39 d8 74 37 49 39 9c 24 c0 00
> > 01 00 75 08
> > > > 49 89 84 24 c0 00 01 00 48 8b 83 40 02 00 00 48 8b
> > > > 93 48 02 00 00 <48> 89 90 48 02 00 00 48 8b 93 48 02 00 00 48
> > > > 89 82 40 02 00 00 RIP [<ffffffffa001bb68>]
> > > > do_cciss_intr+0x627/0xa6c [cciss] RSP <ffff88027f66fee8>
> > > > CR2: 0000000000000248
> > > > ---[ end trace 902dc79a9e72d3ed ]---
> > > >
> > >
> > > Randy,
> > > Sorry I haven't replied sooner. I saw your earlier mail,
> > just been busy breaking stuff internally. Did this happen
> > during driver init or runtime?
> >
> > Hi Mike,
> >
> > It's very much during driver init.
> >
> > Full boot log and .config are attached.
> >
> > Andrew: I'll rerun the test ASAP. Machine is busy atm.
>
> Randy,
> We know of a race condition in cciss_init_one. It's fixed in 2.6.26 I believe. Here's the patch:
>
> http://groups.google.com/group/linux.kernel/browse_thread/thread/7b39f2b77622ab03/4f5f45c008655ca1?hl=en&lnk=gst&q=cciss#4f5f45c008655ca1


Mike,
Sorry, but my fingers have typoed the $subject. My bad.
Kernel is 2.6.27-rc3-git7 (from above):

> > > > Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> > > > Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> > > > RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> > > > do_cciss_intr+0x627/0xa6c [cciss]


---
~Randy
Linux Plumbers Conference, 17-19 September 2008, Portland, Oregon USA
http://linuxplumbersconf.org/

2008-08-21 16:26:35

by Mike Miller

[permalink] [raw]
Subject: RE: in 2.6.23-rc3-git7 in do_cciss_intr

> >
> > Randy,
> > We know of a race condition in cciss_init_one. It's fixed
> in 2.6.26 I believe. Here's the patch:
> >
> >
> http://groups.google.com/group/linux.kernel/browse_thread/thread/7b39f
> > 2b77622ab03/4f5f45c008655ca1?hl=en&lnk=gst&q=cciss#4f5f45c008655ca1
>
>
> Mike,
> Sorry, but my fingers have typoed the $subject. My bad.
> Kernel is 2.6.27-rc3-git7 (from above):
>
> > > > > Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> > > > > Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> > > > > RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> > > > > do_cciss_intr+0x627/0xa6c [cciss]
>
Hmmmmm, let me know what happens from your retest. I'll look at this as soon as I finish what I'm doing now. We trying to spin for our test teams but I have something hopelessly broken. :(

-- mikem

2008-08-22 00:27:12

by Randy Dunlap

[permalink] [raw]
Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr

On Thu, 21 Aug 2008 16:25:24 +0000 Miller, Mike (OS Dev) wrote:

> > >
> > > Randy,
> > > We know of a race condition in cciss_init_one. It's fixed
> > in 2.6.26 I believe. Here's the patch:
> > >
> > >
> > http://groups.google.com/group/linux.kernel/browse_thread/thread/7b39f
> > > 2b77622ab03/4f5f45c008655ca1?hl=en&lnk=gst&q=cciss#4f5f45c008655ca1
> >
> >
> > Mike,
> > Sorry, but my fingers have typoed the $subject. My bad.
> > Kernel is 2.6.27-rc3-git7 (from above):
> >
> > > > > > Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> > > > > > Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> > > > > > RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> > > > > > do_cciss_intr+0x627/0xa6c [cciss]
> >
> Hmmmmm, let me know what happens from your retest. I'll look at this as soon as I finish what I'm doing now. We trying to spin for our test teams but I have something hopelessly broken. :(

It didn't BUG in the retest. That just means that it's more difficult
to find/fix, right?

---
~Randy
Linux Plumbers Conference, 17-19 September 2008, Portland, Oregon USA
http://linuxplumbersconf.org/

2008-08-22 15:49:45

by Mike Miller

[permalink] [raw]
Subject: RE: in 2.6.23-rc3-git7 in do_cciss_intr



> -----Original Message-----
> From: Randy Dunlap [mailto:[email protected]]
> Sent: Thursday, August 21, 2008 7:27 PM
> To: Miller, Mike (OS Dev)
> Cc: lkml; scsi; akpm
> Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
>
> On Thu, 21 Aug 2008 16:25:24 +0000 Miller, Mike (OS Dev) wrote:
>
> > > >
> > > > Randy,
> > > > We know of a race condition in cciss_init_one. It's fixed
> > > in 2.6.26 I believe. Here's the patch:
> > > >
> > > >
> > >
> http://groups.google.com/group/linux.kernel/browse_thread/thread/7b3
> > > 9f
> > > >
> 2b77622ab03/4f5f45c008655ca1?hl=en&lnk=gst&q=cciss#4f5f45c008655ca
> > > > 1
> > >
> > >
> > > Mike,
> > > Sorry, but my fingers have typoed the $subject. My bad.
> > > Kernel is 2.6.27-rc3-git7 (from above):
> > >
> > > > > > > Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> > > > > > > Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> > > > > > > RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> > > > > > > do_cciss_intr+0x627/0xa6c [cciss]
> > >
> > Hmmmmm, let me know what happens from your retest. I'll
> look at this
> > as soon as I finish what I'm doing now. We trying to spin
> for our test
> > teams but I have something hopelessly broken. :(
>
> It didn't BUG in the retest. That just means that it's more
> difficult to find/fix, right?

Yup.

2008-08-22 15:55:01

by James Bottomley

[permalink] [raw]
Subject: RE: in 2.6.23-rc3-git7 in do_cciss_intr

On Fri, 2008-08-22 at 15:48 +0000, Miller, Mike (OS Dev) wrote:
>
> > -----Original Message-----
> > From: Randy Dunlap [mailto:[email protected]]
> > Sent: Thursday, August 21, 2008 7:27 PM
> > To: Miller, Mike (OS Dev)
> > Cc: lkml; scsi; akpm
> > Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
> >
> > On Thu, 21 Aug 2008 16:25:24 +0000 Miller, Mike (OS Dev) wrote:
> >
> > > > >
> > > > > Randy,
> > > > > We know of a race condition in cciss_init_one. It's fixed
> > > > in 2.6.26 I believe. Here's the patch:
> > > > >
> > > > >
> > > >
> > http://groups.google.com/group/linux.kernel/browse_thread/thread/7b3
> > > > 9f
> > > > >
> > 2b77622ab03/4f5f45c008655ca1?hl=en&lnk=gst&q=cciss#4f5f45c008655ca
> > > > > 1
> > > >
> > > >
> > > > Mike,
> > > > Sorry, but my fingers have typoed the $subject. My bad.
> > > > Kernel is 2.6.27-rc3-git7 (from above):
> > > >
> > > > > > > > Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> > > > > > > > Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> > > > > > > > RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> > > > > > > > do_cciss_intr+0x627/0xa6c [cciss]
> > > >
> > > Hmmmmm, let me know what happens from your retest. I'll
> > look at this
> > > as soon as I finish what I'm doing now. We trying to spin
> > for our test
> > > teams but I have something hopelessly broken. :(
> >
> > It didn't BUG in the retest. That just means that it's more
> > difficult to find/fix, right?
>
> Yup.

Randy,

If you can't reproduce it, could you use the debug information or gdb to
tell us what line in the source code this:

do_cciss_intr+0x627

corresponds to? That might help isolating the problem.

Thanks,

James

2008-08-22 16:51:39

by Randy Dunlap

[permalink] [raw]
Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr

James Bottomley wrote:
> On Fri, 2008-08-22 at 15:48 +0000, Miller, Mike (OS Dev) wrote:
>>> -----Original Message-----
>>> From: Randy Dunlap [mailto:[email protected]]
>>> Sent: Thursday, August 21, 2008 7:27 PM
>>> To: Miller, Mike (OS Dev)
>>> Cc: lkml; scsi; akpm
>>> Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
>>>
>>> On Thu, 21 Aug 2008 16:25:24 +0000 Miller, Mike (OS Dev) wrote:
>>>
>>>>>> Randy,
>>>>>> We know of a race condition in cciss_init_one. It's fixed
>>>>> in 2.6.26 I believe. Here's the patch:
>>>>>>
>>> http://groups.google.com/group/linux.kernel/browse_thread/thread/7b3
>>>>> 9f
>>> 2b77622ab03/4f5f45c008655ca1?hl=en&lnk=gst&q=cciss#4f5f45c008655ca
>>>>>> 1
>>>>>
>>>>> Mike,
>>>>> Sorry, but my fingers have typoed the $subject. My bad.
>>>>> Kernel is 2.6.27-rc3-git7 (from above):
>>>>>
>>>>>>>>> Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
>>>>>>>>> Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
>>>>>>>>> RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
>>>>>>>>> do_cciss_intr+0x627/0xa6c [cciss]
>>>> Hmmmmm, let me know what happens from your retest. I'll
>>> look at this
>>>> as soon as I finish what I'm doing now. We trying to spin
>>> for our test
>>>> teams but I have something hopelessly broken. :(
>>> It didn't BUG in the retest. That just means that it's more
>>> difficult to find/fix, right?
>> Yup.
>
> Randy,
>
> If you can't reproduce it, could you use the debug information or gdb to
> tell us what line in the source code this:
>
> do_cciss_intr+0x627
>
> corresponds to? That might help isolating the problem.


Sure, here's an attempt at that. Please let me know if you want it
differently or some other info.


(gdb) x/20i do_cciss_intr+0x627
0x3b68 <do_cciss_intr+1575>: mov %rdx,0x248(%rax)
0x3b6f <do_cciss_intr+1582>: mov 0x248(%rbx),%rdx
0x3b76 <do_cciss_intr+1589>: mov %rax,0x240(%rdx)
0x3b7d <do_cciss_intr+1596>: jmp 0x3b8b <do_cciss_intr+1610>
0x3b7f <do_cciss_intr+1598>: movq $0x0,0x100c0(%r12)
0x3b8b <do_cciss_intr+1610>: mov 0x234(%rbx),%eax
0x3b91 <do_cciss_intr+1616>: test %eax,%eax
0x3b93 <do_cciss_intr+1618>: jne 0x3f27 <do_cciss_intr+2534>
0x3b99 <do_cciss_intr+1624>: mov 0x250(%rbx),%r14
0x3ba0 <do_cciss_intr+1631>: movl $0x0,0xcc(%r14)
0x3bab <do_cciss_intr+1642>: mov 0x228(%rbx),%r8
0x3bb2 <do_cciss_intr+1649>: mov 0x2(%r8),%dx
0x3bb7 <do_cciss_intr+1654>: test %dx,%dx
0x3bba <do_cciss_intr+1657>: je 0x3f0e <do_cciss_intr+2509>


$ addr2line -e cciss.o -f do_cciss_intr+0x627
SA5_fifo_full
/home/rdunlap/linsrc/linux-2.6.27-rc3-git7/drivers/block/cciss.h:206



$ ../../scripts/decodecode < cciss.code
Code: 8b 83 48 02 00 00 48 39 d8 74 37 49 39 9c 24 c0 00 01 00 75 08 49 89 84 24 c0 00 01 00 48 8b 83 40 02 00 00 48 8b 93 48 02 00 00 <48> 89 90 48 02 00 00 48 8b 93 48 02 00 00 48 89 82 40 02 00 00

/tmp/tmp.HbrjP23089.o: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
0: 8b 83 48 02 00 00 mov 0x248(%rbx),%eax
6: 48 39 d8 cmp %rbx,%rax
9: 74 37 je 0x42
b: 49 39 9c 24 c0 00 01 cmp %rbx,0x100c0(%r12)
12: 00
13: 75 08 jne 0x1d
15: 49 89 84 24 c0 00 01 mov %rax,0x100c0(%r12)
1c: 00
1d: 48 8b 83 40 02 00 00 mov 0x240(%rbx),%rax
24: 48 8b 93 48 02 00 00 mov 0x248(%rbx),%rdx

/tmp/tmp.HbrjP23089.o: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
0: 48 89 90 48 02 00 00 mov %rdx,0x248(%rax)
7: 48 8b 93 48 02 00 00 mov 0x248(%rbx),%rdx
e: 48 89 82 40 02 00 00 mov %rax,0x240(%rdx)


~Randy

2008-08-22 17:02:34

by James Bottomley

[permalink] [raw]
Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr

On Fri, 2008-08-22 at 09:49 -0700, Randy Dunlap wrote:
> James Bottomley wrote:
> > On Fri, 2008-08-22 at 15:48 +0000, Miller, Mike (OS Dev) wrote:
> >>> -----Original Message-----
> >>> From: Randy Dunlap [mailto:[email protected]]
> >>> Sent: Thursday, August 21, 2008 7:27 PM
> >>> To: Miller, Mike (OS Dev)
> >>> Cc: lkml; scsi; akpm
> >>> Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
> >>>
> >>> On Thu, 21 Aug 2008 16:25:24 +0000 Miller, Mike (OS Dev) wrote:
> >>>
> >>>>>> Randy,
> >>>>>> We know of a race condition in cciss_init_one. It's fixed
> >>>>> in 2.6.26 I believe. Here's the patch:
> >>>>>>
> >>> http://groups.google.com/group/linux.kernel/browse_thread/thread/7b3
> >>>>> 9f
> >>> 2b77622ab03/4f5f45c008655ca1?hl=en&lnk=gst&q=cciss#4f5f45c008655ca
> >>>>>> 1
> >>>>>
> >>>>> Mike,
> >>>>> Sorry, but my fingers have typoed the $subject. My bad.
> >>>>> Kernel is 2.6.27-rc3-git7 (from above):
> >>>>>
> >>>>>>>>> Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> >>>>>>>>> Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> >>>>>>>>> RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> >>>>>>>>> do_cciss_intr+0x627/0xa6c [cciss]
> >>>> Hmmmmm, let me know what happens from your retest. I'll
> >>> look at this
> >>>> as soon as I finish what I'm doing now. We trying to spin
> >>> for our test
> >>>> teams but I have something hopelessly broken. :(
> >>> It didn't BUG in the retest. That just means that it's more
> >>> difficult to find/fix, right?
> >> Yup.
> >
> > Randy,
> >
> > If you can't reproduce it, could you use the debug information or gdb to
> > tell us what line in the source code this:
> >
> > do_cciss_intr+0x627
> >
> > corresponds to? That might help isolating the problem.
>
>
> Sure, here's an attempt at that. Please let me know if you want it
> differently or some other info.

> (gdb) x/20i do_cciss_intr+0x627
> 0x3b68 <do_cciss_intr+1575>: mov %rdx,0x248(%rax)
> 0x3b6f <do_cciss_intr+1582>: mov 0x248(%rbx),%rdx
> 0x3b76 <do_cciss_intr+1589>: mov %rax,0x240(%rdx)
> 0x3b7d <do_cciss_intr+1596>: jmp 0x3b8b <do_cciss_intr+1610>
> 0x3b7f <do_cciss_intr+1598>: movq $0x0,0x100c0(%r12)
> 0x3b8b <do_cciss_intr+1610>: mov 0x234(%rbx),%eax
> 0x3b91 <do_cciss_intr+1616>: test %eax,%eax
> 0x3b93 <do_cciss_intr+1618>: jne 0x3f27 <do_cciss_intr+2534>
> 0x3b99 <do_cciss_intr+1624>: mov 0x250(%rbx),%r14
> 0x3ba0 <do_cciss_intr+1631>: movl $0x0,0xcc(%r14)
> 0x3bab <do_cciss_intr+1642>: mov 0x228(%rbx),%r8
> 0x3bb2 <do_cciss_intr+1649>: mov 0x2(%r8),%dx
> 0x3bb7 <do_cciss_intr+1654>: test %dx,%dx
> 0x3bba <do_cciss_intr+1657>: je 0x3f0e <do_cciss_intr+2509>
>
>
> $ addr2line -e cciss.o -f do_cciss_intr+0x627
> SA5_fifo_full
> /home/rdunlap/linsrc/linux-2.6.27-rc3-git7/drivers/block/cciss.h:206

OK ...that's confusing. It seems to be saying that ctrlr_info_t * was
NULL. However, I can't see a way of getting into the fifo_full callback
from do_cciss_intr .. especially not with an NULL host.

James

2008-08-22 18:26:38

by Mike Miller

[permalink] [raw]
Subject: RE: in 2.6.23-rc3-git7 in do_cciss_intr



> -----Original Message-----
> From: James Bottomley [mailto:[email protected]]
> Sent: Friday, August 22, 2008 12:02 PM
> To: Randy Dunlap
> Cc: Miller, Mike (OS Dev); lkml; scsi; akpm
> Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
>
> On Fri, 2008-08-22 at 09:49 -0700, Randy Dunlap wrote:
> > James Bottomley wrote:
> > > On Fri, 2008-08-22 at 15:48 +0000, Miller, Mike (OS Dev) wrote:
> > >>> -----Original Message-----
> > >>> From: Randy Dunlap [mailto:[email protected]]
> > >>> Sent: Thursday, August 21, 2008 7:27 PM
> > >>> To: Miller, Mike (OS Dev)
> > >>> Cc: lkml; scsi; akpm
> > >>> Subject: Re: in 2.6.23-rc3-git7 in do_cciss_intr
> > >>>
> > >>> On Thu, 21 Aug 2008 16:25:24 +0000 Miller, Mike (OS Dev) wrote:
> > >>>
> > >>>>>> Randy,
> > >>>>>> We know of a race condition in cciss_init_one. It's fixed
> > >>>>> in 2.6.26 I believe. Here's the patch:
> > >>>>>>
> > >>>
> http://groups.google.com/group/linux.kernel/browse_thread/thread/7
> > >>> b3
> > >>>>> 9f
> > >>>
> 2b77622ab03/4f5f45c008655ca1?hl=en&lnk=gst&q=cciss#4f5f45c008655ca
> > >>>>>> 1
> > >>>>>
> > >>>>> Mike,
> > >>>>> Sorry, but my fingers have typoed the $subject. My bad.
> > >>>>> Kernel is 2.6.27-rc3-git7 (from above):
> > >>>>>
> > >>>>>>>>> Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
> > >>>>>>>>> Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
> > >>>>>>>>> RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>]
> > >>>>>>>>> do_cciss_intr+0x627/0xa6c [cciss]
> > >>>> Hmmmmm, let me know what happens from your retest. I'll
> > >>> look at this
> > >>>> as soon as I finish what I'm doing now. We trying to spin
> > >>> for our test
> > >>>> teams but I have something hopelessly broken. :(
> > >>> It didn't BUG in the retest. That just means that it's more
> > >>> difficult to find/fix, right?
> > >> Yup.
> > >
> > > Randy,
> > >
> > > If you can't reproduce it, could you use the debug information or
> > > gdb to tell us what line in the source code this:
> > >
> > > do_cciss_intr+0x627
> > >
> > > corresponds to? That might help isolating the problem.
> >
> >
> > Sure, here's an attempt at that. Please let me know if you want it
> > differently or some other info.
>
> > (gdb) x/20i do_cciss_intr+0x627
> > 0x3b68 <do_cciss_intr+1575>: mov %rdx,0x248(%rax)
> > 0x3b6f <do_cciss_intr+1582>: mov 0x248(%rbx),%rdx
> > 0x3b76 <do_cciss_intr+1589>: mov %rax,0x240(%rdx)
> > 0x3b7d <do_cciss_intr+1596>: jmp 0x3b8b <do_cciss_intr+1610>
> > 0x3b7f <do_cciss_intr+1598>: movq $0x0,0x100c0(%r12)
> > 0x3b8b <do_cciss_intr+1610>: mov 0x234(%rbx),%eax
> > 0x3b91 <do_cciss_intr+1616>: test %eax,%eax
> > 0x3b93 <do_cciss_intr+1618>: jne 0x3f27 <do_cciss_intr+2534>
> > 0x3b99 <do_cciss_intr+1624>: mov 0x250(%rbx),%r14
> > 0x3ba0 <do_cciss_intr+1631>: movl $0x0,0xcc(%r14)
> > 0x3bab <do_cciss_intr+1642>: mov 0x228(%rbx),%r8
> > 0x3bb2 <do_cciss_intr+1649>: mov 0x2(%r8),%dx
> > 0x3bb7 <do_cciss_intr+1654>: test %dx,%dx
> > 0x3bba <do_cciss_intr+1657>: je 0x3f0e <do_cciss_intr+2509>
> >
> >
> > $ addr2line -e cciss.o -f do_cciss_intr+0x627 SA5_fifo_full
> > /home/rdunlap/linsrc/linux-2.6.27-rc3-git7/drivers/block/cciss.h:206
>
> OK ...that's confusing. It seems to be saying that
> ctrlr_info_t * was NULL. However, I can't see a way of
> getting into the fifo_full callback from do_cciss_intr ..
> especially not with an NULL host.
>
> James

That is weird. Even if we could get there fifo_full doesn't do anything but wait for a bit.

mikem