2007-09-24 06:33:49

by Paul Rolland

[permalink] [raw]
Subject: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Hello,

I already reported kernel 2.6.23-rcX warning about irq X : nobody cared, and
it seemed to have been fixed in 2.6.23-rc6... Unfortunately, just rebooting
with my 2.6.23-rc7, I got it appearing again, though the previous boot was
just fine, and I didn't change/recompile my kernel in between.

So, what changed ? I've compiled two modules : qc-usb-messenger, and
hsf-modem, to make sure all my hardware is fully supported.

And now, I have :
....
scsi 3:0:1:0: Direct-Access ATA ST3500641AS 3.AA PQ: 0 ANSI: 5
sd 3:0:1:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
sd 3:0:1:0: [sdd] Write Protect is off
sd 3:0:1:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support
DPO or FUA
sd 3:0:1:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
sd 3:0:1:0: [sdd] Write Protect is off
sd 3:0:1:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support
DPO or FUA
irq 23: nobody cared (try booting with the "irqpoll" option)

Call Trace:
<IRQ> [<ffffffff8105d21b>] __report_bad_irq+0x30/0x72
[<ffffffff8105d46c>] note_interrupt+0x20f/0x253
[<ffffffff8105dd38>] handle_fasteoi_irq+0xa9/0xd1
[<ffffffff8100ec65>] do_IRQ+0xf1/0x160
[<ffffffff8100b25b>] mwait_idle+0x0/0x45
[<ffffffff8100c431>] ret_from_intr+0x0/0xa
<EOI> [<ffffffff8100b29d>] mwait_idle+0x42/0x45
[<ffffffff8100b1f3>] cpu_idle+0xbd/0xe0
[<ffffffff8175ca8e>] start_kernel+0x2bb/0x2c7
[<ffffffff8175c140>] _sinittext+0x140/0x144

handlers:
[<ffffffff81307485>] (ata_interrupt+0x0/0x1d3)
Disabling IRQ #23
sdd:<3>ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata4.01: cmd c8/00:08:00:00:00/00:00:00:00:00/f0 tag 0 cdb 0x0 data 4096 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4: soft resetting port
ata4.01: qc timeout (cmd 0x27)
ata4.01: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors
(976773168) ata4.00: failed to IDENTIFY (I/O error, err_mask=0x40)
ata4: failed to recover some devices, retrying in 5 secs
ata4: soft resetting port
ata4.01: qc timeout (cmd 0x27)

....


Booting with irqpoll is Ok, and I have :

2 [14:55] rol@donald:~> cat /proc/interrupts
CPU0 CPU1
0: 31263 0 IO-APIC-edge timer
1: 201 0 IO-APIC-edge i8042
4: 219 0 IO-APIC-edge serial
6: 5 0 IO-APIC-edge floppy
8: 1 0 IO-APIC-edge rtc
9: 0 0 IO-APIC-fasteoi acpi
12: 129 1124 IO-APIC-edge i8042
14: 7010 1235 IO-APIC-edge libata
15: 0 0 IO-APIC-edge libata
17: 0 0 IO-APIC-fasteoi uhci_hcd:usb3
18: 0 0 IO-APIC-fasteoi uhci_hcd:usb4
19: 4250 0 IO-APIC-fasteoi uhci_hcd:usb5, HDA Intel
20: 1251 44 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb2
21: 422 640 IO-APIC-fasteoi pata_pdc2027x, firewire_ohci
23: 24023 0 IO-APIC-fasteoi libata, hsfpcibasic2
378: 93 108 PCI-MSI-edge eth0
379: 1 0 PCI-MSI-edge eth1
NMI: 0 0
LOC: 31000 30803
ERR: 0

Hell, IRQ 23 is shared between libata and my modem !!!

OK, just reboot, and see what happens.... Cool, now it is booting fine,
no more complaint, even without irqpoll, IRQ 23 still shared...
Another one, and Ok again... So, this looks really random :(

Anything I could do to try to collect usefull traces ?

Paul


2007-09-24 14:27:17

by David Newall

[permalink] [raw]
Subject: Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Paul Rolland "(???????) wrote:
> Hell, IRQ 23 is shared between libata and my modem !!!
>

Tried using the modem?

2007-09-25 07:01:19

by Paul Rolland

[permalink] [raw]
Subject: Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Hi David,

On Mon, 24 Sep 2007 23:56:59 +0930
David Newall <[email protected]> wrote:

> Paul Rolland "(???????) wrote:
> > Hell, IRQ 23 is shared between libata and my modem !!!
> >
>
> Tried using the modem?

When no problem is reported, both the libata part and the modem are OK.
When the problem is reported, at that time, only libata is handling IRQ23
(the modem is a WinModem, and the driver is an out-kernel module), this
is still kernel boot time, and the disabling of the IRQ makes my machine
unable to complete the boot process (too many disk timeout).

It could be good to be able to delay the disabling of an IRQ something long
enough to allow all the modules to be loaded...

Paul

2007-09-27 04:19:29

by Tejun Heo

[permalink] [raw]
Subject: Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Paul Rolland wrote:
> Hi David,
>
> On Mon, 24 Sep 2007 23:56:59 +0930
> David Newall <[email protected]> wrote:
>
>> Paul Rolland "(???????) wrote:
>>> Hell, IRQ 23 is shared between libata and my modem !!!
>>>
>> Tried using the modem?
>
> When no problem is reported, both the libata part and the modem are OK.
> When the problem is reported, at that time, only libata is handling IRQ23
> (the modem is a WinModem, and the driver is an out-kernel module), this
> is still kernel boot time, and the disabling of the IRQ makes my machine
> unable to complete the boot process (too many disk timeout).
>
> It could be good to be able to delay the disabling of an IRQ something long
> enough to allow all the modules to be loaded...

Can you change driver load order such that the driver for the modem is
loaded first?

--
tejun

2007-09-27 06:06:19

by Paul Rolland

[permalink] [raw]
Subject: Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Hi Tejun,

On Thu, 27 Sep 2007 09:55:22 +0900
Tejun Heo <[email protected]> wrote:

> Paul Rolland wrote:
> > Hi David,
> >
> > On Mon, 24 Sep 2007 23:56:59 +0930
> > David Newall <[email protected]> wrote:
> >
> >> Paul Rolland "(???????) wrote:
> >>> Hell, IRQ 23 is shared between libata and my modem !!!
> >>>
> >> Tried using the modem?
> >
> Can you change driver load order such that the driver for the modem is
> loaded first?

As I said, it's not possible, because :
- the modem driver is an out-kernel one, so I have to wait the end of the
boot process so that it can be loaded,
- libata on IRQ23 is the one taking care of my disks, and I suspect it
quite hard to install a modem driver before having the disk driver
installed.

I was thinking of delaying the disabling of the IRQ, which is basically the
other part of the problem (the first part being that spurious IRQ from the
modem). If it is possible to do that long enough for the modem driver to be
loaded, then the "IRQ xx : nobody cared" becomes an informational message
during the boot process, and then it vanishes, leaving a perfectly working
machine.
I suspect something in note_interrupt that would do (totally
untested, just thinking loudly) :

/* Allow some delay to complete boot process before
* killing an IRQ. This allow some modules to be
* loaded before we decide the IRQ will not be handled.
*/
if (jiffies > 120*HZ) {
/*
* Now kill the IRQ
*/
printk(KERN_EMERG "Disabling IRQ #%d\n", irq);
desc->status |= IRQ_DISABLED;
desc->depth = 1;
desc->chip->disable(irq);
}

I'll try that this week-end, but if someone has an opinion about it, I'll
be glad to know :)

Regards,
Paul

2007-09-27 09:04:46

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Let me guess... this is a T61 or X61 ?

There's a problem with these that we don't fully understand yet, we're
getting those stale interrupts all over the range.

I wonder if it could be a bug with the ICH8 chipset...

If yours is one of these, it's being dealt with (or attempted to deal
with) at

http://bugzilla.kernel.org/show_bug.cgi?id=8853

Ben.


2007-09-27 10:06:22

by Paul Rolland

[permalink] [raw]
Subject: Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Hello,

On Thu, 27 Sep 2007 19:04:11 +1000
Benjamin Herrenschmidt <[email protected]> wrote:

> Let me guess... this is a T61 or X61 ?
Bad luck ;)

This is an Asus P5W-DH Deluxe motherboard, with a Core2 6400 CPU,
a bunch of disk (2 IDE, 3 SATA, 1 CDRW and 1 DVDRW-DL), and a damned
Olitec PCI V92 V2 modem.

Paul

2007-09-27 22:28:42

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared


On Thu, 2007-09-27 at 10:05 +0000, Paul Rolland wrote:
> Hello,
>
> On Thu, 27 Sep 2007 19:04:11 +1000
> Benjamin Herrenschmidt <[email protected]> wrote:
>
> > Let me guess... this is a T61 or X61 ?
> Bad luck ;)
>
> This is an Asus P5W-DH Deluxe motherboard, with a Core2 6400 CPU,
> a bunch of disk (2 IDE, 3 SATA, 1 CDRW and 1 DVDRW-DL), and a damned
> Olitec PCI V92 V2 modem.

What chipset ? 965gm ?

Ben.


2007-09-28 06:31:03

by Paul Rolland

[permalink] [raw]
Subject: Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Hi,

On Fri, 28 Sep 2007 08:27:58 +1000
Benjamin Herrenschmidt <[email protected]> wrote:

> > This is an Asus P5W-DH Deluxe motherboard, with a Core2 6400 CPU,
> > a bunch of disk (2 IDE, 3 SATA, 1 CDRW and 1 DVDRW-DL), and a damned
> > Olitec PCI V92 V2 modem.
>
> What chipset ? 965gm ?

975x

Paul

--
Paul Rolland E-Mail : rol(at)witbe.net
Witbe.net SA Tel. +33 (0)1 47 67 77 77
Les Collines de l'Arche Fax. +33 (0)1 47 67 77 99
F-92057 Paris La Defense RIPE : PR12-RIPE

Please no HTML, I'm not a browser - Pas d'HTML, je ne suis pas un navigateur
"Some people dream of success... while others wake up and work hard at it"

"I worry about my child and the Internet all the time, even though she's too
young to have logged on yet. Here's what I worry about. I worry that 10 or 15
years from now, she will come to me and say 'Daddy, where were you when they
took freedom of the press away from the Internet?'"
--Mike Godwin, Electronic Frontier Foundation

2007-09-28 09:57:24

by Tejun Heo

[permalink] [raw]
Subject: Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Paul Rolland wrote:
>> Can you change driver load order such that the driver for the modem is
>> loaded first?
>
> As I said, it's not possible, because :
> - the modem driver is an out-kernel one, so I have to wait the end of the
> boot process so that it can be loaded,
> - libata on IRQ23 is the one taking care of my disks, and I suspect it
> quite hard to install a modem driver before having the disk driver
> installed.

You can do both by...

1. Build the modem driver into the kernel. char drivers are linked in
before ATA ones, so it will attach first.

2. Using a custom initrd with emergency shell. initrd is loaded by BIOS
so no driver is involved. I don't actually know how to do this tho.

3. Put in an extra disk controller and boot from it with both drivers
compiled as module.

--
tejun