2003-06-03 11:02:02

by Greg Norris

[permalink] [raw]
Subject: lost interrupts with 2.4.1-rc6 and i875p chipset

I recently installed Debian on a new i875P chipset machine, and I'm
seeing frequent "hdX: lost interrupt" messages at the console under
2.4.21-rc6. The IDE system appears to stall for 5 seconds or so
whenever this occurs (I assume that a reset/resync is occurring), but
then seems to recover. It's pretty easy to reproduce... any
significant disk activity will trigger the problem. In particular,
running fsck or copying files off a cdrom will expose the problem
within seconds.

This issue does not occur under 2.4.20. Both kernels were compiled
using gcc 2.95.4, and no non-kernel modules are in use in either case
(no nvidia module, for instance). I'd be happy to provide additional
information, if someone can point out what would be helpful.


root@glitch[~]# lspci -i ~adric/pci.ids
00:00.0 Host bridge: Intel Corp. 82875P Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corp. 82875P Processor to AGP Controller (rev 02)
00:1d.0 USB Controller: Intel Corp. 82801EB USB (rev 02)
00:1d.1 USB Controller: Intel Corp. 82801EB USB (rev 02)
00:1d.2 USB Controller: Intel Corp. 82801EB USB (rev 02)
00:1d.3 USB Controller: Intel Corp. 82801EB USB (rev 02)
00:1d.7 USB Controller: Intel Corp. 82801EB USB2 (rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corp. 82801EB LPC Interface Controller (rev 02)
00:1f.1 IDE interface: Intel Corp. 82801EB Ultra ATA Storage Controller (rev 02)
00:1f.2 IDE interface: Intel Corp. 82801EB Ultra ATA Storage Controller (rev 02)
00:1f.3 SMBus: Intel Corp. 82801EB SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation NV18 [GeForce4 MX 440 AGP 8x] (rev a2)
02:02.0 Multimedia audio controller: Creative Labs [SB Live! Value] EMU10k1X
02:02.1 Input device controller: Creative Labs [SB Live! Value] Input device controller
02:08.0 Ethernet controller: Intel Corp.: Unknown device 1050 (rev 02)


2003-06-03 13:08:16

by Alan

[permalink] [raw]
Subject: Re: lost interrupts with 2.4.1-rc6 and i875p chipset

On Maw, 2003-06-03 at 12:15, Greg Norris wrote:
> I recently installed Debian on a new i875P chipset machine, and I'm
> seeing frequent "hdX: lost interrupt" messages at the console under
> 2.4.21-rc6. The IDE system appears to stall for 5 seconds or so
> whenever this occurs (I assume that a reset/resync is occurring), but
> then seems to recover. It's pretty easy to reproduce... any
> significant disk activity will trigger the problem. In particular,
> running fsck or copying files off a cdrom will expose the problem
> within seconds.

Does this occur if you build the kernel without ACPI and without APIC
support ?

2003-06-03 15:04:53

by dmeyer

[permalink] [raw]
Subject: Re: lost interrupts with 2.4.1-rc6 and i875p chipset

In article <[email protected]> you write:
> I recently installed Debian on a new i875P chipset machine, and I'm
> seeing frequent "hdX: lost interrupt" messages at the console under
> 2.4.21-rc6. The IDE system appears to stall for 5 seconds or so
> whenever this occurs (I assume that a reset/resync is occurring), but
> then seems to recover. It's pretty easy to reproduce... any
> significant disk activity will trigger the problem. In particular,
> running fsck or copying files off a cdrom will expose the problem
> within seconds.

I see the same thing with my machine:

$ /sbin/lspci
00:00.0 Host bridge: Intel Corp. 82845G/GL [Brookdale-G] Chipset Host Bridge (rev 03)
00:02.0 VGA compatible controller: Intel Corp. 82845G/GL [Brookdale-G] Chipset Integrated Graphics Device (rev 03)
00:1d.0 USB Controller: Intel Corp. 82801DB USB (Hub #1) (rev 02)
00:1d.1 USB Controller: Intel Corp. 82801DB USB (Hub #2) (rev 02)
00:1d.2 USB Controller: Intel Corp. 82801DB USB (Hub #3) (rev 02)
00:1d.7 USB Controller: Intel Corp. 82801DB USB EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB PCI Bridge (rev 82)
00:1f.0 ISA bridge: Intel Corp. 82801DB ISA Bridge (LPC) (rev 02)
00:1f.1 IDE interface: Intel Corp. 82801DB ICH4 IDE (rev 02)
00:1f.3 SMBus: Intel Corp. 82801DB SMBus (rev 02)
00:1f.5 Multimedia audio controller: Intel Corp. 82801DB AC'97 Audio (rev 02)
01:04.0 SCSI storage controller: Adaptec AHA-2940U/UW/D / AIC-7881U
01:09.0 Ethernet controller: Broadcom Corporation: Unknown device 4401 (rev 01)

though for me it's more likely to be hitting inn really hard than the
cdrom drive. Booting with "noapic" fixes it, though obviously at the
cost of losing whatever advantages the APIC provides.

--
Dave Meyer
[email protected]

2003-06-07 19:06:35

by dmeyer

[permalink] [raw]
Subject: Re: lost interrupts with 2.4.1-rc6 and i875p chipset

In article <[email protected]> you write:
> I see the same thing with my machine:
>
> $ /sbin/lspci
> 00:00.0 Host bridge: Intel Corp. 82845G/GL [Brookdale-G] Chipset Host
> Bridge (rev 03)
> 00:02.0 VGA compatible controller: Intel Corp. 82845G/GL [Brookdale-G]
> Chipset Integrated Graphics Device (rev 03)
> 00:1d.0 USB Controller: Intel Corp. 82801DB USB (Hub #1) (rev 02)
> 00:1d.1 USB Controller: Intel Corp. 82801DB USB (Hub #2) (rev 02)
> 00:1d.2 USB Controller: Intel Corp. 82801DB USB (Hub #3) (rev 02)
> 00:1d.7 USB Controller: Intel Corp. 82801DB USB EHCI Controller (rev 02)
> 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB PCI Bridge (rev 82)
> 00:1f.0 ISA bridge: Intel Corp. 82801DB ISA Bridge (LPC) (rev 02)
> 00:1f.1 IDE interface: Intel Corp. 82801DB ICH4 IDE (rev 02)
> 00:1f.3 SMBus: Intel Corp. 82801DB SMBus (rev 02)
> 00:1f.5 Multimedia audio controller: Intel Corp. 82801DB AC'97 Audio (rev 02)
> 01:04.0 SCSI storage controller: Adaptec AHA-2940U/UW/D / AIC-7881U
> 01:09.0 Ethernet controller: Broadcom Corporation: Unknown device 4401 (rev 01)
>
> though for me it's more likely to be hitting inn really hard than the
> cdrom drive. Booting with "noapic" fixes it, though obviously at the
> cost of losing whatever advantages the APIC provides.

Followup to this: with 2.4.21-rc7-ac1, I get very different
behavior. If I boot with noapic, my machine goes into an endless loop
of

APIC error on CPU0: 40(40)

errors. If I boot regularly (APIC enabled), everything is fine. My
machine has been up for almost a full day without a single lost
interrupt message. This, BTW, is with ACPI enabled in both cases.

--
Dave Meyer
[email protected]

2003-06-08 00:54:05

by Greg Norris

[permalink] [raw]
Subject: Re: lost interrupts with 2.4.1-rc6 and i875p chipset

On Sat, Jun 07, 2003 at 03:20:08PM -0400, [email protected] wrote:
> Followup to this: with 2.4.21-rc7-ac1, I get very different
> behavior. If I boot with noapic, my machine goes into an endless loop
> of
>
> APIC error on CPU0: 40(40)
>
> errors. If I boot regularly (APIC enabled), everything is fine. My
> machine has been up for almost a full day without a single lost
> interrupt message. This, BTW, is with ACPI enabled in both cases.

It looks like -ac1 works for me with apic enabled as well. I haven't
tried rebooting with noapic yet,,,