2003-08-08 18:33:13

by Alistair John Strachan

[permalink] [raw]
Subject: 2.6.0-test2-mm5: scheduler problem & apic

Hi,

I'm experiencing occassional lockups under minor load with 2.6.0-test2-mm5. I
believe it is related to one or more of the scheduler changes (another div by
zero, maybe?). It's sporradic. I do not experience such instability on 2.6.0-
test2-mm2. I'm writing this on 2.6.0-test2-mm5 with the o13 patches reverted
and it hasn't happened yet, but this boot could just be lucky. My kernel is
not tainted.

(I just noticed o14 went live. I'll try with this in a moment.)

Andrew, I don't know if anybody's given you any feedback regarding the nForce
2 APIC fix, but it appears to resolve the dead kernel problem, and I no
longer have problems with IRQ fallouts when using acpi, and I've removed
pci=noacpi from cmdline. ACPI and APIC now work together harmoniously.

However, on my EPoX 8RDA+ mainboard, I see the following in dmesg. Is this a
bug in the APIC changes or a BIOS bug? I'll contact EPoX about it if it's
the latter. It does not appear to impair function, but it does get a priority
output even when booting with 'quiet' on cmdline (if this is not that
serious, which it does not appear to be, should this message be demoted?).

..MP-BIOS bug: 8254 timer not connected to IO-APIC

(..and hidden in dmesg)

...trying to set up timer (IRQ0) through the 8259A ... failed.
...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ... works.
^
and is therefore recoverable?

Cheers,
Alistair.


2003-08-08 20:30:22

by dth

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm5: scheduler problem & apic

Alistair J Strachan <[email protected]> wrote:
>Andrew, I don't know if anybody's given you any feedback regarding the nForce
>2 APIC fix, but it appears to resolve the dead kernel problem, and I no
>longer have problems with IRQ fallouts when using acpi, and I've removed
>pci=noacpi from cmdline. ACPI and APIC now work together harmoniously.

I still occasionally see this behaviour:

New kernel install (booting from 2.6.0-test2 (vanilla)):

EXT3-fs: mounted filesystem with ordered data mode.
irq 19: nobody cared!
Call Trace:
[<c010be0a>] __report_bad_irq+0x32/0x90
[<c010bee0>] note_interrupt+0x50/0x78
[<c010c0e0>] do_IRQ+0xc0/0x124
[<c0108dd0>] default_idle+0x0/0x34
[<c0107000>] _stext+0x0/0x48
[<c02819ec>] common_interrupt+0x18/0x20
[<c0108dd0>] default_idle+0x0/0x34
[<c0107000>] _stext+0x0/0x48
[<c0108df9>] default_idle+0x29/0x34
[<c0108e83>] cpu_idle+0x37/0x48
[<c0107045>] _stext+0x45/0x48
[<c032276b>] start_kernel+0x147/0x150
handlers:
[<c01f3b34>] (ide_intr+0x0/0x160)
[<c01f3b34>] (ide_intr+0x0/0x160)
Disabling IRQ #19
hde: sata_error = 0x00400000, watchdog = 1, siimage_mmio_ide_dma_test_irq
hdg: sata_error = 0x00400000, watchdog = 1, siimage_mmio_ide_dma_test_irq

powercycling the machine let the kernel boot without this notice (and normal
working). This is a newsgate machine with HT and sata.

Sounds familiar ?

Danny
--
I think so Brain, but why does a forklift
have to be so big if all it does is lift forks?