2011-05-12 13:29:32

by Simon Richter

[permalink] [raw]
Subject: Nobody Cared / Disabling IRQ with Intel E1000 card

Hi,

I have a server with two Intel E1000 and one Areca RAID card:

[...]
01:00.0 RAID bus controller [0104]: Areca Technology Corp. ARC-1231 12-Port PCI-Express to SATA RAID Controller [17d3:1280]
06:00.0 Ethernet controller [0200]: Intel Corporation 82541PI Gigabit Ethernet Controller [8086:107c] (rev 05)
06:02.0 Ethernet controller [0200]: Intel Corporation 82541PI Gigabit Ethernet Controller [8086:107c] (rev 05)
[...]

One of the Ethernet cards ends up sharing an IRQ with the RAID card,
which I already find suboptimal (will send a separate mail for that),
however the main issue currently is that some IRQs from the Ethernet
card trigger this:

[ 315.481763] irq 16: nobody cared (try booting with the "irqpoll" option)
[ 315.481773] Pid: 0, comm: swapper Not tainted 2.6.38-2-amd64 #1
[ 315.481778] Call Trace:
[ 315.481781] <IRQ> [<ffffffff810923e9>] ? __report_bad_irq+0x30/0x80
[ 315.481796] [<ffffffff81092560>] ? note_interrupt+0x127/0x19f
[ 315.481804] [<ffffffff8100f502>] ? read_tsc+0x5/0x16
[ 315.481809] [<ffffffff81092ef7>] ? handle_fasteoi_irq+0xb4/0xdf
[ 315.481815] [<ffffffff8100be54>] ? handle_irq+0x17/0x1f
[ 315.481820] [<ffffffff8100b50e>] ? do_IRQ+0x45/0xaa
[ 315.481827] [<ffffffff81327213>] ? ret_from_intr+0x0/0x15
[ 315.481830] <EOI> [<ffffffff811f32fc>] ? acpi_hw_read_multiple+0x28/0x60
[ 315.481841] [<ffffffff8100f4d5>] ? sched_clock+0x5/0x8
[ 315.481854] [<ffffffffa026efc7>] ? acpi_idle_enter_bm+0x259/0x291 [processor]
[ 315.481862] [<ffffffffa026efc0>] ? acpi_idle_enter_bm+0x252/0x291 [processor]
[ 315.481870] [<ffffffff81259f27>] ? cpuidle_idle_call+0x11f/0x1cc
[ 315.481875] [<ffffffff81008d93>] ? cpu_idle+0xab/0xe1
[ 315.481881] [<ffffffff8169ed70>] ? start_kernel+0x3dc/0x3e7
[ 315.481886] [<ffffffff8169e3cd>] ? x86_64_start_kernel+0x107/0x114
[ 315.481890] handlers:
[ 315.481893] [<ffffffffa01b50b5>] (arcmsr_do_interrupt+0x0/0xe [arcmsr])
[ 315.481905] [<ffffffffa01bcee5>] (e1000_intr+0x0/0x100 [e1000])
[ 315.481915] Disabling IRQ #16

Afterwards, both RAID and network card are rather slow. The problem
seems to be in the Ethernet card, as, after some time:

[28536.149807] irq 18: nobody cared (try booting with the "irqpoll" option)
[28536.149812] Pid: 0, comm: swapper Not tainted 2.6.38-2-amd64 #1
[28536.149814] Call Trace:
[28536.149816] <IRQ> [<ffffffff810923e9>] ? __report_bad_irq+0x30/0x80
[28536.149824] [<ffffffff81092560>] ? note_interrupt+0x127/0x19f
[28536.149828] [<ffffffff81092ef7>] ? handle_fasteoi_irq+0xb4/0xdf
[28536.149832] [<ffffffff8100be54>] ? handle_irq+0x17/0x1f
[28536.149834] [<ffffffff8100b50e>] ? do_IRQ+0x45/0xaa
[28536.149838] [<ffffffff81327213>] ? ret_from_intr+0x0/0x15
[28536.149842] [<ffffffff8100f502>] ? read_tsc+0x5/0x16
[28536.149846] [<ffffffff8104c7cd>] ? __do_softirq+0x55/0x1a0
[28536.149848] [<ffffffff8100a85c>] ? call_softirq+0x1c/0x30
[28536.149851] [<ffffffff8100be03>] ? do_softirq+0x3f/0x79
[28536.149854] [<ffffffff8104c6dd>] ? irq_exit+0x36/0x7b
[28536.149856] [<ffffffff8100b55d>] ? do_IRQ+0x94/0xaa
[28536.149859] [<ffffffff81327213>] ? ret_from_intr+0x0/0x15
[28536.149861] <EOI> [<ffffffff811f32fc>] ? acpi_hw_read_multiple+0x28/0x60
[28536.149867] [<ffffffff8100f4d5>] ? sched_clock+0x5/0x8
[28536.149875] [<ffffffffa026efc7>] ? acpi_idle_enter_bm+0x259/0x291 [processor]
[28536.149879] [<ffffffffa026efc0>] ? acpi_idle_enter_bm+0x252/0x291 [processor]
[28536.149883] [<ffffffff81259f27>] ? cpuidle_idle_call+0x11f/0x1cc
[28536.149886] [<ffffffff81008d93>] ? cpu_idle+0xab/0xe1
[28536.149889] [<ffffffff8169ed70>] ? start_kernel+0x3dc/0x3e7
[28536.149892] [<ffffffff8169e3cd>] ? x86_64_start_kernel+0x107/0x114
[28536.149894] handlers:
[28536.149895] [<ffffffffa01bcee5>] (e1000_intr+0x0/0x100 [e1000])
[28536.149903] Disabling IRQ #18

/proc/interrupts has

16: 302272 0 0 0 IO-APIC-fasteoi arcmsr, eth0
18: 1700001 0 0 0 IO-APIC-fasteoi eth1

which I find curious, as all the IRQs seem to be handled by the first
core.

/proc/cmdline is

BOOT_IMAGE=/vmlinuz-2.6.38-2-amd64 root=/dev/mapper/kiwi-root ro irqpoll

that is, I have already set the "irqpoll" option.

A wild guess would be that the IRQ, which is level-triggered, is not
cleared fast enough and triggered again, but there is no work to be done
during the second run, so this is flagged as a spurious IRQ.

Is there anything I can do to get rid of this problem?

Simon


Attachments:
(No filename) (4.38 kB)
dmesg (60.17 kB)
interrupts (2.71 kB)
lspci (2.49 kB)
cpuinfo (3.33 kB)
Download all attachments