2008-12-26 08:53:47

by Nick Warne

[permalink] [raw]
Subject: [QUESTION] Rather high /proc/interrupts ERR count

Hi all,

Running 2.6.28 stable.

x86_64 system with AMD Athlon(tm) 64 X2 Dual Core Processor 5200+

The machine runs perfectly fine, but...

...should I be concerned with the rather high ERR count below (this
is after 41 minutes uptime). I also have to boot with noapic to stop
lockups (nvidia driver, I expect):

CPU0 CPU1
0: 125 1 XT-PIC-XT timer
1: 5534 97 XT-PIC-XT i8042
2: 0 0 XT-PIC-XT cascade
3: 1 0 XT-PIC-XT
4: 1 1 XT-PIC-XT
5: 78933 11713 XT-PIC-XT sata_nv, ohci_hcd:usb2, Intel ICH
6: 3 0 XT-PIC-XT floppy
7: 21189 333431 XT-PIC-XT
8: 1 0 XT-PIC-XT rtc0
9: 0 0 XT-PIC-XT acpi
10: 246562 9267 XT-PIC-XT eth0
11: 2296 72 XT-PIC-XT sata_nv, ehci_hcd:usb1, nvidia
14: 98 51 XT-PIC-XT pata_amd
15: 0 0 XT-PIC-XT pata_amd
NMI: 0 0 Non-maskable interrupts
LOC: 236200 234958 Local timer interrupts
RES: 33224 73287 Rescheduling interrupts
CAL: 890 964 Function call interrupts
TLB: 982 1127 TLB shootdowns
TRM: 0 0 Thermal event interrupts
THR: 0 0 Threshold APIC interrupts
SPU: 0 0 Spurious interrupts
ERR: 354616
MIS: 0


--
Free Software Foundation Associate Member 5508
http://linicks.net/


2008-12-29 03:39:44

by Len Brown

[permalink] [raw]
Subject: Re: [QUESTION] Rather high /proc/interrupts ERR count


On Fri, 26 Dec 2008, Nick Warne wrote:

> Hi all,
>
> Running 2.6.28 stable.
>
> x86_64 system with AMD Athlon(tm) 64 X2 Dual Core Processor 5200+
>
> The machine runs perfectly fine, but...
>
> ...should I be concerned with the rather high ERR count below (this
> is after 41 minutes uptime). I also have to boot with noapic to stop
> lockups (nvidia driver, I expect):
>
> CPU0 CPU1
> 0: 125 1 XT-PIC-XT timer
> 1: 5534 97 XT-PIC-XT i8042
> 2: 0 0 XT-PIC-XT cascade
> 3: 1 0 XT-PIC-XT
> 4: 1 1 XT-PIC-XT
> 5: 78933 11713 XT-PIC-XT sata_nv, ohci_hcd:usb2, Intel ICH
> 6: 3 0 XT-PIC-XT floppy
> 7: 21189 333431 XT-PIC-XT
> 8: 1 0 XT-PIC-XT rtc0
> 9: 0 0 XT-PIC-XT acpi
> 10: 246562 9267 XT-PIC-XT eth0
> 11: 2296 72 XT-PIC-XT sata_nv, ehci_hcd:usb1, nvidia
> 14: 98 51 XT-PIC-XT pata_amd
> 15: 0 0 XT-PIC-XT pata_amd
> NMI: 0 0 Non-maskable interrupts
> LOC: 236200 234958 Local timer interrupts
> RES: 33224 73287 Rescheduling interrupts
> CAL: 890 964 Function call interrupts
> TLB: 982 1127 TLB shootdowns
> TRM: 0 0 Thermal event interrupts
> THR: 0 0 Threshold APIC interrupts
> SPU: 0 0 Spurious interrupts
> ERR: 354616
> MIS: 0

21189+333431 = 354620

So virtually all of the ERR's are from IRQ7,
which is how the PIC identifies interrupts
when it has no idea of the real source.

The PIC actually doesn't have a concept of directing
interrupts to non CPU0, and the fact that CPU1
on this box thinks it receives some of the interrupts
is in the undocumented area known as "implementation specific"
or maybe stated as "don't do this at home".

So the better thing to focus on would be how to get rid of
"noapic" on your cmdline and get the system into IOAPIC
mode, as it was designed to be.

-- Len Brown, Intel Open Source Technology Center>

2008-12-29 11:17:28

by Nick Warne

[permalink] [raw]
Subject: Re: [QUESTION] Rather high /proc/interrupts ERR count

On Sun, 28 Dec 2008 22:39:32 -0500 (EST)
Len Brown <[email protected]> wrote:

>
> On Fri, 26 Dec 2008, Nick Warne wrote:

> > ERR: 354616
> > MIS: 0
>
> 21189+333431 = 354620
>
> So virtually all of the ERR's are from IRQ7,
> which is how the PIC identifies interrupts
> when it has no idea of the real source.
>
> The PIC actually doesn't have a concept of directing
> interrupts to non CPU0, and the fact that CPU1
> on this box thinks it receives some of the interrupts
> is in the undocumented area known as "implementation specific"
> or maybe stated as "don't do this at home".
>
> So the better thing to focus on would be how to get rid of
> "noapic" on your cmdline and get the system into IOAPIC
> mode, as it was designed to be.

Thanks Len - you explained it to me perfectly, and as suggested:

CPU0 CPU1
0: 121 494 IO-APIC-edge timer
1: 17 20796 IO-APIC-edge i8042
6: 0 3 IO-APIC-edge floppy
7: 1 0 IO-APIC-edge
8: 0 1 IO-APIC-edge rtc0
9: 0 0 IO-APIC-fasteoi acpi
14: 2 139 IO-APIC-edge pata_amd
15: 0 0 IO-APIC-edge pata_amd
16: 86 85163 IO-APIC-fasteoi nvidia
20: 0 4 IO-APIC-fasteoi ehci_hcd:usb1
21: 0 0 IO-APIC-fasteoi sata_nv
22: 570 404175 IO-APIC-fasteoi sata_nv, Intel ICH
23: 1236 8996739 IO-APIC-fasteoi ohci_hcd:usb2, eth0
NMI: 0 0 Non-maskable interrupts
LOC: 4497869 5381066 Local timer interrupts
RES: 999249 570052 Rescheduling interrupts
CAL: 105254 112444 Function call interrupts
TLB: 14640 6594 TLB shootdowns
TRM: 0 0 Thermal event interrupts
THR: 0 0 Threshold APIC interrupts
SPU: 0 0 Spurious interrupts
ERR: 1
MIS: 0

Now appears fine (I get one ERR at boot, for some reason).

Thanks for the information.

Nick
--
Free Software Foundation Associate Member 5508
http://linicks.net/