2002-12-19 10:16:14

by Gianni Tedesco

[permalink] [raw]
Subject: NMI: IOCK error (debug interrupt?) - nope

Hello,

A firewall of ours recently went tits up (2.4.19). It was still routing
traffic but when I connected to SSH for example the SSH banner would not
appear, it looked like all userspace was dead.

When we looked in the logs there was this. Presumably the hardware is
broken. But I wonder if anyone can confirm this? Thanks!

NMI: IOCK error (debug interrupt?)
CPU: 0
EIP: 0010:[default_idle+34/48] Not tainted
EIP: 0010:[<c0106e12>] Not tainted
EFLAGS: 00000246
eax: 00000000 ebx: c0106df0 ecx: 00000032 edx: 00000019
esi: c02f6000 edi: c02f6000 ebp: c0106df0 esp: c02f7fcc
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c02f7000)
Stack: c0106e92 00000002 00098700 c0105000 0008e000 c02f8759 c028e6c0
0001ffc0
0001ffc0 0001ffc0 0001ffc0 c03404c0 c0100191
Call Trace: [cpu_idle+82/112] [_stext+0/48]
Call Trace: [<c0106e92>] [<c0105000>]

Code: f4 c3 fb c3 8d 76 00 8d bc 27 00 00 00 00 fb b8 ff ff ff ff

--
// Gianni Tedesco (gianni at ecsc dot co dot uk)
lynx --source http://www.scaramanga.co.uk/gianni-at-ecsc.asc | gpg --import
8646BE7D: 6D9F 2287 870E A2C9 8F60 3A3C 91B5 7669 8646 BE7D


Attachments:
signature.asc (232.00 B)
This is a digitally signed message part

2002-12-23 15:04:51

by Gianni Tedesco

[permalink] [raw]
Subject: [PATCH]: Re: NMI: IOCK error (debug interrupt?) - nope

On Thu, 2002-12-19 at 10:23, Gianni Tedesco wrote:
> Hello,
>
> A firewall of ours recently went tits up (2.4.19). It was still routing
> traffic but when I connected to SSH for example the SSH banner would not
> appear, it looked like all userspace was dead.
>
> When we looked in the logs there was this. Presumably the hardware is
> broken. But I wonder if anyone can confirm this? Thanks!
>
> NMI: IOCK error (debug interrupt?)

Turns out to be a 2bit ECC error. The machine is a dell power-edge 350.

--- linux-2.4.19.orig/arch/i386/kernel/traps.c Mon Dec 23 13:28:32 2002
+++ linux-2.4.19/arch/i386/kernel/traps.c Mon Dec 23 15:11:24 2002
@@ -613,7 +613,7 @@
{
unsigned long i;

- printk("NMI: IOCK error (debug interrupt?)\n");
+ printk("NMI: IOCK error (debug interrupt / ECC RAM error?)\n");
show_registers(regs);

/* Re-enable the IOCK line, wait for a few seconds */


--
// Gianni Tedesco (gianni at ecsc dot co dot uk)
lynx --source http://www.scaramanga.co.uk/gianni-at-ecsc.asc | gpg --import
8646BE7D: 6D9F 2287 870E A2C9 8F60 3A3C 91B5 7669 8646 BE7D


Attachments:
signature.asc (232.00 B)
This is a digitally signed message part

2002-12-23 15:20:56

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH]: Re: NMI: IOCK error (debug interrupt?) - nope

On Mon, Dec 23, 2002 at 03:12:22PM +0000, Gianni Tedesco wrote:
> > When we looked in the logs there was this. Presumably the hardware is
> > broken. But I wonder if anyone can confirm this? Thanks!
> >
> > NMI: IOCK error (debug interrupt?)
>
> Turns out to be a 2bit ECC error. The machine is a dell power-edge 350.

A while ago I mentioned it would be nice to get the ECC drivers
cleaned up and included. Any yays or nays to getting this stuff
done sometime ?

Yes it's a feature, but it's also just extra drivers.
Decisions decisions...

Dave

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-12-23 16:35:18

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [PATCH]: Re: NMI: IOCK error (debug interrupt?) - nope

On Mon, 23 Dec 2002, Dave Jones wrote:

| On Mon, Dec 23, 2002 at 03:12:22PM +0000, Gianni Tedesco wrote:
| > > When we looked in the logs there was this. Presumably the hardware is
| > > broken. But I wonder if anyone can confirm this? Thanks!
| > >
| > > NMI: IOCK error (debug interrupt?)
| >
| > Turns out to be a 2bit ECC error. The machine is a dell power-edge 350.
|
| A while ago I mentioned it would be nice to get the ECC drivers
| cleaned up and included. Any yays or nays to getting this stuff
| done sometime ?

Yes!

| Yes it's a feature, but it's also just extra drivers.
| Decisions decisions...

Put that way, any driver is a feature, but that's not the point AFAIK.
So go with more drivers...

--
~Randy