2002-09-29 02:38:50

by Ben Greear

[permalink] [raw]
Subject: APIC error on CPU0: 00(02)


Kernel: 2.4.20-pre8

Machine: dual 1.66Ghz AMD machine, Tyan motherboard 64/66 PCI,
2 Intel PRO/1000 MT nics,
1 DFE-570tx 4-port tulip based NIC.
2 built-in 3-com NICs.

I was running 20Mbps over each of the 4 tulip 100Mbps ports, and nothing over
the gigE ports... Things ran fine, but I occasionally saw messages like
this:

[root@localhost lanforge]# eth7: (19541091) System Error occured (2)
eth7: (19547299) System Error occured (2)
eth6: (19163408) System Error occured (2)
[root@localhost lanforge]# eth6: (19167052) System Error occured (2)

[root@localhost lanforge]# eth6: (19171414) System Error occured (2)
eth7: (19560749) System Error occured (2)
eth6: (19187463) System Error occured (2)


I mounted another machine across NFS using one of the 3-com ports, and
started this command: tar -xvzf /mnt/blah/kernel.foo.tgz

At this point, the machine starts behaving extremely slowly, but after
a few fits and bursts, it finishes and I can get a response at the prompt
again.

I see more messages similar to those above, and I also see these errors:

[root@localhost lanforge]# eth6: (19773823) System Error occured (2)
APIC error on CPU0: 00(02)
APIC error on CPU1: 00(02)


If anyone has any ideas, or can suggest more debugging information I can
gather to make the problem easier to solve, please let me know!

Thanks,
Ben

--
Ben Greear <[email protected]> <Ben_Greear AT excite.com>
President of Candela Technologies Inc http://www.candelatech.com
ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear



2002-09-29 11:21:11

by Mikael Pettersson

[permalink] [raw]
Subject: Re: APIC error on CPU0: 00(02)

On Sat, 28 Sep 2002 19:44:08 -0700, Ben Greear wrote:
>Kernel: 2.4.20-pre8
>
>Machine: dual 1.66Ghz AMD machine, Tyan motherboard 64/66 PCI,
>...
>APIC error on CPU0: 00(02)
>APIC error on CPU1: 00(02)

Receive Checksum Error on the APIC bus.

This indicates a hardware problem. Dual-Athlon boards
have a reputation for being unstable unless everything
is exactly right: power supply, cooling, not overclocked,
MP not XP CPUs, correct type memory modules. BIOS MP 1.1
or 1.4 settings may also be critical.

Also double-check all connectors between the mainboard and
the case: at least one board (the ASUS a7m266-d I think)
has or had bugs in its manual.

/Mikael