2005-02-14 09:20:00

by alan

[permalink] [raw]
Subject: Odd problem with dual processor AMD system

I have a dual processor AMD machine.

It give apic errors after running for a while. (Usually after heavy
disk i/o.)

APIC error on CPU0: 02(02)
APIC error on CPU1: 02(02)

After this occurs, read/writes to/from the drive slow down
substantially.

Unless...

If I set the scheduler to deadline (elevator=deadline on the kernel load
line), the APIC errors remain, but the disk slowdown goes away.

My initial thought is that something in the standard scheduler is
getting corrupted, but not when the deadline scheduler is used.

Is there a way to prove this?

This occurs on every 2.6.x kernel I have used.

On a similar, but different vein, if I use the "elevator=deadline" on
the dual processor AMD64 machine running Fedora Core 2 (64bit version),
the kernel blows up real good early enough to not leave a message in the
logs. (The machine is a couple of hundred miles from me, so I am not
certain what the error message on the screen is on time of detonation.)

Ideas on how to log something that early in the boot process without
being in front of the machine?

--
Jag vill inte köpa den här lutefisk , den er skrapet.


2005-02-14 20:01:11

by Michael J. Cohen

[permalink] [raw]
Subject: Re: Odd problem with dual processor AMD system

Alan wrote:

>Ideas on how to log something that early in the boot process without
>being in front of the machine?
>
Serial console is best. remote power cycle, grub, and serial console
make my life easier every day of the week.

Also it would help for the other issues if you'd post a .config and
perhaps the rest of your dmesg to the lkml, though if gets to be fairly
large doing so on the web would be best.

HTH,
Michael