2002-06-20 12:21:31

by Helge Hafting

[permalink] [raw]
Subject: Re: The buggy APIC of the Abit BP6

"Maciej W. Rozycki" wrote:
>
> On Tue, 18 Jun 2002, Robbert Kouprie wrote:
>
> > Problem now is, in the ack_none function we only know about the
> > (illegal) vector we are getting, and not about the interrupt we need to
> > reset. Could there be some kind of link between these, so that
> > kick_IO_APIC_irq can be called from there?
>
> You get an invalid vector delivered due to massive transmission errors at
> the inter-APIC bus. The errors are a serious hardware problem that cannot
> and should not be fixed in software.

Yes, the hardware is at fault. I don't have money for
other hardware though, so working around it seems a good idea.

We could simplify the IDE driver a lot by dropping support for
all the broken controllers too. Or tell
people to not use DMA on them.


Of course such an option should default to OFF, and
perhaps live under "dangerous." It can keep the
BP6 going much longer, which is good enough
for a home machine.

Failing due to a stuck NIC after one week seems worse
than crashing due to a scrambled IPI after some months.
There are more interrupts than IPI's.

This sort of fix don't really make things worse, the
theoretical scrambled IPI will happen without it too.
The safe solution is NOAPIC, this fix simply makes it work
for a longer time using the bad apic.

>
> I'm told getting a better PSU may help, though.
Unfortunately not. I got a nice PSU when I ordered the BP6,
thinking that power was the only issue. (It was the only
cheap dual solution at the time.)

Helge Hafting


2002-06-20 13:09:41

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: The buggy APIC of the Abit BP6

On Thu, 20 Jun 2002, Helge Hafting wrote:

> Yes, the hardware is at fault. I don't have money for
> other hardware though, so working around it seems a good idea.

What's the problem with using a privately patched kernel then? I do that
all the time for various stuff.

> We could simplify the IDE driver a lot by dropping support for
> all the broken controllers too. Or tell
> people to not use DMA on them.

It depends on how intrusive and reliable the workarounds are. If merely
slowing down or using PIO is sufficient, then they may be OK to include.

> The safe solution is NOAPIC, this fix simply makes it work
> for a longer time using the bad apic.

Well, consider it *the* workaround, then.

--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +

2002-06-20 22:29:29

by Kevin Krieser

[permalink] [raw]
Subject: RE: The buggy APIC of the Abit BP6


Obviously, in my case, if I had known about the problems it would have
later, I wouldn't have bought it. But at the time, 2 433 Celerons were
cheaper than a Pentium III 600 system.

At least, with the additional fan I've added, and the "noapic" option, it is
pretty reliable. Up for weeks at a time before I reboot. Of course, when I
had my IBM hard drives on the HT366 board, it was more likely to crash from
a DMA error than the apic problems.

In fact, the last problems I had were SCSI related, fixed by adding a second
SCSI card for some external devices. Not motherboard related.

-----Original Message-----
From: [email protected]
[mailto:[email protected]]On Behalf Of Helge Hafting
Sent: Thursday, June 20, 2002 7:21 AM
To: Maciej W. Rozycki; [email protected]
Subject: Re: The buggy APIC of the Abit BP6
> I'm told getting a better PSU may help, though.
Unfortunately not. I got a nice PSU when I ordered the BP6,
thinking that power was the only issue. (It was the only
cheap dual solution at the time.)