2001-02-05 15:10:40

by Peter Horton

[permalink] [raw]
Subject: VIA silent disk corruption - likely fix

I've found the cause of silent disk corruption on my A7V motherboard,
and it might affect all boards with the same North bridge (KT133 etc).

For some reason the IDE controller(s) was sometimes picking up stale
data during bus master DMA to the drive. Assuming that there was no bug
in the CPU it had to be the North bridge that was caching the stuff when
it shouldn't have been. I assume the problem would also apply to other
bus masters (SCSI, NIC etc).

Scanning the motherboard manual showed up a chipset setting "PCI master
read caching" which I suspect is the culprit. According to the manual
this defaults to "on" for Athlons and "off" for Durons (obviously other
BIOSes / MB might treat this setting differently). Unfortunately my BIOS
does not allow me to change this setting independently [1], I only have
the choice of running the machine in "normal" or "optimal" configuration
to alter this setting ("optimal" is the default).

In "normal" mode my machine is rock solid and I see no corruption,
however "normal" mode also changes a lot of other settings (AGP speed,
DRAM interleave etc). Anyone experiencing such corruption should look
for a BIOS setting which disables this "feature".

If anyone out there has a BIOS which allows them to change just this one
setting can they diff the "lspci -vvxxx" output with the setting off and
then on so we can isolate which host bridge biti(s) control this feature.
Maybe we can then add it to 'pci_quirks' and reduce the number of VIA
corruption reports.

P.

[1] the BIOS appears to let you change the option but it defaults the
option the moment you leave the "advanced settings" screen :-(


2001-02-05 19:37:00

by Udo A. Steinberg

[permalink] [raw]
Subject: Re: VIA silent disk corruption - likely fix

Peter Horton wrote:
>
> I've found the cause of silent disk corruption on my A7V motherboard,
> and it might affect all boards with the same North bridge (KT133 etc).
>
> For some reason the IDE controller(s) was sometimes picking up stale
> data during bus master DMA to the drive. Assuming that there was no bug
> in the CPU it had to be the North bridge that was caching the stuff when
> it shouldn't have been. I assume the problem would also apply to other
> bus masters (SCSI, NIC etc).

Do you have a small test program to illustrate that bug? I have an A7V
with PCI Master Read Caching enabled and haven't seen any corruption so
far (which doesn't necessarily mean much). Or if you don't have a test
program, how did you identify it's caching too much?
Also, are you using a Thunderbird or a Duron?

I'm using the 1003 Bios, which has proven to be the most stable so far.
Which one do you use?

-Udo.

P.S. I seem to recall that later Bios Versions (>=1004) disable Master
Read Caching by default, so maybe Asus has also noticed something
wrong with it.

2001-02-06 00:24:51

by Rogerio Brito

[permalink] [raw]
Subject: Re: VIA silent disk corruption - likely fix

On Feb 05 2001, Udo A. Steinberg wrote:
> Peter Horton wrote:
> > I've found the cause of silent disk corruption on my A7V motherboard,
> > and it might affect all boards with the same North bridge (KT133 etc).
>
> Do you have a small test program to illustrate that bug? I have an A7V
> with PCI Master Read Caching enabled and haven't seen any corruption so
> far (which doesn't necessarily mean much). Or if you don't have a test
> program, how did you identify it's caching too much?
> Also, are you using a Thunderbird or a Duron?

Just an extra data point here.

I have an A7V here also and I haven't seen anything wrong with
my setup (but I'm using 2.2.18 + the IDE patches). Perhaps,
I'm not hitting the bad cases or I'm not stressing the system
enough. I'm using a Quantum lct15 drive here with UDMA/66
here. I have a Duron 600MHz and I remember that when I was
setting the machine (after I bought it), I left everything in
the default settings (so, the PCI Master Read Caching is
disabled).

> I'm using the 1003 Bios, which has proven to be the most stable so far.
> Which one do you use?

I also use 1003, but I have not tried anything else (for fear
of something going wrong when I'm upgrading -- like a power
outage). :-)


[]s, Roger...

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rogerio Brito - [email protected] - http://www.ime.usp.br/~rbrito/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

2001-02-07 21:01:24

by Matthias Andree

[permalink] [raw]
Subject: Re: VIA silent disk corruption - likely fix

On Mon, 05 Feb 2001, Peter Horton wrote:

> I've found the cause of silent disk corruption on my A7V motherboard,
> and it might affect all boards with the same North bridge (KT133 etc).

...

> [1] the BIOS appears to let you change the option but it defaults the
> option the moment you leave the "advanced settings" screen :-(

Is your BIOS current? Gigabyte 7ZXR BIOSes (F4) e. g. have exhibited
not-so-different troubles (once you set a suspend timeout, you could not
reset it lest you reloaded the entire BIOS anew; with American
Megatrends' BIOS this means you lose ALL settings except your Standard
CMOS setup), this problem is fixed in F5J (I did not bother to look for
an official F5 release yet).

--
Matthias Andree