2001-02-05 19:07:38

by Peter Horton

[permalink] [raw]
Subject: VIA silent disk corruption - patch

Okay, looks like this fixes it (for me anyways).

Thanks to Mark Hahn and Andre for their help with this problem.

P.


--- linux-2.4.1/arch/i386/kernel/pci-pc.c Thu Jun 22 15:17:16 2000
+++ linux-2.4.1-bm-fix/arch/i386/kernel/pci-pc.c Mon Feb 5 18:37:35 2001
@@ -924,6 +924,22 @@
pcibios_max_latency = 32;
}

+static void __init pci_fixup_vt8363(struct pci_dev *d)
+{
+ /*
+ * VIA VT8363 host bridge has broken feature 'PCI Master Read
+ * Caching'. It caches more than is good for it, sometimes
+ * serving the bus master with stale data. Some BIOSes enable
+ * it by default, so we disable it.
+ */
+ u8 tmp;
+ pci_read_config_byte(d, 0x70, &tmp);
+ if(tmp & 4) {
+ printk("PCI: Bus master read caching disabled\n");
+ pci_write_config_byte(d, 0x70, tmp & ~4);
+ }
+}
+
struct pci_fixup pcibios_fixups[] = {
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82451NX, pci_fixup_i450nx },
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82454GX, pci_fixup_i450gx },
@@ -936,6 +952,7 @@
{ PCI_FIXUP_HEADER, PCI_ANY_ID, PCI_ANY_ID, pci_fixup_ide_bases },
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_5597, pci_fixup_latency },
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_5598, pci_fixup_latency },
+ { PCI_FIXUP_HEADER, PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8363_0, pci_fixup_vt8363 },
{ 0 }
};


2001-02-05 19:22:18

by Petr Vandrovec

[permalink] [raw]
Subject: Re: VIA silent disk corruption - patch

On 5 Feb 01 at 19:05, Peter Horton wrote:

> Okay, looks like this fixes it (for me anyways).

> + * VIA VT8363 host bridge has broken feature 'PCI Master Read
> + * Caching'. It caches more than is good for it, sometimes
> + * serving the bus master with stale data. Some BIOSes enable
> + * it by default, so we disable it.

Hi,
I'll try it today, though I'm not sure that it will fix lost last
dword on read. But at least it should stop corruption on write...
After your mail I noticed that there is couple of `unsettable'
options in BIOS, and I did not tried switching BIOS from optimal to
slow setting yet, so maybe there are more broken optimizations?
I'll keep you informed.
Thanks,
Petr Vandrovec
[email protected]


2001-02-06 15:52:55

by Dale Farnsworth

[permalink] [raw]
Subject: Re: VIA silent disk corruption - patch


In article <[email protected]>,
Peter Horton <[email protected]> wrote:
> + * VIA VT8363 host bridge has broken feature 'PCI Master Read
> + * Caching'. It caches more than is good for it, sometimes
> + * serving the bus master with stale data. Some BIOSes enable
> + * it by default, so we disable it.

Another data point:

I have an ASUS A7V motherboard with via vt82c686a and Promise pdc20265
IDE controllers. I noticed disk data corruption when I enabled DMA.
The corrupted data was 4K bytes long on 4K byte boundaries and occurred
about once for every couple of gigabytes copied via cpio.
I saw this corruption when the disks were connected to the pdc20265
as well as to the 686a.

I also noticed that turning off read caching eliminated the corruption.

However, if I enable the BIOS parameter "I/O Recovery Time", I can still
enable read caching without seeing any data corruption.
The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default
where the previous revision I had (1004D) did not.

-Dale

--

Dale Farnsworth [email protected]

2001-02-06 16:02:16

by Udo A. Steinberg

[permalink] [raw]
Subject: Re: VIA silent disk corruption - patch

Dale Farnsworth wrote:
>
> However, if I enable the BIOS parameter "I/O Recovery Time", I can still
> enable read caching without seeing any data corruption.
> The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default
> where the previous revision I had (1004D) did not.

Interesting stuff.

Asus, Germany released 1005D today. It's available from
ftp://ftp.asuscom.de/pub/ASUSCOM/BIOS/Socket_A/VIA_Chipset/Apollo_KT133/A7V/1005D.zip

No comments about what they changed and/or fixed.

-Udo.

2001-02-06 18:23:47

by Peter Horton

[permalink] [raw]
Subject: Re: VIA silent disk corruption - patch

On Tue, Feb 06, 2001 at 08:52:23AM -0700, Dale Farnsworth wrote:
>
> In article <[email protected]>,
> Peter Horton <[email protected]> wrote:
> > + * VIA VT8363 host bridge has broken feature 'PCI Master Read
> > + * Caching'. It caches more than is good for it, sometimes
> > + * serving the bus master with stale data. Some BIOSes enable
> > + * it by default, so we disable it.
>
> Another data point:
>
> I have an ASUS A7V motherboard with via vt82c686a and Promise pdc20265
> IDE controllers. I noticed disk data corruption when I enabled DMA.
> The corrupted data was 4K bytes long on 4K byte boundaries and occurred
> about once for every couple of gigabytes copied via cpio.
> I saw this corruption when the disks were connected to the pdc20265
> as well as to the 686a.
>
> I also noticed that turning off read caching eliminated the corruption.
>
> However, if I enable the BIOS parameter "I/O Recovery Time", I can still
> enable read caching without seeing any data corruption.
> The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default
> where the previous revision I had (1004D) did not.
>

I still get corruption with "I/O Recovery Time" enabled :-(

I don't get corruption with the BIOS "normal" settings (1004D).

I might update my BIOS to the latest BIOS in case it changes any other
settings.

P.

2001-02-06 19:57:47

by Jonathan Morton

[permalink] [raw]
Subject: Re: VIA silent disk corruption - patch

>I still get corruption with "I/O Recovery Time" enabled :-(
>
>I don't get corruption with the BIOS "normal" settings (1004D).
>
>I might update my BIOS to the latest BIOS in case it changes any other
>settings.

I'm using an Abit KT7 m/board, which uses the same KT133 chipset that I
believe you are all talking about. Note this is distinct from the KT7-RAID
which has a UDMA-100 RAID chipset on it in addition to the normal IDE
bridge. I've had no problems with disk corruption, despite turning
everything I dare in the BIOS to "full optimisation" settings - this
includes the "Fast CPU Command Decode" and "Enhance Chipset Performance".

The CPU is a Duron 700MHz, and the drives in question are a Seagate
Barracuda ST310210A on hda and a TEAC CD-540E on hdc. /sbin/hdparm reports
both drives as NOT using DMA, I might try switching it on and seeing what
happens.

... half an hour later, i actually try it. Machine appears to be locked
while performing hdparm -t /dev/hda, but waiting to see if it's actually a
timeout. Performance is abysmal when UDMA is off, incidentally - less than
5Mb/sec from this 7200rpm drive. The 10,000rpm IBM SCSI drive also in that
machine benchmarks at around 35Mb/sec.

... after about 10 minutes waiting, while adding to this e-mail, the box is
still hung. Hmph... *RESET*

--------------------------------------------------------------
from: Jonathan "Chromatix" Morton
mail: [email protected] (not for attachments)
big-mail: [email protected]
uni-mail: [email protected]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-----BEGIN GEEK CODE BLOCK-----
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r- y+
-----END GEEK CODE BLOCK-----


2001-02-06 20:09:28

by Jonathan Morton

[permalink] [raw]
Subject: Re: VIA silent disk corruption - patch

>... after about 10 minutes waiting, while adding to this e-mail, the box is
>still hung. Hmph... *RESET*

System log shows no "DMA timeout" messages after rebooting, and no errors
from the inevitable FSCK.

--------------------------------------------------------------
from: Jonathan "Chromatix" Morton
mail: [email protected] (not for attachments)
big-mail: [email protected]
uni-mail: [email protected]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-----BEGIN GEEK CODE BLOCK-----
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r- y+
-----END GEEK CODE BLOCK-----


2001-02-07 23:26:39

by Peter Horton

[permalink] [raw]
Subject: Re: VIA silent disk corruption - patch

On Tue, Feb 06, 2001 at 05:01:46PM +0100, Udo A. Steinberg wrote:
> Dale Farnsworth wrote:
> >
> > However, if I enable the BIOS parameter "I/O Recovery Time", I can still
> > enable read caching without seeing any data corruption.
> > The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default
> > where the previous revision I had (1004D) did not.
>
> Interesting stuff.
>
> Asus, Germany released 1005D today. It's available from
> ftp://ftp.asuscom.de/pub/ASUSCOM/BIOS/Socket_A/VIA_Chipset/Apollo_KT133/A7V/1005D.zip
>
> No comments about what they changed and/or fixed.
>

Good news here, looks like the new BIOS fixes it (1005D). I've run a
heavy test for at least 10 hours without a single blip. The BIOS is set
for "optimal". Hoorah!

Here's the North bridge diff for anyone who can't get a BIOS update :-)

P.

--- bad.pci Sun Feb 4 22:29:22 2001
+++ new.pci Wed Feb 7 23:11:28 2001
@@ -1,7 +1,7 @@
00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 8033
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR+
+ Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
Latency: 0 set
Region 0: Memory at e4000000 (32-bit, prefetchable) [size=64M]
Capabilities: [a0] AGP version 2.0
@@ -10,13 +10,13 @@
Capabilities: [c0] Power Management version 2
Flags: PMEClk- AuxPwr- DSI- D1- D2- PME-
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
-00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00
+00: 06 11 05 03 06 00 10 22 02 00 00 06 00 00 00 00
10: 08 00 00 e4 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-50: 17 a4 6b b4 4f 81 08 08 80 00 04 08 08 08 08 08
-60: 03 ff 00 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00
+50: 17 a4 6b b4 07 28 08 08 80 00 04 08 08 08 08 08
+60: 03 ff 55 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00
70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 00
80: 0f 40 00 00 c0 00 00 00 02 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00