2001-12-01 09:58:32

by Ville Herva

[permalink] [raw]
Subject: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

This is an addition to my reports about HPT370 corrupting data with a pair
of IBM-DPTA-373420's on linux. As a summary, my testing showed that hpt370 +
IBM-DPTA-373420 corrupt data on

- 2.2.18pre19 + ide patch
- 2.2.20 + ide patch
- 2.2.20 + ide patch + Tim Hockin's hpt366.c patch
- 2.2.20 + ide patch + Tim Hockin's hpt366.c patch, in UDMA33 mode (rather than UDMA66)
- 2.4.15 + Tim Hockin's hpt366.c patch

The test involved reading /dev/md0 (that consists of /dev/hde and /dev/hdg)
several times and comparing the md5sums. I also tried reading /dev/hde and
/dev/hdg in parallel, and it did show errors. The problem disappeared when I
moved the drives over to Via 868B interface.

I reported the problem to Highpoint Tech Inc as well (they do explicitly
list IBM-DPTA-373420 as tested and compatible with HPT370), but as
anticipated, they didn't answer.


Now I bought a pair of SAMSUNG SV8004H's. Since the drives were blank, I was
able to do a write test. The test (see http://v.iki.fi/~vherva/tmp/wrchk.c
for the quick'n'dirty proggie) writes the /dev/md1 (which again consists of
the two SAMSUNG SV8004H's) full of a certain randomish 64MB block and then
tries to read it back. The write and read cycles are done over and over
again.

This is with 2.2.20 + ide + Hockin's patch. Drives are in UDMA100 mode (the
default) and no hdparm adjustions have been made.

The first write-read cycle went well, but one block mismatched already on
the second run. I've only run the test over night, but there are already
several mimatches (see http://v.iki.fi/~vherva/tmp/samsung-log for
complete log.)

Even during the succesfull first run, the IDE system gave these warningins:

hdg: status error: status=0x58 { DriveReady SeekComplete DataRequest }
hdg: drive not ready for command

Later during the night, I got 27 of those errors. I also got 10 of these:

probable hardware bug: clock timer configuration lost - probably a VIA686a.

I didn't get either of these with the IBM disks.

Smartctl shows 'No Errors Logged' for both drives. Also, it reports the
temperature of the drives being always under 35 degrees Celsius, at times
even under 30. I reckon temperatur is not a problem.

Right now I'm wondering two things:

- how come anyone else is not seeing this corruption (Abit KT7A, nevermind
HPT370 is fairly popular)?
- is it safe to solder the bugger off the motherboard so I can introduce it
to my shotgun?


regards,

--
Ville Herva [email protected] +358-40-5756996
Viasys Oy Hannuntie 6 FIN-02360 Espoo +358-9-2313-2160
PGP key available: http://www.iki.fi/v/pgp.html fax +358-9-2313-2250


2001-12-01 10:34:13

by Sven.Riedel

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

On Sat, Dec 01, 2001 at 11:58:03AM +0200, Ville Herva wrote:
> - how come anyone else is not seeing this corruption (Abit KT7A, nevermind
> HPT370 is fairly popular)?

A friend of mine had an IBM DLTA drive attached to his HPT370
controller, and this combination proved to produce a whole lot of drive
errors (I can confirm this first hand), which went away after attaching
the drive to the main motherboard controller.
I can't say anything about data corruption though - I just asked him and
he said he didn't know of any, but that doesn't mean it didn't happen.

Regs,
Sven
--
Sven Riedel [email protected]
Osteroeder Str. 6 / App. 13 [email protected]
38678 Clausthal "Call me bored, but don't call me boring."
- Larry Wall

2001-12-01 10:40:04

by Ville Herva

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

On Sat, Dec 01, 2001 at 11:34:00AM +0100, you [[email protected]] claimed:
> On Sat, Dec 01, 2001 at 11:58:03AM +0200, Ville Herva wrote:
> > - how come anyone else is not seeing this corruption (Abit KT7A, nevermind
> > HPT370 is fairly popular)?
>
> A friend of mine had an IBM DLTA drive attached to his HPT370
> controller, and this combination proved to produce a whole lot of drive
> errors (I can confirm this first hand), which went away after attaching
> the drive to the main motherboard controller.
> I can't say anything about data corruption though - I just asked him and
> he said he didn't know of any, but that doesn't mean it didn't happen.

Of course the drive is longer attached to HPT370 and your friend is propably
reluctant to reattach it, but it would still be nice to know if he gets
consistent results which for example this simple test:

cat /dev/hde | mdsum

run for several (5-10, perhaps) times.

OTOH, I haven't had corruption with reading only
one disk at a time, but then again I haven't tried too hard as they
should really work in parallel.


-- v --

[email protected]

2001-12-01 18:20:57

by Matt Schulkind

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

> On Sat, Dec 01, 2001 at 11:34:00AM +0100, you
[[email protected]] claimed:
> > On Sat, Dec 01, 2001 at 11:58:03AM +0200, Ville Herva wrote:
> > > - how come anyone else is not seeing this corruption (Abit KT7A,
nevermind
> > > HPT370 is fairly popular)?
> >
> > A friend of mine had an IBM DLTA drive attached to his HPT370
> > controller, and this combination proved to produce a whole lot of drive
> > errors (I can confirm this first hand), which went away after attaching
> > the drive to the main motherboard controller.
> > I can't say anything about data corruption though - I just asked him and
> > he said he didn't know of any, but that doesn't mean it didn't happen.
>
> Of course the drive is longer attached to HPT370 and your friend is
propably
> reluctant to reattach it, but it would still be nice to know if he gets
> consistent results which for example this simple test:
>
> cat /dev/hde | mdsum
>
> run for several (5-10, perhaps) times.
>
> OTOH, I haven't had corruption with reading only
> one disk at a time, but then again I haven't tried too hard as they
> should really work in parallel.
>
>
> -- v --
>
> [email protected]
> -

In my experience, the HPT370 chipset likes corrupting harddrives. When I was
using it, I had the PCI Raid version and it kept corrupting my hard drives.
I tried updating the BIOS, but the bios program locked up and completly
killed my board. When I RMAed the board, the new BIOS was put on for me and
after that I ahven't had a single problem. Maybe you should try upgrading
the BIOS, but I don't know if you can for an onboard version.

-Matt Schulkind


2001-12-02 05:51:34

by Sven.Riedel

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

On Sat, Dec 01, 2001 at 12:39:33PM +0200, Ville Herva wrote:
> Of course the drive is longer attached to HPT370 and your friend is propably
> reluctant to reattach it, but it would still be nice to know if he gets
> consistent results which for example this simple test:

Well, I asked him, but he doesn't want to rip his fileserver apart
again. Sorry.

Regs,
Sven
--
Sven Riedel [email protected]
Osteroeder Str. 6 / App. 13 [email protected]
38678 Clausthal "Call me bored, but don't call me boring."
- Larry Wall

2001-12-04 04:37:06

by gdf

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well


I concur with Matt Schulkind. This is usually symptom of and old
Highpoint bios or even KT7A bios version.

If you believe it to be some sort of a driver problem i urge you to query
the atariad mailing list.

Until then... for your mothe board i would point you to the KT7 FAQ.
http://www.viahardware.com/faq/kt7/kt7faq.htm

Though it is a bit windows centric, it describes in great detail many of
the issues with the board. (it seems like there are many, but at least
they are known and documented, unlike many other MBs out there.)

-Gabe Friedmann

2001-12-04 14:44:28

by Ville Herva

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

On Mon, Dec 03, 2001 at 11:44:57PM -0500, you [gdf] claimed:
>
> I concur with Matt Schulkind. This is usually symptom of and old
> Highpoint bios or even KT7A bios version.

I've tried Highpoint BIOSes 1.0.3b1 (Abit bios ZT), 1.2.0612 (Abit
bios49b01+), 1.11.0402 (Abit bios 64). No help.

Can someone shed some light on how much the Highpoint bios actually matters
under linux?

> If you believe it to be some sort of a driver problem i urge you to query
> the atariad mailing list.
>
> Until then... for your mothe board i would point you to the KT7 FAQ.
> http://www.viahardware.com/faq/kt7/kt7faq.htm
>
> Though it is a bit windows centric, it describes in great detail many of
> the issues with the board.

True, I've been reading it.


-- v --

[email protected]

2001-12-04 16:03:51

by Lee Packham

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

Now, I have a KT7A-RAID with the HPT370... and I have no problems at all. I
have the 3R bios update (i think...) from the http://www.abit.com.tw site.

> On Mon, Dec 03, 2001 at 11:44:57PM -0500, you [gdf] claimed:
>>
>> I concur with Matt Schulkind. This is usually symptom of and old
>> Highpoint bios or even KT7A bios version.
>
> I've tried Highpoint BIOSes 1.0.3b1 (Abit bios ZT), 1.2.0612 (Abit
> bios49b01+), 1.11.0402 (Abit bios 64). No help.
>
> Can someone shed some light on how much the Highpoint bios actually
> matters under linux?
>
>> If you believe it to be some sort of a driver problem i urge you to
>> query the atariad mailing list.
>>
>> Until then... for your mothe board i would point you to the KT7 FAQ.
>> http://www.viahardware.com/faq/kt7/kt7faq.htm
>>
>> Though it is a bit windows centric, it describes in great detail many
>> of the issues with the board.
>
> True, I've been reading it.
>
>
> -- v --
>
> [email protected]
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


2001-12-04 18:18:42

by Jonathan Amery

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

In article <[email protected]> you write:
>- how come anyone else is not seeing this corruption (Abit KT7A, nevermind
> HPT370 is fairly popular)?

I've got the KT7-RAID (note, no A) but with only one drive attached to the
HPT370. I have seen no problems. I will try your recommended md5sum test
next time I'm at the console to be root...

Jonathan.

2001-12-04 19:27:32

by Ville Herva

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

On Tue, Dec 04, 2001 at 03:04:31PM -0200, you [William N. Zanatta] claimed:
> Hello,
>
> I don't know how much they care about it, but they have a drivers list
> including drivers for SuSE, RH, Caldera and Turbolinux...
>
> http://www.highpoint-tech.com/
>
> William
>
> PS: Also don't know whether they do or not work. No disks to test, sorry!

Yep, I saw those, but
- they are binary only
- the only support old kernels (2.2.16-22, 2.4.2-2, 2.2.14-5.0 (the 1.1
driver for Red Hat)
- nobody seems to have tried them
- nobody knows whether they are stolen from Hedrick's ide driver
or are a pure Highpoint-tech from scratch implementation


-- v --

[email protected]

2001-12-04 19:29:13

by Ville Herva

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

On Tue, Dec 04, 2001 at 06:15:51PM +0000, you [Jonathan Amery] claimed:
> In article <[email protected]> you write:
> >- how come anyone else is not seeing this corruption (Abit KT7A, nevermind
> > HPT370 is fairly popular)?
>
> I've got the KT7-RAID (note, no A) but with only one drive attached to the
> HPT370. I have seen no problems. I will try your recommended md5sum test
> next time I'm at the console to be root...

I've only ever seen the corruption when reading two disks in parallel (one at
each hpt ide channel).


-- v --

[email protected]

2001-12-04 21:00:41

by William N. Zanatta

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

Hello,

I don't know how much they care about it, but they have a drivers list
including drivers for SuSE, RH, Caldera and Turbolinux...

http://www.highpoint-tech.com/

William

PS: Also don't know whether they do or not work. No disks to test, sorry!

>
> Can someone shed some light on how much the Highpoint bios actually matters
> under linux?


2001-12-04 22:23:53

by Anthony DeRobertis

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

Setup here is:

hde: WDC WD200EB-00BHF0, ATA DISK drive
hdg: WDC WD200EB-00BHF0, ATA DISK drive

hde: 39102336 sectors (20020 MB) w/2048KiB Cache,
CHS=38792/16/63, UDMA(100)
hdg: 39102336 sectors (20020 MB) w/2048KiB Cache,
CHS=38792/16/63, UDMA(100)

5GB of each in RAID0 on /dev/md/2

cat /dev/md/2 | md5sum now done its fourth run; all OK.
920b175a519b578dcd7862b720eb9efb, if you care ;-)

This is 2.4.6, on a KT7-RAID board.

2001-12-05 22:54:18

by Ville Herva

[permalink] [raw]
Subject: Re: HPT370 (KT7A-RAID) *corrupts* data - SAMSUNG SV8004H does it as well

On Tue, Dec 04, 2001 at 05:23:19PM -0500, you [Anthony DeRobertis] claimed:
> Setup here is:
>
> hde: WDC WD200EB-00BHF0, ATA DISK drive
> hdg: WDC WD200EB-00BHF0, ATA DISK drive
>
> hde: 39102336 sectors (20020 MB) w/2048KiB Cache,
> CHS=38792/16/63, UDMA(100)
> hdg: 39102336 sectors (20020 MB) w/2048KiB Cache,
> CHS=38792/16/63, UDMA(100)
>
> 5GB of each in RAID0 on /dev/md/2
>
> cat /dev/md/2 | md5sum now done its fourth run; all OK.
> 920b175a519b578dcd7862b720eb9efb, if you care ;-)

I do care (not of the sum, but of the fact your sums are consistent). Thank
you for testing!

> This is 2.4.6, on a KT7-RAID board.

So it is KT7-RAID, not KT7A-RAID like mine... Could that have something to
do with it...


-- v --

[email protected]