2002-10-31 22:29:45

by Giuliano Pochini

[permalink] [raw]
Subject: aic7xxx and error recovery


I have a magneto-optical drive. Recoverable error rate is quite high
in this kind of devices (1 bit every 10^5, according to specs, but
it's actually much lower IMHO). I was playing with the SCSI error
recovery page and I noticed that when I enable the PER flag (which
makes the drive to tell the initiator when a recoverable medium
error occurs) strange things happen. I wrote a small prg that writes
random patterns and then reads it back and compare it with the
pattern. It happens that when a recoverable error occurs (as
reported in the sys logs) read()(2) returns a value smaller then
requested, and the loaded data is identical to the pattern, or
read() completes, but the data is wrong. This two cases seem to
be mutually exclusive, I've tried a lot of times. I don't know why
this happens, but IMO if read(length)==length then the data I get
shouldn't be corrupted. I believe there is a bug in the scsi
driver, because if PER==0 I never get corrupted data, and PER==1
doesn't affects data sent to the initiator, it only reports
recovered errors. Comments ?

[Linux Jay 2.4.19 #3 mer ago 14 15:29:00 CEST 2002 ppc unknown]

Bye.


2002-11-01 08:10:46

by Giuliano Pochini

[permalink] [raw]
Subject: Re: aic7xxx and error recovery

Giuliano Pochini wrote:
>
> [...] It happens that when a recoverable error occurs (as
> reported in the sys logs) read()(2) returns a value smaller then
> requested, and the loaded data is identical to the pattern, or
> read() completes, but the data is wrong.

Ehm, I made a stupid typo in my test program. read() does dot
succeed in the second case. Anyway the problem is still here:
why does it fail on recovered errors ?


Bye.

2002-11-01 16:54:40

by Nicholas Berry

[permalink] [raw]
Subject: Re: aic7xxx and error recovery

At the time I read your original post, I was investigating why one drive kept being kicked out of an md array.

This is on two systems, 2.4.20-pre11 and 2.4.20-rc1, and both using a symc53c875 with 36gb IBM drives.

Turns out it's recovered errors, just like you see.

So it seems to be wider than aic7xxx. I've just rebuilt both arrays with PER 0, and they're working fine.

Another array on 2.4.19-pre7 & aic7xxx works fine with PER 1

Nik


>>> Giuliano Pochini <[email protected]> 11/01/02 03:16AM >>>
Giuliano Pochini wrote:
>
> [...] It happens that when a recoverable error occurs (as
> reported in the sys logs) read()(2) returns a value smaller then
> requested, and the loaded data is identical to the pattern, or
> read() completes, but the data is wrong.

Ehm, I made a stupid typo in my test program. read() does dot
succeed in the second case. Anyway the problem is still here:
why does it fail on recovered errors ?


Bye.