2002-04-27 11:10:28

by Dave Jones

[permalink] [raw]
Subject: linux-2.5.x-dj and SCSI error handling.

Folks, I just woke up to about a dozen reports of the same 'bug'
all with patches which are so wrong they stand no chance of application
by me or Linus, posting here is quicker than me pointing those people
(and those who may follow them) to the answer.

The recent patch from Christoph Hellwig which kills off
the last remaining remnants of the old style SCSI error handling.
The reason these drivers no longer compile due to missing 'abort' and
'reset' functions is due to the fact that they need converting to
new-style error handling.

For instructions on how to do this, read http://www.andante.org/scsi_eh.html

For instructions on how not to this..
o Do not send me patches that just remove the reset: and abort:
functions. This gives us no error handling whatsoever.
o Do not send me patches re-adding the 'missing' reset & abort functions
to the Scsi_Host_Template struct. This gets us nowhere.
o Yes, I know ide-scsi, and various other scsi drivers are broken.
Fix them, or be patient and wait for them to be fixed.


Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs


2002-04-27 11:29:52

by Arjan van de Ven

[permalink] [raw]
Subject: Re: linux-2.5.x-dj and SCSI error handling.

In article <[email protected]> you wrote:
> For instructions on how not to this..
> o Do not send me patches that just remove the reset: and abort:
> functions. This gives us no error handling whatsoever.
> o Do not send me patches re-adding the 'missing' reset & abort functions
> to the Scsi_Host_Template struct. This gets us nowhere.

And this also gives you no error handling whatsoever......

2002-04-27 13:48:56

by Mr. James W. Laferriere

[permalink] [raw]
Subject: Re: linux-2.5.x-dj and SCSI error handling.


Hello Dave , Might be nice to also mention the drivers that were
being complained about . So there respective mantainers can
benifit from your email . Tia , JimL

On Sat, 27 Apr 2002, Dave Jones wrote:

> Folks, I just woke up to about a dozen reports of the same 'bug'
> all with patches which are so wrong they stand no chance of application
> by me or Linus, posting here is quicker than me pointing those people
> (and those who may follow them) to the answer.
>
> The recent patch from Christoph Hellwig which kills off
> the last remaining remnants of the old style SCSI error handling.
> The reason these drivers no longer compile due to missing 'abort' and
> 'reset' functions is due to the fact that they need converting to
> new-style error handling.
>
> For instructions on how to do this, read http://www.andante.org/scsi_eh.html
>
> For instructions on how not to this..
> o Do not send me patches that just remove the reset: and abort:
> functions. This gives us no error handling whatsoever.
> o Do not send me patches re-adding the 'missing' reset & abort functions
> to the Scsi_Host_Template struct. This gets us nowhere.
> o Yes, I know ide-scsi, and various other scsi drivers are broken.
> Fix them, or be patient and wait for them to be fixed.
> Dave.

+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network Engineer | P.O. Box 854 | Give me Linux |
| [email protected] | Coudersport PA 16915 | only on AXP |
+------------------------------------------------------------------+

2002-04-27 13:53:46

by Dave Jones

[permalink] [raw]
Subject: Re: linux-2.5.x-dj and SCSI error handling.

On Sat, Apr 27, 2002 at 09:48:37AM -0400, Mr. James W. Laferriere wrote:
> Hello Dave , Might be nice to also mention the drivers that were
> being complained about . So there respective mantainers can
> benifit from your email . Tia , JimL

noted. I'll do a full compile later today and post back the list of
drivers broken due to this issue. The only one everyone seems to be
complaining about is ide-scsi, but there are definitly others.

Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-04-27 14:35:50

by Douglas Gilbert

[permalink] [raw]
Subject: Re: linux-2.5.x-dj and SCSI error handling.

Dave Jones <[email protected]> wrote:
>On Sat, Apr 27, 2002 at 09:48:37AM -0400, Mr. James W. Laferriere wrote:
> > Hello Dave , Might be nice to also mention the drivers that were
> > being complained about . So there respective mantainers can
> > benifit from your email . Tia , JimL
>
>noted. I'll do a full compile later today and post back the list of
>drivers broken due to this issue. The only one everyone seems to be
>complaining about is ide-scsi, but there are definitly others.

Dave,
We only really need one of the new "eh" handlers,
either:
eh_device_reset_handler() or
eh_bus_reset_handler()
to be implemented in ide-scsi.c in order to go forward.
Assuming we pick the first one, perhaps someone could
tell me, from the context of ide-scsi, how to (politely)
reset the ATAPI device it is referring to?

[A similar approach will suffice for any other scsi drivers
"broken" by the removal of abort() and reset(). Eric
Youngdale warned of their impending removal about 3 years
ago!]

Doug Gilbert

2002-04-27 16:30:45

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: linux-2.5.x-dj and SCSI error handling.

> The recent patch from Christoph Hellwig which kills off
> the last remaining remnants of the old style SCSI error handling.
> ...

Is the new scsi-eh generally regarded as a good thing?

(Personally I have only bad experiences with it.
Usually commented it out, otherwise its error handling
would kill my box as soon as a SCSI error occurred.
I vastly prefer an error return "I/O error"
above a long series (several minutes) of retries,
device resets, bus resets making the machine entirely
unusable during this time, and often causing an oops
in the end, killing off the machine entirely.
However, I have no recent experiences here.)

Andries


2002-04-27 18:04:41

by Douglas Gilbert

[permalink] [raw]
Subject: Re: linux-2.5.x-dj and SCSI error handling.

[email protected] wrote:
> > The recent patch from Christoph Hellwig which kills off
> > the last remaining remnants of the old style SCSI error handling.
> > ...
>
> Is the new scsi-eh generally regarded as a good thing?

Andries,
If it isn't a good thing at the moment, James Bottomley
intends to make it a good thing shortly.

The new scsi_eh is a better design, and in 2.5 the scsi
mid level only has to concentrate on one rather than
two error correction mechanisms. That should make it
easier to get right.

Doug Gilbert

2002-04-27 18:19:47

by Alan

[permalink] [raw]
Subject: Re: linux-2.5.x-dj and SCSI error handling.

> device resets, bus resets making the machine entirely
> unusable during this time, and often causing an oops
> in the end, killing off the machine entirely.
> However, I have no recent experiences here.)

The old scsi eh code is dire, the new scsi eh code is currently merely
bad. However the interface for the newer scsi_eh is probably right, which
is the important bit

2002-04-29 07:37:53

by Rogier Wolff

[permalink] [raw]
Subject: Re: linux-2.5.x-dj and SCSI error handling.

[email protected] wrote:
> > The recent patch from Christoph Hellwig which kills off
> > the last remaining remnants of the old style SCSI error handling.
> > ...
>
> Is the new scsi-eh generally regarded as a good thing?

Oh, and if someone is working in the area of "error handling", please,
please allow me to turn off "retries". That's not going to cost much
in terms of code or performance, but increadably valuable for me: We
do datarecovery. So we regularly have "bad" disks in here, and when
the drive reports an error on block <N> we would much rather have the
drive try block <N+1> than have the OS retry block <N>. (Chances of
block <N> getting results is lower than about 5% (given that the drive
has reported an error for that block), while in general the chances of
a random block getting recovered is better than 99%)

Roger.

--
** [email protected] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots.
* There are also old, bald pilots.