2000-11-02 21:55:31

by chen, xiangping

[permalink] [raw]
Subject: scsi init problem in 2.4.0-test10?

Hello,

I met a problem when trying to upgrade my Linux kernel to 2.4.0-test10.
The machine is Compay AP550, dual processor, mem 512 MB, and 863 MHZ freq.
It has two scsi host adaptors. one is AIC-7892 ultra 160/m connected to
internal hard disk, and the other is AHA-3944 ultra scsi connected to
an attached disk. The boot process stops after detection of the first
scsi host, error info is:
scsi: aborting command due to time out: pid0, scsci1, channel 0,
id 0, lun 0, Inquiry 00 00 00 ff 00

Previous OS on this machine was RedHat 6.2 kernel version 2.2.14

looking forward to your help!

Xiangping


2000-11-02 22:16:45

by Elizabeth Morris-Baker

[permalink] [raw]
Subject: Re: scsi init problem in 2.4.0-test10?

>
> Hello,

Yes, I encountered the same problem, and have a fix, but
want to test it. If the author of scsi_scan.c would like
to correct it, then that would be fine.

Basically the problem is in scan_scsis_single.
Some scsi devices are notoriously brain dead
about answering inquiries without having
recived a TUR and then spinning up.
The problem seems to be the disk, not the controller,
if this is the same problem.

The problem appeared in the test kernels because
the TUR *used* to be there, now it is not.

Hope this helps.

Just curious, what kind of scsi disk do you have??
lemme guess... Compaq Atlas?? :>

cheers,

Elizabeth

>
> I met a problem when trying to upgrade my Linux kernel to 2.4.0-test10.
> The machine is Compay AP550, dual processor, mem 512 MB, and 863 MHZ freq.
> It has two scsi host adaptors. one is AIC-7892 ultra 160/m connected to
> internal hard disk, and the other is AHA-3944 ultra scsi connected to
> an attached disk. The boot process stops after detection of the first
> scsi host, error info is:
> scsi: aborting command due to time out: pid0, scsci1, channel 0,
> id 0, lun 0, Inquiry 00 00 00 ff 00
>
> Previous OS on this machine was RedHat 6.2 kernel version 2.2.14
>
> looking forward to your help!
>
> Xiangping
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2000-11-02 22:54:28

by Matthew Dharm

[permalink] [raw]
Subject: Re: scsi init problem in 2.4.0-test10?

On Thu, Nov 02, 2000 at 03:58:24PM -0600, Elizabeth Morris-Baker wrote:
> Basically the problem is in scan_scsis_single.
> Some scsi devices are notoriously brain dead
> about answering inquiries without having
> recived a TUR and then spinning up.
> The problem seems to be the disk, not the controller,
> if this is the same problem.
>
> The problem appeared in the test kernels because
> the TUR *used* to be there, now it is not.

Strictly speaking, shouldn't we send a START_STOP, not a TUR to get the
disks (or other devices) to spin up?

Matt

--
Matthew Dharm Home: [email protected]
Maintainer, Linux USB Mass Storage Driver

S: Another stupid question?
G: There's no such thing as a stupid question, only stupid people.
-- Stef and Greg
User Friendly, 7/15/1998


Attachments:
(No filename) (832.00 B)
(No filename) (232.00 B)
Download all attachments

2000-11-02 23:05:09

by Elizabeth Morris-Baker

[permalink] [raw]
Subject: Re: scsi init problem in 2.4.0-test10?

>

You need to send the TUR first, but yes,
START_STOP will guarantee that you are
ready to rock and roll.
The first fix I wrote did a TUR, then
3 tries at a START_STOP, till it worked.

cheers,

Elizabeth

>
> --8t9RHnE3ZwKMSgU+
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
>
> On Thu, Nov 02, 2000 at 03:58:24PM -0600, Elizabeth Morris-Baker wrote:
> > Basically the problem is in scan_scsis_single.
> > Some scsi devices are notoriously brain dead
> > about answering inquiries without having=20
> > recived a TUR and then spinning up.
> > The problem seems to be the disk, not the controller,
> > if this is the same problem.
> >=20
> > The problem appeared in the test kernels because
> > the TUR *used* to be there, now it is not.
>
> Strictly speaking, shouldn't we send a START_STOP, not a TUR to get the
> disks (or other devices) to spin up?
>
> Matt
>
> --=20
> Matthew Dharm Home: mdharm-usb@one-eyed-alien.=
> net=20
> Maintainer, Linux USB Mass Storage Driver
>
> S: Another stupid question?
> G: There's no such thing as a stupid question, only stupid people.
> -- Stef and Greg
> User Friendly, 7/15/1998
>
> --8t9RHnE3ZwKMSgU+
> Content-Type: application/pgp-signature
> Content-Disposition: inline
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.0.4 (GNU/Linux)
> Comment: For info see http://www.gnupg.org
>
> iD8DBQE6AfBfz64nssGU+ykRAiG4AJ9d96tbBNs6zCwR8qIkGs5fJGs6EQCeLtO9
> khi+5UEoM5/apYkaEBBgnow=
> =/YMd
> -----END PGP SIGNATURE-----
>
> --8t9RHnE3ZwKMSgU+--
>

2000-11-02 23:31:06

by chen, xiangping

[permalink] [raw]
Subject: RE: scsi init problem in 2.4.0-test10?

Hi,

Thanks for replying. What bothers me is that it works
in another pc with the same setup to the external disk,
but different internal scsi adaptor. So I did not think
it was the kernel software caused the problem. Any other
guess?

Thanks,

Xiangping




-----Original Message-----
From: Elizabeth Morris-Baker [mailto:[email protected]]
Sent: Thursday, November 02, 2000 4:58 PM
To: [email protected]
Cc: [email protected]
Subject: Re: scsi init problem in 2.4.0-test10?


>
> Hello,

Yes, I encountered the same problem, and have a fix, but
want to test it. If the author of scsi_scan.c would like
to correct it, then that would be fine.

Basically the problem is in scan_scsis_single.
Some scsi devices are notoriously brain dead
about answering inquiries without having
recived a TUR and then spinning up.
The problem seems to be the disk, not the controller,
if this is the same problem.

The problem appeared in the test kernels because
the TUR *used* to be there, now it is not.

Hope this helps.

Just curious, what kind of scsi disk do you have??
lemme guess... Compaq Atlas?? :>

cheers,

Elizabeth

>
> I met a problem when trying to upgrade my Linux kernel to 2.4.0-test10.
> The machine is Compay AP550, dual processor, mem 512 MB, and 863 MHZ freq.
> It has two scsi host adaptors. one is AIC-7892 ultra 160/m connected to
> internal hard disk, and the other is AHA-3944 ultra scsi connected to
> an attached disk. The boot process stops after detection of the first
> scsi host, error info is:
> scsi: aborting command due to time out: pid0, scsci1, channel 0,
> id 0, lun 0, Inquiry 00 00 00 ff 00
>
> Previous OS on this machine was RedHat 6.2 kernel version 2.2.14
>
> looking forward to your help!
>
> Xiangping
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2000-11-02 23:47:51

by Torben Mathiasen

[permalink] [raw]
Subject: Re: scsi init problem in 2.4.0-test10?

The SCSI spec says that INQUIRY and not
TUR + INQUIRY is the way to go, but maybe we
should make it a compile time option for buggy
drives.


On Thu, Nov 02 2000, Elizabeth Morris-Baker wrote:
> >
>
> You need to send the TUR first, but yes,
> START_STOP will guarantee that you are
> ready to rock and roll.
> The first fix I wrote did a TUR, then
> 3 tries at a START_STOP, till it worked.
>
> cheers,
>
> Elizabeth
>

[deleted]

--
Torben Mathiasen <[email protected]>
Linux ThunderLAN maintainer
http://tlan.kernel.dk

2000-11-03 00:43:17

by Elizabeth Morris-Baker

[permalink] [raw]
Subject: Re: scsi init problem in 2.4.0-test10? [PATCH]

>

Yes, I know that is in the spec, but truly,
some scsi devices do act this way....
Maybe they need to read the spec :>

I have included the START_STOP for Matthew, but
I never see it execute with the ATLAS disks...
A diff follows for those that want to try it..

cheers,

Elizabeth

> The SCSI spec says that INQUIRY and not
> TUR + INQUIRY is the way to go, but maybe we
> should make it a compile time option for buggy
> drives.

-------------------------cut here ----------------------
*** scsi_scan.c.orig Tue Oct 24 14:01:54 2000
--- scsi_scan.c Thu Nov 2 18:59:30 2000
***************
*** 471,476 ****
--- 471,479 ----
Scsi_Request * SRpnt;
int bflags, type = -1;
extern devfs_handle_t scsi_devfs_handle;
+ int spintime = 0;
+ int retries = 0;
+ unsigned long spintime_value = 0;

SDpnt->host = shpnt;
SDpnt->id = dev;
***************
*** 499,504 ****
--- 502,574 ----
* not really necessary. Spec recommends using INQUIRY to scan for
* devices (and TEST_UNIT_READY to poll for media change). - Paul G.
*/
+ /* Add TUR back in to sync up the disk --
+ mostly borrowed from 2.2 kernel -- eamb */
+
+ do
+ {
+ retries = 0;
+
+ while (retries < 3)
+ {
+ scsi_cmd[0] = TEST_UNIT_READY;
+ scsi_cmd[1] = (lun << 5) & 0xe0;
+ memset((void *) &scsi_cmd[2], 0, 8);
+ SRpnt->sr_cmd_len = 0;
+ SRpnt->sr_sense_buffer[0] = 0;
+ SRpnt->sr_sense_buffer[2] = 0;
+ SRpnt->sr_data_direction = SCSI_DATA_READ;
+
+ scsi_wait_req (SRpnt, (void *) scsi_cmd,
+ (void *) scsi_result,
+ 256, SCSI_TIMEOUT+4*HZ, 3);
+
+ retries++;
+ if (SRpnt->sr_result == 0
+ || SRpnt->sr_sense_buffer[2] != UNIT_ATTENTION)
+ break;
+ }
+
+ if( SRpnt->sr_result != 0
+ && ((driver_byte(SRpnt->sr_result) & DRIVER_SENSE) != 0)
+ && SRpnt->sr_sense_buffer[2] == UNIT_ATTENTION)
+ {
+ break;
+ }
+
+ /* Look for devices that are NOT_READY.
+ * Issue command to spin up drive for these cases. */
+ if(SRpnt->sr_sense_buffer[2] == NOT_READY)
+ {
+ unsigned long time1;
+ if (!spintime)
+ {
+ scsi_cmd[0] = START_STOP;
+ scsi_cmd[1] = (lun << 5) & 0xe0;
+ scsi_cmd[1] |= 1; /* Return immediately */
+ memset((void *) &scsi_cmd[2], 0, 8);
+ scsi_cmd[4] = 1; /* Start spin cycle */
+ SRpnt->sr_cmd_len = 0;
+ SRpnt->sr_sense_buffer[0] = 0;
+ SRpnt->sr_sense_buffer[2] = 0;
+
+ SRpnt->sr_data_direction = SCSI_DATA_READ;
+ scsi_wait_req (SRpnt, (void *) scsi_cmd,
+ (void *) scsi_result,
+ 256, SCSI_TIMEOUT+4*HZ, 3);
+ }
+ spintime = 1;
+ spintime_value = jiffies;
+ time1 = HZ;
+ /* Wait 1 second for next try */
+ do
+ {
+ current->state = TASK_UNINTERRUPTIBLE;
+ time1 = schedule_timeout(time1);
+ } while(time1);
+ }
+ } while (SRpnt->sr_result && spintime && (retries < 3) &&
+ time_after(spintime_value + 100 * HZ, jiffies));

SCSI_LOG_SCAN_BUS(3, printk("scsi: performing INQUIRY\n"));
/*
-------------------------cut here ----------------------
>
>
> On Thu, Nov 02 2000, Elizabeth Morris-Baker wrote:
> > >
> >
> > You need to send the TUR first, but yes,
> > START_STOP will guarantee that you are
> > ready to rock and roll.
> > The first fix I wrote did a TUR, then
> > 3 tries at a START_STOP, till it worked.
> >
> > cheers,
> >
> > Elizabeth
> >
>
> [deleted]
>
> --
> Torben Mathiasen <[email protected]>
> Linux ThunderLAN maintainer
> http://tlan.kernel.dk
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2000-11-03 00:55:59

by chen, xiangping

[permalink] [raw]
Subject: RE: scsi init problem in 2.4.0-test10?

Hi,

The problem got solved by replacing the AHA-3944 card to
AIC-7895, thus switching the order of SCSI discovery.
It seems that still a SCSI ordering problem.

Thanks any way!

Xiangping

-----Original Message-----
From: Elizabeth Morris-Baker [mailto:[email protected]]
Sent: Thursday, November 02, 2000 4:58 PM
To: [email protected]
Cc: [email protected]
Subject: Re: scsi init problem in 2.4.0-test10?


>
> Hello,

Yes, I encountered the same problem, and have a fix, but
want to test it. If the author of scsi_scan.c would like
to correct it, then that would be fine.

Basically the problem is in scan_scsis_single.
Some scsi devices are notoriously brain dead
about answering inquiries without having
recived a TUR and then spinning up.
The problem seems to be the disk, not the controller,
if this is the same problem.

The problem appeared in the test kernels because
the TUR *used* to be there, now it is not.

Hope this helps.

Just curious, what kind of scsi disk do you have??
lemme guess... Compaq Atlas?? :>

cheers,

Elizabeth

>
> I met a problem when trying to upgrade my Linux kernel to 2.4.0-test10.
> The machine is Compay AP550, dual processor, mem 512 MB, and 863 MHZ freq.
> It has two scsi host adaptors. one is AIC-7892 ultra 160/m connected to
> internal hard disk, and the other is AHA-3944 ultra scsi connected to
> an attached disk. The boot process stops after detection of the first
> scsi host, error info is:
> scsi: aborting command due to time out: pid0, scsci1, channel 0,
> id 0, lun 0, Inquiry 00 00 00 ff 00
>
> Previous OS on this machine was RedHat 6.2 kernel version 2.2.14
>
> looking forward to your help!
>
> Xiangping
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2000-11-03 01:02:40

by David Weinehall

[permalink] [raw]
Subject: Re: scsi init problem in 2.4.0-test10? [PATCH]

On Thu, Nov 02, 2000 at 06:24:47PM -0600, Elizabeth Morris-Baker wrote:
> >
>
> Yes, I know that is in the spec, but truly,
> some scsi devices do act this way....
> Maybe they need to read the spec :>
>
> I have included the START_STOP for Matthew, but
> I never see it execute with the ATLAS disks...
> A diff follows for those that want to try it..
>
> cheers,
>
> Elizabeth

Well, if I'm not all mistaken, this is the code that got removed earlier
on from the kernel because it caused some SCSI-adapters to hang on
scsi-scan?! If so, what's better: to follow the specs and penalise the
bad guys, or ignore the specs and penalise the good guys...


/David Weinehall
_ _
// David Weinehall <[email protected]> /> Northern lights wander \\
// Project MCA Linux hacker // Dance across the winter sky //
\> http://www.acc.umu.se/~tao/ </ Full colour fire </

2000-11-03 01:37:47

by stefan mojschewitsch

[permalink] [raw]
Subject: Re: scsi init problem in 2.4.0-test10?

"chen, xiangping" wrote:
>
> Hello,
>
> I met a problem when trying to upgrade my Linux kernel to 2.4.0-test10.
> The machine is Compay AP550, dual processor, mem 512 MB, and 863 MHZ freq.
> It has two scsi host adaptors. one is AIC-7892 ultra 160/m connected to
> internal hard disk, and the other is AHA-3944 ultra scsi connected to
> an attached disk. The boot process stops after detection of the first
> scsi host, error info is:
> scsi: aborting command due to time out: pid0, scsci1, channel 0,
> id 0, lun 0, Inquiry 00 00 00 ff 00
>
im having this problem too with an quad ppro 166Mhz machine, but older
scsi-
controllers. when booting with 2.2.1[67] or 2.4.0.test[6-9] and
kernelcmdline noapic, its okay.
on my machine, an AIC-7870 as scsi0 is aborting, when booting with apic
enabled.

tnx for reading

stefan

--
stefan mojschewitsch

2000-11-03 02:03:12

by Elizabeth Morris-Baker

[permalink] [raw]
Subject: Re: scsi init problem in 2.4.0-test10? [PATCH]

>

Thank you for the information.
I will give it some thought and see if I can come up
with something that will fit both bills...

The problem is that the disks that I have are
very wide-spread, I would imagine. Compaq is shipping
them in their newer machines, so some compromise has
to be arrived at.

I will look into the matter further.
Thanks again.

cheers,

eamb

> On Thu, Nov 02, 2000 at 06:24:47PM -0600, Elizabeth Morris-Baker wrote:
> > >
> >
> > Yes, I know that is in the spec, but truly,
> > some scsi devices do act this way....
> > Maybe they need to read the spec :>
> >
> > I have included the START_STOP for Matthew, but
> > I never see it execute with the ATLAS disks...
> > A diff follows for those that want to try it..
> >
> > cheers,
> >
> > Elizabeth
>
> Well, if I'm not all mistaken, this is the code that got removed earlier
> on from the kernel because it caused some SCSI-adapters to hang on
> scsi-scan?! If so, what's better: to follow the specs and penalise the
> bad guys, or ignore the specs and penalise the good guys...
>
>
> /David Weinehall
> _ _
> // David Weinehall <[email protected]> /> Northern lights wander \\
> // Project MCA Linux hacker // Dance across the winter sky //
> \> http://www.acc.umu.se/~tao/ </ Full colour fire </
>