2002-10-02 12:34:51

by Eriksson Stig

[permalink] [raw]
Subject: aic7xxx problems?

Hi

Maybe You can help me out with this one...
I have hp DLT connected to an adaptec SCSI board.

This is a part of dmesg output:
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.6
<Adaptec (Compaq OEM) 3960D Ultra160 SCSI adapter>
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.6
<Adaptec (Compaq OEM) 3960D Ultra160 SCSI adapter>
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

Vendor: BNCHMARK Model: DLT1 Rev: 5032
Type: Sequential-Access ANSI SCSI revision: 02

The "BNCHMARK DLT1" is actually a hp DLT.
When rewinding this tape, using "mt -f /dev/nst0 rewind" with a reasonable
amount of data on the tape (~2 Gigs), i get the following in
/var/log/messages:

Sep 9 14:01:21 lack kernel: scsi1:0:5:0: Attempting to queue an ABORT
message
Sep 9 14:01:21 lack kernel: scsi1: Dumping Card State in Command phase, at
SEQADDR 0x168
Sep 9 14:01:21 lack kernel: ACCUM = 0x80, SINDEX = 0xa0, DINDEX = 0xe4,
ARG_2 = 0x0
Sep 9 14:01:21 lack kernel: HCNT = 0x0 SCBPTR = 0x0
Sep 9 14:01:21 lack kernel: SCSISEQ = 0x12, SBLKCTL = 0x6
Sep 9 14:01:21 lack kernel: DFCNTRL = 0x4, DFSTATUS = 0x89
Sep 9 14:01:21 lack kernel: LASTPHASE = 0x80, SCSISIGI = 0x84, SXFRCTL0 =
0x88
Sep 9 14:01:21 lack kernel: SSTAT0 = 0x7, SSTAT1 = 0x0
Sep 9 14:01:21 lack kernel: SCSIPHASE = 0x0
Sep 9 14:01:21 lack kernel: STACK == 0x175, 0x160, 0xe7, 0x34
Sep 9 14:01:21 lack kernel: SCB count = 4
Sep 9 14:01:21 lack kernel: Kernel NEXTQSCB = 3
Sep 9 14:01:21 lack kernel: Card NEXTQSCB = 3
Sep 9 14:01:21 lack kernel: QINFIFO entries:
Sep 9 14:01:21 lack kernel: Waiting Queue entries:
Sep 9 14:01:21 lack kernel: Disconnected Queue entries:
Sep 9 14:01:21 lack kernel: QOUTFIFO entries:
Sep 9 14:01:21 lack kernel: Sequencer Free SCB List: 1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Sep 9 14:01:22 lack kernel: Sequencer SCB Info: 0(c 0x40, s 0x57, l 0, t
0x2) 1(c 0x0, s 0x7f, l 223, t 0xff) 2(c 0x0, s 0xff, l 255, t 0xff) 3(c
0x0, s 0x7f, l 239, t 0xff) 4(c 0x0, s 0x0, l 0, t 0xff) 5(c 0x0, s 0x77, l
255, t 0xff) 6(c 0x0, s 0xfd, l 253, t 0xff) 7(c 0x0, s 0xff, l 255, t 0xff)
8(c 0x0, s 0x5f, l 190, t 0xff) 9(c 0x0, s 0xff, l 111, t 0xff) 10(c 0x0, s
0xb6, l 59, t 0xff) 11(c 0x0, s 0xdf, l 223, t 0xff) 12(c 0x0, s 0xff, l
254, t 0xff) 13(c 0x0, s 0xff, l 222, t 0xff) 14(c 0x0, s 0xfd, l 109, t
0xff) 15(c 0x0, s 0xe7, l 239, t 0xff) 16(c 0x0, s 0xd6, l 253, t 0xff) 17(c
0x0, s 0xfe, l 238, t 0xff) 18(c 0x0, s 0xe7, l 253, t 0xff) 19(c 0x0, s
0xff, l 127, t 0xff) 20(c 0x0, s 0x75, l 227, t 0xff) 21(c 0x0, s 0x5f, l
239, t 0xff) 22(c 0x0, s 0xff, l 255, t 0xff) 23(c 0x0, s 0x3f, l 255, t
0xff) 24(c 0x0, s 0xf6, l 127, t 0xff) 25(c 0x0, s 0xff, l 255, t 0xff) 26(c
0x0, s 0x3f, l 247, t 0xff) 27(c 0x0, s 0xf3, l 255, t 0xff) 28(c 0x0, s
0xef, l 255, t 0xff) 29(c 0x0, s 0xff, l 254, t 0xff) 3
Sep 9 14:01:22 lack kernel: (c 0x0, s 0xf6, l 221, t 0xff) 31(c 0x0, s
0xff, l 255, t 0xff)
Sep 9 14:01:22 lack kernel: Pending list: 2(c 0x40, s 0x57, l 0)
Sep 9 14:01:22 lack kernel: Kernel Free SCB list: 1 0
Sep 9 14:01:22 lack kernel: Untagged Q(5): 2
Sep 9 14:01:22 lack kernel: DevQ(0:5:0): 0 waiting
Sep 9 14:01:22 lack kernel: scsi1:0:5:0: Device is active, asserting ATN
Sep 9 14:01:22 lack kernel: Recovery code sleeping
Sep 9 14:01:22 lack kernel: (scsi1:A:5:0): Abort Message Sent
Sep 9 14:01:22 lack kernel: (scsi1:A:5:0): SCB 2 - Abort Completed.
Sep 9 14:01:22 lack kernel: Recovery SCB completes
Sep 9 14:01:22 lack kernel: Recovery code awake
Sep 9 14:01:22 lack kernel: aic7xxx_abort returns 0x2002

This does not happen with small amount of data on the tape, only when
rewind takes a *long* time

Best Regards
--
Stig Eriksson email: [email protected]
Cactus Automation AB phone: +46 31 86 97 10
Kroksl?tts Fabriker 30 fax: +46 31 86 97 24
431 37 M?lndal http: http://www.cactus.se


2002-10-02 14:16:32

by Jos Hulzink

[permalink] [raw]
Subject: Re: aic7xxx problems?

On Wed, 2 Oct 2002, Eriksson Stig wrote:

> Hi
>
> The "BNCHMARK DLT1" is actually a hp DLT.
> When rewinding this tape, using "mt -f /dev/nst0 rewind" with a reasonable
> amount of data on the tape (~2 Gigs), i get the following in
> /var/log/messages:
>
> Sep 9 14:01:21 lack kernel: scsi1:0:5:0: Attempting to queue an ABORT
> message
> Sep 9 14:01:21 lack kernel: scsi1: Dumping Card State in Command phase, at
> SEQADDR 0x168

This bug seems almost as old as 2.5. Had errors like this in (iirc) 2.5.9
already. Since I didn't trust any 2.5 anymore (I still see serious PIIX
bugs appearing on this list) I never tried to debug it. Maybe I'll say
a prayer and install .40 tonight.

Jos

2002-10-02 17:03:31

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: aic7xxx problems?

> Hi
>
> Maybe You can help me out with this one...
> I have hp DLT connected to an adaptec SCSI board.

>From the perspective of the controller, the target has taken the
full command but has yet to REQ for either a cdb transfer retry
or a new phase. This looks like a target problem or a cabling
problem that prevents the initiator from seeing a REQ or two.

--
Justin

2002-10-02 17:05:38

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: aic7xxx problems?

>> Hi
>>
>> Maybe You can help me out with this one...
>> I have hp DLT connected to an adaptec SCSI board.
>
> From the perspective of the controller, the target has taken the
> full command but has yet to REQ for either a cdb transfer retry
> or a new phase. This looks like a target problem or a cabling
> problem that prevents the initiator from seeing a REQ or two.

Actually, in reviewing your message more fully, the problem is that
the timeout for the rewind operation is too short for your configuration.
The timeout should go away if you bump up the timeout in the st driver
so that your tape drive can rewind in peace.

--
Justin

2002-10-02 22:45:33

by Jos Hulzink

[permalink] [raw]
Subject: Re: aic7xxx problems?

On Wednesday 02 October 2002 19:10, Justin T. Gibbs wrote:
>
> Actually, in reviewing your message more fully, the problem is that
> the timeout for the rewind operation is too short for your configuration.
> The timeout should go away if you bump up the timeout in the st driver
> so that your tape drive can rewind in peace.

I guess there is something seriously wrong in the driver then: my SCSI cdrom
writers have the same problem. Result: lots of bad CDs.

Jos

2002-10-02 23:28:27

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: aic7xxx problems?

> On Wednesday 02 October 2002 19:10, Justin T. Gibbs wrote:
>>
>> Actually, in reviewing your message more fully, the problem is that
>> the timeout for the rewind operation is too short for your configuration.
>> The timeout should go away if you bump up the timeout in the st driver
>> so that your tape drive can rewind in peace.
>
> I guess there is something seriously wrong in the driver then: my SCSI
> cdrom writers have the same problem. Result: lots of bad CDs.
>
> Jos

I would have to see the messages to say.

--
Justin

2002-10-03 07:25:37

by Jos Hulzink

[permalink] [raw]
Subject: Re: aic7xxx problems?

On Thursday 03 October 2002 01:33, Justin T. Gibbs wrote:
> > On Wednesday 02 October 2002 19:10, Justin T. Gibbs wrote:
> >
> > I guess there is something seriously wrong in the driver then: my SCSI
> > cdrom writers have the same problem. Result: lots of bad CDs.
> >
> > Jos
>
> I would have to see the messages to say.

Unfortunately all 2.5 log files are gone since the improved IDE driver did
some non-deterministic sector destruction. I'm compiling 2.5.40 at the
moment. I'll try to reproduce the errors.

Jos

2002-10-03 07:37:08

by Eriksson Stig

[permalink] [raw]
Subject: RE: aic7xxx problems?



> >> Hi
> >>
> >> Maybe You can help me out with this one...
> >> I have hp DLT connected to an adaptec SCSI board.
> >
> > From the perspective of the controller, the target has taken the
> > full command but has yet to REQ for either a cdb transfer retry
> > or a new phase. This looks like a target problem or a cabling
> > problem that prevents the initiator from seeing a REQ or two.
>
> Actually, in reviewing your message more fully, the problem is that
> the timeout for the rewind operation is too short for your
> configuration.
> The timeout should go away if you bump up the timeout in the st driver
> so that your tape drive can rewind in peace.

The rewind is not *that* long, about 60 seconds...

--
Stig Eriksson

2002-10-03 18:22:53

by Justin T. Gibbs

[permalink] [raw]
Subject: RE: aic7xxx problems?

>> Actually, in reviewing your message more fully, the problem is that
>> the timeout for the rewind operation is too short for your
>> configuration.
>> The timeout should go away if you bump up the timeout in the st driver
>> so that your tape drive can rewind in peace.
>
> The rewind is not *that* long, about 60 seconds...

Well, we are still waiting on the drive to do something, so its not
the aic7xxx driver's fault.

--
Justin

2002-10-04 18:35:13

by Doug Ledford

[permalink] [raw]
Subject: Re: aic7xxx problems?

On Thu, Oct 03, 2002 at 12:24:03PM -0600, Justin T. Gibbs wrote:
> >> Actually, in reviewing your message more fully, the problem is that
> >> the timeout for the rewind operation is too short for your
> >> configuration.
> >> The timeout should go away if you bump up the timeout in the st driver
> >> so that your tape drive can rewind in peace.
> >
> > The rewind is not *that* long, about 60 seconds...
>
> Well, we are still waiting on the drive to do something, so its not
> the aic7xxx driver's fault.

It's possible that the controller could have disconnect disabled for the
tape drive, causing it to hold the bus the entire time and making other
commands time out (although unlikely unless someone actually went in and
turned it off in the adapter config...)

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2002-10-04 20:16:30

by jbradford

[permalink] [raw]
Subject: Re: aic7xxx problems?

> > >> Actually, in reviewing your message more fully, the problem is that
> > >> the timeout for the rewind operation is too short for your
> > >> configuration.
> > >> The timeout should go away if you bump up the timeout in the st driver
> > >> so that your tape drive can rewind in peace.
> > >
> > > The rewind is not *that* long, about 60 seconds...
> >
> > Well, we are still waiting on the drive to do something, so its not
> > the aic7xxx driver's fault.
>
> It's possible that the controller could have disconnect disabled for the
> tape drive, causing it to hold the bus the entire time and making other
> commands time out (although unlikely unless someone actually went in and
> turned it off in the adapter config...)

Have you checked the settings in the adaptor's own BIOS? Most, (all?), Adaptec cards let you change things like disconnect, and sync negotiation, etc, etc, on a per device basis. Just press control-A, to run the SCSI-Select utility at boot up.

Also, do you have the latest firmware for your card? If it is a genuine Adaptec card, _not_ an OEM one, then I believe that they will send you a new BIOS for it.

John.