2003-05-31 16:46:42

by Daniel Podlejski

[permalink] [raw]
Subject: AIC7xxx problem

I have Adaptec SCSI controler, which with 2.4.20-ac2 boots ok:

====================================================================
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8
<Adaptec 2940 Ultra2 SCSI adapter>
aic7890/91: Ultra2 Wide Channel A, SCSI Id=15, 32/253 SCBs

(scsi0:A:0): 80.000MB/s transfers (40.000MHz, offset 63, 16bit)
Vendor: IBM Model: DPSS-318350N Rev: S96H
Type: Direct-Access ANSI SCSI revision: 03
scsi0:A:0:0: Tagged Queuing enabled. Depth 16
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 35843670 512-byte hdwr sectors (18352 MB)
Partition check:
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 >
====================================================================

but performance is poor - periodically all disk operations
stops for few seconds. I try to use newer drivers, but without
positiver results. Here is log from boot with verbose option:

====================================================================
[...]
SCSI subsystem driver Revision: 1.00
ahc_pci:2:10:0: Reading SEEPROM...done.
ahc_pci:2:10:0: Manual LVD Termination
ahc_pci:2:10:0: BIOS eeprom is present
ahc_pci:2:10:0: Secondary High byte termination Enabled
ahc_pci:2:10:0: Secondary Low byte termination Enabled
ahc_pci:2:10:0: Primary Low Byte termination Enabled
ahc_pci:2:10:0: Primary High Byte termination Enabled
ahc_pci:2:10:0: Downloading Sequencer Program... 423 instructions downloaded
ahc_pci:2:10:0: Features 0x56f6, Bugs 0x6, Flags 0x20485500
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.35
<Adaptec 2940 Ultra2 SCSI adapter>
aic7890/91: Ultra2 Wide Channel A, SCSI Id=15, 32/253 SCBs

(scsi0:A:0): 980KB/s transfers (0.980MHz, offset 255)
scsi0: target 0 using 8bit transfers
(scsi0:A:0): 3.300MB/s transfers
scsi0: target 0 using asynchronous transfers
(scsi0:A:1): 980KB/s transfers (0.980MHz, offset 255)
scsi0: target 1 using 8bit transfers
(scsi0:A:1): 3.300MB/s transfers
scsi0: target 1 using asynchronous transfers

[...]

(scsi0:A:14): 980KB/s transfers (0.980MHz, offset 255)
scsi0: target 14 using 8bit transfers
(scsi0:A:14): 3.300MB/s transfers
scsi0: target 14 using asynchronous transfers
scsi0: target 15 using 8bit transfers
scsi0: target 15 using asynchronous transfers
scsi0: target 0 using 8bit transfers
scsi0: target 0 using asynchronous transfers
scsi0: target 1 using 8bit transfers
scsi0: target 1 using asynchronous transfers

[...]

scsi0: target 12 using 8bit transfers
scsi0: target 12 using asynchronous transfers
scsi0: target 13 using 8bit transfers
scsi0: target 13 using asynchronous transfers
scsi0:0:0:0: Attempting to queue an ABORT message
CDB: 0x12 0x0 0x0 0x0 0xff 0x0
ahc_intr: HOST_MSG_LOOP bad phase 0x0
scsi0: At time of recovery, card was paused
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
scsi0: Dumping Card State while idle, at SEQADDR 0x45
Card was paused
ACCUM = 0xa0, SINDEX = 0x61, DINDEX = 0xe4, ARG_2 = 0x1
HCNT = 0x0 SCBPTR = 0x0
SCSISIGI[0x0] ERROR[0x0] SCSIBUSL[0x0] LASTPHASE[0x1]:(P_BUSFREE)
SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI) SBLKCTL[0xa]:(SELWIDE|SELBUSB)
SCSIRATE[0x0] SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0x40]:(NO_CDB_SENT)
SSTAT0[0x0] SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x8]:(ENSWRAP)
SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) SXFRCTL0[0x88]:(SPIOEN|DFON)
DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
STACK: 0x3 0xe3 0x0 0x0
SCB count = 5
Kernel NEXTQSCB = 3
Card NEXTQSCB = 4
QINFIFO entries: 4
Waiting Queue entries:
Disconnected Queue entries:
QOUTFIFO entries:
Sequencer Free SCB List: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 3 Sequencer SCB Info:
0 SCB_CONTROL[0x50]:(MK_MESSAGE|DISCENB) SCB_SCSIID[0xf]:(OID)
SCB_LUN[0x0] SCB_TAG[0xff]
1 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
2 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
3 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
4 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
5 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
6 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
7 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
8 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
9 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
10 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
11 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
12 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
13 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
14 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
15 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
16 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
17 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
18 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
19 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
20 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
21 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
22 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
23 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
24 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
25 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
26 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
27 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
28 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
29 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
30 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
31 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
Pending list:
4 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0xf]:(OID) SCB_LUN[0x0]
Kernel Free SCB list: 2 1 0
Untagged Q(0): 4
DevQ(0:0:0): 0 waiting

<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
scsi0:0:0:0: Cmd aborted from QINFIFO
aic7xxx_abort returns 0x2002
scsi0: target 14 using 8bit transfers
scsi0: target 14 using asynchronous transfers
====================================================================

Any ideas to fix ?

--
Daniel Podlejski <[email protected]>
... 'Cause yesterday's got nothin' for me
Old pictures that I'll always see ...


2003-06-01 08:06:22

by Daniel Podlejski

[permalink] [raw]
Subject: Re: AIC7xxx problem

Daniel Podlejski wrote:
[...]
: I have Adaptec SCSI controler, which with 2.4.20-ac2 boots ok:
: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
: scsi0:0:0:0: Cmd aborted from QINFIFO
: aic7xxx_abort returns 0x2002
: scsi0: target 14 using 8bit transfers
: scsi0: target 14 using asynchronous transfers
: ====================================================================
:
: Any ideas to fix ?

After switch off APIC support works fine.

--
Daniel Podlejski <[email protected]>
... You can check out any time you like
But you can never leave ...

2003-06-01 08:23:48

by Willy Tarreau

[permalink] [raw]
Subject: Re: AIC7xxx problem

On Sat, May 31, 2003 at 06:59:45PM +0200, Daniel Podlejski wrote:

> (scsi0:A:0): 80.000MB/s transfers (40.000MHz, offset 63, 16bit)
> Vendor: IBM Model: DPSS-318350N Rev: S96H
> Type: Direct-Access ANSI SCSI revision: 03

<...>

> (scsi0:A:0): 980KB/s transfers (0.980MHz, offset 255)
> scsi0: target 0 using 8bit transfers
> (scsi0:A:0): 3.300MB/s transfers
> scsi0: target 0 using asynchronous transfers

Hmmm that makes quite a difference ! I didn't understand what happened between
these two outputs. Also, did you try with Justin's latest version of the driver:

http://people.freebsd.org/~gibbs/linux/SRC/

It fixed many problems for several of us.

Regards,
Willy

2003-06-01 20:21:21

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: AIC7xxx problem

> Hmmm that makes quite a difference ! I didn't understand what happened between
> these two outputs. Also, did you try with Justin's latest version of the driver:
>

My driver can't fix interrupt routing issues which is what Daniel's
problem turned out to be. I'm really tempted to add an interrupt
test to the driver attach so that these kinds of problems are clearly
flagged and my driver doesn't continue to get blamed for interrupt
routing it can't control.

--
Justin

2003-06-01 20:32:05

by Willy Tarreau

[permalink] [raw]
Subject: Re: AIC7xxx problem

On Sun, Jun 01, 2003 at 02:34:40PM -0600, Justin T. Gibbs wrote:
> > Hmmm that makes quite a difference ! I didn't understand what happened between
> > these two outputs. Also, did you try with Justin's latest version of the driver:
> >
>
> My driver can't fix interrupt routing issues which is what Daniel's
> problem turned out to be. I'm really tempted to add an interrupt
> test to the driver attach so that these kinds of problems are clearly
> flagged and my driver doesn't continue to get blamed for interrupt
> routing it can't control.

If this is (relatively) easy to do, I really think it could be a valuable
diagnostic tool. I'd prefer to get a clear "fix your APIC" or any insult
about my hardware config than devices detection dying in endless timeout
loops.

This principle may even be generalized to any other driver which can make the
device trigger an interrupt.

Cheers,
Willy

2003-06-01 20:53:48

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: AIC7xxx problem

On Sun, 1 Jun 2003, Justin T. Gibbs wrote:

> > Hmmm that makes quite a difference ! I didn't understand what happened between
> > these two outputs. Also, did you try with Justin's latest version of the driver:
> >
>
> My driver can't fix interrupt routing issues which is what Daniel's
> problem turned out to be. I'm really tempted to add an interrupt
> test to the driver attach so that these kinds of problems are clearly
> flagged and my driver doesn't continue to get blamed for interrupt
> routing it can't control.

Which aspect of interrupt routing is broken so that we at least can have a
go at fixing it? I might be missing something here but it looks fine,
could you elaborate?

2.4.18

IRQ to pin mappings:
IRQ0 -> 0:2
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ9 -> 0:9
IRQ10 -> 0:10
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
IRQ16 -> 1:0
IRQ17 -> 1:1
IRQ18 -> 1:2
IRQ19 -> 1:3
IRQ20 -> 1:4
IRQ21 -> 1:5
IRQ22 -> 1:6
IRQ23 -> 1:7
IRQ28 -> 1:12
IRQ29 -> 1:13

CPU0 CPU1 CPU2
0: 3354580 4108947 4515468 IO-APIC-edge timer
1: 2 0 0 IO-APIC-edge keyboard
2: 0 0 0 XT-PIC cascade
4: 434 467 729 IO-APIC-edge serial
8: 1 0 0 IO-APIC-edge rtc
19: 73764 78100 80631 IO-APIC-level eth0
28: 301389 301350 302498 IO-APIC-level aic7xxx
29: 79542 82186 83042 IO-APIC-level aic7xxx
NMI: 11978872 11978872 11978872
LOC: 11978887 11978722 11978731
ERR: 0
MIS: 0

2.5.70

IRQ to pin mappings:
IRQ0 -> 0:2
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ9 -> 0:9
IRQ10 -> 0:10
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
IRQ16 -> 1:0
IRQ17 -> 1:1
IRQ18 -> 1:2
IRQ19 -> 1:3
IRQ20 -> 1:4
IRQ21 -> 1:5
IRQ22 -> 1:6
IRQ23 -> 1:7
IRQ28 -> 1:12
IRQ29 -> 1:13

<no /proc/interrupts because it never makes it to a single user prompt>

--
function.linuxpower.ca

2003-06-01 21:24:38

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: AIC7xxx problem

>> My driver can't fix interrupt routing issues which is what Daniel's
>> problem turned out to be. I'm really tempted to add an interrupt
>> test to the driver attach so that these kinds of problems are clearly
>> flagged and my driver doesn't continue to get blamed for interrupt
>> routing it can't control.
>
> Which aspect of interrupt routing is broken so that we at least can have a
> go at fixing it? I might be missing something here but it looks fine,
> could you elaborate?

Daniel is comparing 2.4.20-ac2 with 2.4.21-rc6. In 2.4.20-ac2, APIC
routing is disabled by default and his kernel works. In 2.4.21-rc6,
APIC routing is enabled by default and interrupts are not properly
routed to his SCSI controller. If he boots with noapic, everything
works fine. You'll have to ask Daniel for more details on his system
if you want to figure out why interrupts are not being delivered.
All I know is, from the output and his testing, it is pretty obvious
that interrupts are not being delivered.

--
Justin

2003-06-01 21:35:14

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: AIC7xxx problem

On Sun, 1 Jun 2003, Justin T. Gibbs wrote:

> Daniel is comparing 2.4.20-ac2 with 2.4.21-rc6. In 2.4.20-ac2, APIC
> routing is disabled by default and his kernel works. In 2.4.21-rc6,
> APIC routing is enabled by default and interrupts are not properly
> routed to his SCSI controller. If he boots with noapic, everything
> works fine. You'll have to ask Daniel for more details on his system
> if you want to figure out why interrupts are not being delivered.
> All I know is, from the output and his testing, it is pretty obvious
> that interrupts are not being delivered.

Ok i'll ask him about the details, but i've posted on a number of
occasions about aic7xxx oopsing unless i boot with noapic. Interrupts do
get delivered otherwise it wouldn't even get to mounting root. I can't
give you a 2.5.70 boot because raid is horked there too. If you want me to
fish out the emails again i can do that.

Zwane

--
function.linuxpower.ca