2003-01-03 10:05:47

by Dipankar Sarma

[permalink] [raw]
Subject: aic7xxx broken in 2.5.53/54 ?

Looks like the aic7xxx driver in 2.5.53 and 54 are broken on my hardware.
The older driver works fine. The new driver used to work until 2.5.52.
Does this look familiar to anyone ?

hda: ATAPI 48X CD-ROM drive, 120kB Cache, (U)DMA
Uniform CD-ROM driver Revision: 3.12
end_request: I/O error, dev hda, sector 0
aic7xxx: PCI Device 0:1:0 failed memory mapped test. Using PIO.
Uhhuh. NMI received for unknown reason 25 on CPU 0.
Dazed and confused, but trying to continue
Do you have a strange power saving mode enabled?
aic7xxx: PCI Device 0:1:1 failed memory mapped test. Using PIO.
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.25
<Adaptec aic7896/97 Ultra2 SCSI adapter>
aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi0: PCI error Interrupt at seqaddr = 0x2
scsi0: Signaled a Target Abort
scsi1: PCI error Interrupt at seqaddr = 0x2
scsi1: Signaled a Target Abort
Uhhuh. NMI received for unknown reason 25 on CPU 0.
Dazed and confused, but trying to continue
Do you have a strange power saving mode enabled?
(scsi1:A:0): 80.000MB/s transfers (40.000MHz, offset 63, 16bit)
(scsi1:A:1): 80.000MB/s transfers (40.000MHz, offset 63, 16bit)
(scsi1:A:2): 80.000MB/s transfers (40.000MHz, offset 63, 16bit)
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.25
<Adaptec aic7896/97 Ultra2 SCSI adapter>
aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/253 SCBs

Vendor: IBM-ESXS Model: ST318305LC !# Rev: B245
Type: Direct-Access ANSI SCSI revision: 03
scsi1:A:0:0: Tagged Queuing enabled. Depth 253
Vendor: IBM-ESXS Model: ST318305LC !# Rev: B245
Type: Direct-Access ANSI SCSI revision: 03
scsi1:A:1:0: Tagged Queuing enabled. Depth 253
Vendor: IBM-ESXS Model: ST318305LC !# Rev: B245
Type: Direct-Access ANSI SCSI revision: 03
scsi1:A:2:0: Tagged Queuing enabled. Depth 253
Vendor: IBM Model: AuSaV1S2 Rev: 0
Type: Processor ANSI SCSI revision: 02

The hardware [4-CPU P3 xeon] -

[root@llm04 root]# lspci
00:00.0 Host bridge: ServerWorks CNB20HE Host Bridge (rev 21)
00:00.1 Host bridge: ServerWorks CNB20HE Host Bridge (rev 01)
00:00.2 Host bridge: ServerWorks: Unknown device 0006
00:00.3 Host bridge: ServerWorks: Unknown device 0006
00:01.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
00:01.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
00:05.0 Ethernet controller: Advanced Micro Devices [AMD] 79c970 [PCnet LANCE] )00:06.0 VGA compatible controller: S3 Inc. Trio 64 3D (rev 01)
00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 50)
00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller
00:0f.2 USB Controller: ServerWorks OSB4/CSB5 USB Controller (rev 04)
02:03.0 RAID bus controller: IBM Netfinity ServeRAID controller
02:05.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0c)
02:06.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0c)

Thanks
Dipankar


2003-01-03 15:07:45

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?

> Looks like the aic7xxx driver in 2.5.53 and 54 are broken on my hardware.

It looks like the driver recovers fine.

...

> aic7xxx: PCI Device 0:1:0 failed memory mapped test. Using PIO.
> Uhhuh. NMI received for unknown reason 25 on CPU 0.

SERR must be enabled by your BIOS. I will change the driver so
that, should the memory mapped I/O test fail, an SERR (and thus an
NMI) is not generated.

...

> scsi0: PCI error Interrupt at seqaddr = 0x2
> scsi0: Signaled a Target Abort

These are left over from the failed memory mapped I/O test. They
should have been cleared by the test, but the behavior must be
different for the 7896/97. I'll review the documentation for this
chip and see if I can quiet up the failure.

Just out of curiosity, do you have any strange PCI options enabled
in your BIOS? I remeber seeing memory mapped I/O failures on this
ServerWorks chipset under FreeBSD in the past, but an updated BIOS
resolved the issue for the affected users. It seemed that the BIOS
incorrectly placed the Adaptec controller in a prefetchable region.

--
Justin

2003-01-06 07:21:45

by Dipankar Sarma

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?

Hi Justin,

On Fri, Jan 03, 2003 at 08:14:06AM -0700, Justin T. Gibbs wrote:
> > Looks like the aic7xxx driver in 2.5.53 and 54 are broken on my hardware.
>
> It looks like the driver recovers fine.

Not for long. It dies shortly afterwards.

> > aic7xxx: PCI Device 0:1:0 failed memory mapped test. Using PIO.
> > Uhhuh. NMI received for unknown reason 25 on CPU 0.
>
> SERR must be enabled by your BIOS. I will change the driver so
> that, should the memory mapped I/O test fail, an SERR (and thus an
> NMI) is not generated.

I guess having to use PIO with aic7xxx is bad. MMIO failure is
what we need to investigate.

>
> Just out of curiosity, do you have any strange PCI options enabled
> in your BIOS? I remeber seeing memory mapped I/O failures on this
> ServerWorks chipset under FreeBSD in the past, but an updated BIOS
> resolved the issue for the affected users. It seemed that the BIOS
> incorrectly placed the Adaptec controller in a prefetchable region.
>

I didn't change anything in that box since it was delivered to me. FYI
it is an IBM x250. Would it help if I can get a PCI space dump and mtrr
dump ? FWIW, the older driver works fine. Does the older driver use
only PIO ?

Thanks
Dipankar

2003-01-06 16:08:33

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?

> Hi Justin,
>
> On Fri, Jan 03, 2003 at 08:14:06AM -0700, Justin T. Gibbs wrote:
>> > Looks like the aic7xxx driver in 2.5.53 and 54 are broken on my
>> > hardware.
>>
>> It looks like the driver recovers fine.
>
> Not for long. It dies shortly afterwards.

In what fashion?

>> > aic7xxx: PCI Device 0:1:0 failed memory mapped test. Using PIO.
>> > Uhhuh. NMI received for unknown reason 25 on CPU 0.
>>
>> SERR must be enabled by your BIOS. I will change the driver so
>> that, should the memory mapped I/O test fail, an SERR (and thus an
>> NMI) is not generated.
>
> I guess having to use PIO with aic7xxx is bad. MMIO failure is
> what we need to investigate.

The only way that I know how to investigate these issues is
with a PCI bus analyzer. We're in the process of going through
all of the systems we have in our lab to see which ones fail and
why, but I certainly don't have one of every failing system on
the planet. 8-)

>> Just out of curiosity, do you have any strange PCI options enabled
>> in your BIOS? I remeber seeing memory mapped I/O failures on this
>> ServerWorks chipset under FreeBSD in the past, but an updated BIOS
>> resolved the issue for the affected users. It seemed that the BIOS
>> incorrectly placed the Adaptec controller in a prefetchable region.
>>
>
> I didn't change anything in that box since it was delivered to me. FYI
> it is an IBM x250. Would it help if I can get a PCI space dump and mtrr
> dump ? FWIW, the older driver works fine. Does the older driver use
> only PIO ?

It would be good to know the chipset on the motherboard. As to why
the old driver worked, for 6.X.X drivers, you may have just been lucky.
For 5.X.X drivers, they perform a read after every register write to
"manually" prevent any byte-merging. These reads are actually more
expensive than just using PIO. Neither of these older drivers included
a test to try and catch fishy behavior.

--
Justin

2003-01-06 16:27:58

by uaca

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?

Hi hav

I have to report the same problem, tomorrow I will complete the info,
including bios version and read-only tests on a disk device

At least I can say I've saw the message

> scsi0: PCI error Interrupt at seqaddr = 0x2
> scsi0: Signaled a Target Abort
> scsi1: PCI error Interrupt at seqaddr = 0x2
> scsi1: Signaled a Target Abort

> It would be good to know the chipset on the motherboard. As to why
> the old driver worked, for 6.X.X drivers, you may have just been lucky.
> For 5.X.X drivers, they perform a read after every register write to
> "manually" prevent any byte-merging. These reads are actually more
> expensive than just using PIO. Neither of these older drivers included
> a test to try and catch fishy behavior.


Kernel 2.4.x works fine

now some info about chipset, etc...


00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
Flags: bus master, medium devsel, latency 64
Memory at f8000000 (32-bit, prefetchable) [size=64M]
Capabilities: <available only to root>

00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge (prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, medium devsel, latency 64
Bus: primary=00, secondary=01, subordinate=02, sec-latency=64

00:0b.0 ATM network controller: FORE Systems Inc PCA-200E
Flags: bus master, medium devsel, latency 64, IRQ 18
Memory at f4000000 (32-bit, non-prefetchable) [size=2M]
Expansion ROM at <unassigned> [disabled] [size=8K]

00:0c.0 SCSI storage controller: Adaptec 7896
Subsystem: Adaptec: Unknown device 0053
Flags: bus master, medium devsel, latency 64, IRQ 19
BIST result: 00
I/O ports at 2000 [disabled] [size=256]
Memory at f4300000 (64-bit, non-prefetchable) [size=4K]
Expansion ROM at <unassigned> [disabled] [size=128K]
Capabilities: <available only to root>

00:0c.1 SCSI storage controller: Adaptec 7896
Subsystem: Adaptec: Unknown device 0053
Flags: bus master, medium devsel, latency 64, IRQ 19
BIST result: 00
I/O ports at 2400 [disabled] [size=256]
Memory at f4301000 (64-bit, non-prefetchable) [size=4K]
Capabilities: <available only to root>

00:0e.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
Subsystem: Intel Corporation 82559 Fast Ethernet LAN on Motherboard
Flags: bus master, medium devsel, latency 64, IRQ 21
Memory at f4302000 (32-bit, non-prefetchable) [size=4K]
I/O ports at 2800 [size=64]
Memory at f4200000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at <unassigned> [disabled] [size=1M]
Capabilities: <available only to root>

00:12.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
Flags: bus master, medium devsel, latency 0

00:12.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) (prog-if 80 [Master])
Flags: bus master, medium devsel, latency 64
I/O ports at 2860 [size=16]

00:12.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01) (prog-if 00 [UHCI])
Flags: bus master, medium devsel, latency 64, IRQ 21
I/O ports at 2840 [size=32]

00:12.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
Flags: medium devsel, IRQ 9

00:14.0 VGA compatible controller: Cirrus Logic GD 5480 (rev 23) (prog-if 00 [VGA])
Subsystem: Cirrus Logic CL-GD5480
Flags: bus master, medium devsel, latency 64
Memory at f6000000 (32-bit, prefetchable) [size=32M]
Memory at f4303000 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at <unassigned> [disabled] [size=32K]

01:0f.0 PCI bridge: Digital Equipment Corporation DECchip 21150 (rev 06) (prog-if 00 [Normal decode])
Flags: bus master, fast Back2Back, 66Mhz, medium devsel, latency 240
Bus: primary=01, secondary=02, subordinate=02, sec-latency=68
Capabilities: <available only to root>


processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 3
cpu MHz : 796.559
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 1589.24

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 3
cpu MHz : 796.559
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 1592.52



Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

2003-01-07 11:27:34

by uaca

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?


I'm running 2.5.54 and the system seems stable

anyway I see the error reported in my previous e-mail

aic7xxx: PCI Device 0:12:0 failed memory mapped test. Using PIO.
aic7xxx: PCI Device 0:12:1 failed memory mapped test. Using PIO.
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.25
<Adaptec aic7896/97 Ultra2 SCSI adapter>
aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi0: PCI error Interrupt at seqaddr = 0x2
scsi0: Signaled a Target Abort
scsi1: PCI error Interrupt at seqaddr = 0x2
scsi1: Signaled a Target Abort
(scsi0:A:0): 80.000MB/s transfers (40.000MHz, offset 63, 16bit)
Vendor: QUANTUM Model: ATLAS_V_36_WLS Rev: 0230
Type: Direct-Access ANSI SCSI revision: 03
scsi0:A:0:0: Tagged Queuing enabled. Depth 64
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.25
<Adaptec aic7896/97 Ultra2 SCSI adapter>
aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/253 SCBs

SCSI device sda: drive cache: write back
SCSI device sda: 71722776 512-byte hdwr sectors (36722 MB)
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0

no other messages seen while system running


controller's bios version is v2.57S2B3

My previous e-mail is appended here



On Mon, Jan 06, 2003 at 05:36:25PM +0100, [email protected] wrote:
> Hi hav
>
> I have to report the same problem, tomorrow I will complete the info,
> including bios version and read-only tests on a disk device
>
> At least I can say I've saw the message
>
> > scsi0: PCI error Interrupt at seqaddr = 0x2
> > scsi0: Signaled a Target Abort
> > scsi1: PCI error Interrupt at seqaddr = 0x2
> > scsi1: Signaled a Target Abort
>
> > It would be good to know the chipset on the motherboard. As to why
> > the old driver worked, for 6.X.X drivers, you may have just been lucky.
> > For 5.X.X drivers, they perform a read after every register write to
> > "manually" prevent any byte-merging. These reads are actually more
> > expensive than just using PIO. Neither of these older drivers included
> > a test to try and catch fishy behavior.
>
>
> Kernel 2.4.x works fine
>
> now some info about chipset, etc...
>
>
> 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
> Flags: bus master, medium devsel, latency 64
> Memory at f8000000 (32-bit, prefetchable) [size=64M]
> Capabilities: <available only to root>
>
> 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge (prog-if 00 [Normal decode])
> Flags: bus master, 66Mhz, medium devsel, latency 64
> Bus: primary=00, secondary=01, subordinate=02, sec-latency=64
>
> 00:0b.0 ATM network controller: FORE Systems Inc PCA-200E
> Flags: bus master, medium devsel, latency 64, IRQ 18
> Memory at f4000000 (32-bit, non-prefetchable) [size=2M]
> Expansion ROM at <unassigned> [disabled] [size=8K]
>
> 00:0c.0 SCSI storage controller: Adaptec 7896
> Subsystem: Adaptec: Unknown device 0053
> Flags: bus master, medium devsel, latency 64, IRQ 19
> BIST result: 00
> I/O ports at 2000 [disabled] [size=256]
> Memory at f4300000 (64-bit, non-prefetchable) [size=4K]
> Expansion ROM at <unassigned> [disabled] [size=128K]
> Capabilities: <available only to root>
>
> 00:0c.1 SCSI storage controller: Adaptec 7896
> Subsystem: Adaptec: Unknown device 0053
> Flags: bus master, medium devsel, latency 64, IRQ 19
> BIST result: 00
> I/O ports at 2400 [disabled] [size=256]
> Memory at f4301000 (64-bit, non-prefetchable) [size=4K]
> Capabilities: <available only to root>
>
> 00:0e.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
> Subsystem: Intel Corporation 82559 Fast Ethernet LAN on Motherboard
> Flags: bus master, medium devsel, latency 64, IRQ 21
> Memory at f4302000 (32-bit, non-prefetchable) [size=4K]
> I/O ports at 2800 [size=64]
> Memory at f4200000 (32-bit, non-prefetchable) [size=1M]
> Expansion ROM at <unassigned> [disabled] [size=1M]
> Capabilities: <available only to root>
>
> 00:12.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
> Flags: bus master, medium devsel, latency 0
>
> 00:12.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) (prog-if 80 [Master])
> Flags: bus master, medium devsel, latency 64
> I/O ports at 2860 [size=16]
>
> 00:12.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01) (prog-if 00 [UHCI])
> Flags: bus master, medium devsel, latency 64, IRQ 21
> I/O ports at 2840 [size=32]
>
> 00:12.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
> Flags: medium devsel, IRQ 9
>
> 00:14.0 VGA compatible controller: Cirrus Logic GD 5480 (rev 23) (prog-if 00 [VGA])
> Subsystem: Cirrus Logic CL-GD5480
> Flags: bus master, medium devsel, latency 64
> Memory at f6000000 (32-bit, prefetchable) [size=32M]
> Memory at f4303000 (32-bit, non-prefetchable) [size=4K]
> Expansion ROM at <unassigned> [disabled] [size=32K]
>
> 01:0f.0 PCI bridge: Digital Equipment Corporation DECchip 21150 (rev 06) (prog-if 00 [Normal decode])
> Flags: bus master, fast Back2Back, 66Mhz, medium devsel, latency 240
> Bus: primary=01, secondary=02, subordinate=02, sec-latency=68
> Capabilities: <available only to root>
>
>
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 8
> model name : Pentium III (Coppermine)
> stepping : 3
> cpu MHz : 796.559
> cache size : 256 KB
> fdiv_bug : no
> hlt_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 2
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
> bogomips : 1589.24
>
> processor : 1
> vendor_id : GenuineIntel
> cpu family : 6
> model : 8
> model name : Pentium III (Coppermine)
> stepping : 3
> cpu MHz : 796.559
> cache size : 256 KB
> fdiv_bug : no
> hlt_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 2
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
> bogomips : 1592.52
>
>
>
> Debian GNU/Linux: a dream come true
> -----------------------------------------------------------------------------
> "Computers are useless. They can only give answers." Pablo Picasso
>
> ---> Visita http://www.valux.org/ para saber acerca de la <---
> ---> Asociaci?n Valenciana de Usuarios de Linux <---
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

2003-01-08 02:32:47

by Tomas Szepe

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?

> [[email protected]]
>
> These reads are actually more expensive than just using PIO. Neither of
> these older drivers included a test to try and catch fishy behavior.

Justin, are you quite sure that these tests actually work?
I too have just run into

aic7xxx: PCI Device 0:16:0 failed memory mapped test. Using PIO.
aic7xxx: PCI Device 0:17:0 failed memory mapped test. Using PIO.

with aic79xx-linux-2.4-20021230 (6.2.25) in Linux 2.4.21-pre3.
What makes me scratch my head in particular is:

o The chipset is i440BX aka the Compatibility King.
o I've never had *any* problems with 6.2.8.

Full ahc boot-up messages follow:

PCI: Found IRQ 11 for device 00:10.0
aic7xxx: PCI Device 0:16:0 failed memory mapped test. Using PIO.
PCI: Found IRQ 10 for device 00:11.0
aic7xxx: PCI Device 0:17:0 failed memory mapped test. Using PIO.
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.25
<Adaptec 2940 Ultra SCSI adapter>
aic7880: Ultra Single Channel A, SCSI Id=7, 16/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.25
<Adaptec 2940 Ultra SCSI adapter>
aic7880: Ultra Single Channel A, SCSI Id=7, 16/253 SCBs

(scsi1:A:4): 20.000MB/s transfers (20.000MHz, offset 15)
(scsi0:A:3): 20.000MB/s transfers (20.000MHz, offset 15)
Vendor: SEAGATE Model: ST39173N Rev: 6244
Type: Direct-Access ANSI SCSI revision: 02
scsi0:A:3:0: Tagged Queuing enabled. Depth 253
Vendor: SEAGATE Model: ST39173N Rev: 6244
Type: Direct-Access ANSI SCSI revision: 02
scsi1:A:4:0: Tagged Queuing enabled. Depth 253
Attached scsi disk sda at scsi0, channel 0, id 3, lun 0
Attached scsi disk sdb at scsi1, channel 0, id 4, lun 0
...

--
Tomas Szepe <[email protected]>

2003-01-08 04:14:49

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?

>> [[email protected]]
>>
>> These reads are actually more expensive than just using PIO. Neither of
>> these older drivers included a test to try and catch fishy behavior.
>
> Justin, are you quite sure that these tests actually work?
> I too have just run into

See my recent post to the SCSI list. The tests don't work on
certain older controllers that lack a feature I was using. The
latest csets submitted to Linus correct this problem (as verified
on a dusty dual P-90 PCI/EISA box just added to our regression cluster).

--
Justin

2003-01-08 09:56:51

by Tomas Szepe

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?

> [[email protected]]
>
> > [[email protected]]
> >
> > These reads are actually more expensive than just using PIO. Neither of
> > these older drivers included a test to try and catch fishy behavior.
> >
> > Justin, are you quite sure that these tests actually work?
> > I too have just run into
>
> See my recent post to the SCSI list. The tests don't work on
> certain older controllers that lack a feature I was using. The
> latest csets submitted to Linus correct this problem (as verified
> on a dusty dual P-90 PCI/EISA box just added to our regression cluster).

Ok. I can confirm 6.2.26 fixes the false positive here:

PCI: Found IRQ 11 for device 00:10.0
PCI: Found IRQ 10 for device 00:11.0
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.26
<Adaptec 2940 Ultra SCSI adapter>
aic7880: Ultra Single Channel A, SCSI Id=7, 16/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.26
<Adaptec 2940 Ultra SCSI adapter>
aic7880: Ultra Single Channel A, SCSI Id=7, 16/253 SCBs

Thanks,
--
Tomas Szepe <[email protected]>

2003-01-08 15:30:54

by uaca

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?

On Tue, Jan 07, 2003 at 09:23:04PM -0700, Justin T. Gibbs wrote:
> >> [[email protected]]
> >>
> >> These reads are actually more expensive than just using PIO. Neither of
> >> these older drivers included a test to try and catch fishy behavior.
> >
> > Justin, are you quite sure that these tests actually work?
> > I too have just run into
>
> See my recent post to the SCSI list. The tests don't work on
> certain older controllers that lack a feature I was using. The
> latest csets submitted to Linus correct this problem (as verified
> on a dusty dual P-90 PCI/EISA box just added to our regression cluster).

It also seems to work for me:


scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.26
<Adaptec aic7896/97 Ultra2 SCSI adapter>
aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

(scsi0:A:0): 80.000MB/s transfers (40.000MHz, offset 63, 16bit)
Vendor: QUANTUM Model: ATLAS_V_36_WLS Rev: 0230
Type: Direct-Access ANSI SCSI revision: 03
scsi0:A:0:0: Tagged Queuing enabled. Depth 64
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.26
<Adaptec aic7896/97 Ultra2 SCSI adapter>
aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/253 SCBs

SCSI device sda: drive cache: write back
SCSI device sda: 71722776 512-byte hdwr sectors (36722 MB)
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0


Theses are the controllers:

00:0c.0 SCSI storage controller: Adaptec 7896
Subsystem: Adaptec: Unknown device 0053
Flags: bus master, medium devsel, latency 64, IRQ 19
BIST result: 00
I/O ports at 2000 [disabled] [size=256]
Memory at f4300000 (64-bit, non-prefetchable) [size=4K]
Expansion ROM at <unassigned> [disabled] [size=128K]
Capabilities: <available only to root>

00:0c.1 SCSI storage controller: Adaptec 7896
Subsystem: Adaptec: Unknown device 0053
Flags: bus master, medium devsel, latency 64, IRQ 19
BIST result: 00
I/O ports at 2400 [disabled] [size=256]
Memory at f4301000 (64-bit, non-prefetchable) [size=4K]
Capabilities: <available only to root>



Thanks


Ulisses

Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

2003-01-09 11:57:09

by David Lang

[permalink] [raw]
Subject: Re: aic7xxx broken in 2.5.53/54 ?

I just tried 2.5.55 and it still locks up. I will hook up my laptop and
see if I can get aa serial console dump tomorrow night.

messages are

Slave Alloc 0
launching DV thread
begin domain validation
scsi0:2477 going from state 0 to state 1
scsi0:A:0:0: sending INQ
scsi0:timeout while doing DV command 12
scsi0:0:0:0 command completed status=0x90000
scsi0:A:0:0 enntering ahc_linux_dv_transition, state=1 statis=0x14005, cmd->result=0x90000
scsi0:2645 going from state 1 to state 1

at this point all the messages between the 'going to state' messages
repeat exactly, this happens for a couple min and then a whole bunch of
other stuff scrolls by (I don't know if this happens on previous versions,
I had given up before that much time had passed) the final message is
something about a recovery sleep and then the machine stops responding (I
waited 10 min this time to make sure it wasn't going to start working
again)

Daavid Lang

On Mon, 6 Jan 2003, Justin T. Gibbs wrote:

> Date: Mon, 06 Jan 2003 09:16:53 -0700
> From: Justin T. Gibbs <[email protected]>
> To: [email protected]
> Cc: [email protected], [email protected]
> Subject: Re: aic7xxx broken in 2.5.53/54 ?
>
> > Hi Justin,
> >
> > On Fri, Jan 03, 2003 at 08:14:06AM -0700, Justin T. Gibbs wrote:
> >> > Looks like the aic7xxx driver in 2.5.53 and 54 are broken on my
> >> > hardware.
> >>
> >> It looks like the driver recovers fine.
> >
> > Not for long. It dies shortly afterwards.
>
> In what fashion?
>
> >> > aic7xxx: PCI Device 0:1:0 failed memory mapped test. Using PIO.
> >> > Uhhuh. NMI received for unknown reason 25 on CPU 0.
> >>
> >> SERR must be enabled by your BIOS. I will change the driver so
> >> that, should the memory mapped I/O test fail, an SERR (and thus an
> >> NMI) is not generated.
> >
> > I guess having to use PIO with aic7xxx is bad. MMIO failure is
> > what we need to investigate.
>
> The only way that I know how to investigate these issues is
> with a PCI bus analyzer. We're in the process of going through
> all of the systems we have in our lab to see which ones fail and
> why, but I certainly don't have one of every failing system on
> the planet. 8-)
>
> >> Just out of curiosity, do you have any strange PCI options enabled
> >> in your BIOS? I remeber seeing memory mapped I/O failures on this
> >> ServerWorks chipset under FreeBSD in the past, but an updated BIOS
> >> resolved the issue for the affected users. It seemed that the BIOS
> >> incorrectly placed the Adaptec controller in a prefetchable region.
> >>
> >
> > I didn't change anything in that box since it was delivered to me. FYI
> > it is an IBM x250. Would it help if I can get a PCI space dump and mtrr
> > dump ? FWIW, the older driver works fine. Does the older driver use
> > only PIO ?
>
> It would be good to know the chipset on the motherboard. As to why
> the old driver worked, for 6.X.X drivers, you may have just been lucky.
> For 5.X.X drivers, they perform a read after every register write to
> "manually" prevent any byte-merging. These reads are actually more
> expensive than just using PIO. Neither of these older drivers included
> a test to try and catch fishy behavior.
>
> --
> Justin
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>