2004-04-29 23:44:14

by CaT

[permalink] [raw]
Subject: libata + siI3112 + 2.6.5-rc3 hang

Just acquired a Seagate 200GB SATA HD (yeah, baby, yeah ;) and hooked
it up to my onboard Silicon Image iI 3112 SATA Raid controller of my
Gigabyte nforce2 MB. Things work fine for the most part except when
heavy IO is done on the drive. Then the system hangs totally with no
console error msgs displayed. This also happens under Debian sarge's
2.4.25 aswell and has occured when I did a mke2fs -c on a partition
and (twice) with hdparm -tT. The first time hdparm works fine and
infact clocks the HD at 62MB/s (wowsers!), but the second time the
system hangs.

I'm not 100% sure what info to provide here so I took a stab at a few
things:

0000:01:0d.0 Unknown mass storage controller: CMD Technology Inc Silicon Image SiI 3112 SATARaid Controller (rev 02)
Subsystem: CMD Technology Inc Silicon Image SiI 3112 SATARaid Controller
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32, Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 15
Region 0: I/O ports at a800
Region 1: I/O ports at ac00 [size=4]
Region 2: I/O ports at b000 [size=8]
Region 3: I/O ports at b400 [size=4]
Region 4: I/O ports at b800 [size=16]
Region 5: Memory at e1001000 (32-bit, non-prefetchable) [size=512]
Capabilities: [60] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-

--- 8< ---

NFORCE2: IDE controller at PCI slot 0000:00:09.0
NFORCE2: chipset revision 162
NFORCE2: not 100% native mode: will probe irqs later
NFORCE2: 0000:00:09.0 (rev a2) UDMA133 controller
NFORCE2: neither IDE port enabled (BIOS)
ide-floppy driver 0.99.newide
libata version 1.02 loaded.
sata_sil version 0.54
ata1: SATA max UDMA/100 cmd 0xE0847080 ctl 0xE084708A bmdma 0xE0847000 irq 15
ata2: SATA max UDMA/100 cmd 0xE08470C0 ctl 0xE08470CA bmdma 0xE0847008 irq 15
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:207f
ata1: dev 0 ATA, max UDMA/133, 390721968 sectors (lba48)
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: no device found (phy stat 00000000)
ata2: thread exiting
scsi1 : sata_sil
Vendor: ATA Model: ST3200822AS Rev: 1.02
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
SCSI device sda: drive cache: write through
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0

--- 8< ---

scsi0? I thought it detected it at scsi1? This reminds me. The MB has
the connector labeled as SATA1 but on bootup it's detected as the primary
SATA drive.

--- 8< ---

CONFIG_SCSI_SATA=y
# CONFIG_SCSI_SATA_SVW is not set
# CONFIG_SCSI_SATA_PROMISE is not set
CONFIG_SCSI_SATA_SIL=y
# CONFIG_SCSI_SATA_SIS is not set
# CONFIG_SCSI_SATA_VIA is not set
# CONFIG_SCSI_SATA_VITESSE is not set

--- 8< ---

lexx:/proc# more interrupts
CPU0
0: 5107953 IO-APIC-edge timer
1: 9 IO-APIC-edge i8042
2: 0 XT-PIC cascade
8: 1 IO-APIC-edge rtc
9: 0 XT-PIC NVidia nForce2
11: 1872 IO-APIC-edge eth0
12: 58 IO-APIC-edge i8042
15: 3200 IO-APIC-edge libata
NMI: 0
LOC: 5107919
ERR: 0
MIS: 0

If you need more info, any debugging done, etc please yell. This is going
to be the primary and only linux drive in the system (I'm relegating
windows to 'PATA') and so it'd be nice to have it stable. :)

Thanks.

--
Red herrings strewn hither and yon.


Subject: Re: libata + siI3112 + 2.6.5-rc3 hang


Probably your drive needs mod15write quirk. please try this.

[PATCH] sata_sil.c: ST3200822AS needs MOD15WRITE quirk

linux-2.6.6-rc2-bk4-bzolnier/drivers/scsi/sata_sil.c | 1 +
1 files changed, 1 insertion(+)

diff -puN drivers/scsi/sata_sil.c~sata_sil_fix drivers/scsi/sata_sil.c
--- linux-2.6.6-rc2-bk4/drivers/scsi/sata_sil.c~sata_sil_fix 2004-04-30 02:00:37.387289528 +0200
+++ linux-2.6.6-rc2-bk4-bzolnier/drivers/scsi/sata_sil.c 2004-04-30 02:00:53.417852512 +0200
@@ -82,6 +82,7 @@ struct sil_drivelist {
{ "ST360015AS", SIL_QUIRK_MOD15WRITE },
{ "ST380023AS", SIL_QUIRK_MOD15WRITE },
{ "ST3120023AS", SIL_QUIRK_MOD15WRITE },
+ { "ST3200822AS", SIL_QUIRK_MOD15WRITE },
{ "ST340014ASL", SIL_QUIRK_MOD15WRITE },
{ "ST360014ASL", SIL_QUIRK_MOD15WRITE },
{ "ST380011ASL", SIL_QUIRK_MOD15WRITE },

_

On Friday 30 of April 2004 01:42, CaT wrote:
> Just acquired a Seagate 200GB SATA HD (yeah, baby, yeah ;) and hooked
> it up to my onboard Silicon Image iI 3112 SATA Raid controller of my
> Gigabyte nforce2 MB. Things work fine for the most part except when
> heavy IO is done on the drive. Then the system hangs totally with no
> console error msgs displayed. This also happens under Debian sarge's
> 2.4.25 aswell and has occured when I did a mke2fs -c on a partition
> and (twice) with hdparm -tT. The first time hdparm works fine and
> infact clocks the HD at 62MB/s (wowsers!), but the second time the
> system hangs.

It will go down with a quirk :( blame SiI for not providing chipset errata.

> scsi0? I thought it detected it at scsi1? This reminds me. The MB has
> the connector labeled as SATA1 but on bootup it's detected as the primary
> SATA drive.

libata has zero knowledge about legacy ordering and it's GOOD thing.

Cheers,
Bartlomiej

2004-04-30 00:40:49

by CaT

[permalink] [raw]
Subject: Re: libata + siI3112 + 2.6.5-rc3 hang

On Fri, Apr 30, 2004 at 02:08:32AM +0200, Bartlomiej Zolnierkiewicz wrote:
>
> Probably your drive needs mod15write quirk. please try this.

The system hung linking the kernel. :) I'll try compiling again
either in 2-3hrs time or 6-7 depending on when I can get home.

Figures. :/
> On Friday 30 of April 2004 01:42, CaT wrote:
> > 2.4.25 aswell and has occured when I did a mke2fs -c on a partition
> > and (twice) with hdparm -tT. The first time hdparm works fine and
> > infact clocks the HD at 62MB/s (wowsers!), but the second time the
> > system hangs.
>
> It will go down with a quirk :( blame SiI for not providing chipset errata.

Actively doing so as I glare at my dead ssh connection. :)

> > scsi0? I thought it detected it at scsi1? This reminds me. The MB has
> > the connector labeled as SATA1 but on bootup it's detected as the primary
> > SATA drive.
>
> libata has zero knowledge about legacy ordering and it's GOOD thing.

Not complaining. This is all new to me and I'm trying to get it all
straight in my head (one of the reasons why I went with a SATA drive
rather then PATA).

--
Red herrings strewn hither and yon.

2004-04-30 09:40:43

by CaT

[permalink] [raw]
Subject: Re: libata + siI3112 + 2.6.5-rc3 hang

On Fri, Apr 30, 2004 at 02:08:32AM +0200, Bartlomiej Zolnierkiewicz wrote:
>
> Probably your drive needs mod15write quirk. please try this.
>
> [PATCH] sata_sil.c: ST3200822AS needs MOD15WRITE quirk

Didn't work. Still hangs rather well. :/

--
Red herrings strewn hither and yon.

Subject: Re: libata + siI3112 + 2.6.5-rc3 hang

On Friday 30 of April 2004 11:39, CaT wrote:
> On Fri, Apr 30, 2004 at 02:08:32AM +0200, Bartlomiej Zolnierkiewicz wrote:
> > Probably your drive needs mod15write quirk. please try this.
> >
> > [PATCH] sata_sil.c: ST3200822AS needs MOD15WRITE quirk
>
> Didn't work. Still hangs rather well. :/

I have no idea then. Jeff?

2004-05-01 03:08:57

by CaT

[permalink] [raw]
Subject: Re: libata + siI3112 + 2.6.5-rc3 hang

On Fri, Apr 30, 2004 at 06:00:08PM +0200, Bartlomiej Zolnierkiewicz wrote:
> On Friday 30 of April 2004 11:39, CaT wrote:
> > On Fri, Apr 30, 2004 at 02:08:32AM +0200, Bartlomiej Zolnierkiewicz wrote:
> > > Probably your drive needs mod15write quirk. please try this.
> > >
> > > [PATCH] sata_sil.c: ST3200822AS needs MOD15WRITE quirk
> >
> > Didn't work. Still hangs rather well. :/
>
> I have no idea then. Jeff?

A solution has come forth! Whee! :) Joe Rutledge sent me a message in
private relating to his issues with the sil3112 and local apic. It solved
the hang issue for him and it appears to have solved it for me also as
I've run many a hdparm -tT on the drive and got upto 62MB/s each go where
as before I could run it once at the most, with the 2nd try resulting
in a hang.

Happy days. Linux doesn't hang anymore on my PC, my SATA drive does 62MB/s
thanks to libata (tons of thanks for the work on that - it did 35MB/s using
the normal IDE SATA driver) and I found a reclusive easter egg next to my
keyboard. Joy. :)

Here's the patch that Joe sent me. It doesn't apply cleanly mainly due
to formatting errors in the patch but a bit of manual fixerupping made
it all apply.

--- 8< ---
--- linux-2.6.4-orig/arch/i386/pci/fixup.c 2004-03-11
03:55:36.000000000 +0100
+++ linux-2.6.4/arch/i386/pci/fixup.c 2004-03-16 13:12:25.706569480 +0100
@@ -187,6 +187,22 @@
dev->transparent = 1;
}

+/*
+ * Halt Disconnect and Stop Grant Disconnect (bit 4 at offset 0x6F)
+ * must be disabled when APIC is used (or lockups will happen).
+ */
+static void __devinit pci_fixup_nforce2_disconnect(struct pci_dev *d)
+{
+ u8 t;
+
+ pci_read_config_byte(d, 0x6F, &t);
+ if (t & 0x10) {
+ printk(KERN_INFO "PCI: disabling nForce2 Halt Disconnect"
+ " and Stop Grant Disconnect\n");
+ pci_write_config_byte(d, 0x6F, (t & 0xef));
+ }
+}
+
struct pci_fixup pcibios_fixups[] = {
{
.pass = PCI_FIXUP_HEADER,
@@ -290,5 +306,11 @@
.device = PCI_ANY_ID,
.hook = pci_fixup_transparent_bridge
},
+ {
+ .pass = PCI_FIXUP_HEADER,
+ .vendor = PCI_VENDOR_ID_NVIDIA,
+ .device = PCI_DEVICE_ID_NVIDIA_NFORCE2,
+ .hook = pci_fixup_nforce2_disconnect
+ },
{ .pass = 0 }
};
--- 8< ---

--
Red herrings strewn hither and yon.