2007-10-11 12:10:15

by Bernd Schubert

[permalink] [raw]
Subject: [PATCHES] Re: sil3114 data corruption

On Wednesday 10 October 2007 11:12:20 Bernd Schubert wrote:
> On Monday 08 October 2007 17:09:17 Bernd Schubert wrote:
> > [sorry for sending twice, but after I read the sil sources, I see the
> > mail address had been wrong]
> >
> > Hi,
> >
> > somehow the sil3114 causes data corruption with some (newer?) disks.
> > Simply filling the filesystem with zeros and reading the these data will
> > make the kernel to report filesystem corruption.
> > This is definitely not an issue of memory, since the systems (several
> > tested) do have ECC memory and the memory is monitored with EDAC.
> >
> > kernel versions tested: 2.6.15-2.6.20
>
> Update: Setting sata_sil.slow_down=1 fill fix the problem, seems there are
> some drives missing in the quirk table.
>
> Jeff, I found an old patch/workaround from you
> (http://uwsg.indiana.edu/hypermail/linux/kernel/0403.1/1957.html), can you
> give me any further information why this never went into the driver?
>

I will send 3 mails with patches to fix this corruption.


--
Bernd Schubert
Q-Leap Networks GmbH


2007-10-11 12:15:49

by Bernd Schubert

[permalink] [raw]
Subject: [PATCH 1/3] Re: sil3114 data corruption

This will add the Seagate ST3250820AS to the mod15 blacklist.

I think this is rather trivial and should go into any any release as soon as
possible, since there will be data corruption without it for this disk.

Signed-off-by: Bernd Schubert <[email protected]>

Index: linux-2.6.23-rc9/drivers/ata/sata_sil.c
===================================================================
--- linux-2.6.23-rc9.orig/drivers/ata/sata_sil.c 2007-10-11 10:44:57.000000000
+0200
+++ linux-2.6.23-rc9/drivers/ata/sata_sil.c 2007-10-11 10:45:02.000000000
+0200
@@ -151,6 +151,7 @@ static const struct sil_drivelist {
{ "ST380011ASL", SIL_QUIRK_MOD15WRITE },
{ "ST3120022ASL", SIL_QUIRK_MOD15WRITE },
{ "ST3160021ASL", SIL_QUIRK_MOD15WRITE },
+ { "ST3250820AS", SIL_QUIRK_MOD15WRITE },
{ "Maxtor 4D060H3", SIL_QUIRK_UDMA5MAX },
{ }
};



--
Bernd Schubert
Q-Leap Networks GmbH

2007-10-11 12:20:49

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH 2/3] Re: sil3114 data corruption

This will add the sil3114 back to the controllers with the mod15 bug. Without
this patch no workaround for this controller is done and people might/will
suffer from data corruption.

Also rather trivial, though with a huge effect, the speed for the effected
disks will go down from about 45-50MB/s to 20-25MB/s. But better safe than
lost data or damaged filesystem.

Signed-off-by: Bernd Schubert <[email protected]>

Index: linux-2.6.23-rc9/drivers/ata/sata_sil.c
===================================================================
--- linux-2.6.23-rc9.orig/drivers/ata/sata_sil.c 2007-10-11 10:45:02.000000000
+0200
+++ linux-2.6.23-rc9/drivers/ata/sata_sil.c 2007-10-11 10:45:08.000000000
+0200
@@ -241,7 +241,8 @@ static const struct ata_port_info sil_po
},
/* sil_3114 */
{
- .flags = SIL_DFL_PORT_FLAGS | SIL_FLAG_RERR_ON_DMA_ACT,
+ .flags = SIL_DFL_PORT_FLAGS | SIL_FLAG_RERR_ON_DMA_ACT
+ | SIL_FLAG_MOD15WRITE,
.pio_mask = 0x1f, /* pio0-4 */
.mwdma_mask = 0x07, /* mwdma0-2 */
.udma_mask = ATA_UDMA5,



--
Bernd Schubert
Q-Leap Networks GmbH

2007-10-11 12:33:38

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

This is based on a patch from Jeff from 2004, but backported to 2.6.23 and
furthermore, it will use the 7.5kiB/512B splitoff for blacklisted drives
only.

Jeff, why did you replace ATA_SHT_USE_CLUSTERING and ATA_DMA_BOUNDARY?

drivers/ata/libata-core.c | 9 ++++-
drivers/ata/sata_sil.c | 58 ++++++++++++++++++++++++++++++------
include/linux/libata.h | 6 +++
3 files changed, 62 insertions(+), 11 deletions(-)

Signed-off-by: Bernd Schubert <[email protected]>


Index: linux-2.6.23-rc9/drivers/ata/libata-core.c
===================================================================
--- linux-2.6.23-rc9.orig/drivers/ata/libata-core.c 2007-10-02
17:21:12.000000000 +0200
+++ linux-2.6.23-rc9/drivers/ata/libata-core.c 2007-10-11 10:46:18.000000000
+0200
@@ -4073,7 +4073,7 @@ void ata_sg_clean(struct ata_queued_cmd
* spin_lock_irqsave(host lock)
*
*/
-static void ata_fill_sg(struct ata_queued_cmd *qc)
+void ata_fill_sg(struct ata_queued_cmd *qc)
{
struct ata_port *ap = qc->ap;
struct scatterlist *sg;
@@ -4217,10 +4217,15 @@ int ata_check_atapi_dma(struct ata_queue
*/
void ata_qc_prep(struct ata_queued_cmd *qc)
{
+ struct ata_port *ap = qc->ap;
+
if (!(qc->flags & ATA_QCFLAG_DMAMAP))
return;

- ata_fill_sg(qc);
+ if (ap->ops->fill_sg)
+ ap->ops->fill_sg(qc);
+ else
+ ata_fill_sg(qc);
}

/**
Index: linux-2.6.23-rc9/drivers/ata/sata_sil.c
===================================================================
--- linux-2.6.23-rc9.orig/drivers/ata/sata_sil.c 2007-10-11 10:45:08.000000000
+0200
+++ linux-2.6.23-rc9/drivers/ata/sata_sil.c 2007-10-11 10:57:51.000000000
+0200
@@ -120,6 +120,7 @@ static int sil_scr_write(struct ata_port
static int sil_set_mode (struct ata_port *ap, struct ata_device **r_failed);
static void sil_freeze(struct ata_port *ap);
static void sil_thaw(struct ata_port *ap);
+static void sil_fill_sg(struct ata_queued_cmd *qc);


static const struct pci_device_id sil_pci_tbl[] = {
@@ -174,12 +175,12 @@ static struct scsi_host_template sil_sht
.queuecommand = ata_scsi_queuecmd,
.can_queue = ATA_DEF_QUEUE,
.this_id = ATA_SHT_THIS_ID,
- .sg_tablesize = LIBATA_MAX_PRD,
+ .sg_tablesize = 120, /* max 15 kiB sectors ? */
.cmd_per_lun = ATA_SHT_CMD_PER_LUN,
.emulated = ATA_SHT_EMULATED,
- .use_clustering = ATA_SHT_USE_CLUSTERING,
+ .use_clustering = 1,
.proc_name = DRV_NAME,
- .dma_boundary = ATA_DMA_BOUNDARY,
+ .dma_boundary = 0x1fff,
.slave_configure = ata_scsi_slave_config,
.slave_destroy = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
@@ -187,6 +188,7 @@ static struct scsi_host_template sil_sht

static const struct ata_port_operations sil_ops = {
.port_disable = ata_port_disable,
+ .fill_sg = sil_fill_sg,
.dev_config = sil_dev_config,
.tf_load = ata_tf_load,
.tf_read = ata_tf_read,
@@ -278,9 +280,9 @@ MODULE_LICENSE("GPL");
MODULE_DEVICE_TABLE(pci, sil_pci_tbl);
MODULE_VERSION(DRV_VERSION);

-static int slow_down = 0;
-module_param(slow_down, int, 0444);
-MODULE_PARM_DESC(slow_down, "Sledgehammer used to work around random
problems, by limiting commands to 15 sectors (0=off, 1=on)");
+static int mod15_quirk = 0;
+module_param(mod15_quirk, int, 0444);
+MODULE_PARM_DESC(mod15_quirk, "Some disks from Seagate need a mod15
workaround.");


static unsigned char sil_get_device_cache_line(struct pci_dev *pdev)
@@ -534,6 +536,44 @@ static void sil_thaw(struct ata_port *ap
writel(tmp, mmio_base + SIL_SYSCFG);
}

+static void sil_fill_sg(struct ata_queued_cmd *qc)
+{
+ struct ata_port *ap = qc->ap;
+ u32 addr, len;
+ unsigned int idx;
+
+ ata_fill_sg(qc);
+
+ /* check if we need the MOD15 workaround */
+ if (!(qc->dev->quirk & SIL_FLAG_MOD15WRITE))
+ return;
+
+ if (unlikely(qc->n_elem < 1))
+ return;
+
+ /* hardware S/G list may be longer (or shorter) than number of
+ * PCI-mapped S/G entries (qc->n_elem), due to splitting
+ * in ata_fill_sg(). Start at zero, and skip to end
+ * of list, if we're not already there.
+ */
+ idx = 0;
+ while ((le32_to_cpu(ap->prd[idx].flags_len) & ATA_PRD_EOT) == 0)
+ idx++;
+
+ /* Errata workaround: if last segment is exactly 8K, split
+ * into 7.5K and 512b pieces.
+ */
+ len = le32_to_cpu(ap->prd[idx].flags_len) & 0xffff;
+ if (len == 8192) {
+ addr = le32_to_cpu(ap->prd[idx].addr);
+ ap->prd[idx].flags_len = cpu_to_le32(15 * 512);
+
+ idx++;
+ ap->prd[idx].addr = cpu_to_le32(addr + (15 * 512));
+ ap->prd[idx].flags_len = cpu_to_le32(512 | ATA_PRD_EOT);
+ }
+}
+
/**
* sil_dev_config - Apply device/host-specific errata fixups
* @dev: Device to be examined
@@ -577,14 +617,14 @@ static void sil_dev_config(struct ata_de
break;
}

- /* limit requests to 15 sectors */
- if (slow_down ||
+ /* mod15 bug */
+ if (mod15_quirk ||
((ap->flags & SIL_FLAG_MOD15WRITE) &&
(quirks & SIL_QUIRK_MOD15WRITE))) {
if (print_info)
ata_dev_printk(dev, KERN_INFO, "applying Seagate "
"errata fix (mod15write workaround)\n");
- dev->max_sectors = 15;
+ dev->quirk |= SIL_FLAG_MOD15WRITE;
return;
}

Index: linux-2.6.23-rc9/include/linux/libata.h
===================================================================
--- linux-2.6.23-rc9.orig/include/linux/libata.h 2007-10-02 17:21:28.000000000
+0200
+++ linux-2.6.23-rc9/include/linux/libata.h 2007-10-11 10:53:31.000000000
+0200
@@ -471,6 +471,9 @@ struct ata_device {
/* error history */
struct ata_ering ering;
int spdn_cnt;
+
+ /* driver specific flags */
+ unsigned int quirk;
};

/* Offset into struct ata_device. Fields above it are maintained
@@ -641,6 +644,8 @@ struct ata_port_operations {

void (*bmdma_stop) (struct ata_queued_cmd *qc);
u8 (*bmdma_status) (struct ata_port *ap);
+
+ void (*fill_sg) (struct ata_queued_cmd *qc);
};

struct ata_port_info {
@@ -789,6 +794,7 @@ extern void ata_data_xfer_noirq(struct a
unsigned int buflen, int write_data);
extern void ata_dumb_qc_prep(struct ata_queued_cmd *qc);
extern void ata_qc_prep(struct ata_queued_cmd *qc);
+extern void ata_fill_sg(struct ata_queued_cmd *qc);
extern void ata_noop_qc_prep(struct ata_queued_cmd *qc);
extern unsigned int ata_qc_issue_prot(struct ata_queued_cmd *qc);
extern void ata_sg_init_one(struct ata_queued_cmd *qc, void *buf,


--
Bernd Schubert
Q-Leap Networks GmbH

2007-10-11 13:22:14

by Alan

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

> -static void ata_fill_sg(struct ata_queued_cmd *qc)
> +void ata_fill_sg(struct ata_queued_cmd *qc)
> {
> struct ata_port *ap = qc->ap;
> struct scatterlist *sg;
> @@ -4217,10 +4217,15 @@ int ata_check_atapi_dma(struct ata_queue
> */
> void ata_qc_prep(struct ata_queued_cmd *qc)
> {
> + struct ata_port *ap = qc->ap;
> +
> if (!(qc->flags & ATA_QCFLAG_DMAMAP))
> return;
>
> - ata_fill_sg(qc);
> + if (ap->ops->fill_sg)
> + ap->ops->fill_sg(qc);
> + else
> + ata_fill_sg(qc);
> }

Its probably better to simply make your own sil_qc_prep function for this
case rather than touch the core code.

> - .sg_tablesize = LIBATA_MAX_PRD,
> + .sg_tablesize = 120, /* max 15 kiB sectors ? */

If you are just fiddling with the way the data is split then
LIBATA_MAX_PRD - 1 should be totally safe)

> .cmd_per_lun = ATA_SHT_CMD_PER_LUN,
> .emulated = ATA_SHT_EMULATED,
> - .use_clustering = ATA_SHT_USE_CLUSTERING,
> + .use_clustering = 1,

Un-needed

> .proc_name = DRV_NAME,
> - .dma_boundary = ATA_DMA_BOUNDARY,
> + .dma_boundary = 0x1fff,

Ok

> .slave_configure = ata_scsi_slave_config,
> .slave_destroy = ata_scsi_slave_destroy,
> .bios_param = ata_std_bios_param,

> + /* Errata workaround: if last segment is exactly 8K, split
> + * into 7.5K and 512b pieces.
> + */
> + len = le32_to_cpu(ap->prd[idx].flags_len) & 0xffff;
> + if (len == 8192) {
> + addr = le32_to_cpu(ap->prd[idx].addr);
> + ap->prd[idx].flags_len = cpu_to_le32(15 * 512);
> +
> + idx++;
> + ap->prd[idx].addr = cpu_to_le32(addr + (15 * 512));
> + ap->prd[idx].flags_len = cpu_to_le32(512 | ATA_PRD_EOT);
> + }
> +}

And since in this approach we are merely splitting the last PRD entry in
some obscure cases we might as well do it by default as it should have no
performance impact of any note done this way.


2007-10-11 14:19:54

by Jeff Garzik

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

Alan Cox wrote:
>> -static void ata_fill_sg(struct ata_queued_cmd *qc)
>> +void ata_fill_sg(struct ata_queued_cmd *qc)
>> {
>> struct ata_port *ap = qc->ap;
>> struct scatterlist *sg;
>> @@ -4217,10 +4217,15 @@ int ata_check_atapi_dma(struct ata_queue
>> */
>> void ata_qc_prep(struct ata_queued_cmd *qc)
>> {
>> + struct ata_port *ap = qc->ap;
>> +
>> if (!(qc->flags & ATA_QCFLAG_DMAMAP))
>> return;
>>
>> - ata_fill_sg(qc);
>> + if (ap->ops->fill_sg)
>> + ap->ops->fill_sg(qc);
>> + else
>> + ata_fill_sg(qc);
>> }
>
> Its probably better to simply make your own sil_qc_prep function for this
> case rather than touch the core code.
>
>> - .sg_tablesize = LIBATA_MAX_PRD,
>> + .sg_tablesize = 120, /* max 15 kiB sectors ? */
>
> If you are just fiddling with the way the data is split then
> LIBATA_MAX_PRD - 1 should be totally safe)
>
>> .cmd_per_lun = ATA_SHT_CMD_PER_LUN,
>> .emulated = ATA_SHT_EMULATED,
>> - .use_clustering = ATA_SHT_USE_CLUSTERING,
>> + .use_clustering = 1,
>
> Un-needed
>
>> .proc_name = DRV_NAME,
>> - .dma_boundary = ATA_DMA_BOUNDARY,
>> + .dma_boundary = 0x1fff,
>
> Ok
>
>> .slave_configure = ata_scsi_slave_config,
>> .slave_destroy = ata_scsi_slave_destroy,
>> .bios_param = ata_std_bios_param,
>
>> + /* Errata workaround: if last segment is exactly 8K, split
>> + * into 7.5K and 512b pieces.
>> + */
>> + len = le32_to_cpu(ap->prd[idx].flags_len) & 0xffff;
>> + if (len == 8192) {
>> + addr = le32_to_cpu(ap->prd[idx].addr);
>> + ap->prd[idx].flags_len = cpu_to_le32(15 * 512);
>> +
>> + idx++;
>> + ap->prd[idx].addr = cpu_to_le32(addr + (15 * 512));
>> + ap->prd[idx].flags_len = cpu_to_le32(512 | ATA_PRD_EOT);
>> + }
>> +}
>
> And since in this approach we are merely splitting the last PRD entry in
> some obscure cases we might as well do it by default as it should have no
> performance impact of any note done this way.

Unfortunately all this stuff is quite meaningless, which was why my
patch was never merged.

The problem is that the 3112 generates Data FIS's of a size other than a
multiple of 512 bytes. Spec-legal, but exposed firmware bugs in many
early SATA drives. Early Seagate hard drives choked when the formula
(sector%15)==1 was satisfied (or something along those lines).

The problem with the fix is that Data FIS size is only roughly
correlated to PRD segment length or DMA boundary -- the chip could
decide to send out a frame even if the PRD length is < 8K. The 3112 can
generate not-512b-sized FIS's at any time, not just at the end of the
transfer.

That leaves us with two observations:

1) Just about the only valid optimization is to ensure that only the
write path must be limited to small chunks, not both read- and
write-paths. Tejun had a patch to do this a long time ago, but it's an
open question whether the large amount of code is worth it for a rare
combination.

2) Once we identified, over time, the set of drives affected by this
3112 quirk (aka drives that didn't fully comply to SATA spec), the
debugging of corruption cases largely shifted to the standard routine:
update the BIOS, replace the cables/RAM/power/mainboard/slot/etc. to be
certain of problem location.

Jeff


2007-10-11 14:39:53

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

On Thursday 11 October 2007 16:19:37 Jeff Garzik wrote:
> 1) Just about the only valid optimization is to ensure that only the
> write path must be limited to small chunks, not both read- and
> write-paths. Tejun had a patch to do this a long time ago, but it's an
> open question whether the large amount of code is worth it for a rare
> combination.

How large? This patch is rather small? Where can I find it?

>
> 2) Once we identified, over time, the set of drives affected by this
> 3112 quirk (aka drives that didn't fully comply to SATA spec), the
> debugging of corruption cases largely shifted to the standard routine:
> update the BIOS, replace the cables/RAM/power/mainboard/slot/etc. to be
> certain of problem location.

Replace this disk or the sata controller maybe, but usually people don't want
to replace a big cluster, even if it is already 3 years old, this has to wait
at least another 3 years.

The problem came up, when 200GB drives were replaced by *newer* 250GB drives
(well maybe not the newest, no idea were they came from).

Anyway, I'm testing for more than 24h already and didn't observe any data
corruption as without the patch. I know this is only an obersavation and no
definite prove...
Also, this is with 3114, maybe this chip behaves a bit different than 3112?

Thanks,
Bernd



--
Bernd Schubert
Q-Leap Networks GmbH

2007-10-11 14:46:06

by Alan

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

> The problem is that the 3112 generates Data FIS's of a size other than a
> multiple of 512 bytes. Spec-legal, but exposed firmware bugs in many
> early SATA drives. Early Seagate hard drives choked when the formula
> (sector%15)==1 was satisfied (or something along those lines).

And the 3114 is the same ?

> 2) Once we identified, over time, the set of drives affected by this
> 3112 quirk (aka drives that didn't fully comply to SATA spec), the
> debugging of corruption cases largely shifted to the standard routine:
> update the BIOS, replace the cables/RAM/power/mainboard/slot/etc. to be
> certain of problem location.

Except for the continued series of later SI + Nvidia chipset (mostly)
pattern which seems unanswered but also being later chips I assume
unrelated to this problem.

2007-10-11 14:59:19

by Jeff Garzik

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

Alan Cox wrote:
>> The problem is that the 3112 generates Data FIS's of a size other than a
>> multiple of 512 bytes. Spec-legal, but exposed firmware bugs in many
>> early SATA drives. Early Seagate hard drives choked when the formula
>> (sector%15)==1 was satisfied (or something along those lines).
>
> And the 3114 is the same ?

3114 should not be affected by this problem (see below).

Most likely we are led down this road because the 'slow_down' module
parameter has an excellent capacity for hiding all manner of problems.

As a tangent from this thread, this is why I was OK with adding the
libata.dma even for SATA. Sometimes knobs are found useful by users,
though perhaps not its original intended use. Sometimes masking a
hardware problem can help you get through the rest of your day on hold
with vendor support :)


>> 2) Once we identified, over time, the set of drives affected by this
>> 3112 quirk (aka drives that didn't fully comply to SATA spec), the
>> debugging of corruption cases largely shifted to the standard routine:
>> update the BIOS, replace the cables/RAM/power/mainboard/slot/etc. to be
>> certain of problem location.
>
> Except for the continued series of later SI + Nvidia chipset (mostly)
> pattern which seems unanswered but also being later chips I assume
> unrelated to this problem.

The SIL_FLAG_MOD15WRITE flag is set in sil_port_info[] is set according
to the best info we have from SiI, which indicates that 3114 and 3512 do
not have the same problem as the 3112.

Jeff


2007-10-11 15:05:01

by Jeff Garzik

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

Bernd Schubert wrote:
> On Thursday 11 October 2007 16:19:37 Jeff Garzik wrote:
>> 1) Just about the only valid optimization is to ensure that only the
>> write path must be limited to small chunks, not both read- and
>> write-paths. Tejun had a patch to do this a long time ago, but it's an
>> open question whether the large amount of code is worth it for a rare
>> combination.
>
> How large? This patch is rather small? Where can I find it?

http://home-tj.org/wiki/index.php/Sil_m15w

> The problem came up, when 200GB drives were replaced by *newer* 250GB drives
> (well maybe not the newest, no idea were they came from).
>
> Anyway, I'm testing for more than 24h already and didn't observe any data
> corruption as without the patch. I know this is only an obersavation and no
> definite prove...
> Also, this is with 3114, maybe this chip behaves a bit different than 3112?

3114 + new SATA drive is definitely a new one for us.

It would help to (a) use the latest kernel, (b) post your .config with
the latest kernel, (c) post lspci booted into latest kernel, and (d)
post dmesg booted into latest kernel.

Jeff


2007-10-11 15:19:06

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

On Thursday 11 October 2007 17:04:45 Jeff Garzik wrote:
> Bernd Schubert wrote:
> > On Thursday 11 October 2007 16:19:37 Jeff Garzik wrote:
> >> 1) Just about the only valid optimization is to ensure that only the
> >> write path must be limited to small chunks, not both read- and
> >> write-paths. Tejun had a patch to do this a long time ago, but it's an
> >> open question whether the large amount of code is worth it for a rare
> >> combination.
> >
> > How large? This patch is rather small? Where can I find it?
>
> http://home-tj.org/wiki/index.php/Sil_m15w

Thanks, I will take a look later on.

>
> > The problem came up, when 200GB drives were replaced by *newer* 250GB
> > drives (well maybe not the newest, no idea were they came from).
> >
> > Anyway, I'm testing for more than 24h already and didn't observe any data
> > corruption as without the patch. I know this is only an obersavation and
> > no definite prove...
> > Also, this is with 3114, maybe this chip behaves a bit different than
> > 3112?
>
> 3114 + new SATA drive is definitely a new one for us.
>
> It would help to (a) use the latest kernel, (b) post your .config with
> the latest kernel, (c) post lspci booted into latest kernel, and (d)
> post dmesg booted into latest kernel.

a) 2.6.23 + sil-patch I posted, this is on a customer system (though my former
group), I wouldn't like to use -mm there.

b) .config is attached

c) attached

d) attached (don't get irritaded by those machine check events, thats "GART
TLB errorr", harmless warnings, just not disabled in the bios).


Thanks,
Bernd

--
Bernd Schubert
Q-Leap Networks GmbH


Attachments:
(No filename) (1.58 kB)
config-2.6.23 (33.41 kB)
lspci.out (6.17 kB)
dmesg.out.gz (7.36 kB)
Download all attachments

2007-10-12 21:08:37

by Jeff Garzik

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

Bernd Schubert wrote:
> a) 2.6.23 + sil-patch I posted, this is on a customer system (though my former
> group), I wouldn't like to use -mm there.
>
> b) .config is attached
>
> c) attached
>
> d) attached (don't get irritaded by those machine check events, thats "GART
> TLB errorr", harmless warnings, just not disabled in the bios).

Any chance you could provide dmesg on 2.6.23 without the sil patch?

Jeff



2007-10-15 10:18:56

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

On Friday 12 October 2007 23:08:21 Jeff Garzik wrote:
> Bernd Schubert wrote:
> > a) 2.6.23 + sil-patch I posted, this is on a customer system (though my
> > former group), I wouldn't like to use -mm there.
> >
> > b) .config is attached
> >
> > c) attached
> >
> > d) attached (don't get irritaded by those machine check events, thats
> > "GART TLB errorr", harmless warnings, just not disabled in the bios).
>
> Any chance you could provide dmesg on 2.6.23 without the sil patch?

Its attached.


Bernd

--
Bernd Schubert
Q-Leap Networks GmbH


Attachments:
(No filename) (546.00 B)
dmesg.out (21.87 kB)
Download all attachments

2007-10-23 08:08:27

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

Jeff Garzik wrote:
> Alan Cox wrote:
>>> 2) Once we identified, over time, the set of drives affected by this
>>> 3112 quirk (aka drives that didn't fully comply to SATA spec), the
>>> debugging of corruption cases largely shifted to the standard
>>> routine: update the BIOS, replace the
>>> cables/RAM/power/mainboard/slot/etc. to be certain of problem location.
>>
>> Except for the continued series of later SI + Nvidia chipset (mostly)
>> pattern which seems unanswered but also being later chips I assume
>> unrelated to this problem.
>
> The SIL_FLAG_MOD15WRITE flag is set in sil_port_info[] is set according
> to the best info we have from SiI, which indicates that 3114 and 3512 do
> not have the same problem as the 3112.

I don't think this data corruption problem w/ sil3114 is related to
m15w. m15w workaround slows down things quite a bit and is likely to
hide problems on PCI bus side. There are reports of data corruption
with 3114 on nvidia (most common), via and now amd chipsets. There's
one on intel too but IIRC wasn't too definite.

According to a user, freebsd didn't have data corruption problem on the
same hardware. I copied PCI FIFO setup code (ours is broken BTW) but it
didn't fix the problem.

I'll try to reproduce the problem locally and hunt it down.

Thanks.

--
tejun

2007-10-23 17:29:08

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

Hello Tejun,

On Tuesday 23 October 2007 10:08:01 Tejun Heo wrote:
> Jeff Garzik wrote:
> > Alan Cox wrote:
> >>> 2) Once we identified, over time, the set of drives affected by this
> >>> 3112 quirk (aka drives that didn't fully comply to SATA spec), the
> >>> debugging of corruption cases largely shifted to the standard
> >>> routine: update the BIOS, replace the
> >>> cables/RAM/power/mainboard/slot/etc. to be certain of problem location.
> >>
> >> Except for the continued series of later SI + Nvidia chipset (mostly)
> >> pattern which seems unanswered but also being later chips I assume
> >> unrelated to this problem.
> >
> > The SIL_FLAG_MOD15WRITE flag is set in sil_port_info[] is set according
> > to the best info we have from SiI, which indicates that 3114 and 3512 do
> > not have the same problem as the 3112.
>
> I don't think this data corruption problem w/ sil3114 is related to
> m15w. m15w workaround slows down things quite a bit and is likely to
> hide problems on PCI bus side. There are reports of data corruption
> with 3114 on nvidia (most common), via and now amd chipsets. There's
> one on intel too but IIRC wasn't too definite.
>
> According to a user, freebsd didn't have data corruption problem on the
> same hardware. I copied PCI FIFO setup code (ours is broken BTW) but it
> didn't fix the problem.
>
> I'll try to reproduce the problem locally and hunt it down.

thanks for your help and please tell me, if I can do anything. We have this
problem on a production system, but the node in question will be rebooted in
Thursday (ups needs to be replaced). If there are some tests/reboots/whatever
I could do, it would be best to do it shortly after the scheduled reboot.

Actually I now would have attempted to port your mod15 patch
(http://home-tj.org/wiki/index.php/Sil_m15w#Patches) to 2.6.23, hoping it
would solve Soerens problem and ours as well (ours magically already went
away using the mod15 fix). Well, maybe I port it anyway to 2.6.23 to see if
it also solves our problem.


Thanks,
Bernd


--
Bernd Schubert
Q-Leap Networks GmbH

2007-10-24 13:41:45

by Soeren Sonnenburg

[permalink] [raw]
Subject: Re: [PATCH 3/3] faster workaround

On Tue, 2007-10-23 at 17:08 +0900, Tejun Heo wrote:
> Jeff Garzik wrote:
> > Alan Cox wrote:
> >>> 2) Once we identified, over time, the set of drives affected by this
> >>> 3112 quirk (aka drives that didn't fully comply to SATA spec), the
> >>> debugging of corruption cases largely shifted to the standard
> >>> routine: update the BIOS, replace the
> >>> cables/RAM/power/mainboard/slot/etc. to be certain of problem location.
> >>
> >> Except for the continued series of later SI + Nvidia chipset (mostly)
> >> pattern which seems unanswered but also being later chips I assume
> >> unrelated to this problem.
> >
> > The SIL_FLAG_MOD15WRITE flag is set in sil_port_info[] is set according
> > to the best info we have from SiI, which indicates that 3114 and 3512 do
> > not have the same problem as the 3112.
>
> I don't think this data corruption problem w/ sil3114 is related to
> m15w. m15w workaround slows down things quite a bit and is likely to
> hide problems on PCI bus side. There are reports of data corruption
> with 3114 on nvidia (most common), via and now amd chipsets. There's
> one on intel too but IIRC wasn't too definite.

err wait, the motherboard I am having is also via based. however the
m15w workaround just slowed down everything but the problem still
appeared.

also to be sure that it is really some problem related to the particular
seagate drive vs. sil3114 I created a 200G file of zeros on one of the
known to work seagates. and indeed it was all zeros...

> According to a user, freebsd didn't have data corruption problem on the
> same hardware. I copied PCI FIFO setup code (ours is broken BTW) but it
> didn't fix the problem.
>
> I'll try to reproduce the problem locally and hunt it down.

Thanks in advance...
Soeren