2008-07-25 04:30:45

by Jeffrey Baker

[permalink] [raw]
Subject: 2.6.24 + ICH8M + high SATA load == death

On 2.6.24 with a SATA controller: Intel Corporation 82801HBM/HEM
(ICH8M/ICH8M-E) SATA AHCI Controller (rev 03) and a Vendor: ATA
Model: SAMSUNG MCBQE32G Rev: PS10 flash disk, I get this error when
doing 32 parallel runs of pgbench:

ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xa frozen
ata1.00: irq_stat 0x00400001, PHY RDY changed
ata1: SError: { PHYRdyChg CommWake }
ata1.00: cmd c8/00:10:67:38:97/00:00:00:00:00/e1 tag 0 dma 8192 in
res 50/00:00:76:38:97/00:00:00:00:00/e1 Emask 0x10 (ATA bus error)
ata1.00: status: { DRDY }

Afterwards the machine was in some kind of bad state where it would do
only about 1MB/s to the disk, and I had to power it off.

Basically I have no idea what any of that gibberish means. Note that
this device is about 80 times faster than the spinning disk it
replaced, so it may be stressing parts of the software that are not
normally stressed. Note also that it could just be crap hardware. I
don't really know. However, I do note that someone recently posted a
very similar error using Western Digital disks and the same SATA
controller. I don't think the problem is cables, since this is a
laptop. Any advice welcome.

-jwb


2008-07-25 06:33:34

by Robert Hancock

[permalink] [raw]
Subject: Re: 2.6.24 + ICH8M + high SATA load == death

Jeffrey Baker wrote:
> On 2.6.24 with a SATA controller: Intel Corporation 82801HBM/HEM
> (ICH8M/ICH8M-E) SATA AHCI Controller (rev 03) and a Vendor: ATA
> Model: SAMSUNG MCBQE32G Rev: PS10 flash disk, I get this error when
> doing 32 parallel runs of pgbench:
>
> ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xa frozen
> ata1.00: irq_stat 0x00400001, PHY RDY changed
> ata1: SError: { PHYRdyChg CommWake }
> ata1.00: cmd c8/00:10:67:38:97/00:00:00:00:00/e1 tag 0 dma 8192 in
> res 50/00:00:76:38:97/00:00:00:00:00/e1 Emask 0x10 (ATA bus error)
> ata1.00: status: { DRDY }
>
> Afterwards the machine was in some kind of bad state where it would do
> only about 1MB/s to the disk, and I had to power it off.
>
> Basically I have no idea what any of that gibberish means. Note that
> this device is about 80 times faster than the spinning disk it
> replaced, so it may be stressing parts of the software that are not
> normally stressed. Note also that it could just be crap hardware. I
> don't really know. However, I do note that someone recently posted a
> very similar error using Western Digital disks and the same SATA
> controller. I don't think the problem is cables, since this is a
> laptop. Any advice welcome.

PHYRdyChg in SError basically means that the controller detected that
the drive disconnected or lost communication with it. Almost certainly a
hardware problem of some sort. Power issue, perhaps?