2010-06-01 05:36:49

by Jeff Garzik

[permalink] [raw]
Subject: Re: SSD + sata_nv + btrfs oops

On 05/31/2010 11:04 PM, Dave Airlie wrote:
> Hi guys,
>
> I've been running an Intel SSD (the KS one) on my Dell XPS710 desktop
> machine, with btrfs on it.
>
> I'm not sure the btrfs oops isn't due to the disk/controller doing
> something bad (almost guaranteed).

The btrfs oops may be poor handling of an I/O error thrown by the block
error.

Root cause is definitely your SATA PHY throwing some hardware errors
from the transport layer (low level SATA packet transmission failures).
Everything else sorta falls apart after that.

First guesses are the usual suspects: cabling, temperature, power or
SATA ports on the [SATA controller | SATA device] going bad.

Disabling swncq will only improve things from the perspective of slowing
things down and giving the hardware less to do. swncq makes things
parallel, so forcing only one transaction at a time certainly increases
the chances of success by reducing complexity and serializing transactions.

Jeff



2010-06-01 11:25:38

by Chris Mason

[permalink] [raw]
Subject: Re: SSD + sata_nv + btrfs oops

On Tue, Jun 01, 2010 at 01:36:42AM -0400, Jeff Garzik wrote:
> On 05/31/2010 11:04 PM, Dave Airlie wrote:
> >Hi guys,
> >
> >I've been running an Intel SSD (the KS one) on my Dell XPS710 desktop
> >machine, with btrfs on it.
> >
> >I'm not sure the btrfs oops isn't due to the disk/controller doing
> >something bad (almost guaranteed).
>
> The btrfs oops may be poor handling of an I/O error thrown by the
> block error.

Correct, we do pretty well when there is an alternate copy but we're
still working on the eios when there is only one.

Dave, one the hardware side is sorted out, if you have trouble with the
btrfs data let me know.

-chris