2007-01-06 14:26:48

by Kasper Sandberg

[permalink] [raw]
Subject: libata error handling

Hello.

i have a question in regards to libata's error handling, specifically
with pata drivers.

ill start by explaining something that happens to me using the normal
ide drivers (via ide and pdc202 new)

this is what i get when it has been used for a while:
hde: dma_intr: bad DMA status (dma_stat=75)
hde: dma_intr: status=0x50 { DriveReady SeekComplete }
ide: failed opcode was: unknown
hde: dma_timer_expiry: dma status == 0x60
hde: DMA timeout retry
PDC202XX: Primary channel reset.
hde: timeout waiting for DMA

its ALWAYS hde, and its on the promise controller, i attempted to
replace the promise controller by other controllers, but i got the same
error. i have tried replacing cables too, and swapping around
harddrives, its ALWAYS the last harddrive that gets me this. after this,
my raid (6x300gb drives in raid5) would go nuts, as if the data was
there, but skewed, so i got it all from an offset.

this has been going on since always on this box, from .15 to .17, but
now i updated to .20-rc3-git4, and went over to the pata-on-libata
drivers, where i think this has stopped, or atleast, its not causing
WEIRD errors anymore, i have observed some stalls, but im not sure this
is due to it doing this, or simply syncing. i get no messages like this
from the kernel anymore.

i have heard that libata has much better error handling (this is what
made me try it), and from initial observations, that appears to be very
true, however, im wondering, is there something i can do to get
extremely verbose information from libata? for example if it corrects
errors? cause i'd really like to know if it still happens, and if i
perhaps get corruption as before, even though not severe.


Regards,
Kasper Sandberg


2007-01-06 18:21:58

by Robert Hancock

[permalink] [raw]
Subject: Re: libata error handling

Kasper Sandberg wrote:
> i have heard that libata has much better error handling (this is what
> made me try it), and from initial observations, that appears to be very
> true, however, im wondering, is there something i can do to get
> extremely verbose information from libata? for example if it corrects
> errors? cause i'd really like to know if it still happens, and if i
> perhaps get corruption as before, even though not severe.

Any errors, timeouts or retries would be showing up in dmesg..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-01-06 18:58:01

by Kasper Sandberg

[permalink] [raw]
Subject: Re: libata error handling

On Sat, 2007-01-06 at 12:21 -0600, Robert Hancock wrote:
> Kasper Sandberg wrote:
> > i have heard that libata has much better error handling (this is what
> > made me try it), and from initial observations, that appears to be very
> > true, however, im wondering, is there something i can do to get
> > extremely verbose information from libata? for example if it corrects
> > errors? cause i'd really like to know if it still happens, and if i
> > perhaps get corruption as before, even though not severe.
>
> Any errors, timeouts or retries would be showing up in dmesg..
how sure can i be of this? is it 100% sure that i have not encountered
this error then?
>

2007-01-06 19:01:59

by Robert Hancock

[permalink] [raw]
Subject: Re: libata error handling

Kasper Sandberg wrote:
> On Sat, 2007-01-06 at 12:21 -0600, Robert Hancock wrote:
>> Kasper Sandberg wrote:
>>> i have heard that libata has much better error handling (this is what
>>> made me try it), and from initial observations, that appears to be very
>>> true, however, im wondering, is there something i can do to get
>>> extremely verbose information from libata? for example if it corrects
>>> errors? cause i'd really like to know if it still happens, and if i
>>> perhaps get corruption as before, even though not severe.
>> Any errors, timeouts or retries would be showing up in dmesg..
> how sure can i be of this? is it 100% sure that i have not encountered
> this error then?

Pretty sure, I'm quite certain libata never does any silent error recovery..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/



2007-01-06 19:08:35

by Kasper Sandberg

[permalink] [raw]
Subject: Re: libata error handling

On Sat, 2007-01-06 at 13:01 -0600, Robert Hancock wrote:
> Kasper Sandberg wrote:
> > On Sat, 2007-01-06 at 12:21 -0600, Robert Hancock wrote:
> >> Kasper Sandberg wrote:
> >>> i have heard that libata has much better error handling (this is what
> >>> made me try it), and from initial observations, that appears to be very
> >>> true, however, im wondering, is there something i can do to get
> >>> extremely verbose information from libata? for example if it corrects
> >>> errors? cause i'd really like to know if it still happens, and if i
> >>> perhaps get corruption as before, even though not severe.
> >> Any errors, timeouts or retries would be showing up in dmesg..
> > how sure can i be of this? is it 100% sure that i have not encountered
> > this error then?
>
> Pretty sure, I'm quite certain libata never does any silent error recovery..
okay, i suppose i face two possibilities then:
1: libata drivers are simply better, and the error does not occur
because of driver bugs in the old ide drivers
2: it hasnt happened to me on libata yet (though this is also abit
weird, as it has now ran far longer than were previously required to hit
the errors)
>

Subject: Re: libata error handling

On 1/6/07, Kasper Sandberg <[email protected]> wrote:
> On Sat, 2007-01-06 at 13:01 -0600, Robert Hancock wrote:
> > Kasper Sandberg wrote:
> > > On Sat, 2007-01-06 at 12:21 -0600, Robert Hancock wrote:
> > >> Kasper Sandberg wrote:
> > >>> i have heard that libata has much better error handling (this is what
> > >>> made me try it), and from initial observations, that appears to be very
> > >>> true, however, im wondering, is there something i can do to get
> > >>> extremely verbose information from libata? for example if it corrects
> > >>> errors? cause i'd really like to know if it still happens, and if i
> > >>> perhaps get corruption as before, even though not severe.
> > >> Any errors, timeouts or retries would be showing up in dmesg..
> > > how sure can i be of this? is it 100% sure that i have not encountered
> > > this error then?
> >
> > Pretty sure, I'm quite certain libata never does any silent error recovery..

AFAIR this is true
(at least it was last time that I've looked at libata eh code)

> okay, i suppose i face two possibilities then:
> 1: libata drivers are simply better, and the error does not occur
> because of driver bugs in the old ide drivers

very likely however pdc202xx_new bugs should be fixed in 2.6.20-rc3
(as it contains a lot of bugfixes for this driver from Sergei Shtylyov)

> 2: it hasnt happened to me on libata yet (though this is also abit
> weird, as it has now ran far longer than were previously required to hit
> the errors)

2007-01-07 20:07:47

by Kasper Sandberg

[permalink] [raw]
Subject: Re: libata error handling

On Sat, 2007-01-06 at 20:28 +0100, Bartlomiej Zolnierkiewicz wrote:
> On 1/6/07, Kasper Sandberg <[email protected]> wrote:
> > On Sat, 2007-01-06 at 13:01 -0600, Robert Hancock wrote:
> > > Kasper Sandberg wrote:
> > > > On Sat, 2007-01-06 at 12:21 -0600, Robert Hancock wrote:
> > > >> Kasper Sandberg wrote:
> > > >>> i have heard that libata has much better error handling (this is what
> > > >>> made me try it), and from initial observations, that appears to be very
> > > >>> true, however, im wondering, is there something i can do to get
> > > >>> extremely verbose information from libata? for example if it corrects
> > > >>> errors? cause i'd really like to know if it still happens, and if i
> > > >>> perhaps get corruption as before, even though not severe.
> > > >> Any errors, timeouts or retries would be showing up in dmesg..
> > > > how sure can i be of this? is it 100% sure that i have not encountered
> > > > this error then?
> > >
> > > Pretty sure, I'm quite certain libata never does any silent error recovery..
>
> AFAIR this is true
> (at least it was last time that I've looked at libata eh code)
>
> > okay, i suppose i face two possibilities then:
> > 1: libata drivers are simply better, and the error does not occur
> > because of driver bugs in the old ide drivers
>
> very likely however pdc202xx_new bugs should be fixed in 2.6.20-rc3
> (as it contains a lot of bugfixes for this driver from Sergei Shtylyov)
these fixes are also in the libata driver?
>
> > 2: it hasnt happened to me on libata yet (though this is also abit
> > weird, as it has now ran far longer than were previously required to hit
> > the errors)
>

Subject: Re: libata error handling

On 1/7/07, Kasper Sandberg <[email protected]> wrote:
> On Sat, 2007-01-06 at 20:28 +0100, Bartlomiej Zolnierkiewicz wrote:
> > On 1/6/07, Kasper Sandberg <[email protected]> wrote:
> > > On Sat, 2007-01-06 at 13:01 -0600, Robert Hancock wrote:
> > > > Kasper Sandberg wrote:
> > > > > On Sat, 2007-01-06 at 12:21 -0600, Robert Hancock wrote:
> > > > >> Kasper Sandberg wrote:
> > > > >>> i have heard that libata has much better error handling (this is what
> > > > >>> made me try it), and from initial observations, that appears to be very
> > > > >>> true, however, im wondering, is there something i can do to get
> > > > >>> extremely verbose information from libata? for example if it corrects
> > > > >>> errors? cause i'd really like to know if it still happens, and if i
> > > > >>> perhaps get corruption as before, even though not severe.
> > > > >> Any errors, timeouts or retries would be showing up in dmesg..
> > > > > how sure can i be of this? is it 100% sure that i have not encountered
> > > > > this error then?
> > > >
> > > > Pretty sure, I'm quite certain libata never does any silent error recovery..
> >
> > AFAIR this is true
> > (at least it was last time that I've looked at libata eh code)
> >
> > > okay, i suppose i face two possibilities then:
> > > 1: libata drivers are simply better, and the error does not occur
> > > because of driver bugs in the old ide drivers
> >
> > very likely however pdc202xx_new bugs should be fixed in 2.6.20-rc3
> > (as it contains a lot of bugfixes for this driver from Sergei Shtylyov)
> these fixes are also in the libata driver?

some were backported directly from libata driver and few were
pdc202xx_new specific so probably pata_pdc2027x is also fine

> > > 2: it hasnt happened to me on libata yet (though this is also abit
> > > weird, as it has now ran far longer than were previously required to hit
> > > the errors)