2005-01-15 20:25:42

by Erik Steffl

[permalink] [raw]
Subject: SATA disk dead? ATA: abnormal status 0x59 on port 0xE407

I got these errors when accessing SATA disk (via scsi):

Jan 15 11:56:50 jojda kernel: ata2: command 0x25 timeout, stat 0x59
host_stat 0x21
Jan 15 11:56:50 jojda kernel: ata2: status=0x59 { DriveReady
SeekComplete DataRequest Error }
Jan 15 11:56:50 jojda kernel: ata2: error=0x40 { UncorrectableError }
Jan 15 11:56:50 jojda kernel: scsi1: ERROR on channel 0, id 0, lun 0,
CDB: Read (10) 00 00 00 01 26 00 00 29 00
Jan 15 11:56:50 jojda kernel: Current sda: sense key Medium Error
Jan 15 11:56:50 jojda kernel: Additional sense: Unrecovered read error -
auto reallocate failed
Jan 15 11:56:50 jojda kernel: end_request: I/O error, dev sda, sector 294
Jan 15 11:56:50 jojda kernel: Buffer I/O error on device sda1, logical
block 57
Jan 15 11:56:50 jojda kernel: ATA: abnormal status 0x59 on port 0xE407
Jan 15 11:56:50 jojda last message repeated 2 times

when the disk was mounted I got it only when accessing certain
directories but now any disk access generates these errors and processes
that touch the disk are in disk wait state (I tried fsck, mount,
dd_rescue), looks like some of them get out if it after very long time
(1h+).

I have another SATA drive (pretty much same, both are Maxtor
DiamondMax 9, 250GB) and that one works when I connect it to same SATA
and power cables so I think there is a problem with disk (not my setup
or cables etc.).

Since I didn't see any read error before I think it might be the
electronics being dead, not the disk itself - considering that I have
another disk of same model is it possible to swap the disks (right now I
can't try it because I don't have funny screwdriver to fit the screws on
the disk).

my system: kernel 2.6.9, debian unstable, SATA disks seen as scsi
disks (CONFIG_SCSI_SATA=y).

Is there anything I can do to rescue (some of) the data on the disk?

TIA,

erik


2005-01-16 02:12:14

by Alan

[permalink] [raw]
Subject: Re: SATA disk dead? ATA: abnormal status 0x59 on port 0xE407

On Sad, 2005-01-15 at 20:25, Erik Steffl wrote:
> I got these errors when accessing SATA disk (via scsi):
>
> Jan 15 11:56:50 jojda kernel: ata2: command 0x25 timeout, stat 0x59
> host_stat 0x21
> Jan 15 11:56:50 jojda kernel: ata2: status=0x59 { DriveReady
> SeekComplete DataRequest Error }
> Jan 15 11:56:50 jojda kernel: ata2: error=0x40 { UncorrectableError }

Bad sector - the disk has lost the data on some blocks. Thats a physical
disk failure.

2005-01-16 02:33:54

by Erik Steffl

[permalink] [raw]
Subject: Re: SATA disk dead? ATA: abnormal status 0x59 on port 0xE407

Alan Cox wrote:
> On Sad, 2005-01-15 at 20:25, Erik Steffl wrote:
>
>> I got these errors when accessing SATA disk (via scsi):
>>
>>Jan 15 11:56:50 jojda kernel: ata2: command 0x25 timeout, stat 0x59
>>host_stat 0x21
>>Jan 15 11:56:50 jojda kernel: ata2: status=0x59 { DriveReady
>>SeekComplete DataRequest Error }
>>Jan 15 11:56:50 jojda kernel: ata2: error=0x40 { UncorrectableError }
>
>
> Bad sector - the disk has lost the data on some blocks. Thats a physical
> disk failure.

what's somewhat weird is that the disk _seemed_ OK (i.e. no errors
that I would notice, nothing in the syslog) and then suddenly the disk
does not respond at all, I tried dd_rescue and it ran for hours (more
than a day) and it rescued absolutely nothing. Is it possible that the
disk surface is OK but the electronics went bad? Is there anything that
can be done if that's the case? (I have another disk, same model).

erik

2005-01-17 04:17:02

by Bill Davidsen

[permalink] [raw]
Subject: Re: SATA disk dead? ATA: abnormal status 0x59 on port 0xE407

Erik Steffl wrote:
> Alan Cox wrote:
>
>> On Sad, 2005-01-15 at 20:25, Erik Steffl wrote:
>>
>>> I got these errors when accessing SATA disk (via scsi):
>>>
>>> Jan 15 11:56:50 jojda kernel: ata2: command 0x25 timeout, stat 0x59
>>> host_stat 0x21
>>> Jan 15 11:56:50 jojda kernel: ata2: status=0x59 { DriveReady
>>> SeekComplete DataRequest Error }
>>> Jan 15 11:56:50 jojda kernel: ata2: error=0x40 { UncorrectableError }
>>
>>
>>
>> Bad sector - the disk has lost the data on some blocks. Thats a physical
>> disk failure.
>
>
> what's somewhat weird is that the disk _seemed_ OK (i.e. no errors
> that I would notice, nothing in the syslog) and then suddenly the disk
> does not respond at all, I tried dd_rescue and it ran for hours (more
> than a day) and it rescued absolutely nothing. Is it possible that the
> disk surface is OK but the electronics went bad? Is there anything that
> can be done if that's the case? (I have another disk, same model).

You probably void your waranty on both drives if you swap the control
board, it may require special tools you don't have, and I have done it
in the past. Can you get to the point where it fails and cool it with a
shot of freon (or whatever is politically correct these days)? May be
thermal, in which case you run it until you back it up, then waranty it.

--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979

2005-01-17 06:44:35

by Erik Steffl

[permalink] [raw]
Subject: Re: SATA disk dead? ATA: abnormal status 0x59 on port 0xE407

Bill Davidsen wrote:
> Erik Steffl wrote:
>
>> Alan Cox wrote:
>>
>>> On Sad, 2005-01-15 at 20:25, Erik Steffl wrote:
>>>
>>>> I got these errors when accessing SATA disk (via scsi):
>>>>
>>>> Jan 15 11:56:50 jojda kernel: ata2: command 0x25 timeout, stat 0x59
>>>> host_stat 0x21
>>>> Jan 15 11:56:50 jojda kernel: ata2: status=0x59 { DriveReady
>>>> SeekComplete DataRequest Error }
>>>> Jan 15 11:56:50 jojda kernel: ata2: error=0x40 { UncorrectableError }
>>>
>>>
>>>
>>>
>>> Bad sector - the disk has lost the data on some blocks. Thats a physical
>>> disk failure.
>>
>>
>>
>> what's somewhat weird is that the disk _seemed_ OK (i.e. no errors
>> that I would notice, nothing in the syslog) and then suddenly the disk
>> does not respond at all, I tried dd_rescue and it ran for hours (more
>> than a day) and it rescued absolutely nothing. Is it possible that the
>> disk surface is OK but the electronics went bad? Is there anything
>> that can be done if that's the case? (I have another disk, same model).
>
>
> You probably void your waranty on both drives if you swap the control
> board, it may require special tools you don't have, and I have done it
> in the past. Can you get to the point where it fails and cool it with a
> shot of freon (or whatever is politically correct these days)? May be
> thermal, in which case you run it until you back it up, then waranty it.

it does not respond at all (right after I boot up the computer),
doesn't seem to be heat related. It is completely unreadable, I ran
rr_rescue on it for a long time, it didn't read absolutely anything. It
requires a star-shaped screwdriver, are those available somewhere?

erik

2005-01-17 09:08:53

by Mark Watts

[permalink] [raw]
Subject: Re: SATA disk dead? ATA: abnormal status 0x59 on port 0xE407

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> Bill Davidsen wrote:
> > Erik Steffl wrote:
> >> Alan Cox wrote:
> >>> On Sad, 2005-01-15 at 20:25, Erik Steffl wrote:
> >>>> I got these errors when accessing SATA disk (via scsi):
> >>>>
> >>>> Jan 15 11:56:50 jojda kernel: ata2: command 0x25 timeout, stat 0x59
> >>>> host_stat 0x21
> >>>> Jan 15 11:56:50 jojda kernel: ata2: status=0x59 { DriveReady
> >>>> SeekComplete DataRequest Error }
> >>>> Jan 15 11:56:50 jojda kernel: ata2: error=0x40 { UncorrectableError }
> >>>
> >>> Bad sector - the disk has lost the data on some blocks. Thats a
> >>> physical disk failure.
> >>
> >> what's somewhat weird is that the disk _seemed_ OK (i.e. no errors
> >> that I would notice, nothing in the syslog) and then suddenly the disk
> >> does not respond at all, I tried dd_rescue and it ran for hours (more
> >> than a day) and it rescued absolutely nothing. Is it possible that the
> >> disk surface is OK but the electronics went bad? Is there anything
> >> that can be done if that's the case? (I have another disk, same model).
> >
> > You probably void your waranty on both drives if you swap the control
> > board, it may require special tools you don't have, and I have done it
> > in the past. Can you get to the point where it fails and cool it with a
> > shot of freon (or whatever is politically correct these days)? May be
> > thermal, in which case you run it until you back it up, then waranty it.
>
> it does not respond at all (right after I boot up the computer),
> doesn't seem to be heat related. It is completely unreadable, I ran
> rr_rescue on it for a long time, it didn't read absolutely anything. It
> requires a star-shaped screwdriver, are those available somewhere?

Those are Torx drivers. You may need the 'security' version if the screws have
a pin in the middle (utterly pointless since both types of driver are
publicly available).

Mark.

- --
Mark Watts
Senior Systems Engineer
QinetiQ Trusted Information Management
Trusted Solutions and Services group
GPG Public Key ID: 455420ED

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFB64IGBn4EFUVUIO0RAugkAJ4kmCDOsILhZLISR75ml2gch528AQCbB56r
UJWFiujxQxI95TZEhIOKoWc=
=7AkY
-----END PGP SIGNATURE-----

2005-01-18 01:01:24

by Eric D. Mudama

[permalink] [raw]
Subject: Re: SATA disk dead? ATA: abnormal status 0x59 on port 0xE407

we don't use security torx screws, we use normal ones on our boards.

I wouldn't recommend swapping boards, since the code stored on the
physical media, the opti tables, and the asic on the board were all
processed together at one point and are specific to each other. The
new board may not work properly with the heads in the other drive, and
could even cause damage, if both drives were several sigma to opposite
sides of each other in the spectrum of passing drives, or had a
different head vendor, etc.

If the data already appears lost and you've run out of other options,
it may prove useful to attempt writing to the entire device without
attempting reads. If the drive then reads normally after that, the
damage was probably incurred in some transient fashion (excessive
vibration or heat, etc) and the replacement data may eliminate the
failures.

Either way, however, I would probably recommend just RMA'ing the
drives. We should be able to get you a replacement in a few days from
the time you fill out the form.

--eric


On Mon, 17 Jan 2005 09:14:46 +0000, Mark Watts <[email protected]> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> > Bill Davidsen wrote:
> > > Erik Steffl wrote:
> > >> Alan Cox wrote:
> > >>> On Sad, 2005-01-15 at 20:25, Erik Steffl wrote:
> > >>>> I got these errors when accessing SATA disk (via scsi):
> > >>>>
> > >>>> Jan 15 11:56:50 jojda kernel: ata2: command 0x25 timeout, stat 0x59
> > >>>> host_stat 0x21
> > >>>> Jan 15 11:56:50 jojda kernel: ata2: status=0x59 { DriveReady
> > >>>> SeekComplete DataRequest Error }
> > >>>> Jan 15 11:56:50 jojda kernel: ata2: error=0x40 { UncorrectableError }
> > >>>
> > >>> Bad sector - the disk has lost the data on some blocks. Thats a
> > >>> physical disk failure.
> > >>
> > >> what's somewhat weird is that the disk _seemed_ OK (i.e. no errors
> > >> that I would notice, nothing in the syslog) and then suddenly the disk
> > >> does not respond at all, I tried dd_rescue and it ran for hours (more
> > >> than a day) and it rescued absolutely nothing. Is it possible that the
> > >> disk surface is OK but the electronics went bad? Is there anything
> > >> that can be done if that's the case? (I have another disk, same model).
> > >
> > > You probably void your waranty on both drives if you swap the control
> > > board, it may require special tools you don't have, and I have done it
> > > in the past. Can you get to the point where it fails and cool it with a
> > > shot of freon (or whatever is politically correct these days)? May be
> > > thermal, in which case you run it until you back it up, then waranty it.
> >
> > it does not respond at all (right after I boot up the computer),
> > doesn't seem to be heat related. It is completely unreadable, I ran
> > rr_rescue on it for a long time, it didn't read absolutely anything. It
> > requires a star-shaped screwdriver, are those available somewhere?
>
> Those are Torx drivers. You may need the 'security' version if the screws have
> a pin in the middle (utterly pointless since both types of driver are
> publicly available).
>
> Mark.
>
> - --
> Mark Watts
> Senior Systems Engineer
> QinetiQ Trusted Information Management
> Trusted Solutions and Services group
> GPG Public Key ID: 455420ED
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
>
> iD8DBQFB64IGBn4EFUVUIO0RAugkAJ4kmCDOsILhZLISR75ml2gch528AQCbB56r
> UJWFiujxQxI95TZEhIOKoWc=
> =7AkY
> -----END PGP SIGNATURE-----
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2005-01-18 06:26:46

by Erik Steffl

[permalink] [raw]
Subject: Re: SATA disk dead? ATA: abnormal status 0x59 on port 0xE407

Eric Mudama wrote:
> we don't use security torx screws, we use normal ones on our boards.
>
> I wouldn't recommend swapping boards, since the code stored on the
> physical media, the opti tables, and the asic on the board were all
> processed together at one point and are specific to each other. The
> new board may not work properly with the heads in the other drive, and
> could even cause damage, if both drives were several sigma to opposite
> sides of each other in the spectrum of passing drives, or had a
> different head vendor, etc.
>
> If the data already appears lost and you've run out of other options,
> it may prove useful to attempt writing to the entire device without
> attempting reads. If the drive then reads normally after that, the
> damage was probably incurred in some transient fashion (excessive
> vibration or heat, etc) and the replacement data may eliminate the
> failures.
>
> Either way, however, I would probably recommend just RMA'ing the
> drives. We should be able to get you a replacement in a few days from
> the time you fill out the form.

it's DiamondMax 9 (manufactured june 13 2003), those had only one
year warranty so unfortunately I can't return it (just checked it on
maxtor.com).

trying to write to it (cat /dev/hdb6 > /dev/sda) but getting exactly
same messages (ATA: abnormal status 0x59 on port 0xE407). Looks like the
drive does not respond to anything at all (I tried to turn off computer
completely, even disconnecting it (while powered off)).

here's the full set of messages (the same set repeats every 30s or so):

Jan 17 22:22:48 jojda kernel: ata2: command 0x35 timeout, stat 0x59
host_stat 0x21
Jan 17 22:22:48 jojda kernel: ata2: status=0x59 { DriveReady
SeekComplete DataRequest Error }
Jan 17 22:22:48 jojda kernel: ata2: error=0x40 { UncorrectableError }
Jan 17 22:22:48 jojda kernel: scsi1: ERROR on channel 0, id 0, lun 0,
CDB: Write (10) 00 00 00 00 15 00 03 eb 00
Jan 17 22:22:48 jojda kernel: Current sda: sense key Medium Error
Jan 17 22:22:48 jojda kernel: Additional sense: Unrecovered read error -
auto reallocate failed
Jan 17 22:22:48 jojda kernel: end_request: I/O error, dev sda, sector 21
Jan 17 22:22:48 jojda kernel: ATA: abnormal status 0x59 on port 0xE407
Jan 17 22:22:48 jojda last message repeated 2 times

erik

2005-01-19 00:19:06

by James Colannino

[permalink] [raw]
Subject: Re: SATA disk dead? ATA: abnormal status 0x59 on port 0xE407

Erik Steffl wrote:

> Eric Mudama wrote:
>
>> we don't use security torx screws, we use normal ones on our boards.
>>
>> I wouldn't recommend swapping boards, since the code stored on the
>> physical media, the opti tables, and the asic on the board were all
>> processed together at one point and are specific to each other. The
>> new board may not work properly with the heads in the other drive, and
>> could even cause damage, if both drives were several sigma to opposite
>> sides of each other in the spectrum of passing drives, or had a
>> different head vendor, etc.
>>
>> If the data already appears lost and you've run out of other options,
>> it may prove useful to attempt writing to the entire device without
>> attempting reads. If the drive then reads normally after that, the
>> damage was probably incurred in some transient fashion (excessive
>> vibration or heat, etc) and the replacement data may eliminate the
>> failures.
>>
>> Either way, however, I would probably recommend just RMA'ing the
>> drives. We should be able to get you a replacement in a few days from
>> the time you fill out the form.
>
>
> it's DiamondMax 9 (manufactured june 13 2003), those had only one
> year warranty so unfortunately I can't return it (just checked it on
> maxtor.com).
>

Sometimes, if you get a nice person from Maxtor on the phone, you can
get it RMA'd anyway. You just have to talk to the right person. If you
don't get someone willing to help out, try calling back until you get
someone else. I was able to return a drive that was 3 months out of
warranty. Yours is a bit more out of date, but you might as well give
it a shot ;)

James