2006-02-06 16:58:45

by Jerome Lacoste

[permalink] [raw]
Subject: EVMS & SATA failure

Hi,

Trying to recover from a disk failure [1], we:
- installed a fresh ubuntu 5.10 (modified kernel 2.6.12) on a new disk
- stopped machine
- connected failing SATA disk to board (in order to try to recover some data)
- rebooted

That doesn't work at all:

* Setting up LVM Volumne Groups
* Starting Enterprise Volume Management System...
ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
Buffer I/O error on device dm-3, logical block 1
ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
Buffer I/O error on device dm-3, logical block 2
ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
Buffer I/O error on device dm-3, logical block 3
ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
Buffer I/O error on device dm-3, logical block 4
ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
Buffer I/O error on device dm-3, logical block 5
ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
Buffer I/O error on device dm-3, logical block 6
....

And the boot doesn't go further, the error messages keep looping from
block 0 to 12.

As I am trying to minimize my number of reboots (to not stress my
failing disk), maybe someone has an idea what those messages could
mean? Is that due to my failing disk? Is it some sort of SATA
configuration issue? Or is there an EVMS issue?

Otherwise I will prevent the EVMS daemon from starting at boot and see
if that fixes my issue.

Cheers,

Jerome

[1] http://lkml.org/lkml/2006/1/10/397


2006-02-06 18:32:00

by Phillip Susi

[permalink] [raw]
Subject: Re: EVMS & SATA failure

Looks to me like those are IO errors caused by the defective disk. That
appears to confuse Ubuntu's init scripts when they try to detect the
EVMS stuff, and it goes into an infinite loop retrying instead of giving
up. You should probably report this bug to Ubuntu.


jerome lacoste wrote:
> Hi,
>
> Trying to recover from a disk failure [1], we:
> - installed a fresh ubuntu 5.10 (modified kernel 2.6.12) on a new disk
> - stopped machine
> - connected failing SATA disk to board (in order to try to recover some data)
> - rebooted
>
> That doesn't work at all:
>
> * Setting up LVM Volumne Groups
> * Starting Enterprise Volume Management System...
> ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
> Buffer I/O error on device dm-3, logical block 1
> ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
> Buffer I/O error on device dm-3, logical block 2
> ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
> Buffer I/O error on device dm-3, logical block 3
> ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
> Buffer I/O error on device dm-3, logical block 4
> ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
> Buffer I/O error on device dm-3, logical block 5
> ata4: translated ATA stat/err 0x25/00 to SCSI SK/ASC
> Buffer I/O error on device dm-3, logical block 6
> ....
>
> And the boot doesn't go further, the error messages keep looping from
> block 0 to 12.
>
> As I am trying to minimize my number of reboots (to not stress my
> failing disk), maybe someone has an idea what those messages could
> mean? Is that due to my failing disk? Is it some sort of SATA
> configuration issue? Or is there an EVMS issue?
>
> Otherwise I will prevent the EVMS daemon from starting at boot and see
> if that fixes my issue.
>
> Cheers,
>
> Jerome
>
>