2005-03-01 01:45:37

by Joerg Sommrey

[permalink] [raw]
Subject: 2.6.11-rc5: Promise SATA150 TX4 failure

Hi all,

a problem that was introduced between 2.6.10-ac9 and 2.6.10-ac11 made
it's way into 2.6.11-rc5. While taking a backup onto a SCSI-streamer one
of my RAID1-arrays gets corrupted. Afterwards the system hangs and
isn't even bootable. Need to raidhotadd the failed partition in single
user mode to get the box working again. Error messages:

Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar 1 01:46:15 bear kernel: SCSI error : <2 0 0 0> return code = 0x8000002
Mar 1 01:46:15 bear kernel: sdc: Current: sense key: Medium Error
Mar 1 01:46:15 bear kernel: Additional sense: Unrecovered read error - auto
reallocate failed
Mar 1 01:46:15 bear kernel: end_request: I/O error, dev sdc, sector 52694606
Mar 1 01:46:15 bear kernel: raid1: Disk failure on sdc2, disabling device.
Mar 1 01:46:15 bear kernel: ^IOperation continuing on 1 devices
Mar 1 01:46:15 bear kernel: raid1: sdc2: rescheduling sector 12499976
Mar 1 01:46:16 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:16 bear kernel: ata2: called with no error (51)!
Mar 1 01:46:16 bear kernel: SCSI error : <2 0 0 0> return code = 0x8000002
Mar 1 01:46:16 bear kernel: sdc: Current: sense key: Medium Error
Mar 1 01:46:16 bear kernel: Additional sense: Unrecovered read error - auto
reallocate failed
Mar 1 01:46:16 bear kernel: end_request: I/O error, dev sdc, sector 52694614
Mar 1 01:46:16 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:16 bear kernel: ata2: called with no error (51)!
Mar 1 01:46:16 bear kernel: SCSI error : <2 0 0 0> return code = 0x8000002
Mar 1 01:46:16 bear kernel: sdc: Current: sense key: Medium Error
Mar 1 01:46:16 bear kernel: Additional sense: Unrecovered read error - auto
reallocate failed
Mar 1 01:46:16 bear kernel: end_request: I/O error, dev sdc, sector 52694622
Mar 1 01:46:16 bear kernel: raid1: sdc2: rescheduling sector 12499984
Mar 1 01:46:16 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:16 bear kernel: ata2: called with no error (51)!
Mar 1 01:46:16 bear kernel: SCSI error : <2 0 0 0> return code = 0x8000002
Mar 1 01:46:16 bear kernel: sdc: Current: sense key: Medium Error
Mar 1 01:46:16 bear kernel: Additional sense: Unrecovered read error - auto
reallocate failed
Mar 1 01:46:16 bear kernel: end_request: I/O error, dev sdc, sector 52694630
Mar 1 01:46:16 bear kernel: raid1: sdc2: rescheduling sector 12500000
Mar 1 01:46:16 bear kernel: RAID1 conf printout:
Mar 1 01:46:16 bear kernel: --- wd:1 rd:2
Mar 1 01:46:16 bear kernel: disk 0, wo:0, o:1, dev:sdb2
Mar 1 01:46:16 bear kernel: disk 1, wo:1, o:0, dev:sdc2
Mar 1 01:46:16 bear kernel: RAID1 conf printout:
Mar 1 01:46:16 bear kernel: --- wd:1 rd:2
Mar 1 01:46:16 bear kernel: disk 0, wo:0, o:1, dev:sdb2
Mar 1 01:46:16 bear kernel: raid1: sdb2: redirecting sector 12499976 to another
mirror
Mar 1 01:46:16 bear kernel: raid1: sdb2: redirecting sector 12499984 to another
mirror
Mar 1 01:46:16 bear kernel: raid1: sdb2: redirecting sector 12500000 to another
mirror
Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar 1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar 1 01:46:16 bear kernel: SCSI error : <1 0 0 0> return code = 0x8000002
Mar 1 01:46:16 bear kernel: sdb: Current: sense key: Medium Error

etc. until hard reboot.

The failing array consists of two partitions of two SATA disks connected
to a Promise SATA150 TX4 controller.

-jo

--
-rw-r--r-- 1 jo users 63 2005-03-01 02:26 /home/jo/.signature


2005-03-01 23:38:16

by J.A. Magallon

[permalink] [raw]
Subject: Re: 2.6.11-rc5: Promise SATA150 TX4 failure


On 03.01, Joerg Sommrey wrote:
> Hi all,
>
> a problem that was introduced between 2.6.10-ac9 and 2.6.10-ac11 made
> it's way into 2.6.11-rc5. While taking a backup onto a SCSI-streamer one
> of my RAID1-arrays gets corrupted. Afterwards the system hangs and
> isn't even bootable. Need to raidhotadd the failed partition in single
> user mode to get the box working again. Error messages:
>

Me too :(. Just a slightly different case.
I have a server with 6x250Gb SATA drives, hanged on a pair of Promise
PDC20319 (FastTrak S150 TX4) (rev 02) controlers (each has 4 ports).
Main use for the box is as a smb/atalk/nfs server.

With 2.6.20-rc3-mm2+libata-dev2, the box is stable, we can drop
gigs of files throug samba amd it works.
Anything newer that that makes the box hang siliently, no messages,
no oops. It also happened to me with just a local wget of a big
file (oofice-2.0-beta), after download the box locked hard.

I tried to apply libata-dev1 on top of newer kernels, but part of it
is already there, and the rest drops too many rejects/offsets for
me.

I also have one other problem with flock, but thats subject for another
post...

Any ideas about what changed wrt sata ?

--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandrakelinux release 10.2 (Cooker) for i586
Linux 2.6.10-jam12 (gcc 3.4.3 (Mandrakelinux 10.2 3.4.3-3mdk)) #1