2007-06-24 19:19:20

by Otto Meier

[permalink] [raw]
Subject: sata_promise disk error 2.6.22-rc5 with hrt1 patch

Hello,

I got the following in my logs:

Jun 24 19:06:40 gate2 kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr
0x0 action 0x0
Jun 24 19:06:40 gate2 kernel: ata4.00: (port_status 0x00001000)
Jun 24 19:06:40 gate2 kernel: ata4.00: cmd
35/00:00:c1:17:92/00:04:17:00:00/e0 tag 0 cdb 0x0 data 524288 out
Jun 24 19:06:40 gate2 kernel: res
50/00:00:c0:1b:92/00:00:17:00:00/e0 Emask 0x20 (host bus error)
Jun 24 19:06:40 gate2 kernel: ata4.00: ata_hpa_resize 1: sectors =
488397168, hpa_sectors = 488397168
Jun 24 19:06:40 gate2 kernel: ata4.00: ata_hpa_resize 1: sectors =
488397168, hpa_sectors = 488397168
Jun 24 19:06:40 gate2 kernel: ata4.00: configured for UDMA/133
Jun 24 19:06:40 gate2 kernel: ata4: EH complete
Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] 488397168 512-byte
hardware sectors (250059 MB)
Jun 24 19:06:40 gate2 kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr
0x0 action 0x0
Jun 24 19:06:40 gate2 kernel: ata4.00: (port_status 0x00001000)
Jun 24 19:06:40 gate2 kernel: ata4.00: cmd
25/00:00:c1:15:92/00:02:17:00:00/e0 tag 0 cdb 0x0 data 262144 in
Jun 24 19:06:40 gate2 kernel: res
50/00:00:c0:17:92/00:00:17:00:00/e0 Emask 0x20 (host bus error)
Jun 24 19:06:40 gate2 kernel: ata4.00: ata_hpa_resize 1: sectors =
488397168, hpa_sectors = 488397168
Jun 24 19:06:40 gate2 kernel: ata4.00: ata_hpa_resize 1: sectors =
488397168, hpa_sectors = 488397168
Jun 24 19:06:40 gate2 kernel: ata4.00: configured for UDMA/133
Jun 24 19:06:40 gate2 kernel: ata4: EH complete
Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Write Protect is off
Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] 488397168 512-byte
hardware sectors (250059 MB)
Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Write Protect is off
Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA

The system continued without seeing other effects.

It is runnuning a 2.6.22-rc5 kernel with hrt1 patch.
The drive ist part of a raid5 with 4 identical drives

lspci -vvv gives:

02:06.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA
300 TX4) (rev 02)
Subsystem: Promise Technology, Inc. PDC40718 (SATA 300 TX4)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 72 (1000ns min, 4500ns max), Cache Line Size: 4 bytes
Interrupt: pin A routed to IRQ 16
Region 0: I/O ports at bc00 [size=128]
Region 2: I/O ports at b800 [size=256]
Region 3: Memory at fdeff000 (32-bit, non-prefetchable) [size=4K]
Region 4: Memory at fdec0000 (32-bit, non-prefetchable) [size=128K]
[virtual] Expansion ROM at fdda0000 [disabled] [size=32K]
Capabilities: [60] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-


Has anyone an idea whats going on?

Best regards





2007-06-24 21:31:01

by Mikael Pettersson

[permalink] [raw]
Subject: Re: sata_promise disk error 2.6.22-rc5 with hrt1 patch

(cc: linux-ide added)

On Sun, 24 Jun 2007 20:49:05 +0200, otto Meier wrote:
> I got the following in my logs:
>
> Jun 24 19:06:40 gate2 kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr
> 0x0 action 0x0
> Jun 24 19:06:40 gate2 kernel: ata4.00: (port_status 0x00001000)
> Jun 24 19:06:40 gate2 kernel: ata4.00: cmd
> 35/00:00:c1:17:92/00:04:17:00:00/e0 tag 0 cdb 0x0 data 524288 out
> Jun 24 19:06:40 gate2 kernel: res
> 50/00:00:c0:1b:92/00:00:17:00:00/e0 Emask 0x20 (host bus error)
> Jun 24 19:06:40 gate2 kernel: ata4.00: ata_hpa_resize 1: sectors =
> 488397168, hpa_sectors = 488397168
> Jun 24 19:06:40 gate2 kernel: ata4.00: ata_hpa_resize 1: sectors =
> 488397168, hpa_sectors = 488397168
> Jun 24 19:06:40 gate2 kernel: ata4.00: configured for UDMA/133
> Jun 24 19:06:40 gate2 kernel: ata4: EH complete
> Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] 488397168 512-byte
> hardware sectors (250059 MB)
> Jun 24 19:06:40 gate2 kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr
> 0x0 action 0x0
> Jun 24 19:06:40 gate2 kernel: ata4.00: (port_status 0x00001000)
> Jun 24 19:06:40 gate2 kernel: ata4.00: cmd
> 25/00:00:c1:15:92/00:02:17:00:00/e0 tag 0 cdb 0x0 data 262144 in
> Jun 24 19:06:40 gate2 kernel: res
> 50/00:00:c0:17:92/00:00:17:00:00/e0 Emask 0x20 (host bus error)
> Jun 24 19:06:40 gate2 kernel: ata4.00: ata_hpa_resize 1: sectors =
> 488397168, hpa_sectors = 488397168
> Jun 24 19:06:40 gate2 kernel: ata4.00: ata_hpa_resize 1: sectors =
> 488397168, hpa_sectors = 488397168
> Jun 24 19:06:40 gate2 kernel: ata4.00: configured for UDMA/133
> Jun 24 19:06:40 gate2 kernel: ata4: EH complete
> Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Write Protect is off
> Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
> Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Write cache: enabled,
> read cache: enabled, doesn't support DPO or FUA
> Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] 488397168 512-byte
> hardware sectors (250059 MB)
> Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Write Protect is off
> Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
> Jun 24 19:06:40 gate2 kernel: sd 3:0:0:0: [sdd] Write cache: enabled,
> read cache: enabled, doesn't support DPO or FUA
>
> The system continued without seeing other effects.
>
> It is runnuning a 2.6.22-rc5 kernel with hrt1 patch.
> The drive ist part of a raid5 with 4 identical drives
>
> lspci -vvv gives:
>
> 02:06.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA
> 300 TX4) (rev 02)

The 300 TX4 model is causing transient errors for several people,
and we don't yet know why. In your case, port_status 0x00001000
means that "host bus is busy more than 256 clock cycles for every
ATA I/O transfer" (quoting the docs). Basically it's a timeout.

As long as it's transient, don't worry.

You may want to try without hrt1 to see if hrt1 is causing the issue.

/Mikael