2004-06-30 11:41:50

by Andre Costa

[permalink] [raw]
Subject: 2.4.26: IDE drives become unavailable randomly

(please cc me on any replies, because I am not subscribed to this list;
if I do need to subscribe, just let me know)

Hi,

I am using 2.4.26 SMP on a ABIT AT7 mobo, with a 2.8GHz P4 processor
with hyper-threading enabled. I have one 80GB Seagate IDE disk
as /dev/hda, and from time to time it seems to "disappear", usually
after these messages appear a couple of trimes on/var/log/messages:

Jun 27 17:15:00 dali kernel: hda: status timeout: status=0x80 { Busy }
Jun 27 17:15:00 dali kernel:
Jun 27 17:15:00 dali kernel: hda: drive not ready for command
Jun 27 17:15:03 dali kernel: ide0: reset: success

I already had some ide-related issues, namely the one mentioned here:

http://www.x86-64.org/lists/discuss/msg04679.html

Due to that, I am booting with:

hdc=ide-scsi apm=off acpi=ht noapic

Turning off APIC and keeping ACPI to a minimum seems to have fixed the
"dma status == 0x24" problem, but I still experience the "status
timeout" above, which is very frustrating because this is supposed to be
a server for our intranet.

I tried turning off APM for this disk with 'hdparm -B255 /dev/hda', but
it didn't work:

hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
hda: drive_cmd: error=0x04 { DriveStatusError }

I have turned off spindown with 'hdparm -S0 /dev/hda', but frankly I am
not sure this will help (besides being bad for harddisk lifetime).

So, given this scenario, I would really appreciate any suggestions on
how to workaround this issue... Please, let me know if you need
additional info. I am attaching below the output of 'hdparm -I /dev/hda'
in case it helps. I am running Fedora Core 1.

TIA

Andre

-------- output of 'hdparm -I /dev/hda' --------

/dev/hda:

ATA device, with non-removable media
Model Number: ST380011A
Serial Number: 3JV78385
Firmware Revision: 3.06
Standards:
Used: ATA/ATAPI-6 T13 1410D revision 2
Supported: 6 5 4 3
Configuration:
Logical max current
cylinders 16383 65535
heads 16 1
sectors/track 63 63
--
CHS current addressable sectors: 4128705
LBA user addressable sectors: 156301488
LBA48 user addressable sectors: 156301488
device size with M = 1024*1024: 76319 MBytes
device size with M = 1000*1000: 80026 MBytes (80 GB)
Capabilities:
LBA, IORDY(can be disabled)
bytes avail on r/w long: 4 Queue depth: 1
Standby timer values: spec'd by Standard
R/W multiple sector transfer: Max = 16 Current = 16
Recommended acoustic management value: 128, current value: 0
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=240ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* READ BUFFER cmd
* WRITE BUFFER cmd
* Host Protected Area feature set
* Look-ahead
* Write cache
* Power Management feature set
Security Mode feature set
* SMART feature set
* FLUSH CACHE EXT command
* Mandatory FLUSH CACHE command
* Device Configuration Overlay feature set
* 48-bit Address feature set
SET MAX security extension
* DOWNLOAD MICROCODE cmd
* SMART self-test
* SMART error logging
Security:
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
HW reset results:
CBLID- above Vih
Device num = 0 determined by the jumper
Checksum: correct


--
Andre Oliveira da Costa
([email protected])


2004-06-30 13:59:16

by tom st denis

[permalink] [raw]
Subject: Re: 2.4.26: IDE drives become unavailable randomly

--- Andre Costa <[email protected]> wrote:
> (please cc me on any replies, because I am not subscribed to this
> list;
> if I do need to subscribe, just let me know)
>
> Hi,
>
> I am using 2.4.26 SMP on a ABIT AT7 mobo, with a 2.8GHz P4 processor
> with hyper-threading enabled. I have one 80GB Seagate IDE disk
> as /dev/hda, and from time to time it seems to "disappear", usually
> after these messages appear a couple of trimes on/var/log/messages:

I get a similar problem on my Presario laptop. In my case "all of a
suddend" hda3 becomes write-only. Next time it happens I'll see if I
can capture a dmesg log or something. It only seems to happen when I
enable my wifi and do a lot of disk activity [but only once in a
while]. Could be that my wifi and IDE0 share an IRQ?

Of course I'm more apt to blame my laptop than Linux since the same
kernel [well diff build options but you know what I mean] works just
fine on my two desktops in the house...

Tom




__________________________________
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail

2004-06-30 14:46:46

by Andre Costa

[permalink] [raw]
Subject: Re: 2.4.26: IDE drives become unavailable randomly

(please cc me on any replies, because I am not subscribed to this list)

Hi Tom,

On Wed, 30 Jun 2004 06:59:07 -0700 (PDT)
tom st denis <[email protected]> wrote:

> --- Andre Costa <[email protected]> wrote:
> > (please cc me on any replies, because I am not subscribed to this
> > list;
> > if I do need to subscribe, just let me know)
> >
> > Hi,
> >
> > I am using 2.4.26 SMP on a ABIT AT7 mobo, with a 2.8GHz P4 processor
> > with hyper-threading enabled. I have one 80GB Seagate IDE disk
> > as /dev/hda, and from time to time it seems to "disappear", usually
> > after these messages appear a couple of trimes on/var/log/messages:
>
> I get a similar problem on my Presario laptop. In my case "all of a
> suddend" hda3 becomes write-only. Next time it happens I'll see if I
> can capture a dmesg log or something. It only seems to happen when I
> enable my wifi and do a lot of disk activity [but only once in a
> while]. Could be that my wifi and IDE0 share an IRQ?

I can't say the situation is the same here -- actually, it seems to be
more related (in my case) to idle times: usually this happens when the
system is under light load (or under no load at all), like between
0:00am and 6:00am. This is why my primary suspect is APM (I could be
completely wrong, of course). Also, I don't have wifi.

> Of course I'm more apt to blame my laptop than Linux since the same
> kernel [well diff build options but you know what I mean] works just
> fine on my two desktops in the house...

Yeah, I know what you mean: the same Linux distro has been running
flawlessly on other boxes around here for months (with different
hardware specs, though). Mine OTOH has a sad uptime record of 5 days...
=(

I agree Linux works (actually, it rocks =)), been using it for years
both at home and at work, but sometimes a specific hardware combination
(or buggy hardware/BIOS/firmware etc.) pushes it to the edge, reaching
some weak spots that need to be "hardened". Some hardware simply doesn't
work at all (I hope that's not my case...)

Best,

Andre

--
Andre Oliveira da Costa
([email protected])

2004-06-30 15:19:01

by Nick Warne

[permalink] [raw]
Subject: Re: 2.4.26: IDE drives become unavailable randomly

I was getting this problem, and advice from smartmontools people was
to clean out the box and reseat all cables etc. Seemed to work for
me on the box at work with this DMA timeout issue - BTW, always
happened at idle, like 2:15am in the middle of the night etc.

Reference:

http://sourceforge.net/mailarchive/message.php?msg_id=8660397

http://sourceforge.net/mailarchive/forum.php?thread_id=4908273&forum_i
d=12495

Nick

--
"When you're chewing on life's gristle,
Don't grumble, Give a whistle..."