2003-11-17 02:02:23

by wwp

[permalink] [raw]
Subject: possible IDE/ext3 fs corruption while playing w/ ACPI and/or 2.4.22?

Hi folks,


I wonder if playing w/ vanilla 2.4.22 or 2.4.22+ACPI patches can lead to
IDE/ext3 fs corruption..

I'm using SuSE 8.1 on my Dell Inspiron 8200, default SuSE 2.4.19-4GB
kernel (acpi 20020829). I've compiled 2.4.22 vanilla and 2.4.22 + latest
ACPI code + latest ieee1394 (gcc 3.2), and got IDE faults that lead to ext3
corruption when (re)booting to those 2.4.22 kernels, in different ACPI modes
(w/ or w/o batteries, power supply).

Here what /var/log/messages shows:
kernel: hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }
kernel: hda: drive not ready for command
This error usually occurs right when booting the a 2.4.22 kernel (vanilla or
w/ ACPI patches), and I got broken files and directories sometimes.

I usually use SuSE's 2.4.19 kernel, and never have IDE/ext3 problems. Those past
3 months, I tried twice to compile and use 2.4.22 and each time I got fs corruption.

So I double-checked my drive for broken sectors, also did 4-hour long memory
check: nothing bad has been found. I've also upgraded to modutils 2.4.26.

That's why I'm trying to figure out what element (kernel, acpi, ieee1394,
SuSE stuff, Dell hardware/BIOS) is leading to such IDE errors and to fs
data loss.

Any thought about such problems or maybe how to investigate? Did anyone
experienced such problems w/ ACPI, Dell hardware or fresh 2.4.22 kernel?


Regards,

--
wwp


2003-11-17 02:20:43

by wwp

[permalink] [raw]
Subject: Re: possible IDE/ext3 fs corruption while playing w/ ACPI and/or 2.4.22?

Hi folks,


more info:

I was interested in looking if switching from a kernel to another one leads to
such possible errors.. I've ran the command below over all the /var/log/message
files created since I use this laptop:
grep -E "symbols from /boot/System.map|kernel: hda: " messages*
The output is bzip2'ed and attached.

OK, this doesn't show anything else but kernel bootings and hda errors,
nothing about ACPI actions or anything else. Nearly all the hda errors
happen when during the startup process, quite always at the same time.
Maybe it's not relevant, and I don't know if there's any information to get
from such observation. Would it help or should I provide more info?


Regards,

--
wwp


Attachments:
hda-errors.log.bz2 (5.45 kB)