My laptop hard drive recently died (or is in the process of dying).
HP wanted me to do some more tests before sending me a replacement, so
I tried booting Linux again today. I got lots of DMA errors, which
was really to be expected, but then I got a kernel panic. While I'd
not blame the kernel when a panic occurs with broken RAM or CPU, I'm
sure sure the kernel should panic just because of a broken IDE drive.
I posted a picture of the panic at http://cyrius.com/tmp/ide_panic.jpg
Is this something that can be fixed or is my hardware really so broken
that the kernel cannot deal with it?
--
Martin Michlmayr
http://www.cyrius.com/
Martin Michlmayr wrote:
> My laptop hard drive recently died (or is in the process of dying).
> HP wanted me to do some more tests before sending me a replacement, so
> I tried booting Linux again today. I got lots of DMA errors, which
> was really to be expected, but then I got a kernel panic. While I'd
> not blame the kernel when a panic occurs with broken RAM or CPU, I'm
> sure sure the kernel should panic just because of a broken IDE drive.
>
> I posted a picture of the panic at http://cyrius.com/tmp/ide_panic.jpg
> Is this something that can be fixed or is my hardware really so broken
> that the kernel cannot deal with it?
Probably is a genuine bug. These kinds of reports have come up a few
times recently as I recall - it seems some of the error handling in the
drivers/ide code isn't quite so robust..
--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/
* Robert Hancock <[email protected]> [2006-03-08 22:03]:
> Probably is a genuine bug. These kinds of reports have come up a few
> times recently as I recall - it seems some of the error handling in
> the drivers/ide code isn't quite so robust..
Was the traceback I posted enough so someone can find out what's going
on or do you need more information? I can hook up a serial console
and try to capture the full log, but I'm not sure I can reproduce this
kernel panic. The dying hard drive is quite arbitrary when it comes
to showing errors or working fine...
--
Martin Michlmayr
http://www.cyrius.com/
On Iau, 2006-03-09 at 15:14 +0000, Martin Michlmayr wrote:
> on or do you need more information? I can hook up a serial console
> and try to capture the full log, but I'm not sure I can reproduce this
> kernel panic. The dying hard drive is quite arbitrary when it comes
> to showing errors or working fine...
Ancient known problem. I'd be interested if you can however break libata
and the PATA IDE patches the same way.
* Alan Cox <[email protected]> [2006-03-09 16:45]:
> Ancient known problem. I'd be interested if you can however break
> libata and the PATA IDE patches the same way.
I can try, but like I said, the hard drive acts pretty arbitrarily and
won't always fail when I want it to. Do you know if there's a way to
trigger the problem? Otherwise I'll just try a couple of times,
but without a good way to trigger the problem I cannot really say if
it's gone with libata.
--
Martin Michlmayr
http://www.cyrius.com/
On Iau, 2006-03-09 at 16:53 +0000, Martin Michlmayr wrote:
> * Alan Cox <[email protected]> [2006-03-09 16:45]:
> > Ancient known problem. I'd be interested if you can however break
> > libata and the PATA IDE patches the same way.
>
> I can try, but like I said, the hard drive acts pretty arbitrarily and
> won't always fail when I want it to. Do you know if there's a way to
> trigger the problem? Otherwise I'll just try a couple of times,
> but without a good way to trigger the problem I cannot really say if
> it's gone with libata.
You could try heavy I/O (find / -print type stuff), or if its specific
problem blocks then cp /dev/hda (/dev/sda for libata) /dev/null.
Libata should either error correctly or recover cleanly from the
problems.
* Alan Cox <[email protected]> [2006-03-09 16:45]:
> > The dying hard drive is quite arbitrary when it comes to showing
> > errors or working fine...
>
> Ancient known problem. I'd be interested if you can however break
> libata and the PATA IDE patches the same way.
Sorry, but I'm not able to give you more information. I tried again
several times with PATA and never saw the oops again, so I don't think
trying libata would help since not seeing an oops wouldn't mean
anything at all. Unless there is a _specific_ way to trigger this bug
("cause much disk IO" isn't enough because it only led to an oops once
out of maybe something like 30-40 tries) I cannot do anything.
--
Martin Michlmayr
http://www.cyrius.com/