2001-03-27 05:12:59

by Steven Walter

[permalink] [raw]
Subject: Strange lockups on 2.4.2

This has happened twice, now, though I don't believe its completely
reproduceable. What happens is an Oops, which drops me into kdb. I've
been in X both times, however, which makes kdb rather useless. I
blindly type "go", and interrupts get reenabled, at least (I know
because my mp3 stops looping and begins playing again). This almost
must mean at least part of userspace survives. Probably only X dies,
since VT switching and numlock-toggling doesn't work. I can Ctrl+SysRq
S-U-B, though.

The thing I find most interesting about this is that only 4 lines of the
oops gets into the log. 4 lines, both times. This time, those lines
were:

printing eip:
c0112e1f
Oops: 0002
CPU: 0

This corresponds to schedule according to System.map (that's the nearest
symbol without going over). Before I believe it was path_walk. If
anyone's got an idea, it'd be helpful. Btw, this machine consistently
passes memtest, most recently ran 2 passes of all tests with no errors
found.
--
-Steven
Freedom is the freedom to say that two plus two equals four.


2001-03-27 08:06:14

by Keith Owens

[permalink] [raw]
Subject: Re: Strange lockups on 2.4.2

On Mon, 26 Mar 2001 23:16:27 -0600,
Steven Walter <[email protected]> wrote:
>This has happened twice, now, though I don't believe its completely
>reproduceable. What happens is an Oops, which drops me into kdb. I've
>been in X both times, however, which makes kdb rather useless.

Documentation/serial-console.txt

>The thing I find most interesting about this is that only 4 lines of the
>oops gets into the log. 4 lines, both times. This time, those lines
>were:
>
> printing eip:
>c0112e1f
>Oops: 0002
>CPU: 0

That is a symptom of a broken klogd. Always run klogd with the -x
switch. If that does not work, take a look at

ftp://ftp.<country>.kernel.org/pub/linux/utils/kernel/ksymoops/v2.4/patch-sysklogd-1-3-31-ksymoops-1.gz

One day the sysklogd maintainers might just fix this bug, that bug fix
is almost 2 years old.

2001-03-27 08:16:54

by Steven Walter

[permalink] [raw]
Subject: Re: Strange lockups on 2.4.2

On Tue, Mar 27, 2001 at 06:05:05PM +1000, Keith Owens wrote:
> On Mon, 26 Mar 2001 23:16:27 -0600,
> Steven Walter <[email protected]> wrote:
> >This has happened twice, now, though I don't believe its completely
> >reproduceable. What happens is an Oops, which drops me into kdb. I've
> >been in X both times, however, which makes kdb rather useless.
>
> Documentation/serial-console.txt
>

Unfortunately I don't have the money to go and buy a dumb-terminal, and
the nearest other computer is ~30 feet away. I've actually looked into
writing code that allows to kernel to return to VGA-text mode for this
reason.

> >The thing I find most interesting about this is that only 4 lines of the
> >oops gets into the log. 4 lines, both times. This time, those lines
> >were:
> >
> > printing eip:
> >c0112e1f
> >Oops: 0002
> >CPU: 0
>
> That is a symptom of a broken klogd. Always run klogd with the -x
> switch. If that does not work, take a look at
>
> ftp://ftp.<country>.kernel.org/pub/linux/utils/kernel/ksymoops/v2.4/patch-sysklogd-1-3-31-ksymoops-1.gz
>
> One day the sysklogd maintainers might just fix this bug, that bug fix
> is almost 2 years old.

I actually already run klogd with -x due to earlier threads on lkml, so
it can't be that /particular/ problem, but klog/syslog may still be to
blame. I'm usually lucky to get anything in my log between "--MARK--"
then "klogd restart" related to the crash.

--
-Steven
Freedom is the freedom to say that two plus two equals four.