2001-07-27 19:50:39

by Kent Hunt

[permalink] [raw]
Subject: Longstanding sudden reboots with 2.4 smp kernels

Hi,

I have a problem that continues to persist since the
2.4 test kernels until the latest 2.4.7.
The machine suddently reboots once in a while when I
click some action button in the gnomeicu program. I am
ruling out hardware problems since the box is rock
solid except in the above mentioned situation. It is
frustrating since no messages are left in the kernel
logs when these reboots happen.
The box is an ASUS P2BD main board.
Video card Matrox G400.

I have run X from 3.3.6 to 4.0.3 and all have this
problem. Can anyone tell me if a userland software
like gnomeicu can trigger a reboot in the kernel? I
think it has to do either with the code the X handles
pointer events or some networking problems since it is
an instant messenger software. But it is intriging
that this only happens with gnomeicu. Anyone with good
suggestions to solve this problem?

Kent

Please CC and I would be happy to provide any further details.

__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail
http://personal.mail.yahoo.com/


2001-07-28 00:08:38

by Kurt Garloff

[permalink] [raw]
Subject: Re: Longstanding sudden reboots with 2.4 smp kernels

On Fri, Jul 27, 2001 at 12:50:31PM -0700, Kent Hunt wrote:
> I have a problem that continues to persist since the
> 2.4 test kernels until the latest 2.4.7.
> The machine suddently reboots once in a while when I
> click some action button in the gnomeicu program. I am
> ruling out hardware problems since the box is rock
> solid except in the above mentioned situation. It is
> frustrating since no messages are left in the kernel
> logs when these reboots happen.
> The box is an ASUS P2BD main board.
> Video card Matrox G400.

Power supply?
You may also want to try to reduce AGP to 2x.

Jsut out of curiosity: Do you also see APIC errors in your syslog?

Regards,
--
Kurt Garloff <[email protected]> [Eindhoven, NL]
Physics: Plasma simulations <[email protected]> [TU Eindhoven, NL]
Linux: SCSI, Security <[email protected]> [SuSE Nuernberg, DE]
(See mail header or public key servers for PGP2 and GPG public keys.)


Attachments:
(No filename) (967.00 B)
(No filename) (232.00 B)
Download all attachments

2001-07-31 14:32:04

by Kent Hunt

[permalink] [raw]
Subject: Re: Longstanding sudden reboots with 2.4 smp kernels

Mark,

--- Mark Hahn <[email protected]>
wrote:
> > click some action button in the gnomeicu program.
> I am
> > ruling out hardware problems since the box is rock
> > solid except in the above mentioned situation. It
> is
>
> that is not a valid argument.

Okay. I also don't take it as a strong evidence that
it must be a hardware problem, however, if it were a
hardware problem, then I would expect this happening
in other similar situations as well (using GUI's in
X). Now, I have more correlated data. I have tried to
use a different instant message program: licq. It also
does the same. Perhaps then the problem is using (GUI
in X with networking)?

I had other problems in that past that many people
assured me that was a hardware problem. For example,
under high network traffic in my ppp0 the box
rebooted. There was one special condition though. The
two cpu's had to be busy with a demanding process. The
upgrade from kernel 2.2.16 to 2.4 solved completely
the problem. You may argue than that is because 2.4
might have introduced workarounds to fix buggy
hardware. Perhaps, I don't know exactly what has been
changed in the ppp code of the kernel. As a side note,
I believe this problem has been solved in the later
2.2 kernels.

For a long time I did a bunch of hardware tests to
solve the above problem. CPU burn, Memory burn,
anything burn, watched CPU and box temperature (it was
summer time and very hot) and all tests gave negative.
This is why I also have some trust in the hardware. I
only bought good parts and I'm doing no overclocking
or other hardware tweaks that makes the box unstable.

Each of the componets X, GUI and network I think they
work properly, I never have crashes with X, all GUI's
are working and I do a lot of heavy networking. Except
of course the IM programs.

> > frustrating since no messages are left in the
> kernel
> > logs when these reboots happen.
>
> there are some fault scenarios that would cause
> instant
> reboots (maybe triple faults), but of course that
> implies
> a bug making it possible (in this case, probably the
> X code).

I'm not ruling out X problem (I would be then thankful
that it is a software problem then, if someone is able
to solve).

> equally possible is a hardware fault - being
> triggered by
> some obscure sequence doesn't mean it's not
> hardware.

I was expecting if someone could point out if there is
a clever way of isolating and detecting the problem.
The problem is that the kernel reboots, no oops or
logs. X also doesn't leave any logs. I have tried to
disable hardware pointer in X but it also doesn't
help.
Perhaps I should try to recompile gnomeicu to only
catch events and not do any networking.

Kent


__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger
http://phonecard.yahoo.com/