2001-10-05 18:42:55

by Igor Mozetic

[permalink] [raw]
Subject: 2.4.10-ac4 (SMP, highmem) complete freeze

The same story as with 2.4.10, only faster:

After one day of uptime under load 2-3 (highmem),
the box froze completely. Only hard reboot (actually power unplug)
brought it back. Nothing in logs, nothing over netconsole-C2 ...

Hardware:
dual Xeon 550Mhz, C440GX+, 2GB RAM, 1GB swap, SCSI AIC-7896/7

-Igor Mozetic


2001-10-05 19:08:30

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: 2.4.10-ac4 (SMP, highmem) complete freeze



On Fri, 5 Oct 2001, Igor Mozetic wrote:

> The same story as with 2.4.10, only faster:
>
> After one day of uptime under load 2-3 (highmem),
> the box froze completely. Only hard reboot (actually power unplug)
> brought it back. Nothing in logs, nothing over netconsole-C2 ...

Can you try to get any backtraces the next time the machine locks up ?

You can use the SysRQ key's for that (documentation about it at
Documentation/sysrq.txt). (Alt+SysRQ+T and Alt+SysRQ+P traces)

Thanks

2001-10-05 19:24:31

by Igor Mozetic

[permalink] [raw]
Subject: Re: 2.4.10-ac4 (SMP, highmem) complete freeze

Marcelo Tosatti writes:

> Can you try to get any backtraces the next time the machine locks up ?
>
> You can use the SysRQ key's for that (documentation about it at
> Documentation/sysrq.txt). (Alt+SysRQ+T and Alt+SysRQ+P traces)

Well, at the time of this lockup I was in almost on-line contact
with Ingo and we couldn't get anything at all! Screen was dead blank,
no keystroke worked, NumLock didn't work, even Power Off didn't work.
Also, nothing over netconsole. If you have any other suggestion,
please ...

I'm now back to 2.4.3 (which worked reliably for months)
just to make sure that the hardware is OK.
But I strongly suspect kernel, because these deadlock
coincide with 2.4.10 and I don't believe in coincidences.

-Igor Mozetic

2001-10-07 18:26:01

by George Anzinger

[permalink] [raw]
Subject: Re: 2.4.10-ac4 (SMP, highmem) complete freeze

Igor Mozetic wrote:
>
> Marcelo Tosatti writes:
>
> > Can you try to get any backtraces the next time the machine locks up ?
> >
> > You can use the SysRQ key's for that (documentation about it at
> > Documentation/sysrq.txt). (Alt+SysRQ+T and Alt+SysRQ+P traces)
>
> Well, at the time of this lockup I was in almost on-line contact
> with Ingo and we couldn't get anything at all! Screen was dead blank,
> no keystroke worked, NumLock didn't work, even Power Off didn't work.
> Also, nothing over netconsole. If you have any other suggestion,
> please ...

Did you turn on the NMI watchdog? See
.../linux/Documentation/nmi_watchdog.txt in your kernel tree.

George
>
> I'm now back to 2.4.3 (which worked reliably for months)
> just to make sure that the hardware is OK.
> But I strongly suspect kernel, because these deadlock
> coincide with 2.4.10 and I don't believe in coincidences.
>
> -Igor Mozetic
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-10-07 19:30:51

by Igor Mozetic

[permalink] [raw]
Subject: Re: 2.4.10-ac4 (SMP, highmem) complete freeze

george anzinger writes:
>
> > Well, at the time of this lockup I was in almost on-line contact
> > with Ingo and we couldn't get anything at all! Screen was dead blank,
> > no keystroke worked, NumLock didn't work, even Power Off didn't work.
> > Also, nothing over netconsole. If you have any other suggestion,
> > please ...
>
> Did you turn on the NMI watchdog? See
> .../linux/Documentation/nmi_watchdog.txt in your kernel tree.

Yes.

-Igor