2006-01-30 09:08:13

by Jan Beulich

[permalink] [raw]
Subject: [PATCH] prevent nested panic from soft lockup detection

From: Jan Beulich <[email protected]>

Suppress triggering a nested panic due to soft lockup detection.

Signed-Off-By: Jan Beulich <[email protected]>



Attachments:
(No filename) (158.00 B)
linux-2.6.16-rc1-panic-softlockup.patch (900.00 B)
Download all attachments

2006-01-30 14:59:18

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] prevent nested panic from soft lockup detection

On Mon, Jan 30, 2006 at 10:08:33AM +0100, Jan Beulich wrote:

> From: Jan Beulich <[email protected]>
>
> Suppress triggering a nested panic due to soft lockup detection.
>
> Signed-Off-By: Jan Beulich <[email protected]>
>
> diff -Npru /home/jbeulich/tmp/linux-2.6.16-rc1/kernel/panic.c 2.6.16-rc1-panic-softlockup/kernel/panic.c
> --- /home/jbeulich/tmp/linux-2.6.16-rc1/kernel/panic.c 2006-01-27 15:10:49.000000000 +0100
> +++ 2.6.16-rc1-panic-softlockup/kernel/panic.c 2006-01-25 09:55:53.000000000 +0100
> @@ -107,6 +107,7 @@ NORET_TYPE void panic(const char * fmt,
> printk(KERN_EMERG "Rebooting in %d seconds..",panic_timeout);
> for (i = 0; i < panic_timeout*1000; ) {
> touch_nmi_watchdog();
> + touch_softlockup_watchdog();
> i += panic_blink(i);
> mdelay(1);
> i++;
> @@ -130,6 +131,7 @@ NORET_TYPE void panic(const char * fmt,
> #endif
> local_irq_enable();
> for (i = 0;;) {
> + touch_softlockup_watchdog();
> i += panic_blink(i);
> mdelay(1);
> i++;

I've been wondering for a while why we don't just make touch_nmi_watchdog
do an implicit call to touch_softlockup_watchdog. I can't think of a situation
where we'd want to do one but not the other, and adding patches like this
seems to be an uphill battle (I know at least two other places that need
this off the top of my head).

Dave

2006-01-31 12:35:22

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] prevent nested panic from soft lockup detection

Dave Jones <[email protected]> writes:

> On Mon, Jan 30, 2006 at 10:08:33AM +0100, Jan Beulich wrote:
>
> > From: Jan Beulich <[email protected]>
> >
> > Suppress triggering a nested panic due to soft lockup detection.
> >
> > Signed-Off-By: Jan Beulich <[email protected]>
> >
> > diff -Npru /home/jbeulich/tmp/linux-2.6.16-rc1/kernel/panic.c 2.6.16-rc1-panic-softlockup/kernel/panic.c
> > --- /home/jbeulich/tmp/linux-2.6.16-rc1/kernel/panic.c 2006-01-27 15:10:49.000000000 +0100
> > +++ 2.6.16-rc1-panic-softlockup/kernel/panic.c 2006-01-25 09:55:53.000000000 +0100
> > @@ -107,6 +107,7 @@ NORET_TYPE void panic(const char * fmt,
> > printk(KERN_EMERG "Rebooting in %d seconds..",panic_timeout);
> > for (i = 0; i < panic_timeout*1000; ) {
> > touch_nmi_watchdog();
> > + touch_softlockup_watchdog();
> > i += panic_blink(i);
> > mdelay(1);
> > i++;
> > @@ -130,6 +131,7 @@ NORET_TYPE void panic(const char * fmt,
> > #endif
> > local_irq_enable();
> > for (i = 0;;) {
> > + touch_softlockup_watchdog();
> > i += panic_blink(i);
> > mdelay(1);
> > i++;
>
> I've been wondering for a while why we don't just make touch_nmi_watchdog
> do an implicit call to touch_softlockup_watchdog. I can't think of a situation
> where we'd want to do one but not the other, and adding patches like this
> seems to be an uphill battle (I know at least two other places that need
> this off the top of my head).

Very good idea.

Someone did it already in the SUSE kernel and it helped considerably
there.

-Andi

2006-01-31 13:46:44

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH] prevent nested panic from soft lockup detection

>> I've been wondering for a while why we don't just make touch_nmi_watchdog
>> do an implicit call to touch_softlockup_watchdog. I can't think of a situation
>> where we'd want to do one but not the other, and adding patches like this
>> seems to be an uphill battle (I know at least two other places that need
>> this off the top of my head).
>
>Very good idea.
>
>Someone did it already in the SUSE kernel and it helped considerably
>there.

Actually, plain 2.6.15 already has this (for i386 and x86-64 at least). Hence the first of the two hunks the patch
consists of is superfluous. The second hunk, however, is still necessary (as there's no pre-existing
touch_nmi_watchdog() call there, and there also shouldn't be one as interrupts get re-enabled before getting there).

Jan