2005-09-08 15:36:51

by Jan Beulich

[permalink] [raw]
Subject: [PATCH] fix i386 condition to call nmi_watchdog_tick

(Note: Patch also attached because the inline version is certain to get
line wrapped.)

Don't call nmi_watchdog_tick() when this isn't enabled.

Signed-off-by: Jan Beulich <[email protected]>

diff -Npru 2.6.13/arch/i386/kernel/traps.c
2.6.13-i386-watchdog-active/arch/i386/kernel/traps.c
--- 2.6.13/arch/i386/kernel/traps.c 2005-08-29 01:41:01.000000000
+0200
+++
2.6.13-i386-watchdog-active/arch/i386/kernel/traps.c 2005-09-01
14:04:35.000000000 +0200
@@ -611,7 +611,7 @@ static void default_do_nmi(struct pt_reg
* Ok, so this is none of the documented NMI sources,
* so it must be the NMI watchdog.
*/
- if (nmi_watchdog) {
+ if (nmi_watchdog && nmi_active > 0) {
nmi_watchdog_tick(regs);
return;
}
diff -Npru 2.6.13/include/asm-i386/apic.h
2.6.13-i386-watchdog-active/include/asm-i386/apic.h
--- 2.6.13/include/asm-i386/apic.h 2005-08-29 01:41:01.000000000
+0200
+++
2.6.13-i386-watchdog-active/include/asm-i386/apic.h 2005-09-01
11:32:11.000000000 +0200
@@ -125,6 +125,7 @@ extern void enable_APIC_timer(void);
extern void enable_NMI_through_LVT0 (void * dummy);

extern unsigned int nmi_watchdog;
+extern int nmi_active;
#define NMI_NONE 0
#define NMI_IO_APIC 1
#define NMI_LOCAL_APIC 2


Attachments:
(No filename) (1.20 kB)
linux-2.6.13-i386-watchdog-active.patch (1.20 kB)
Download all attachments

2005-09-09 00:37:11

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH] fix i386 condition to call nmi_watchdog_tick

On Thu, 8 Sep 2005, Jan Beulich wrote:

> diff -Npru 2.6.13/arch/i386/kernel/traps.c
> 2.6.13-i386-watchdog-active/arch/i386/kernel/traps.c
> --- 2.6.13/arch/i386/kernel/traps.c 2005-08-29 01:41:01.000000000
> +0200
> +++
> 2.6.13-i386-watchdog-active/arch/i386/kernel/traps.c 2005-09-01
> 14:04:35.000000000 +0200
> @@ -611,7 +611,7 @@ static void default_do_nmi(struct pt_reg
> * Ok, so this is none of the documented NMI sources,
> * so it must be the NMI watchdog.
> */
> - if (nmi_watchdog) {
> + if (nmi_watchdog && nmi_active > 0) {
> nmi_watchdog_tick(regs);
> return;
> }

I dislike this patch, and it's not your fault. The reason being is that
there are a few systems (i have one such) which always reports "CPU stuck"
during watchdog setup but then eventually the watchdog starts ticking
during runtime. Unfortunately if this gets in you'll get lots of the
following;

Uhhuh. NMI received for unknown reason 00 on CPU 1.
Dazed and confused, but trying to continue
Do you have a strange power saving mode enabled?
Uhhuh. NMI received for unknown reason 21 on CPU 0.

So, before the patch can go in, the "CPU stuck" systems probably need
looking at. Since i have one, i'll have a look.

Thanks,
Zwane

Ps. why is NMI watchdog perpetually broken?