2008-03-13 18:38:25

by Aristeu Rozanski

[permalink] [raw]
Subject: [RFC][PATCH] nmi watchdog: handle NMI_IO_APIC on nmi_watchdog

This patch includes code to handle NMI_IO_APIC enabling/disabling by
nmi_watchdog proc file.

Signed-off-by: Aristeu Rozanski <[email protected]>

---
arch/x86/kernel/nmi_32.c | 25 +++++++++++++++++++++++++
arch/x86/kernel/nmi_64.c | 25 +++++++++++++++++++++++++
2 files changed, 50 insertions(+)

--- nmi.orig/arch/x86/kernel/nmi_64.c 2008-03-12 17:02:36.000000000 -0400
+++ nmi/arch/x86/kernel/nmi_64.c 2008-03-12 17:47:35.000000000 -0400
@@ -274,6 +274,8 @@ void stop_apic_nmi_watchdog(void *unused
return;
if (nmi_watchdog == NMI_LOCAL_APIC)
lapic_watchdog_stop();
+ else
+ __acpi_nmi_disable(NULL);
__get_cpu_var(wd_enabled) = 0;
atomic_dec(&nmi_active);
}
@@ -405,6 +407,24 @@ void restart_nmi(void)

#ifdef CONFIG_SYSCTL

+static void enable_ioapic_nmi_watchdog_single(void *unused)
+{
+ __get_cpu_var(wd_enabled) = 1;
+ atomic_inc(&nmi_active);
+ __acpi_nmi_enable(NULL);
+}
+
+static void enable_ioapic_nmi_watchdog(void)
+{
+ on_each_cpu(enable_ioapic_nmi_watchdog_single, NULL, 0, 1);
+ touch_nmi_watchdog();
+}
+
+static void disable_ioapic_nmi_watchdog(void)
+{
+ on_each_cpu(stop_apic_nmi_watchdog, NULL, 0, 1);
+}
+
static int unknown_nmi_panic_callback(struct pt_regs *regs, int cpu)
{
unsigned char reason = get_nmi_reason();
@@ -442,6 +462,11 @@ int proc_nmi_enabled(struct ctl_table *t
enable_lapic_nmi_watchdog();
else
disable_lapic_nmi_watchdog();
+ } else if (nmi_watchdog == NMI_IO_APIC) {
+ if (nmi_watchdog_enabled)
+ enable_ioapic_nmi_watchdog();
+ else
+ disable_ioapic_nmi_watchdog();
} else {
printk( KERN_WARNING
"NMI watchdog doesn't know what hardware to touch\n");
--- nmi.orig/arch/x86/kernel/nmi_32.c 2008-03-12 15:16:30.000000000 -0400
+++ nmi/arch/x86/kernel/nmi_32.c 2008-03-12 17:50:55.000000000 -0400
@@ -270,6 +270,8 @@ void stop_apic_nmi_watchdog(void *unused
return;
if (nmi_watchdog == NMI_LOCAL_APIC)
lapic_watchdog_stop();
+ else
+ __acpi_nmi_disable(NULL);
__get_cpu_var(wd_enabled) = 0;
atomic_dec(&nmi_active);
}
@@ -390,6 +392,24 @@ __kprobes int nmi_watchdog_tick(struct p

#ifdef CONFIG_SYSCTL

+static void enable_ioapic_nmi_watchdog_single(void *unused)
+{
+ __get_cpu_var(wd_enabled) = 1;
+ atomic_inc(&nmi_active);
+ __acpi_nmi_enable(NULL);
+}
+
+static void enable_ioapic_nmi_watchdog(void)
+{
+ on_each_cpu(enable_ioapic_nmi_watchdog_single, NULL, 0, 1);
+ touch_nmi_watchdog();
+}
+
+static void disable_ioapic_nmi_watchdog(void)
+{
+ on_each_cpu(stop_apic_nmi_watchdog, NULL, 0, 1);
+}
+
static int unknown_nmi_panic_callback(struct pt_regs *regs, int cpu)
{
unsigned char reason = get_nmi_reason();
@@ -431,6 +451,11 @@ int proc_nmi_enabled(struct ctl_table *t
enable_lapic_nmi_watchdog();
else
disable_lapic_nmi_watchdog();
+ } else if (nmi_watchdog == NMI_IO_APIC) {
+ if (nmi_watchdog_enabled)
+ enable_ioapic_nmi_watchdog();
+ else
+ disable_ioapic_nmi_watchdog();
} else {
printk( KERN_WARNING
"NMI watchdog doesn't know what hardware to touch\n");


2008-03-21 11:48:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC][PATCH] nmi watchdog: handle NMI_IO_APIC on nmi_watchdog


* Aristeu Rozanski <[email protected]> wrote:

> This patch includes code to handle NMI_IO_APIC enabling/disabling by
> nmi_watchdog proc file.

hm:

> +static void disable_ioapic_nmi_watchdog(void)
> +{
> + on_each_cpu(stop_apic_nmi_watchdog, NULL, 0, 1);
> +}

stop_apic_nmi_watchdog() doesnt currently properly disable the
generation of NMIs when they come from an IO-APIC, so this will need
more fixes i believe. One approach would be to save the IO-APIC id and
pin when the watchdog is set up, and use it later on to poke that
IO-APIC register to disable NMI generation there.

Ingo

2008-03-26 15:24:41

by Aristeu Rozanski

[permalink] [raw]
Subject: Re: [RFC][PATCH] nmi watchdog: handle NMI_IO_APIC on nmi_watchdog

Hi Ingo,
> > This patch includes code to handle NMI_IO_APIC enabling/disabling by
> > nmi_watchdog proc file.
>
> hm:
>
> > +static void disable_ioapic_nmi_watchdog(void)
> > +{
> > + on_each_cpu(stop_apic_nmi_watchdog, NULL, 0, 1);
> > +}
>
> stop_apic_nmi_watchdog() doesnt currently properly disable the
> generation of NMIs when they come from an IO-APIC, so this will need
> more fixes i believe. One approach would be to save the IO-APIC id and
> pin when the watchdog is set up, and use it later on to poke that
> IO-APIC register to disable NMI generation there.
the patch I sent has this change:

@@ -270,6 +270,8 @@ void stop_apic_nmi_watchdog(void *unused
return;
if (nmi_watchdog == NMI_LOCAL_APIC)
lapic_watchdog_stop();
+ else
+ __acpi_nmi_disable(NULL);
__get_cpu_var(wd_enabled) = 0;
atomic_dec(&nmi_active);
}

and:
static void __acpi_nmi_disable(void *__unused)
{
apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED);
}

do you think this isn't enough?

--
Aristeu

2008-03-26 18:23:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC][PATCH] nmi watchdog: handle NMI_IO_APIC on nmi_watchdog


* Aristeu Rozanski <[email protected]> wrote:

> > stop_apic_nmi_watchdog() doesnt currently properly disable the
> > generation of NMIs when they come from an IO-APIC, so this will need
> > more fixes i believe. One approach would be to save the IO-APIC id and
> > pin when the watchdog is set up, and use it later on to poke that
> > IO-APIC register to disable NMI generation there.
> the patch I sent has this change:
>
> @@ -270,6 +270,8 @@ void stop_apic_nmi_watchdog(void *unused
> return;
> if (nmi_watchdog == NMI_LOCAL_APIC)
> lapic_watchdog_stop();
> + else
> + __acpi_nmi_disable(NULL);
> __get_cpu_var(wd_enabled) = 0;
> atomic_dec(&nmi_active);
> }
>
> and:
> static void __acpi_nmi_disable(void *__unused)
> {
> apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED);
> }
>
> do you think this isn't enough?

but this stops all NMIs, not just the IO-APIC generated ones, doesnt it?

Ingo

2008-03-26 18:44:46

by Aristeu Rozanski

[permalink] [raw]
Subject: Re: [RFC][PATCH] nmi watchdog: handle NMI_IO_APIC on nmi_watchdog

> > > stop_apic_nmi_watchdog() doesnt currently properly disable the
> > > generation of NMIs when they come from an IO-APIC, so this will need
> > > more fixes i believe. One approach would be to save the IO-APIC id and
> > > pin when the watchdog is set up, and use it later on to poke that
> > > IO-APIC register to disable NMI generation there.
> > the patch I sent has this change:
> >
> > @@ -270,6 +270,8 @@ void stop_apic_nmi_watchdog(void *unused
> > return;
> > if (nmi_watchdog == NMI_LOCAL_APIC)
> > lapic_watchdog_stop();
> > + else
> > + __acpi_nmi_disable(NULL);
> > __get_cpu_var(wd_enabled) = 0;
> > atomic_dec(&nmi_active);
> > }
> >
> > and:
> > static void __acpi_nmi_disable(void *__unused)
> > {
> > apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED);
> > }
> >
> > do you think this isn't enough?
>
> but this stops all NMIs, not just the IO-APIC generated ones, doesnt it?
This is the reverse of:

static void __init setup_nmi(void)
{
/*
* Dirty trick to enable the NMI watchdog ...
* We put the 8259A master into AEOI mode and
* unmask on all local APICs LVT0 as NMI.
*
* The idea to use the 8259A in AEOI mode ('8259A Virtual Wire')
* is from Maciej W. Rozycki - so we do not have to EOI from
* the NMI handler or the timer interrupt.
*/
apic_printk(APIC_VERBOSE, KERN_INFO "activating NMI Watchdog ...");

enable_NMI_through_LVT0();

apic_printk(APIC_VERBOSE, " done.\n");
}
where:
void __cpuinit enable_NMI_through_LVT0(void)
{
unsigned int v;

/* unmask and set to NMI */
v = APIC_DM_NMI;
apic_write(APIC_LVT0, v);
}

I must admit I don't really understand how the NMI watchdog thru IOAPIC
works. Couldn't find proper documentation on that. I just tried to
revert what's done when it's enabled and everything worked as expected,
including a customer who is using those development boxes with a NMI
button. First the NMI watchdog was disabled using a patch much like the
one I sent then NMI was generated by pushing the button and the box
crashed as it should (the old unknown_nmi_panic sysctl was set). So,
unless the NMI button does something different to deliver the NMI, other
NMIs should keep working.

--
Aristeu

2008-10-22 22:22:39

by Aristeu Rozanski

[permalink] [raw]
Subject: Re: [RFC][PATCH] nmi watchdog: handle NMI_IO_APIC on nmi_watchdog

> > > stop_apic_nmi_watchdog() doesnt currently properly disable the
> > > generation of NMIs when they come from an IO-APIC, so this will need
> > > more fixes i believe. One approach would be to save the IO-APIC id and
> > > pin when the watchdog is set up, and use it later on to poke that
> > > IO-APIC register to disable NMI generation there.
> > the patch I sent has this change:
> >
> > @@ -270,6 +270,8 @@ void stop_apic_nmi_watchdog(void *unused
> > return;
> > if (nmi_watchdog == NMI_LOCAL_APIC)
> > lapic_watchdog_stop();
> > + else
> > + __acpi_nmi_disable(NULL);
> > __get_cpu_var(wd_enabled) = 0;
> > atomic_dec(&nmi_active);
> > }
> >
> > and:
> > static void __acpi_nmi_disable(void *__unused)
> > {
> > apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED);
> > }
> >
> > do you think this isn't enough?
>
> but this stops all NMIs, not just the IO-APIC generated ones, doesnt it?
external NMIs would come by LINT1, no?
after reading a lot and talking with Vivek, what I understood was that 8259 is
also connected to LINT0, so timer interrupts would be reported both in
IOAPIC and on LINT0. LVT0 is being configured on setup_nmi() to generate
an NMI. Also, the same timer interrupt would generate an regular interrupt
coming from IOAPIC. Clearing the APIC_DM_NMI on LVT0 would disable the
NMI delivery, but not the regular interrupt.
or am I missing something here?

--
Aristeu

2008-10-22 22:45:52

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [RFC][PATCH] nmi watchdog: handle NMI_IO_APIC on nmi_watchdog

On Wed, 22 Oct 2008, Aristeu Rozanski wrote:

> > > static void __acpi_nmi_disable(void *__unused)
> > > {
> > > apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED);
> > > }
> > >
> > > do you think this isn't enough?
> >
> > but this stops all NMIs, not just the IO-APIC generated ones, doesnt it?
> external NMIs would come by LINT1, no?

Correct.

> after reading a lot and talking with Vivek, what I understood was that 8259 is
> also connected to LINT0, so timer interrupts would be reported both in
> IOAPIC and on LINT0. LVT0 is being configured on setup_nmi() to generate
> an NMI. Also, the same timer interrupt would generate an regular interrupt
> coming from IOAPIC. Clearing the APIC_DM_NMI on LVT0 would disable the
> NMI delivery, but not the regular interrupt.
> or am I missing something here?

You are correct. The routing of the 8254 timer interrupt may vary, but
the NMI watchdog it drives, if supported, always uses the LINT0 input of
all the local APICs.

You seem to be writing in reply to a very old thread -- please make sure
your concerns are still valid with current code. For example we have an
implementation of __acpi_nmi_disable() now, which is exactly your proposed
one.

Maciej

2008-10-23 03:36:06

by Aristeu Rozanski

[permalink] [raw]
Subject: Re: [RFC][PATCH] nmi watchdog: handle NMI_IO_APIC on nmi_watchdog

> > > but this stops all NMIs, not just the IO-APIC generated ones, doesnt it?
> > external NMIs would come by LINT1, no?
>
> Correct.
>
> > after reading a lot and talking with Vivek, what I understood was that 8259 is
> > also connected to LINT0, so timer interrupts would be reported both in
> > IOAPIC and on LINT0. LVT0 is being configured on setup_nmi() to generate
> > an NMI. Also, the same timer interrupt would generate an regular interrupt
> > coming from IOAPIC. Clearing the APIC_DM_NMI on LVT0 would disable the
> > NMI delivery, but not the regular interrupt.
> > or am I missing something here?
>
> You are correct. The routing of the 8254 timer interrupt may vary, but
> the NMI watchdog it drives, if supported, always uses the LINT0 input of
> all the local APICs.
thanks

> You seem to be writing in reply to a very old thread -- please make sure
> your concerns are still valid with current code. For example we have an
> implementation of __acpi_nmi_disable() now, which is exactly your proposed
> one.
didn't noticed that, sorry.

--
Aristeu

2008-10-23 21:59:48

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [RFC][PATCH] nmi watchdog: handle NMI_IO_APIC on nmi_watchdog

On Wed, 22 Oct 2008, Aristeu Rozanski wrote:

> > You seem to be writing in reply to a very old thread -- please make sure
> > your concerns are still valid with current code. For example we have an
> > implementation of __acpi_nmi_disable() now, which is exactly your proposed
> > one.
> didn't noticed that, sorry.

Well, as it is, it looks the rest of your patch may still be useful -- I
haven't investigated it any further, so if still interested, please do
have a look into it.

Maciej