In io_apic.c there is the following bit of code:
if (nmi_watchdog) {
printk(KERN_WARNING "timer doesn't work through the
IO-APIC - disabling NMI Watchdog!\n");
nmi_watchdog = 0;
}
On at least some systems, disabling the above store leaves a
valid nmi watchdog timer.
In attempting to understand how the NMI watchdog works I
think I have found that:
a. the NMI interrupts are generated by the performance
counter in the cpu and
b. the test to see if the cpu is stalled is on a counter
that is bumped by the apic counter interrupt code.
If this is so (and help me to understand if it is not), then
what do the timer interrupts going thru the IO_APIC have to
do with the NMI watchdog.
Is it possible that the above code is a hold over from when
things were done differently?
--
George Anzinger [email protected]
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
george anzinger writes:
> In attempting to understand how the NMI watchdog works I
> think I have found that:
>
> a. the NMI interrupts are generated by the performance
> counter in the cpu and
...
> If this is so (and help me to understand if it is not), then
> what do the timer interrupts going thru the IO_APIC have to
> do with the NMI watchdog.
Before 2.4, the NMI watchdog was only available for SMP boxes,
since it used the I/O APIC to send NMIs to the CPUs. Then the
ability to use the *local* APIC on UP machines was introduced,
and with it the ability to drive the NMI watchdog from the CPU
itself, via performance counter overflow interrupts.
The NMI watchdog still supports both these modes of operation.
Typically, the performance counter + local APIC mode kicks in
when (a) you asked for it, or (b) you asked for the I/O APIC
mode but it wasn't available.
/Mikael
Mikael Pettersson wrote:
>
> george anzinger writes:
> > In attempting to understand how the NMI watchdog works I
> > think I have found that:
> >
> > a. the NMI interrupts are generated by the performance
> > counter in the cpu and
> ...
> > If this is so (and help me to understand if it is not), then
> > what do the timer interrupts going thru the IO_APIC have to
> > do with the NMI watchdog.
>
> Before 2.4, the NMI watchdog was only available for SMP boxes,
> since it used the I/O APIC to send NMIs to the CPUs. Then the
> ability to use the *local* APIC on UP machines was introduced,
> and with it the ability to drive the NMI watchdog from the CPU
> itself, via performance counter overflow interrupts.
>
> The NMI watchdog still supports both these modes of operation.
> Typically, the performance counter + local APIC mode kicks in
> when (a) you asked for it, or (b) you asked for the I/O APIC
> mode but it wasn't available.
So then the NMI checks for timer interrupts being serviced
in this case? But, still, why the turn off if the timer
does not go thru the APIC? The case this came up in is an
SMP machine, but the test in apic.c shows that the PIT
interrupt does not go thru the APIC. Leaving NMI on seems
to work, so I am wondering if this is just old code.
>
> /Mikael
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
George Anzinger [email protected]
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
On Wed, Nov 06, 2002 at 09:59:41AM -0800, george anzinger wrote:
> So then the NMI checks for timer interrupts being serviced
> in this case? But, still, why the turn off if the timer
> does not go thru the APIC? The case this came up in is an
> SMP machine, but the test in apic.c shows that the PIT
> interrupt does not go thru the APIC. Leaving NMI on seems
> to work, so I am wondering if this is just old code.
It seems that the test should be :
if (nmi_watchdog == NMI_IO_APIC) {
... disable it
}
I don't think the perfctr watchdog would be affected by the code in
io_apic.c
(on a vaguely related note, booting with nmi_watchdog=2 on my SMP
machine gives high rates of nmis :
janus:~# cat /proc/interrupts | grep NMI ; sleep 1 ; cat /proc/interrupts | grep NMI
NMI: 88358 88358
NMI: 88432 88397
when the machine is compiling kernels. I dunno why ...)
regards
john
--
"When a man has nothing to say, the worst thing he can do is to say it
memorably."
- Calvin Trillin
On Wed, 6 Nov 2002, John Levon wrote:
> It seems that the test should be :
>
> if (nmi_watchdog == NMI_IO_APIC) {
> ... disable it
> }
Indeed. Good spotting.
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +
On Wed, 6 Nov 2002, george anzinger wrote:
> So then the NMI checks for timer interrupts being serviced
> in this case? But, still, why the turn off if the timer
> does not go thru the APIC? The case this came up in is an
> SMP machine, but the test in apic.c shows that the PIT
> interrupt does not go thru the APIC. Leaving NMI on seems
> to work, so I am wondering if this is just old code.
For the I/O APIC NMI watchdog the PIT timer is used as a source of NMI
interrupts as well as a source of timer interrupts. For this to work you
need to have two APIC interrupt inputs to receive timer ticks, one
programmed as an ordinary LoPri interrupt and the other one as an NMI one.
Our implementation supports two common variants:
1. The PIT timer is directly connected to an I/O APIC input (typically
INTIN2 of the first I/O APIC) *and* to the master i8259A PIC (hereafter
referred to as PIC). The output of the PIC is connected both to an I/O
APIC input (typically INTIN0 of the first I/O APIC) *and* to all LINT0
inputs of local APICs. For such a setup, the I/O APIC input INTIN2 is
programmed to send LoPri timer interrupts and the LINT0 inputs are
programmed to send NMIs -- for the latter to work the PIC is programmed to
behave transparently. Intel chipsets usually behave this way -- an
exception is the ancient EISA-only i82350.
2. The PIT timer is directly connected to the PIC only. The output of the
PIC is connected both to an I/O APIC input (typically INTIN0 of the first
I/O APIC) *and* to all LINT0 inputs of local APICs. For such a setup, the
I/O APIC input INTIN0 is programmed to send LoPri timer interrupts and the
LINT0 inputs are programmed to send NMIs -- for both interrupts to work
the PIC is programmed to behave transparently. The Intel i82350 chipset
and ServerWorks ones behave this way.
If neither of the variants works, which is rare, but happens in real life
-- for example some glue logic prevents the PIC from working transparently
-- then the case you are asking about happens. At this moment we only
have a single input for timer interrupts available (be it INTIN0 or LINT0)
and it has to be programmed for the ExtINTA PC/AT compatibility mode and
no input remains for the NMI. We choose LINT0 of the bootstrap CPU as it
offloads the inter-APIC bus a little and provides a slightly lower
latency. We could use INTIN0 as well, but LINT0 never failed so far
(there is also a safeguard in the MP-table parser).
Maciej
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +
On Wed, Nov 06, 2002 at 11:49:07AM -0800, george anzinger wrote:
> So the performance counters are only used on UP machines?
no. nmi_watchdog=1 -> I/O APIC is used iff available and it works
nmi_watchdog=2 -> local APIC LVTPC set to interrupt in NMI mode when
perfctr overflows.
=2 can be used on both UP and SMP, =1 is only available on UP for the
rare machines that have an I/O APIC on a UP motherboard (I believe there
are some, but I don't know if the code is set up to do so properly).
> Also, what is the point of turning off the nmi in this way
> (i.e. nmi_watchdog = 0;)? If the interrupts are not
> generated the test of the flag will not be done in traps.c.
> Is it tested else where?
NMIs can have other sources. In particular if we get an NMI from an
unknown source, we want to tell the user we're dazed and confused.
Currently, if we boot with nmi_watchdog=2 on SMP /and/ that io_apic.c
code sets nmi_watchdog to 0, it seems we will get an incorrect "dazed
and confused" every time the perfctr overflows (which will take a while
to overflow the full 40 bits, but ...)
[hmm, actually this would depend on exactly what order the setup is
done, I'm too lazy to check]
So I think that test definitely needs to be there, but it needs to be
if (nmi_watchdog == NMI_IO_APIC)
as Maciej ACKed.
regards
john
--
"When a man has nothing to say, the worst thing he can do is to say it
memorably."
- Calvin Trillin
"Maciej W. Rozycki" wrote:
>
> On Wed, 6 Nov 2002, george anzinger wrote:
>
> > So then the NMI checks for timer interrupts being serviced
> > in this case? But, still, why the turn off if the timer
> > does not go thru the APIC? The case this came up in is an
> > SMP machine, but the test in apic.c shows that the PIT
> > interrupt does not go thru the APIC. Leaving NMI on seems
> > to work, so I am wondering if this is just old code.
>
> For the I/O APIC NMI watchdog the PIT timer is used as a source of NMI
> interrupts as well as a source of timer interrupts. For this to work you
> need to have two APIC interrupt inputs to receive timer ticks, one
> programmed as an ordinary LoPri interrupt and the other one as an NMI one.
So the performance counters are only used on UP machines?
Also, what is the point of turning off the nmi in this way
(i.e. nmi_watchdog = 0;)? If the interrupts are not
generated the test of the flag will not be done in traps.c.
Is it tested else where?
>
> Our implementation supports two common variants:
>
> 1. The PIT timer is directly connected to an I/O APIC input (typically
> INTIN2 of the first I/O APIC) *and* to the master i8259A PIC (hereafter
> referred to as PIC). The output of the PIC is connected both to an I/O
> APIC input (typically INTIN0 of the first I/O APIC) *and* to all LINT0
> inputs of local APICs. For such a setup, the I/O APIC input INTIN2 is
> programmed to send LoPri timer interrupts and the LINT0 inputs are
> programmed to send NMIs -- for the latter to work the PIC is programmed to
> behave transparently. Intel chipsets usually behave this way -- an
> exception is the ancient EISA-only i82350.
>
> 2. The PIT timer is directly connected to the PIC only. The output of the
> PIC is connected both to an I/O APIC input (typically INTIN0 of the first
> I/O APIC) *and* to all LINT0 inputs of local APICs. For such a setup, the
> I/O APIC input INTIN0 is programmed to send LoPri timer interrupts and the
> LINT0 inputs are programmed to send NMIs -- for both interrupts to work
> the PIC is programmed to behave transparently. The Intel i82350 chipset
> and ServerWorks ones behave this way.
>
> If neither of the variants works, which is rare, but happens in real life
> -- for example some glue logic prevents the PIC from working transparently
> -- then the case you are asking about happens. At this moment we only
> have a single input for timer interrupts available (be it INTIN0 or LINT0)
> and it has to be programmed for the ExtINTA PC/AT compatibility mode and
> no input remains for the NMI. We choose LINT0 of the bootstrap CPU as it
> offloads the inter-APIC bus a little and provides a slightly lower
> latency. We could use INTIN0 as well, but LINT0 never failed so far
> (there is also a safeguard in the MP-table parser).
>
> Maciej
>
> --
> + Maciej W. Rozycki, Technical University of Gdansk, Poland +
> +--------------------------------------------------------------+
> + e-mail: [email protected], PGP key available +
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
George Anzinger [email protected]
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
On Thu, 7 Nov 2002, Bill Davidsen wrote:
> By any chance, does this implementation imply that if I boot SMP with
> 'noapic' the NMI watchdog won't work? It doesn't, but I am not sure I had
> it on before I turned off the APIC.
The I/O APIC watchdog won't be enabled as the chip isn't used then. You
may still use the local APIC watchdog (i.e. "nmi_watchdog=2"), but that
requires at least a P6-class processor (while the I/O APIC watchdog works
even with the i486).
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +
On Wed, 6 Nov 2002, John Levon wrote:
> On Wed, Nov 06, 2002 at 11:49:07AM -0800, george anzinger wrote:
>
> > So the performance counters are only used on UP machines?
>
> no. nmi_watchdog=1 -> I/O APIC is used iff available and it works
> nmi_watchdog=2 -> local APIC LVTPC set to interrupt in NMI mode when
> perfctr overflows.
>
> =2 can be used on both UP and SMP, =1 is only available on UP for the
> rare machines that have an I/O APIC on a UP motherboard (I believe there
> are some, but I don't know if the code is set up to do so properly).
By any chance, does this implementation imply that if I boot SMP with
'noapic' the NMI watchdog won't work? It doesn't, but I am not sure I had
it on before I turned off the APIC.
Clearly this would be desirable to work, as noapic is needed on a fairly
large minority of machines.
--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
On Thu, 7 Nov 2002, Bill Davidsen wrote:
> By any chance, does this implementation imply that if I boot SMP with
> 'noapic' the NMI watchdog won't work? It doesn't, but I am not sure I had
> it on before I turned off the APIC.
>
> Clearly this would be desirable to work, as noapic is needed on a fairly
> large minority of machines.
You're not using IO-APIC interrupt handling therefore you can't use it to
deliver to the Local-APIC unit. You're out of luck, just use Local-APIC
NMI watchdog.
Zwane
PS first time i've heard 'fairly large minority' ;)
--
function.linuxpower.ca