2017-03-06 20:12:22

by Mike Travis

[permalink] [raw]
Subject: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain

Add a new NMI call chain that is called last after all other NMI handlers
have been checked and did not "handle" the NMI. This mimics the current
NMI_UNKNOWN call chain except it eliminates the WARNING message about
multiple NMI handlers registering on this call chain.

This call chain dramatically lowers the NMI call frequency when high
frequency NMI tools are in use, notably the perf tools. It is required
for NMI handlers that cannot sustain a high NMI call rate without
ramifications to the system operability.


Signed-off-by: Mike Travis <[email protected]>
Reviewed-by: Russ Anderson <[email protected]>
---
arch/x86/include/asm/nmi.h | 1 +
arch/x86/kernel/nmi.c | 21 ++++++++++++++++++++-
2 files changed, 21 insertions(+), 1 deletion(-)

--- linux-4.4.orig/arch/x86/include/asm/nmi.h
+++ linux-4.4/arch/x86/include/asm/nmi.h
@@ -28,6 +28,7 @@ enum {
NMI_UNKNOWN,
NMI_SERR,
NMI_IO_CHECK,
+ NMI_LAST,
NMI_MAX
};

--- linux-4.4.orig/arch/x86/kernel/nmi.c
+++ linux-4.4/arch/x86/kernel/nmi.c
@@ -57,6 +57,10 @@ static struct nmi_desc nmi_desc[NMI_MAX]
.lock = __SPIN_LOCK_UNLOCKED(&nmi_desc[3].lock),
.head = LIST_HEAD_INIT(nmi_desc[3].head),
},
+ {
+ .lock = __SPIN_LOCK_UNLOCKED(&nmi_desc[4].lock),
+ .head = LIST_HEAD_INIT(nmi_desc[4].head),
+ },

};

@@ -65,6 +69,7 @@ struct nmi_stats {
unsigned int unknown;
unsigned int external;
unsigned int swallow;
+ unsigned int last;
};

static DEFINE_PER_CPU(struct nmi_stats, nmi_stats);
@@ -312,6 +317,20 @@ unknown_nmi_error(unsigned char reason,
}
NOKPROBE_SYMBOL(unknown_nmi_error);

+static void check_nmi_last(unsigned char reason, struct pt_regs *regs)
+{
+ int handled;
+
+ /* Check low frequency, multiple CPU NMI handlers */
+ handled = nmi_handle(NMI_LAST, regs);
+ __this_cpu_add(nmi_stats.last, handled);
+ if (handled)
+ return;
+
+ unknown_nmi_error(reason, regs);
+}
+NOKPROBE_SYMBOL(check_nmi_last);
+
static DEFINE_PER_CPU(bool, swallow_nmi);
static DEFINE_PER_CPU(unsigned long, last_nmi_rip);

@@ -423,7 +442,7 @@ static void default_do_nmi(struct pt_reg
if (b2b && __this_cpu_read(swallow_nmi))
__this_cpu_add(nmi_stats.swallow, 1);
else
- unknown_nmi_error(reason, regs);
+ check_nmi_last(reason, regs);
}
NOKPROBE_SYMBOL(default_do_nmi);


--


2017-03-07 07:42:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain


* Mike Travis <[email protected]> wrote:

> Add a new NMI call chain that is called last after all other NMI handlers
> have been checked and did not "handle" the NMI. This mimics the current
> NMI_UNKNOWN call chain except it eliminates the WARNING message about
> multiple NMI handlers registering on this call chain.
>
> This call chain dramatically lowers the NMI call frequency when high
> frequency NMI tools are in use, notably the perf tools. It is required
> for NMI handlers that cannot sustain a high NMI call rate without
> ramifications to the system operability.

So how about we just turn off that warning instead? I don't remember the last time
it actually _helped_ us find any kernel or hardware bug - and it has caused tons
of problems...

It's not like we warn about excess regular IRQs either - we either handle them or
at most increase a counter somewhere. We could do the same for NMIs: introduce a
counter somewhere that counts the number of seemingly unhandled NMIs.

But in any case, we should not spam the kernel log, neither with high, nor with
low frequency.

Thanks,

Ingo

2017-03-07 15:23:21

by Don Zickus

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain

On Tue, Mar 07, 2017 at 08:42:10AM +0100, Ingo Molnar wrote:
>
> * Mike Travis <[email protected]> wrote:
>
> > Add a new NMI call chain that is called last after all other NMI handlers
> > have been checked and did not "handle" the NMI. This mimics the current
> > NMI_UNKNOWN call chain except it eliminates the WARNING message about
> > multiple NMI handlers registering on this call chain.
> >
> > This call chain dramatically lowers the NMI call frequency when high
> > frequency NMI tools are in use, notably the perf tools. It is required
> > for NMI handlers that cannot sustain a high NMI call rate without
> > ramifications to the system operability.
>
> So how about we just turn off that warning instead? I don't remember the last time
> it actually _helped_ us find any kernel or hardware bug - and it has caused tons
> of problems...

Yeah, that is one way to solve it. :-)

Originally I put that in there because the unknown nmi handlers sometime do
not return, making it impossible for the second handler to run.

But you are right, it probably hasn't really helped find any problems. I
wasn't aware of problems it was causing (not that I was looking through
emails to find them either).

Cheers,
Don

>
> It's not like we warn about excess regular IRQs either - we either handle them or
> at most increase a counter somewhere. We could do the same for NMIs: introduce a
> counter somewhere that counts the number of seemingly unhandled NMIs.
>
> But in any case, we should not spam the kernel log, neither with high, nor with
> low frequency.
>
> Thanks,
>
> Ingo

2017-03-07 15:39:56

by Mike Travis

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain



On 3/6/2017 11:42 PM, Ingo Molnar wrote:
>
> * Mike Travis <[email protected]> wrote:
>
>> Add a new NMI call chain that is called last after all other NMI handlers
>> have been checked and did not "handle" the NMI. This mimics the current
>> NMI_UNKNOWN call chain except it eliminates the WARNING message about
>> multiple NMI handlers registering on this call chain.
>>
>> This call chain dramatically lowers the NMI call frequency when high
>> frequency NMI tools are in use, notably the perf tools. It is required
>> for NMI handlers that cannot sustain a high NMI call rate without
>> ramifications to the system operability.
>
> So how about we just turn off that warning instead? I don't remember the last time
> it actually _helped_ us find any kernel or hardware bug - and it has caused tons
> of problems...

I can do that, with an even simpler patch...

>
> It's not like we warn about excess regular IRQs either - we either handle them or
> at most increase a counter somewhere. We could do the same for NMIs: introduce a
> counter somewhere that counts the number of seemingly unhandled NMIs.

Really "unknown" NMI errors are reported by either the "dazed and
confused" message or if the panic on unknown nmi is set, then the
system will panic. So unknown NMI occurrences are already being
dealt with.

Plus the following stats are being collected, though I'm not sure of
any reporting facility:

struct nmi_stats {
unsigned int normal;
unsigned int unknown;
unsigned int external;
unsigned int swallow;
};

static DEFINE_PER_CPU(struct nmi_stats, nmi_stats);


>
> But in any case, we should not spam the kernel log, neither with high, nor with
> low frequency.
>
> Thanks,
>
> Ingo
>

2017-03-07 16:08:29

by Don Zickus

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain

On Tue, Mar 07, 2017 at 08:00:33AM -0800, Mike Travis wrote:
>
>
> On 3/7/2017 7:22 AM, Don Zickus wrote:
> > On Tue, Mar 07, 2017 at 08:42:10AM +0100, Ingo Molnar wrote:
> >>
> >> * Mike Travis <[email protected]> wrote:
> >>
> >>> Add a new NMI call chain that is called last after all other NMI handlers
> >>> have been checked and did not "handle" the NMI. This mimics the current
> >>> NMI_UNKNOWN call chain except it eliminates the WARNING message about
> >>> multiple NMI handlers registering on this call chain.
> >>>
> >>> This call chain dramatically lowers the NMI call frequency when high
> >>> frequency NMI tools are in use, notably the perf tools. It is required
> >>> for NMI handlers that cannot sustain a high NMI call rate without
> >>> ramifications to the system operability.
> >>
> >> So how about we just turn off that warning instead? I don't remember the last time
> >> it actually _helped_ us find any kernel or hardware bug - and it has caused tons
> >> of problems...
> >
> > Yeah, that is one way to solve it. :-)
>
> Actually just removing the WARNING indication and making it an
> INFO message would be enough to quiet objections. Is that enough,

That might be useful to at least log in, but as Ingo said it just might be
overkill in the real world.

> or should the message be completely removed for UNKNOWN NMI
> handlers, and left in place for IO_CHECK and SERR NMI handlers?
>
> >
> > Originally I put that in there because the unknown nmi handlers sometime do
> > not return, making it impossible for the second handler to run.
>
> The only two external unknown NMI handlers that I know of is the UV
> one and the KGDB one. The KGDB one appears to be only claimed if it
> is exiting an NMI_LOCAL event. And the UV one is only claimed if
> it as caused by a UV System NMI event. So truly unknown NMI events
> eventually get to the internal unknown nmi handler.

Good to know. I keep thinking of the hpwdt that wants to eat all NMIs, I
believe that still registers on the unknown nmi (drivers/watchdog/hpwdt.c).

Cheers,
Don

>
> > But you are right, it probably hasn't really helped find any problems. I
> > wasn't aware of problems it was causing (not that I was looking through
> > emails to find them either).
> >
> > Cheers,
> > Don
> >
> >>
> >> It's not like we warn about excess regular IRQs either - we either handle them or
> >> at most increase a counter somewhere. We could do the same for NMIs: introduce a
> >> counter somewhere that counts the number of seemingly unhandled NMIs.
> >>
> >> But in any case, we should not spam the kernel log, neither with high, nor with
> >> low frequency.
> >>
> >> Thanks,
> >>
> >> Ingo

2017-03-07 16:13:34

by Mike Travis

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain



On 3/7/2017 8:07 AM, Don Zickus wrote:
> Good to know. I keep thinking of the hpwdt that wants to eat all NMIs, I
> believe that still registers on the unknown nmi (drivers/watchdog/hpwdt.c).

Thanks for pointing that out. I'm guessing this happens only on
HPE systems, which are now us... :) I'll contact Thomas for more info.

2017-03-07 20:42:58

by Mike Travis

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain



On 3/7/2017 7:22 AM, Don Zickus wrote:
> On Tue, Mar 07, 2017 at 08:42:10AM +0100, Ingo Molnar wrote:
>>
>> * Mike Travis <[email protected]> wrote:
>>
>>> Add a new NMI call chain that is called last after all other NMI handlers
>>> have been checked and did not "handle" the NMI. This mimics the current
>>> NMI_UNKNOWN call chain except it eliminates the WARNING message about
>>> multiple NMI handlers registering on this call chain.
>>>
>>> This call chain dramatically lowers the NMI call frequency when high
>>> frequency NMI tools are in use, notably the perf tools. It is required
>>> for NMI handlers that cannot sustain a high NMI call rate without
>>> ramifications to the system operability.
>>
>> So how about we just turn off that warning instead? I don't remember the last time
>> it actually _helped_ us find any kernel or hardware bug - and it has caused tons
>> of problems...
>
> Yeah, that is one way to solve it. :-)

Actually just removing the WARNING indication and making it an
INFO message would be enough to quiet objections. Is that enough,
or should the message be completely removed for UNKNOWN NMI
handlers, and left in place for IO_CHECK and SERR NMI handlers?

>
> Originally I put that in there because the unknown nmi handlers sometime do
> not return, making it impossible for the second handler to run.

The only two external unknown NMI handlers that I know of is the UV
one and the KGDB one. The KGDB one appears to be only claimed if it
is exiting an NMI_LOCAL event. And the UV one is only claimed if
it as caused by a UV System NMI event. So truly unknown NMI events
eventually get to the internal unknown nmi handler.

> But you are right, it probably hasn't really helped find any problems. I
> wasn't aware of problems it was causing (not that I was looking through
> emails to find them either).
>
> Cheers,
> Don
>
>>
>> It's not like we warn about excess regular IRQs either - we either handle them or
>> at most increase a counter somewhere. We could do the same for NMIs: introduce a
>> counter somewhere that counts the number of seemingly unhandled NMIs.
>>
>> But in any case, we should not spam the kernel log, neither with high, nor with
>> low frequency.
>>
>> Thanks,
>>
>> Ingo

2017-03-08 10:35:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain


* Mike Travis <[email protected]> wrote:

>
>
> On 3/6/2017 11:42 PM, Ingo Molnar wrote:
> >
> > * Mike Travis <[email protected]> wrote:
> >
> >> Add a new NMI call chain that is called last after all other NMI handlers
> >> have been checked and did not "handle" the NMI. This mimics the current
> >> NMI_UNKNOWN call chain except it eliminates the WARNING message about
> >> multiple NMI handlers registering on this call chain.
> >>
> >> This call chain dramatically lowers the NMI call frequency when high
> >> frequency NMI tools are in use, notably the perf tools. It is required
> >> for NMI handlers that cannot sustain a high NMI call rate without
> >> ramifications to the system operability.
> >
> > So how about we just turn off that warning instead? I don't remember the last time
> > it actually _helped_ us find any kernel or hardware bug - and it has caused tons
> > of problems...
>
> I can do that, with an even simpler patch...
>
> >
> > It's not like we warn about excess regular IRQs either - we either handle them or
> > at most increase a counter somewhere. We could do the same for NMIs: introduce a
> > counter somewhere that counts the number of seemingly unhandled NMIs.
>
> Really "unknown" NMI errors are reported by either the "dazed and
> confused" message or if the panic on unknown nmi is set, then the
> system will panic. So unknown NMI occurrences are already being
> dealt with.

So I'd even remove the 'dazed and confused' message - has it ever helped us?

If NMIs are generated but not handled properly then developers and users will
notice it just like when IRQs are lost: either through bad system behavior or via
weird stats in procfs. The kernel log should not get spammed.

So if you could expose the lost NMI stats via procfs or debugfs then we could
remove both the warning and the dazed-and-confused spam on the system log.

This should make perf all around more usable on UV systems, right?

Thanks,

Ingo

2017-03-08 15:19:13

by Mike Travis

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain



On 3/8/2017 2:28 AM, Ingo Molnar wrote:
>
> * Mike Travis <[email protected]> wrote:
>
>>
>>
>> On 3/6/2017 11:42 PM, Ingo Molnar wrote:
>>>
>>> * Mike Travis <[email protected]> wrote:
>>>
>>>> Add a new NMI call chain that is called last after all other NMI handlers
>>>> have been checked and did not "handle" the NMI. This mimics the current
>>>> NMI_UNKNOWN call chain except it eliminates the WARNING message about
>>>> multiple NMI handlers registering on this call chain.
>>>>
>>>> This call chain dramatically lowers the NMI call frequency when high
>>>> frequency NMI tools are in use, notably the perf tools. It is required
>>>> for NMI handlers that cannot sustain a high NMI call rate without
>>>> ramifications to the system operability.
>>>
>>> So how about we just turn off that warning instead? I don't remember the last time
>>> it actually _helped_ us find any kernel or hardware bug - and it has caused tons
>>> of problems...
>>
>> I can do that, with an even simpler patch...
>>
>>>
>>> It's not like we warn about excess regular IRQs either - we either handle them or
>>> at most increase a counter somewhere. We could do the same for NMIs: introduce a
>>> counter somewhere that counts the number of seemingly unhandled NMIs.
>>
>> Really "unknown" NMI errors are reported by either the "dazed and
>> confused" message or if the panic on unknown nmi is set, then the
>> system will panic. So unknown NMI occurrences are already being
>> dealt with.
>
> So I'd even remove the 'dazed and confused' message - has it ever helped us?

I can remove it though it seems to have become an institution, or more
correctly, a common term of reference. :) It does precede the decision
to either attempt to continue system operation, or panic the system
immediately.

> If NMIs are generated but not handled properly then developers and users will
> notice it just like when IRQs are lost: either through bad system behavior or via
> weird stats in procfs. The kernel log should not get spammed.

Having some notice is probably a good thing even if for archaic reasons.
We recently discovered that an internal system error triggered an NMI
event. Without any notice, the system would not have been suspected of
acting strangely, but data could potentially have been silently lost.
(NMI seems by far the least standard standard in the x86 architecture.)

Also, I don't think IRQs and NMIs are in the same league. Missing an
IRQ means an expected I/O operation did not occur. Prudent drivers can
set a timeout to notice missing interrupts.

Missing an NMI usually means that something unexpected occurred but was
not dealt with. Losing perf interrupts is recoverable since there
will be another along shortly. But missing an NMI due to a system
failure event is not. (Why NMI is heavily overloaded, and not very
standardized.)

>
> So if you could expose the lost NMI stats via procfs or debugfs then we could
> remove both the warning and the dazed-and-confused spam on the system log.

I can add this.

>
> This should make perf all around more usable on UV systems, right?

I'm not sure this is accurate. Perf is currently very usable on UV.
But as we increase our online fault analysis procedures, this warning
message stood out as a glaring example of a false positive. Note it
is not warning of anything except there is more than one NMI handler
registering on this "call after all other handlers have been called
and did not claim the NMI" chain.

So let me know if I should go ahead with the above (remove some or
all indications that an unclaimed NMI event occurred, and add a
reporting facility for NMI stats.)

Thanks!
Mike