2022-05-09 06:24:51

by Ricardo Neri

[permalink] [raw]
Subject: [PATCH v6 21/29] x86/nmi: Add an NMI_WATCHDOG NMI handler category

Add a NMI_WATCHDOG as a new category of NMI handler. This new category
is to be used with the HPET-based hardlockup detector. This detector
does not have a direct way of checking if the HPET timer is the source of
the NMI. Instead, it indirectly estimates it using the time-stamp counter.

Therefore, we may have false-positives in case another NMI occurs within
the estimated time window. For this reason, we want the handler of the
detector to be called after all the NMI_LOCAL handlers. A simple way
of achieving this with a new NMI handler category.

Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Tony Luck <[email protected]>
Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v5:
* Updated to call instrumentation_end() as per f051f6979550 ("x86/nmi:
Protect NMI entry against instrumentation")

Changes since v4:
* None

Changes since v3:
* None

Changes since v2:
* Introduced this patch.

Changes since v1:
* N/A
---
arch/x86/include/asm/nmi.h | 1 +
arch/x86/kernel/nmi.c | 10 ++++++++++
2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index 1cb9c17a4cb4..4a0d5b562c91 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -28,6 +28,7 @@ enum {
NMI_UNKNOWN,
NMI_SERR,
NMI_IO_CHECK,
+ NMI_WATCHDOG,
NMI_MAX
};

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index e73f7df362f5..fde387e0812a 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -61,6 +61,10 @@ static struct nmi_desc nmi_desc[NMI_MAX] =
.lock = __RAW_SPIN_LOCK_UNLOCKED(&nmi_desc[3].lock),
.head = LIST_HEAD_INIT(nmi_desc[3].head),
},
+ {
+ .lock = __RAW_SPIN_LOCK_UNLOCKED(&nmi_desc[4].lock),
+ .head = LIST_HEAD_INIT(nmi_desc[4].head),
+ },

};

@@ -168,6 +172,8 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
*/
WARN_ON_ONCE(type == NMI_SERR && !list_empty(&desc->head));
WARN_ON_ONCE(type == NMI_IO_CHECK && !list_empty(&desc->head));
+ WARN_ON_ONCE(type == NMI_WATCHDOG && !list_empty(&desc->head));
+

/*
* some handlers need to be executed first otherwise a fake
@@ -379,6 +385,10 @@ static noinstr void default_do_nmi(struct pt_regs *regs)
}
raw_spin_unlock(&nmi_reason_lock);

+ handled = nmi_handle(NMI_WATCHDOG, regs);
+ if (handled == NMI_HANDLED)
+ goto out;
+
/*
* Only one NMI can be latched at a time. To handle
* this we may process multiple nmi handlers at once to
--
2.17.1



2022-05-09 14:09:38

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v6 21/29] x86/nmi: Add an NMI_WATCHDOG NMI handler category

On Thu, May 05 2022 at 17:00, Ricardo Neri wrote:
> Add a NMI_WATCHDOG as a new category of NMI handler. This new category
> is to be used with the HPET-based hardlockup detector. This detector
> does not have a direct way of checking if the HPET timer is the source of
> the NMI. Instead, it indirectly estimates it using the time-stamp counter.
>
> Therefore, we may have false-positives in case another NMI occurs within
> the estimated time window. For this reason, we want the handler of the
> detector to be called after all the NMI_LOCAL handlers. A simple way
> of achieving this with a new NMI handler category.
>
> @@ -379,6 +385,10 @@ static noinstr void default_do_nmi(struct pt_regs *regs)
> }
> raw_spin_unlock(&nmi_reason_lock);
>
> + handled = nmi_handle(NMI_WATCHDOG, regs);
> + if (handled == NMI_HANDLED)
> + goto out;
> +

How is this supposed to work reliably?

If perf is active and the HPET NMI and the perf NMI come in around the
same time, then nmi_handle(LOCAL) can swallow the NMI and the watchdog
won't be checked. Because MSI is strictly edge and the message is only
sent once, this can result in a stale watchdog, no?

Thanks,

tglx




2022-05-18 04:44:51

by Ricardo Neri

[permalink] [raw]
Subject: Re: [PATCH v6 21/29] x86/nmi: Add an NMI_WATCHDOG NMI handler category

On Mon, May 09, 2022 at 03:59:40PM +0200, Thomas Gleixner wrote:
> On Thu, May 05 2022 at 17:00, Ricardo Neri wrote:
> > Add a NMI_WATCHDOG as a new category of NMI handler. This new category
> > is to be used with the HPET-based hardlockup detector. This detector
> > does not have a direct way of checking if the HPET timer is the source of
> > the NMI. Instead, it indirectly estimates it using the time-stamp counter.
> >
> > Therefore, we may have false-positives in case another NMI occurs within
> > the estimated time window. For this reason, we want the handler of the
> > detector to be called after all the NMI_LOCAL handlers. A simple way
> > of achieving this with a new NMI handler category.
> >
> > @@ -379,6 +385,10 @@ static noinstr void default_do_nmi(struct pt_regs *regs)
> > }
> > raw_spin_unlock(&nmi_reason_lock);
> >
> > + handled = nmi_handle(NMI_WATCHDOG, regs);
> > + if (handled == NMI_HANDLED)
> > + goto out;
> > +
>
> How is this supposed to work reliably?
>
> If perf is active and the HPET NMI and the perf NMI come in around the
> same time, then nmi_handle(LOCAL) can swallow the NMI and the watchdog
> won't be checked. Because MSI is strictly edge and the message is only
> sent once, this can result in a stale watchdog, no?

This is true. Instead, at the end of each NMI I should _also_ check if the TSC
is within the expected value of the HPET NMI watchdog. In this way, unrelated
NMIs (e.g., perf NMI) are handled and we don't miss the NMI from the HPET
channel.

Thanks and BR,
Ricardo