Hi Boris.
This is the 2nd iteration of a smallish series with some fixes/cleanups
for the handling of MCEs. It should apply cleanly to your ras/core branch.
The first iteration can be found here:
https://lore.kernel.org/linux-edac/[email protected]/T/
Changes v1 -> v2:
- dropped patches 3, 5, 6 as you already cherry-picked them into
ras/core or ras/urgent (this renumbers patch 4 in v1 to patch 3 in v2);
- addressed remaining comments on patches 1-3;
- added patch 5 as per Yazen's comment that the SRAO notifier shall
not be used on AMD for now;
- added patch 4 as a prerequisite for the given realization of patch 5;
- added patch 6 as an example, what else can be done due to patch 4.
See individual patches 1-3 for more detailed comments on changes.
I'm not yet convinced, that patch 6 is an entirely good idea. I've
still included it for discussion. If we end up not doing something
like it, we can as well rewrite patch 5 to be just another "if"
within srao_decode_notifier()/uc_decode_notifier().
Regards
Jan
Jan H. Schönherr (6):
x86/mce: Take action on UCNA/Deferred errors again
x86/mce: Make mce=nobootlog work again
x86/mce: Fix use of uninitialized MCE message string
x86/mce: Allow a variable number of internal MCE decode notifiers
x86/mce: Do not take action on SRAO/Deferred errors on AMD for now
x86/mce: Dynamically register default MCE handler
arch/x86/include/asm/mce.h | 2 +-
arch/x86/kernel/cpu/mce/core.c | 145 ++++++++++++++++++---------------
2 files changed, 81 insertions(+), 66 deletions(-)
--
2.22.0.3.gb49bb57c8208.dirty
Per Yazen Ghannam we should not use the UC notifier for the time
being on AMD.
Reported-by: Yazen Ghannam <[email protected]>
Signed-off-by: Jan H. Schönherr <[email protected]>
---
New in v2. This is due to a remark from Yazen on v1, that we shouldn't
be handling neither SRAO nor Deferred errors in that handler.
An alternative implementation would do the architecture "if" directly
within uc_decode_notifier(), in which case we could decide to not apply
patch 4.
---
arch/x86/kernel/cpu/mce/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d48deb127071..d8fe5b048ee7 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1970,7 +1970,8 @@ int __init mcheck_init(void)
{
mcheck_intel_therm_init();
mce_register_decode_chain_internal(&first_nb);
- mce_register_decode_chain_internal(&mce_uc_nb);
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
+ mce_register_decode_chain_internal(&mce_uc_nb);
mce_register_decode_chain_internal(&mce_default_nb);
mcheck_vendor_init_severity();
--
2.22.0.3.gb49bb57c8208.dirty
The default MCE handler takes action, when no external MCE handler is
registered. Instead of checking for external handlers within the default
MCE handler, only register the default MCE handler when there are no
external handlers in the first place.
Signed-off-by: Jan H. Schönherr <[email protected]>
---
New in v2. This is something that became possible due to patch 4.
I'm not entirely happy with it.
One the one hand, I'm wondering whether there's a nicer way to handle
(de-)registration races.
On the other hand, I'm starting to question the whole logic to "only print
the MCE if nothing else in the kernel has a handler registered". Is that
really how it should be? For example, there are handlers that filter for a
specific subset of MCEs. If one of those is registered, we're losing all
information for MCEs that don't match.
A possible solution to the latter would be to have a "handled" or "printed"
flag within "struct mce" and print the MCE based on that in the default
handler. What do you think?
---
arch/x86/kernel/cpu/mce/core.c | 90 ++++++++++++++++++++--------------
1 file changed, 52 insertions(+), 38 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d8fe5b048ee7..3b6e37c5252f 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -156,36 +156,6 @@ void mce_log(struct mce *m)
}
EXPORT_SYMBOL_GPL(mce_log);
-/*
- * We run the default notifier as long as we have no external
- * notifiers registered on the chain.
- */
-static atomic_t num_notifiers;
-
-static void mce_register_decode_chain_internal(struct notifier_block *nb)
-{
- if (WARN_ON(nb->priority > MCE_PRIO_MCELOG && nb->priority < MCE_PRIO_EDAC))
- return;
-
- blocking_notifier_chain_register(&x86_mce_decoder_chain, nb);
-}
-
-void mce_register_decode_chain(struct notifier_block *nb)
-{
- atomic_inc(&num_notifiers);
-
- mce_register_decode_chain_internal(nb);
-}
-EXPORT_SYMBOL_GPL(mce_register_decode_chain);
-
-void mce_unregister_decode_chain(struct notifier_block *nb)
-{
- atomic_dec(&num_notifiers);
-
- blocking_notifier_chain_unregister(&x86_mce_decoder_chain, nb);
-}
-EXPORT_SYMBOL_GPL(mce_unregister_decode_chain);
-
static inline u32 ctl_reg(int bank)
{
return MSR_IA32_MCx_CTL(bank);
@@ -606,18 +576,19 @@ static struct notifier_block mce_uc_nb = {
.priority = MCE_PRIO_UC,
};
+/*
+ * We run the default notifier as long as we have no external
+ * notifiers registered on the chain.
+ */
+static atomic_t num_notifiers;
+
static int mce_default_notifier(struct notifier_block *nb, unsigned long val,
void *data)
{
struct mce *m = (struct mce *)data;
- if (!m)
- return NOTIFY_DONE;
-
- if (atomic_read(&num_notifiers))
- return NOTIFY_DONE;
-
- __print_mce(m);
+ if (m)
+ __print_mce(m);
return NOTIFY_DONE;
}
@@ -628,6 +599,49 @@ static struct notifier_block mce_default_nb = {
.priority = MCE_PRIO_LOWEST,
};
+static void update_default_notifier_registration(void)
+{
+ bool has_notifiers = !!atomic_read(&num_notifiers);
+
+retry:
+ if (has_notifiers)
+ blocking_notifier_chain_unregister(&x86_mce_decoder_chain,
+ &mce_default_nb);
+ else
+ blocking_notifier_chain_cond_register(&x86_mce_decoder_chain,
+ &mce_default_nb);
+
+ if (has_notifiers != !!atomic_read(&num_notifiers)) {
+ has_notifiers = !has_notifiers;
+ goto retry;
+ }
+}
+
+static void mce_register_decode_chain_internal(struct notifier_block *nb)
+{
+ if (WARN_ON(nb->priority > MCE_PRIO_MCELOG &&
+ nb->priority < MCE_PRIO_EDAC))
+ return;
+
+ blocking_notifier_chain_register(&x86_mce_decoder_chain, nb);
+}
+
+void mce_register_decode_chain(struct notifier_block *nb)
+{
+ atomic_inc(&num_notifiers);
+ mce_register_decode_chain_internal(nb);
+ update_default_notifier_registration();
+}
+EXPORT_SYMBOL_GPL(mce_register_decode_chain);
+
+void mce_unregister_decode_chain(struct notifier_block *nb)
+{
+ atomic_dec(&num_notifiers);
+ update_default_notifier_registration();
+ blocking_notifier_chain_unregister(&x86_mce_decoder_chain, nb);
+}
+EXPORT_SYMBOL_GPL(mce_unregister_decode_chain);
+
/*
* Read ADDR and MISC registers.
*/
@@ -1972,7 +1986,7 @@ int __init mcheck_init(void)
mce_register_decode_chain_internal(&first_nb);
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
mce_register_decode_chain_internal(&mce_uc_nb);
- mce_register_decode_chain_internal(&mce_default_nb);
+ update_default_notifier_registration();
mcheck_vendor_init_severity();
INIT_WORK(&mce_work, mce_gen_pool_process);
--
2.22.0.3.gb49bb57c8208.dirty
Since commit 8b38937b7ab5 ("x86/mce: Do not enter deferred errors into
the generic pool twice") the mce=nobootlog option has become mostly
ineffective (after being only slightly ineffective before), as the
code is taking actions on MCEs left over from boot when they have a
usable address.
Move the check for MCP_DONTLOG a bit outward to make it effective again.
Also, since commit 011d82611172 ("RAS: Add a Corrected Errors Collector")
the two branches of the remaining "if" the bottom of machine_check_poll()
do same. Unify them.
Signed-off-by: Jan H. Schönherr <[email protected]>
---
Changes v1->v2:
- remove an indentation level in favor of a goto (requested by Boris)
- don't mention Linux version (per remark from Boris)
---
arch/x86/kernel/cpu/mce/core.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 16134ce587fd..0ccd6cf3402d 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -750,26 +750,22 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
log_it:
error_seen = true;
- mce_read_aux(&m, i);
+ if (flags & MCP_DONTLOG)
+ goto clear_it;
+ mce_read_aux(&m, i);
m.severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
-
/*
* Don't get the IP here because it's unlikely to
* have anything to do with the actual error location.
*/
- if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce)
- mce_log(&m);
- else if (mce_usable_address(&m)) {
- /*
- * Although we skipped logging this, we still want
- * to take action. Add to the pool so the registered
- * notifiers will see it.
- */
- if (!mce_gen_pool_add(&m))
- mce_schedule_work();
- }
+ if (mca_cfg.dont_log_ce && !mce_usable_address(&m))
+ goto clear_it;
+
+ mce_log(&m);
+
+clear_it:
/*
* Clear state for this bank.
*/
--
2.22.0.3.gb49bb57c8208.dirty
Get rid of the compile time constant of internal (or mandatory)
MCE decode notifiers in preparation for future changes. Instead,
distinguish explicitly between internal and external MCE decode
notifiers.
Signed-off-by: Jan H. Schönherr <[email protected]>
---
New in v2, preparation for patches 5 and 6.
---
arch/x86/kernel/cpu/mce/core.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 1d91ce956772..d48deb127071 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -157,21 +157,24 @@ void mce_log(struct mce *m)
EXPORT_SYMBOL_GPL(mce_log);
/*
- * We run the default notifier if we have only the UC, the first and the
- * default notifier registered. I.e., the mandatory NUM_DEFAULT_NOTIFIERS
+ * We run the default notifier as long as we have no external
* notifiers registered on the chain.
*/
-#define NUM_DEFAULT_NOTIFIERS 3
static atomic_t num_notifiers;
-void mce_register_decode_chain(struct notifier_block *nb)
+static void mce_register_decode_chain_internal(struct notifier_block *nb)
{
if (WARN_ON(nb->priority > MCE_PRIO_MCELOG && nb->priority < MCE_PRIO_EDAC))
return;
+ blocking_notifier_chain_register(&x86_mce_decoder_chain, nb);
+}
+
+void mce_register_decode_chain(struct notifier_block *nb)
+{
atomic_inc(&num_notifiers);
- blocking_notifier_chain_register(&x86_mce_decoder_chain, nb);
+ mce_register_decode_chain_internal(nb);
}
EXPORT_SYMBOL_GPL(mce_register_decode_chain);
@@ -611,7 +614,7 @@ static int mce_default_notifier(struct notifier_block *nb, unsigned long val,
if (!m)
return NOTIFY_DONE;
- if (atomic_read(&num_notifiers) > NUM_DEFAULT_NOTIFIERS)
+ if (atomic_read(&num_notifiers))
return NOTIFY_DONE;
__print_mce(m);
@@ -1966,9 +1969,9 @@ __setup("mce", mcheck_enable);
int __init mcheck_init(void)
{
mcheck_intel_therm_init();
- mce_register_decode_chain(&first_nb);
- mce_register_decode_chain(&mce_uc_nb);
- mce_register_decode_chain(&mce_default_nb);
+ mce_register_decode_chain_internal(&first_nb);
+ mce_register_decode_chain_internal(&mce_uc_nb);
+ mce_register_decode_chain_internal(&mce_default_nb);
mcheck_vendor_init_severity();
INIT_WORK(&mce_work, mce_gen_pool_process);
--
2.22.0.3.gb49bb57c8208.dirty
On Fri, Jan 03, 2020 at 04:07:22PM +0100, Jan H. Schönherr wrote:
> +retry:
> + if (has_notifiers)
> + blocking_notifier_chain_unregister(&x86_mce_decoder_chain,
> + &mce_default_nb);
> + else
> + blocking_notifier_chain_cond_register(&x86_mce_decoder_chain,
> + &mce_default_nb);
I get a compile error here:
CC arch/x86/kernel/cpu/mce/core.o
arch/x86/kernel/cpu/mce/core.c: In function ‘update_default_notifier_registration’:
arch/x86/kernel/cpu/mce/core.c:616:3: error: implicit declaration of function ‘blocking_notifier_chain_cond_register’; did you mean ‘blocking_notifier_chain_unregister’? [-Werror=implicit-function-declaration]
blocking_notifier_chain_cond_register(&x86_mce_decoder_chain,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
blocking_notifier_chain_unregister
-Tony
On Fri, Jan 03, 2020 at 04:07:22PM +0100, Jan H. Sch?nherr wrote:
> The default MCE handler takes action, when no external MCE handler is
> registered. Instead of checking for external handlers within the default
> MCE handler, only register the default MCE handler when there are no
> external handlers in the first place.
>
> Signed-off-by: Jan H. Sch?nherr <[email protected]>
> ---
> New in v2. This is something that became possible due to patch 4.
> I'm not entirely happy with it.
>
> One the one hand, I'm wondering whether there's a nicer way to handle
> (de-)registration races.
>
Instead of unregistering/registering the default notifier depending
on whether there are other notifiers, couldn't you just make the
default notifier check to see if it should print. E.g.
static int mce_default_notifier(struct notifier_block *nb, unsigned long val,
void *data)
{
struct mce *m = (struct mce *)data;
if (m && !atomic_read(&num_notifiers))
__print_mce(m);
return NOTIFY_DONE;
}
> On the other hand, I'm starting to question the whole logic to "only print
> the MCE if nothing else in the kernel has a handler registered". Is that
> really how it should be? For example, there are handlers that filter for a
> specific subset of MCEs. If one of those is registered, we're losing all
> information for MCEs that don't match.
Maybe put control of this into the hands of the notifiers ... if
a notifier thinks that it does something useful with the log
that makes the console log redundant, then it could call a function
to bump the count to suppress the __print_mce(). Simple filter functions
on the chain wouldn't do that.
If we go this path the variable should be named something like
"suppress_console_mce" rather than num_notifiers.
Might also be useful to have some command line option or debugfs knob
to force printing for those cases where bad stuff is happening and we
don't see what was logged before a crash drops all the bits on the floor.
-Tony
On Fri, Jan 03, 2020 at 04:07:22PM +0100, Jan H. Schönherr wrote:
> On the other hand, I'm starting to question the whole logic to "only print
> the MCE if nothing else in the kernel has a handler registered". Is that
> really how it should be?
Yes, it should be this way: if there are no consumers, all error
information should go to dmesg so that it gets printed at least.
> For example, there are handlers that filter for a specific subset of
> MCEs. If one of those is registered, we're losing all information for
> MCEs that don't match.
Probably but I don't think there's an example of an actual system where
there are no other MCE consumers registered. Not if its users care about
RAS. This default fallback was added for the hypothetical case anyway.
> A possible solution to the latter would be to have a "handled" or "printed"
> flag within "struct mce" and print the MCE based on that in the default
> handler. What do you think?
Before we go and fix whatever, we need to define what exactly we're
fixing. Is there an actual system you're seeing this on or is this
something that would never happen in reality? Because if the latter, I
don't really care TBH. As in, there's more important stuff to take care
of first.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Hi "Jan,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on tip/auto-latest]
[also build test ERROR on v5.5-rc4 next-20191220]
[cannot apply to tip/x86/core]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Jan-H-Sch-nherr/x86-mce-Various-fixes-and-cleanups-for-MCE-handling/20200104-183135
base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git acfe9d882f586288f663aa73209f40e034003c13
config: x86_64-defconfig (attached as .config)
compiler: gcc-7 (Debian 7.5.0-3) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <[email protected]>
All errors (new ones prefixed by >>):
arch/x86/kernel/cpu/mce/core.c: In function 'update_default_notifier_registration':
>> arch/x86/kernel/cpu/mce/core.c:616:3: error: implicit declaration of function 'blocking_notifier_chain_cond_register'; did you mean 'blocking_notifier_chain_unregister'? [-Werror=implicit-function-declaration]
blocking_notifier_chain_cond_register(&x86_mce_decoder_chain,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
blocking_notifier_chain_unregister
cc1: some warnings being treated as errors
vim +616 arch/x86/kernel/cpu/mce/core.c
606
607 static void update_default_notifier_registration(void)
608 {
609 bool has_notifiers = !!atomic_read(&num_notifiers);
610
611 retry:
612 if (has_notifiers)
613 blocking_notifier_chain_unregister(&x86_mce_decoder_chain,
614 &mce_default_nb);
615 else
> 616 blocking_notifier_chain_cond_register(&x86_mce_decoder_chain,
617 &mce_default_nb);
618
619 if (has_notifiers != !!atomic_read(&num_notifiers)) {
620 has_notifiers = !has_notifiers;
621 goto retry;
622 }
623 }
624
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Friday, January 3, 2020 5:03 PM
> To: Jan H. Schönherr <[email protected]>
> Cc: Ghannam, Yazen <[email protected]>; [email protected]; [email protected]; Tony Luck
> <[email protected]>; Thomas Gleixner <[email protected]>; Ingo Molnar <[email protected]>; H. Peter Anvin <[email protected]>;
> [email protected]
> Subject: Re: [PATCH v2 6/6] x86/mce: Dynamically register default MCE handler
>
> On Fri, Jan 03, 2020 at 04:07:22PM +0100, Jan H. Schönherr wrote:
> > On the other hand, I'm starting to question the whole logic to "only print
> > the MCE if nothing else in the kernel has a handler registered". Is that
> > really how it should be?
>
> Yes, it should be this way: if there are no consumers, all error
> information should go to dmesg so that it gets printed at least.
>
> > For example, there are handlers that filter for a specific subset of
> > MCEs. If one of those is registered, we're losing all information for
> > MCEs that don't match.
>
> Probably but I don't think there's an example of an actual system where
> there are no other MCE consumers registered. Not if its users care about
> RAS. This default fallback was added for the hypothetical case anyway.
>
> > A possible solution to the latter would be to have a "handled" or "printed"
> > flag within "struct mce" and print the MCE based on that in the default
> > handler. What do you think?
>
> Before we go and fix whatever, we need to define what exactly we're
> fixing. Is there an actual system you're seeing this on or is this
> something that would never happen in reality? Because if the latter, I
> don't really care TBH. As in, there's more important stuff to take care
> of first.
>
I've encountered an issue where EDAC didn't load (either due to a bug or
missing hardware enablement) and the MCE got swallowed by the mcelog notifier.
The mcelog utility wasn't in use, so there was no record of the MCE. This can
be considered a system configuration issue though that can be resolved with a
bit of tweaking. But maybe we can find a solution to prevent something like
this?
Thanks,
Yazen
On Wed, Jan 08, 2020 at 04:24:33AM +0000, Ghannam, Yazen wrote:
> I've encountered an issue where EDAC didn't load (either due to a
> bug or missing hardware enablement) and the MCE got swallowed by the
> mcelog notifier. The mcelog utility wasn't in use, so there was no
> record of the MCE.
Can you reproduce this using the injector?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Wednesday, January 8, 2020 5:04 AM
> To: Ghannam, Yazen <[email protected]>
> Cc: Jan H. Schönherr <[email protected]>; [email protected]; [email protected]; Tony Luck
> <[email protected]>; Thomas Gleixner <[email protected]>; Ingo Molnar <[email protected]>; H. Peter Anvin <[email protected]>;
> [email protected]
> Subject: Re: [PATCH v2 6/6] x86/mce: Dynamically register default MCE handler
>
> On Wed, Jan 08, 2020 at 04:24:33AM +0000, Ghannam, Yazen wrote:
> > I've encountered an issue where EDAC didn't load (either due to a
> > bug or missing hardware enablement) and the MCE got swallowed by the
> > mcelog notifier. The mcelog utility wasn't in use, so there was no
> > record of the MCE.
>
> Can you reproduce this using the injector?
>
Yes, I was just able to do it. Here's what I did.
1) Unload EDAC decoder modules.
2) Inject a corrected MCE using mce-inject.
3) Observe "Machine check events logged" message and no other MCE info.
4) Manually load mcelog and find MCE info.
It seems to me that the issue is the mcelog notifier counts toward the number
of notifiers, so the default notifier doesn't print anything.
Of course, this can be avoided if the EDAC modules are loaded, or if mcelog is
disabled in the kernel config, or if rasdaemon is used, etc.
Thanks,
Yazen
> It seems to me that the issue is the mcelog notifier counts toward the number
> of notifiers, so the default notifier doesn't print anything.
If we gave a API to the notifiers to say whether to suppress printing, then the
dev_mcelog() code could do the suppression only if some process had
/dev/mcelog open. So if mcelog(8) wasn't running, you'd still see the console
message.
-Tony
On 09/01/2020 22.54, Luck, Tony wrote:
>> It seems to me that the issue is the mcelog notifier counts toward the number
>> of notifiers, so the default notifier doesn't print anything.
>
> If we gave a API to the notifiers to say whether to suppress printing, then the
> dev_mcelog() code could do the suppression only if some process had
> /dev/mcelog open. So if mcelog(8) wasn't running, you'd still see the console
> message.
I briefly looked into that.
There is the issue that mcelog code buffers MCEs unconditionally. And we probably
don't want to deactivate that, so that MCEs during boot can be queried
a bit later via /dev/mcelog.
We would get a bit of duplicate logging, if we let mcelog report "supress
printing" only if there is an actual consumer. (Or if there was a consumer
once, in case there are periodically polling consumers.)
Regards
Jan
--
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879