Currently if mce_end() fails no_way_out is set equal to worst.
worst is the worst severirty that was found in the MCA banks
associated to the current CPU; however at this point no_way_out
could be already set by mca_start() by looking at all severities
of all CPUs that entered the MCE handler.
if mce_end() fails we first check if no_way_out is already set and
if so we stick to it, otherwise we use the local worst value
Signed-off-by: Gabriele Paoloni <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
arch/x86/kernel/cpu/mce/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 4102b866e7c0..b990892c6766 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1385,7 +1385,7 @@ noinstr void do_machine_check(struct pt_regs *regs)
*/
if (!lmce) {
if (mce_end(order) < 0)
- no_way_out = worst >= MCE_PANIC_SEVERITY;
+ no_way_out = no_way_out ? no_way_out : worst >= MCE_PANIC_SEVERITY;
} else {
/*
* If there was a fatal machine check we should have
--
2.20.1
---------------------------------------------------------------------
INTEL CORPORATION ITALIA S.p.A. con unico socio
Sede: Milanofiori Palazzo E 4
CAP 20094 Assago (MI)
Capitale Sociale Euro 104.000,00 interamente versato
Partita I.V.A. e Codice Fiscale 04236760155
Repertorio Economico Amministrativo n. 997124
Registro delle Imprese di Milano nr. 183983/5281/33
Soggetta ad attivita' di direzione e coordinamento di
INTEL CORPORATION, USA
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
On Wed, Nov 18, 2020 at 03:15:49PM +0000, Gabriele Paoloni wrote:
> Currently if mce_end() fails no_way_out is set equal to worst.
> worst is the worst severirty that was found in the MCA banks
^^^^^^^^^
Please introduce a spellchecker into your patch creation workflow.
> associated to the current CPU; however at this point no_way_out
^
with
> could be already set by mca_start() by looking at all severities
I think you mean "could have been already set" here
> of all CPUs that entered the MCE handler.
> if mce_end() fails we first check if no_way_out is already set and
Please use passive voice in your commit message: no "we" or "I", etc.
Also, pls start new sentences with a capital letter and end them with a
fullstop.
> if so we stick to it, otherwise we use the local worst value
So basically you're trying to say here that no_way_out might have been
already set and other CPUs could overwrite it and that should not
happen.
Is that what you mean?
> Signed-off-by: Gabriele Paoloni <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
> ---
> arch/x86/kernel/cpu/mce/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 4102b866e7c0..b990892c6766 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -1385,7 +1385,7 @@ noinstr void do_machine_check(struct pt_regs *regs)
> */
> if (!lmce) {
> if (mce_end(order) < 0)
> - no_way_out = worst >= MCE_PANIC_SEVERITY;
> + no_way_out = no_way_out ? no_way_out : worst >= MCE_PANIC_SEVERITY;
I had to stare at this a bit to figure out what you're doing. So how
about simplifying this:
if (!no_way_out)
no_way_out = worst >= MCE_PANIC_SEVERITY;
?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Hi Boris
> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Friday, November 20, 2020 6:08 PM
> To: Paoloni, Gabriele <[email protected]>
> Cc: Luck, Tony <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH 1/4] x86/mce: do not overwrite no_way_out if
> mce_end() fails
>
> On Wed, Nov 18, 2020 at 03:15:49PM +0000, Gabriele Paoloni wrote:
> > Currently if mce_end() fails no_way_out is set equal to worst.
> > worst is the worst severirty that was found in the MCA banks
> ^^^^^^^^^
>
> Please introduce a spellchecker into your patch creation workflow.
>
> > associated to the current CPU; however at this point no_way_out
> ^
> with
>
>
> > could be already set by mca_start() by looking at all severities
>
> I think you mean "could have been already set" here
>
> > of all CPUs that entered the MCE handler.
> > if mce_end() fails we first check if no_way_out is already set and
>
> Please use passive voice in your commit message: no "we" or "I", etc.
>
> Also, pls start new sentences with a capital letter and end them with a
> fullstop.
Sorry about the grammar errors above, I'll pay more attention in future
>
> > if so we stick to it, otherwise we use the local worst value
>
> So basically you're trying to say here that no_way_out might have been
> already set and other CPUs could overwrite it and that should not
> happen.
>
> Is that what you mean?
I mean that on this CPU thread at this point mce_start() already cached
global_nwo and hence could accumulate fatal severities of other CPUs.
Now here if mce_end() fails we only consider the local 'worst' severity
and we overwrite those already cached.
>
> > Signed-off-by: Gabriele Paoloni <[email protected]>
> > Reviewed-by: Tony Luck <[email protected]>
> > ---
> > arch/x86/kernel/cpu/mce/core.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/cpu/mce/core.c
> b/arch/x86/kernel/cpu/mce/core.c
> > index 4102b866e7c0..b990892c6766 100644
> > --- a/arch/x86/kernel/cpu/mce/core.c
> > +++ b/arch/x86/kernel/cpu/mce/core.c
> > @@ -1385,7 +1385,7 @@ noinstr void do_machine_check(struct pt_regs
> *regs)
> > */
> > if (!lmce) {
> > if (mce_end(order) < 0)
> > - no_way_out = worst >= MCE_PANIC_SEVERITY;
> > + no_way_out = no_way_out ? no_way_out : worst >=
> MCE_PANIC_SEVERITY;
>
> I had to stare at this a bit to figure out what you're doing. So how
> about simplifying this:
>
> if (!no_way_out)
> no_way_out = worst >=
Yes that works as well improving readability.
If ok I will fix the grammar and rewrite this code in v2.
Many Thanks
Gab
> MCE_PANIC_SEVERITY;
>
> ?
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
---------------------------------------------------------------------
INTEL CORPORATION ITALIA S.p.A. con unico socio
Sede: Milanofiori Palazzo E 4
CAP 20094 Assago (MI)
Capitale Sociale Euro 104.000,00 interamente versato
Partita I.V.A. e Codice Fiscale 04236760155
Repertorio Economico Amministrativo n. 997124
Registro delle Imprese di Milano nr. 183983/5281/33
Soggetta ad attivita' di direzione e coordinamento di
INTEL CORPORATION, USA
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
On Wed, Nov 18, 2020 at 03:15:49PM +0000, Gabriele Paoloni wrote:
> Currently if mce_end() fails no_way_out is set equal to worst.
> worst is the worst severirty that was found in the MCA banks
> associated to the current CPU; however at this point no_way_out
> could be already set by mca_start() by looking at all severities
> of all CPUs that entered the MCE handler.
> if mce_end() fails we first check if no_way_out is already set and
> if so we stick to it, otherwise we use the local worst value
>
> Signed-off-by: Gabriele Paoloni <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
Also, this very likely wants Cc: stable, I'd say, considering the
severity.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Fri, Nov 20, 2020 at 05:31:32PM +0000, Paoloni, Gabriele wrote:
> I mean that on this CPU thread at this point mce_start() already cached
> global_nwo and hence could accumulate fatal severities of other CPUs.
>
> Now here if mce_end() fails we only consider the local 'worst' severity
> and we overwrite those already cached.
Yap, we're on the same page. :)
> If ok I will fix the grammar and rewrite this code in v2.
Sure, lemme go through the rest first.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
[...]
> Also, this very likely wants Cc: stable, I'd say, considering the
> severity.
Sure, will add stable in v2.
Thanks
Gab
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
---------------------------------------------------------------------
INTEL CORPORATION ITALIA S.p.A. con unico socio
Sede: Milanofiori Palazzo E 4
CAP 20094 Assago (MI)
Capitale Sociale Euro 104.000,00 interamente versato
Partita I.V.A. e Codice Fiscale 04236760155
Repertorio Economico Amministrativo n. 997124
Registro delle Imprese di Milano nr. 183983/5281/33
Soggetta ad attivita' di direzione e coordinamento di
INTEL CORPORATION, USA
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
On Fri, Nov 20, 2020 at 06:33:42PM +0100, Borislav Petkov wrote:
> Sure, lemme go through the rest first.
Done, thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette