2013-06-20 09:24:28

by Chen, Gong

[permalink] [raw]
Subject: [PATCH] x86/MCE: Update MCE severity condition check

Update some SRAR severity conditions check to make it clearer,
according to latest Intel SDM Vol 3(June 2013), table 15-20.

Signed-off-by: Chen Gong <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce-severity.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index beb1f16..1fa12ea 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -110,22 +110,17 @@ static struct severity {
/* known AR MCACODs: */
#ifdef CONFIG_MEMORY_FAILURE
MCESEV(
- KEEP, "HT thread notices Action required: data load error",
- SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
- MCGMASK(MCG_STATUS_EIPV, 0)
+ KEEP, "Action required but non-affected thread is continuable",
+ SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR),
+ MCGMASK(MCG_STATUS_RIPV, MCG_STATUS_RIPV)
),
MCESEV(
- AR, "Action required: data load error",
+ AR, "Action required: data load error on user land",
SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
USER
),
MCESEV(
- KEEP, "HT thread notices Action required: instruction fetch error",
- SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR),
- MCGMASK(MCG_STATUS_EIPV, 0)
- ),
- MCESEV(
- AR, "Action required: instruction fetch error",
+ AR, "Action required: instruction fetch error on user land",
SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR),
USER
),
--
1.7.10.4


2013-06-20 09:42:03

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/MCE: Update MCE severity condition check

On Thu, Jun 20, 2013 at 05:16:12AM -0400, Chen Gong wrote:
> Update some SRAR severity conditions check to make it clearer,
> according to latest Intel SDM Vol 3(June 2013), table 15-20.
>
> Signed-off-by: Chen Gong <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce-severity.c | 15 +++++----------
> 1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> index beb1f16..1fa12ea 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> @@ -110,22 +110,17 @@ static struct severity {
> /* known AR MCACODs: */
> #ifdef CONFIG_MEMORY_FAILURE
> MCESEV(
> - KEEP, "HT thread notices Action required: data load error",
> - SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
> - MCGMASK(MCG_STATUS_EIPV, 0)
> + KEEP, "Action required but non-affected thread is continuable",
> + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR),
> + MCGMASK(MCG_STATUS_RIPV, MCG_STATUS_RIPV)
> ),
> MCESEV(
> - AR, "Action required: data load error",
> + AR, "Action required: data load error on user land",

You mean "data load error in a user process"?

> SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
> USER
> ),
> MCESEV(
> - KEEP, "HT thread notices Action required: instruction fetch error",
> - SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR),
> - MCGMASK(MCG_STATUS_EIPV, 0)
> - ),
> - MCESEV(
> - AR, "Action required: instruction fetch error",
> + AR, "Action required: instruction fetch error on user land",

ditto?

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-21 12:47:15

by Chen, Gong

[permalink] [raw]
Subject: Re: [PATCH] x86/MCE: Update MCE severity condition check

On Thu, Jun 20, 2013 at 11:41:52AM +0200, Borislav Petkov wrote:
> Date: Thu, 20 Jun 2013 11:41:52 +0200
> From: Borislav Petkov <[email protected]>
> To: Chen Gong <[email protected]>
> Cc: [email protected], [email protected]
> Subject: Re: [PATCH] x86/MCE: Update MCE severity condition check
> User-Agent: Mutt/1.5.21 (2010-09-15)
>
> On Thu, Jun 20, 2013 at 05:16:12AM -0400, Chen Gong wrote:
> > Update some SRAR severity conditions check to make it clearer,
> > according to latest Intel SDM Vol 3(June 2013), table 15-20.
> >
> > Signed-off-by: Chen Gong <[email protected]>
> > ---
> > arch/x86/kernel/cpu/mcheck/mce-severity.c | 15 +++++----------
> > 1 file changed, 5 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> > index beb1f16..1fa12ea 100644
> > --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
> > +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> > @@ -110,22 +110,17 @@ static struct severity {
> > /* known AR MCACODs: */
> > #ifdef CONFIG_MEMORY_FAILURE
> > MCESEV(
> > - KEEP, "HT thread notices Action required: data load error",
> > - SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
> > - MCGMASK(MCG_STATUS_EIPV, 0)
> > + KEEP, "Action required but non-affected thread is continuable",
> > + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR),
> > + MCGMASK(MCG_STATUS_RIPV, MCG_STATUS_RIPV)
> > ),
> > MCESEV(
> > - AR, "Action required: data load error",
> > + AR, "Action required: data load error on user land",
>
> You mean "data load error in a user process"?

Yes it is.
>
> > SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
> > USER
> > ),
> > MCESEV(
> > - KEEP, "HT thread notices Action required: instruction fetch error",
> > - SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR),
> > - MCGMASK(MCG_STATUS_EIPV, 0)
> > - ),
> > - MCESEV(
> > - AR, "Action required: instruction fetch error",
> > + AR, "Action required: instruction fetch error on user land",
>
> ditto?
>
ditto

> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --


Attachments:
(No filename) (2.27 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2013-06-25 06:33:14

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [PATCH] x86/MCE: Update MCE severity condition check

On 2013/06/20 05:16AM, Chen Gong wrote:
> Update some SRAR severity conditions check to make it clearer,
> according to latest Intel SDM Vol 3(June 2013), table 15-20.
>
> Signed-off-by: Chen Gong <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce-severity.c | 15 +++++----------
> 1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> index beb1f16..1fa12ea 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> @@ -110,22 +110,17 @@ static struct severity {
> /* known AR MCACODs: */
> #ifdef CONFIG_MEMORY_FAILURE
> MCESEV(
> - KEEP, "HT thread notices Action required: data load error",
> - SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
> - MCGMASK(MCG_STATUS_EIPV, 0)
> + KEEP, "Action required but non-affected thread is continuable",

The SDM talks about "non-affected" logical processors, but perhaps we
can call this an "unaffected" thread?

- Naveen

2013-06-25 16:31:27

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH] x86/MCE: Update MCE severity condition check

> The SDM talks about "non-affected" logical processors, but perhaps we
> can call this an "unaffected" thread?

"unaffected" sounds a bit more natural (but close enough to the wording in
the SDM that people should see the connection).

-Tony

2013-06-25 20:08:32

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [PATCH] x86/MCE: Update MCE severity condition check

On 06/25/2013 10:01 PM, Luck, Tony wrote:
>> The SDM talks about "non-affected" logical processors, but perhaps we
>> can call this an "unaffected" thread?
>
> "unaffected" sounds a bit more natural (but close enough to the wording in
> the SDM that people should see the connection).

Yup - "unnatural" is precisely the term that describes my feeling when I
read the original description :)

Thanks,
Naveen

2013-06-26 09:27:08

by Chen, Gong

[permalink] [raw]
Subject: Re: [PATCH] x86/MCE: Update MCE severity condition check

On Tue, Jun 25, 2013 at 04:31:23PM +0000, Luck, Tony wrote:
> Date: Tue, 25 Jun 2013 16:31:23 +0000
> From: "Luck, Tony" <[email protected]>
> To: "Naveen N. Rao" <[email protected]>, Chen Gong
> <[email protected]>
> CC: "[email protected]" <[email protected]>, "[email protected]"
> <[email protected]>
> Subject: RE: [PATCH] x86/MCE: Update MCE severity condition check
>
> > The SDM talks about "non-affected" logical processors, but perhaps we
> > can call this an "unaffected" thread?
>
> "unaffected" sounds a bit more natural (but close enough to the wording in
> the SDM that people should see the connection).
>
> -Tony

If this patch is OK, would you please help to update it when merging
it? Thanks very much.


Attachments:
(No filename) (763.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments