2012-10-17 10:59:45

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

On Fri, Sep 21, 2012 at 05:09:04PM +0530, Naveen N. Rao wrote:
> Hi Tony,
> Can you kindly take in this patch if there are no further comments?
>
> Thanks,
> Naveen
>
> On 09/12/2012 05:55 PM, Naveen N. Rao wrote:
> >The ACPI spec doesn't provide for a way for the bios to pass down
> >recommended thresholds to the OS on a _per-bank_ basis. This patch adds
> >a new boot option, which if passed, allows bios to initialize the CMCI
> >threshold. In such a case, we simply skip programming any threshold
> >value.
> >
> >As fail-safe, we initialize threshold to 1 if some banks have not been
> >initialized by the bios and warn the user.
> >
> >v3: Updated messages as per Tony's inputs.
> >v2: Just separating out the patch. I will send a separate patch for
> >consolidating the MCE boot flags.
> >
> >Signed-off-by: Naveen N. Rao <[email protected]>
> >---
> > Documentation/x86/x86_64/boot-options.txt | 5 ++++
> > arch/x86/include/asm/mce.h | 1 +
> > arch/x86/kernel/cpu/mcheck/mce.c | 10 ++++++++
> > arch/x86/kernel/cpu/mcheck/mce_intel.c | 35 +++++++++++++++++++++++++++--
> > 4 files changed, 48 insertions(+), 3 deletions(-)
> >
> >diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
> >index c54b4f5..ec92540 100644
> >--- a/Documentation/x86/x86_64/boot-options.txt
> >+++ b/Documentation/x86/x86_64/boot-options.txt
> >@@ -50,6 +50,11 @@ Machine check
> > monarchtimeout:
> > Sets the time in us to wait for other CPUs on machine checks. 0
> > to disable.
> >+ mce=bios_cmci_threshold
> >+ Don't overwrite the bios-set CMCI threshold. This boot option
> >+ prevents Linux from overwriting the CMCI threshold set by the
> >+ bios. Without this option, Linux always sets the CMCI
> >+ threshold to 1.
> >
> > nomce (for compatibility with i386): same as mce=off
> >
> >diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
> >index a3ac52b..8ad5078 100644
> >--- a/arch/x86/include/asm/mce.h
> >+++ b/arch/x86/include/asm/mce.h
> >@@ -171,6 +171,7 @@ DECLARE_PER_CPU(struct device *, mce_device);
> > #ifdef CONFIG_X86_MCE_INTEL
> > extern int mce_cmci_disabled;
> > extern int mce_ignore_ce;
> >+extern int mce_bios_cmci_threshold;
> > void mce_intel_feature_init(struct cpuinfo_x86 *c);
> > void cmci_clear(void);
> > void cmci_reenable(void);
> >diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> >index c311122..29e87d3 100644
> >--- a/arch/x86/kernel/cpu/mcheck/mce.c
> >+++ b/arch/x86/kernel/cpu/mcheck/mce.c
> >@@ -83,6 +83,7 @@ static int mce_dont_log_ce __read_mostly;
> > int mce_cmci_disabled __read_mostly;
> > int mce_ignore_ce __read_mostly;
> > int mce_ser __read_mostly;
> >+int mce_bios_cmci_threshold __read_mostly;
> >
> > struct mce_bank *mce_banks __read_mostly;
> >
> >@@ -1946,6 +1947,7 @@ static struct miscdevice mce_chrdev_device = {
> > * check, or 0 to not wait
> > * mce=bootlog Log MCEs from before booting. Disabled by default on AMD.
> > * mce=nobootlog Don't log MCEs from before booting.
> >+ * mce=bios_cmci_threshold Don't program the CMCI threshold
> > */
> > static int __init mcheck_enable(char *str)
> > {
> >@@ -1965,6 +1967,8 @@ static int __init mcheck_enable(char *str)
> > mce_ignore_ce = 1;
> > else if (!strcmp(str, "bootlog") || !strcmp(str, "nobootlog"))
> > mce_bootlog = (str[0] == 'b');
> >+ else if (!strcmp(str, "bios_cmci_threshold"))
> >+ mce_bios_cmci_threshold = 1;
> > else if (isdigit(str[0])) {
> > get_option(&str, &tolerant);
> > if (*str == ',') {
> >@@ -2205,6 +2209,11 @@ static struct dev_ext_attribute dev_attr_cmci_disabled = {
> > &mce_cmci_disabled
> > };
> >
> >+static struct dev_ext_attribute dev_attr_bios_cmci_threshold = {
> >+ __ATTR(bios_cmci_threshold, 0444, device_show_int, NULL),
> >+ &mce_bios_cmci_threshold

Ok, I just noticed this (we must've missed it during review) but why is
this read-only? If it has to be read-only, why do we have a node for
this in sysfs instead of simply issuing the printk statements below and
people who are interested in this, can grep dmesg?

If there's no apparent reason, I'll remove this chunk adding the sysfs
attribute.

Thanks.

--
Regards/Gruss,
Boris.


2012-10-17 11:27:42

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

On 10/17/2012 04:29 PM, Borislav Petkov wrote:
>>>
>>> +static struct dev_ext_attribute dev_attr_bios_cmci_threshold = {
>>> + __ATTR(bios_cmci_threshold, 0444, device_show_int, NULL),
>>> + &mce_bios_cmci_threshold
>
> Ok, I just noticed this (we must've missed it during review) but why is
> this read-only? If it has to be read-only, why do we have a node for
> this in sysfs instead of simply issuing the printk statements below and
> people who are interested in this, can grep dmesg?

This was added so that user-space tools could find out if we're using
thresholds for CMCI.

Regards,
Naveen

2012-10-17 13:09:38

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

On Wed, Oct 17, 2012 at 04:57:30PM +0530, Naveen N. Rao wrote:
> On 10/17/2012 04:29 PM, Borislav Petkov wrote:
> >>>
> >>>+static struct dev_ext_attribute dev_attr_bios_cmci_threshold = {
> >>>+ __ATTR(bios_cmci_threshold, 0444, device_show_int, NULL),
> >>>+ &mce_bios_cmci_threshold
> >
> >Ok, I just noticed this (we must've missed it during review) but why is
> >this read-only? If it has to be read-only, why do we have a node for
> >this in sysfs instead of simply issuing the printk statements below and
> >people who are interested in this, can grep dmesg?
>
> This was added so that user-space tools could find out if we're
> using thresholds for CMCI.

That I figured out.

What I can't figure out is why userspace tools need to know that - the
fact that some MCI_CTL2 has a 0 CMCI threshold because BIOS forgot to
set it correctly? IOW, this is a rather evolved workaround for b0rked
BIOS (the gazillionth BIOS f*ckup, btw if someone is counting :)) and,
on top of that, we have a read-only, special sysfs node which is pretty
useless to me.

Why?

--
Regards/Gruss,
Boris.

2012-10-17 15:48:30

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

On 10/17/2012 06:39 PM, Borislav Petkov wrote:
> On Wed, Oct 17, 2012 at 04:57:30PM +0530, Naveen N. Rao wrote:
>> On 10/17/2012 04:29 PM, Borislav Petkov wrote:
>>>>>
>>>>> +static struct dev_ext_attribute dev_attr_bios_cmci_threshold = {
>>>>> + __ATTR(bios_cmci_threshold, 0444, device_show_int, NULL),
>>>>> + &mce_bios_cmci_threshold
>>>
>>> Ok, I just noticed this (we must've missed it during review) but why is
>>> this read-only? If it has to be read-only, why do we have a node for
>>> this in sysfs instead of simply issuing the printk statements below and
>>> people who are interested in this, can grep dmesg?
>>
>> This was added so that user-space tools could find out if we're
>> using thresholds for CMCI.
>
> That I figured out.
>
> What I can't figure out is why userspace tools need to know that - the
> fact that some MCI_CTL2 has a 0 CMCI threshold because BIOS forgot to
> set it correctly? IOW, this is a rather evolved workaround for b0rked
> BIOS (the gazillionth BIOS f*ckup, btw if someone is counting :)) and,
> on top of that, we have a read-only, special sysfs node which is pretty
> useless to me.
>
> Why?
>

This is not about the CMCI threshold being 0, but rather about the
threshold not being 1. Linux used to program a threshold of 1 always but
with this boot option, a firmware-set non-zero threshold is honoured.

Userspace tools need this sysfs attribute so they know how to react on
receipt of a corrected error event: whether this is the first event or
if such events have already been threshold-ed.


Regards,
Naveen

2012-10-17 16:40:14

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

On Wed, Oct 17, 2012 at 09:17:39PM +0530, Naveen N. Rao wrote:
> Userspace tools need this sysfs attribute so they know how to react on
> receipt of a corrected error event: whether this is the first event or
> if such events have already been threshold-ed.

What's wrong with userspace tools parsing /proc/cmdline and seeing that
mce_bios_cmci_threshold has been set since this is the only way to set
it anyway?

I still don't see any reason for that read-only sysfs attribute.

Thanks.

--
Regards/Gruss,
Boris.

2012-10-17 17:28:13

by Tony Luck

[permalink] [raw]
Subject: RE: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

> What's wrong with userspace tools parsing /proc/cmdline and seeing that
> mce_bios_cmci_threshold has been set since this is the only way to set
> it anyway?

The argument might be on the command line, but may have been rejected
because the BIOS didn't set the thresholds? So then you'd have to look at
the command line, *and* check /var/log/messages to make sure we hadn't
printed the message saying the BIOS was unsupportive.

BUT ... I don't think that knowing this is sufficient. A userspace tool would
want to know what value had been set for each bank. So if it really wants to
do something interesting, just knowing that "bios set some thresholds" doesn't
sound like enough information.

BUT (squared) do you even really need to know that thresholds were set? You
could look at bits {52:38} in the MCi_STATUS information for the bank to see
how many corrected errors had been logged.

-Tony
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2012-10-17 18:09:16

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

On Wed, Oct 17, 2012 at 05:28:11PM +0000, Luck, Tony wrote:
> > What's wrong with userspace tools parsing /proc/cmdline and seeing that
> > mce_bios_cmci_threshold has been set since this is the only way to set
> > it anyway?
>
> The argument might be on the command line, but may have been rejected
> because the BIOS didn't set the thresholds? So then you'd have to look at
> the command line, *and* check /var/log/messages to make sure we hadn't
> printed the message saying the BIOS was unsupportive.

Right, but the sysfs read-only thing doesn't solve that either - it only
dumps the mce_bios_cmci_threshold value so it doesn't help.

The stuff you're talking about is not reflected in that variable but
saved in some other like bios_wrong_thresh and bios_zero_thresh.

> BUT ... I don't think that knowing this is sufficient. A userspace
> tool would want to know what value had been set for each bank. So if
> it really wants to do something interesting, just knowing that "bios
> set some thresholds" doesn't sound like enough information.

Yes. And I still haven't gotten a real use case of what userspace tool
will do with this information.

> BUT (squared) do you even really need to know that thresholds were
> set? You could look at bits {52:38} in the MCi_STATUS information for
> the bank to see how many corrected errors had been logged.

Right, a real use case which shows why we need that information should
be good.

Thanks.

> -Tony
> �{.n�+�������+%��lzwm��b�맲��r��zX���iȧ���ܨ}���Ơz�&j:+v�������zZ+��+zf���h���~����i���z��w���?����&�)ߢf

What happened here? It seems your mail client wanted to say something
too but it has been censored.

:-)

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

2012-10-18 05:43:28

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

On 10/17/2012 10:58 PM, Luck, Tony wrote:
> BUT (squared) do you even really need to know that thresholds were set? You
> could look at bits {52:38} in the MCi_STATUS information for the bank to see
> how many corrected errors had been logged.

Ah, nice. I think we should be able to use this instead of the sysfs
attribute.

Thanks,
Naveen

2012-10-18 13:25:01

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

On Thu, Oct 18, 2012 at 11:13:14AM +0530, Naveen N. Rao wrote:
> On 10/17/2012 10:58 PM, Luck, Tony wrote:
> >BUT (squared) do you even really need to know that thresholds were set? You
> >could look at bits {52:38} in the MCi_STATUS information for the bank to see
> >how many corrected errors had been logged.
>
> Ah, nice. I think we should be able to use this instead of the sysfs
> attribute.

Ok, patch below removes it.

@Tony: I'll send it upwards soonish in case there are no objections.
This way no stable backport will be needed.

Thanks.

--
From: Borislav Petkov <[email protected]>
Date: Thu, 18 Oct 2012 15:10:56 +0200
Subject: [PATCH] x86, MCE: Remove bios_cmci_threshold sysfs attribute

450cc201038f3 ("x86/mce: Provide boot argument to honour bios-set CMCI
threshold") added the bios_cmci_threshold attribute which was supposed
to communicate to userspace tools that BIOS CMCI threshold has been
honoured.

However, this info is not of any importance to userspace - rather it
can get the actual error count it has been thresholded already from
MCi_STATUS[38:52].

So drop this before it becomes a used interface (good thing is we caught
this early in 3.7-rc1, right after the merge window closed).

Cc: Naveen N. Rao <[email protected]>
Cc: Tony Luck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 29e87d3b2843..46cbf8689692 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2209,11 +2209,6 @@ static struct dev_ext_attribute dev_attr_cmci_disabled = {
&mce_cmci_disabled
};

-static struct dev_ext_attribute dev_attr_bios_cmci_threshold = {
- __ATTR(bios_cmci_threshold, 0444, device_show_int, NULL),
- &mce_bios_cmci_threshold
-};
-
static struct device_attribute *mce_device_attrs[] = {
&dev_attr_tolerant.attr,
&dev_attr_check_interval.attr,
@@ -2222,7 +2217,6 @@ static struct device_attribute *mce_device_attrs[] = {
&dev_attr_dont_log_ce.attr,
&dev_attr_ignore_ce.attr,
&dev_attr_cmci_disabled.attr,
- &dev_attr_bios_cmci_threshold.attr,
NULL
};

--
1.8.0.rc2.4.g42e55a5

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

2012-10-18 16:46:14

by Tony Luck

[permalink] [raw]
Subject: RE: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

> @Tony: I'll send it upwards soonish in case there are no objections.
> This way no stable backport will be needed.

Acked-by: Tony Luck <[email protected]>

Subject: [tip:x86/urgent] x86, MCE: Remove bios_cmci_threshold sysfs attribute

Commit-ID: 5bc66170dc486556a1e36fd384463536573f4b82
Gitweb: http://git.kernel.org/tip/5bc66170dc486556a1e36fd384463536573f4b82
Author: Borislav Petkov <[email protected]>
AuthorDate: Thu, 18 Oct 2012 15:10:56 +0200
Committer: Borislav Petkov <[email protected]>
CommitDate: Fri, 19 Oct 2012 15:22:29 +0200

x86, MCE: Remove bios_cmci_threshold sysfs attribute

450cc201038f3 ("x86/mce: Provide boot argument to honour bios-set CMCI
threshold") added the bios_cmci_threshold sysfs attribute which was
supposed to communicate to userspace tools that BIOS CMCI threshold has
been honoured.

However, this info is not of any importance to userspace - it should
rather get the actual error count it has been thresholded already from
MCi_STATUS[38:52].

So drop this before it becomes a used interface (good thing we caught
this early in 3.7-rc1, right after the merge window closed).

Cc: Naveen N. Rao <[email protected]>
Acked-by: Tony Luck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce.c | 6 ------
1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 29e87d3..46cbf86 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2209,11 +2209,6 @@ static struct dev_ext_attribute dev_attr_cmci_disabled = {
&mce_cmci_disabled
};

-static struct dev_ext_attribute dev_attr_bios_cmci_threshold = {
- __ATTR(bios_cmci_threshold, 0444, device_show_int, NULL),
- &mce_bios_cmci_threshold
-};
-
static struct device_attribute *mce_device_attrs[] = {
&dev_attr_tolerant.attr,
&dev_attr_check_interval.attr,
@@ -2222,7 +2217,6 @@ static struct device_attribute *mce_device_attrs[] = {
&dev_attr_dont_log_ce.attr,
&dev_attr_ignore_ce.attr,
&dev_attr_cmci_disabled.attr,
- &dev_attr_bios_cmci_threshold.attr,
NULL
};