by Hidetoshi Seto

[permalink] [raw]

Subject: [PATCH -tip 2/3] x86, mce: Revert "add mce=nopoll option to disable timer polling"

Disabling only polling but not cmci is pointless setting.
Instead of "mce=nopoll" which tend to be paired with cmci disablement,
it rather make sense to have a "mce=ignore_ce" option that disable
both of polling and cmci at once. A patch for this new implementation
will follow this reverting patch.

OTOH, once booted, we can disable polling by setting check_interval
to 0, but there are no mention about the fact. Later Andi will post
updated documents that can respond this issue.

Signed-off-by: Hidetoshi Seto <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
Documentation/x86/x86_64/boot-options.txt | 2 --
arch/x86/kernel/cpu/mcheck/mce_64.c | 10 ++--------
2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
index 5d55158..34c1304 100644
--- a/Documentation/x86/x86_64/boot-options.txt
+++ b/Documentation/x86/x86_64/boot-options.txt
@@ -13,8 +13,6 @@ Machine check
in a reboot. On Intel systems it is enabled by default.
mce=nobootlog
Disable boot machine check logging.
- mce=nopoll
- Disable timer polling for corrected errors
mce=tolerancelevel (number)
0: always panic on uncorrected errors, log corrected errors
1: panic or SIGBUS on uncorrected errors, log corrected errors
diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c b/arch/x86/kernel/cpu/mcheck/mce_64.c
index 80ec191..33d612e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -449,8 +449,6 @@ void mce_log_therm_throt_event(__u64 status)
* Periodic polling timer for "silent" machine check errors. If the
* poller finds an MCE, poll 2x faster. When the poller finds no more
* errors, poll 2x slower (up to check_interval seconds).
- *
- * If check_interval is 0, polling is disabled.
*/

static int check_interval = 5 * 60; /* 5 minutes */
@@ -635,12 +633,11 @@ static void mce_init_timer(void)
{
struct timer_list *t = &__get_cpu_var(mce_timer);

- /* Disable polling if check_interval is 0 */
- if (!check_interval)
- return;
/* data race harmless because everyone sets to the same value */
if (!next_interval)
next_interval = check_interval * HZ;
+ if (!next_interval)
+ return;
setup_timer(t, mcheck_timer, smp_processor_id());
t->expires = round_jiffies(jiffies + next_interval);
add_timer(t);
@@ -848,14 +845,11 @@ __setup("nomce", mcheck_disable);
* mce=TOLERANCELEVEL (number, see above)
* mce=bootlog Log MCEs from before booting. Disabled by default on AMD.
* mce=nobootlog Don't log MCEs from before booting.
- * mce=nopoll Disable timer polling for corrected errors
*/
static int __init mcheck_enable(char *str)
{
if (!strcmp(str, "off"))
mce_dont_init = 1;
- else if (!strcmp(str, "nopoll"))
- check_interval = 0;
else if (!strcmp(str, "bootlog") || !strcmp(str, "nobootlog"))
mce_bootlog = (str[0] == 'b');
else if (isdigit(str[0]))
--
1.6.2.1

2009-04-02 04:58:34

by Hidetoshi Seto

[permalink] [raw]

Subject: [PATCH -tip 3/3] x86, mce: Add new option mce=no_cmci and mce=ignore_ce

This patch introduces a couple of boot option for x86_64 mce.

The "mce=no_cmci" boot option disables cmci feature.
Since cmci is a new feature so having boot controls to disable
it will be a help if the hardware is misbehaving.

The "mce=ignore_ce" boot option disables features for corrected
errors, i.e. polling timer and cmci. Usually this disablement
is not recommended, however it will be a help if there are some
conflict with the BIOS or hardware monitoring applications etc.

And trivial cleanup (space -> tab) for doc is included.

Signed-off-by: Hidetoshi Seto <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
Documentation/x86/x86_64/boot-options.txt | 22 ++++++++++++++++------
arch/x86/include/asm/mce.h | 2 ++
arch/x86/kernel/cpu/mcheck/mce_64.c | 11 +++++++++++
arch/x86/kernel/cpu/mcheck/mce_intel_64.c | 3 +++
4 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
index 34c1304..730b09b 100644
--- a/Documentation/x86/x86_64/boot-options.txt
+++ b/Documentation/x86/x86_64/boot-options.txt
@@ -5,12 +5,22 @@ only the AMD64 specific ones are listed here.

Machine check

- mce=off disable machine check
- mce=bootlog Enable logging of machine checks left over from booting.
- Disabled by default on AMD because some BIOS leave bogus ones.
- If your BIOS doesn't do that it's a good idea to enable though
- to make sure you log even machine check events that result
- in a reboot. On Intel systems it is enabled by default.
+ mce=off
+ Disable machine check
+ mce=no_cmci
+ Disable CMCI(Corrected Machine Check Interrupt) that
+ Intel processor supports.
+ mce=ignore_ce
+ Disable features for corrected errors, e.g. polling timer
+ and CMCI. Usually this disablement is not recommended,
+ however it will be a help if there are some conflict with
+ BIOS or hardware monitoring applications etc.
+ mce=bootlog
+ Enable logging of machine checks left over from booting.
+ Disabled by default on AMD because some BIOS leave bogus ones.
+ If your BIOS doesn't do that it's a good idea to enable though
+ to make sure you log even machine check events that result
+ in a reboot. On Intel systems it is enabled by default.
mce=nobootlog
Disable boot machine check logging.
mce=tolerancelevel (number)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 563933e..065858c 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -104,6 +104,8 @@ extern void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu);
#define MAX_NR_BANKS (MCE_EXTENDED_BANK - 1)

#ifdef CONFIG_X86_MCE_INTEL
+extern int cmci_disabled;
+extern int ignore_ce;
void mce_intel_feature_init(struct cpuinfo_x86 *c);
void cmci_clear(void);
void cmci_reenable(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c b/arch/x86/kernel/cpu/mcheck/mce_64.c
index 33d612e..bd0fce3 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -41,6 +41,8 @@
atomic_t mce_entry;

static int mce_dont_init;
+int cmci_disabled;
+int ignore_ce;

/*
* Tolerant levels:
@@ -633,6 +635,9 @@ static void mce_init_timer(void)
{
struct timer_list *t = &__get_cpu_var(mce_timer);

+ if (ignore_ce)
+ return;
+
/* data race harmless because everyone sets to the same value */
if (!next_interval)
next_interval = check_interval * HZ;
@@ -842,6 +847,8 @@ __setup("nomce", mcheck_disable);

/*
* mce=off disables machine check
+ * mce=no_cmci disables CMCI
+ * mce=ignore_ce disables polling for corrected errors and also CMCI
* mce=TOLERANCELEVEL (number, see above)
* mce=bootlog Log MCEs from before booting. Disabled by default on AMD.
* mce=nobootlog Don't log MCEs from before booting.
@@ -850,6 +857,10 @@ static int __init mcheck_enable(char *str)
{
if (!strcmp(str, "off"))
mce_dont_init = 1;
+ else if (!strcmp(str, "no_cmci"))
+ cmci_disabled = 1;
+ else if (!strcmp(str, "ignore_ce"))
+ ignore_ce = 1;
else if (!strcmp(str, "bootlog") || !strcmp(str, "nobootlog"))
mce_bootlog = (str[0] == 'b');
else if (isdigit(str[0]))
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
index d6b72df..64c0dd9 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
@@ -109,6 +109,9 @@ static int cmci_supported(int *banks)
{
u64 cap;

+ if (cmci_disabled | ignore_ce)
+ return 0;
+
/*
* Vendor check is not strictly needed, but the initial
* initialization is vendor keyed and this
--
1.6.2.1