Ingo Molnar wrote:
> > * Hidetoshi Seto <[email protected]> wrote:
> >
>> >> Hi Ingo,
>> >>
>> >> Sorry, could you drop patch 1/3 and 3/3 from your -tip tree? Andi
>> >> and I discussed some and agreed disabling CMCI need to be done in
>> >> different way (at least parameter name).
> >
> > Could you please send a patch that changes the parameter name?
I'm going to post three patches from now, heading two for revert
and last one for re-implement.
Thanks,
H.Seto
After having some discussion with Andi Kleen, we have concluded
that setting threshold >1 will not work properly. This patch
reverts the previous patch that introduces mce_threshold option.
However as Ingo pointed out, cmci is a new feature so having boot
controls to disable it is generally a good idea, and it might be
handy if the hw is misbehaving.
So instead of mce_threshold, I will introduce "mce=no_cmci" option
to support cmci disablement in later patch.
Compare with mce_threshold, it lacks threshold >1 support but it
does not matter because it will not work.
Signed-off-by: Hidetoshi Seto <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
Documentation/kernel-parameters.txt | 5 ---
arch/x86/include/asm/msr-index.h | 1 -
arch/x86/kernel/cpu/mcheck/mce_intel_64.c | 56 ++---------------------------
3 files changed, 3 insertions(+), 59 deletions(-)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 5e966a9..6172e43 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1208,11 +1208,6 @@ and is between 256 and 4096 characters. It is defined in the file
mce=option [X86-64] See Documentation/x86/x86_64/boot-options.txt
- mce_threshold= [X86-64,intel] Default CMCI threshold
- Should be unsigned integer. Setting 0 disables cmci.
- Format: <integer>
- Default: 1
-
md= [HW] RAID subsystems devices and level
See Documentation/md.txt.
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index bb144e4..ec41fc1 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -85,7 +85,6 @@
#define MSR_IA32_MC0_CTL2 0x00000280
#define CMCI_EN (1ULL << 30)
#define CMCI_THRESHOLD_MASK 0xffffULL
-#define CMCI_THRESHOLD_VAL_MASK 0x7fffULL
#define MSR_P6_PERFCTR0 0x000000c1
#define MSR_P6_PERFCTR1 0x000000c2
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
index 8bd5d0c..d6b72df 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
@@ -103,6 +103,8 @@ static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned);
*/
static DEFINE_SPINLOCK(cmci_discover_lock);
+#define CMCI_THRESHOLD 1
+
static int cmci_supported(int *banks)
{
u64 cap;
@@ -133,51 +135,6 @@ static void intel_threshold_interrupt(void)
mce_notify_user();
}
-/*
- * Default threshold, setting 0 disables cmci
- */
-static unsigned long threshold_limit = 1;
-
-static int __init mcheck_threshold(char *str)
-{
- int val;
-
- get_option(&str, &val);
- if (val < 0) {
- printk(KERN_INFO "mce_threshold argument ignored.\n");
- return 0;
- }
- threshold_limit = val;
-
- return 1;
-}
-__setup("mce_threshold=", mcheck_threshold);
-
-void static cmci_set_threshold(int bank)
-{
- u64 val, limit, max, new;
-
- rdmsrl(MSR_IA32_MC0_CTL2 + bank, val);
- limit = val & CMCI_THRESHOLD_VAL_MASK;
-
- /* Thresholding available? */
- if (!limit)
- return;
- /* Return if no need to change */
- if (limit == threshold_limit)
- return;
-
- /* Find the maximum threshold value */
- max = (val & ~CMCI_THRESHOLD_MASK) | CMCI_THRESHOLD_VAL_MASK;
- wrmsrl(MSR_IA32_MC0_CTL2 + bank, max);
- rdmsrl(MSR_IA32_MC0_CTL2 + bank, max);
- max &= CMCI_THRESHOLD_VAL_MASK;
- max = (threshold_limit > max ? max : threshold_limit);
-
- new = (val & ~CMCI_THRESHOLD_MASK) | max;
- wrmsrl(MSR_IA32_MC0_CTL2 + bank, new);
-}
-
static void print_update(char *type, int *hdr, int num)
{
if (*hdr == 0)
@@ -186,9 +143,6 @@ static void print_update(char *type, int *hdr, int num)
printk(KERN_CONT " %s:%d", type, num);
}
-/* Used to determine whether thresholding is available or not */
-#define CMCI_THRESHOLD_FIRST 1
-
/*
* Enable CMCI (Corrected Machine Check Interrupt) for available MCE banks
* on this CPU. Use the algorithm recommended in the SDM to discover shared
@@ -200,9 +154,6 @@ static void cmci_discover(int banks, int boot)
int hdr = 0;
int i;
- if (!threshold_limit)
- return;
-
spin_lock(&cmci_discover_lock);
for (i = 0; i < banks; i++) {
u64 val;
@@ -220,7 +171,7 @@ static void cmci_discover(int banks, int boot)
continue;
}
- val |= CMCI_EN | CMCI_THRESHOLD_FIRST;
+ val |= CMCI_EN | CMCI_THRESHOLD;
wrmsrl(MSR_IA32_MC0_CTL2 + i, val);
rdmsrl(MSR_IA32_MC0_CTL2 + i, val);
@@ -229,7 +180,6 @@ static void cmci_discover(int banks, int boot)
if (!test_and_set_bit(i, owned) || boot)
print_update("CMCI", &hdr, i);
__clear_bit(i, __get_cpu_var(mce_poll_banks));
- cmci_set_threshold(i);
} else {
WARN_ON(!test_bit(i, __get_cpu_var(mce_poll_banks)));
}
--
1.6.2.2
Disabling only polling but not cmci is pointless setting.
Instead of "mce=nopoll" which tend to be paired with cmci disablement,
it rather make sense to have a "mce=ignore_ce" option that disable
both of polling and cmci at once. A patch for this new implementation
will follow this reverting patch.
OTOH, once booted, we can disable polling by setting check_interval
to 0, but there are no mention about the fact. Later Andi will post
updated documents that can respond this issue.
Signed-off-by: Hidetoshi Seto <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
Documentation/x86/x86_64/boot-options.txt | 2 --
arch/x86/kernel/cpu/mcheck/mce_64.c | 10 ++--------
2 files changed, 2 insertions(+), 10 deletions(-)
diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
index 5d55158..34c1304 100644
--- a/Documentation/x86/x86_64/boot-options.txt
+++ b/Documentation/x86/x86_64/boot-options.txt
@@ -13,8 +13,6 @@ Machine check
in a reboot. On Intel systems it is enabled by default.
mce=nobootlog
Disable boot machine check logging.
- mce=nopoll
- Disable timer polling for corrected errors
mce=tolerancelevel (number)
0: always panic on uncorrected errors, log corrected errors
1: panic or SIGBUS on uncorrected errors, log corrected errors
diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c b/arch/x86/kernel/cpu/mcheck/mce_64.c
index 80ec191..33d612e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -449,8 +449,6 @@ void mce_log_therm_throt_event(__u64 status)
* Periodic polling timer for "silent" machine check errors. If the
* poller finds an MCE, poll 2x faster. When the poller finds no more
* errors, poll 2x slower (up to check_interval seconds).
- *
- * If check_interval is 0, polling is disabled.
*/
static int check_interval = 5 * 60; /* 5 minutes */
@@ -635,12 +633,11 @@ static void mce_init_timer(void)
{
struct timer_list *t = &__get_cpu_var(mce_timer);
- /* Disable polling if check_interval is 0 */
- if (!check_interval)
- return;
/* data race harmless because everyone sets to the same value */
if (!next_interval)
next_interval = check_interval * HZ;
+ if (!next_interval)
+ return;
setup_timer(t, mcheck_timer, smp_processor_id());
t->expires = round_jiffies(jiffies + next_interval);
add_timer(t);
@@ -848,14 +845,11 @@ __setup("nomce", mcheck_disable);
* mce=TOLERANCELEVEL (number, see above)
* mce=bootlog Log MCEs from before booting. Disabled by default on AMD.
* mce=nobootlog Don't log MCEs from before booting.
- * mce=nopoll Disable timer polling for corrected errors
*/
static int __init mcheck_enable(char *str)
{
if (!strcmp(str, "off"))
mce_dont_init = 1;
- else if (!strcmp(str, "nopoll"))
- check_interval = 0;
else if (!strcmp(str, "bootlog") || !strcmp(str, "nobootlog"))
mce_bootlog = (str[0] == 'b');
else if (isdigit(str[0]))
--
1.6.2.2
This patch introduces a couple of boot option for x86_64 mce.
The "mce=no_cmci" boot option disables cmci feature.
Since cmci is a new feature so having boot controls to disable
it will be a help if the hardware is misbehaving.
The "mce=ignore_ce" boot option disables features for corrected
errors, i.e. polling timer and cmci. Usually this disablement
is not recommended, however it will be a help if there are some
conflict with the BIOS or hardware monitoring applications etc.
And trivial cleanup (space -> tab) for doc is included.
Signed-off-by: Hidetoshi Seto <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
Documentation/x86/x86_64/boot-options.txt | 22 ++++++++++++++++------
arch/x86/include/asm/mce.h | 2 ++
arch/x86/kernel/cpu/mcheck/mce_64.c | 11 +++++++++++
arch/x86/kernel/cpu/mcheck/mce_intel_64.c | 3 +++
4 files changed, 32 insertions(+), 6 deletions(-)
diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
index 34c1304..730b09b 100644
--- a/Documentation/x86/x86_64/boot-options.txt
+++ b/Documentation/x86/x86_64/boot-options.txt
@@ -5,12 +5,22 @@ only the AMD64 specific ones are listed here.
Machine check
- mce=off disable machine check
- mce=bootlog Enable logging of machine checks left over from booting.
- Disabled by default on AMD because some BIOS leave bogus ones.
- If your BIOS doesn't do that it's a good idea to enable though
- to make sure you log even machine check events that result
- in a reboot. On Intel systems it is enabled by default.
+ mce=off
+ Disable machine check
+ mce=no_cmci
+ Disable CMCI(Corrected Machine Check Interrupt) that
+ Intel processor supports.
+ mce=ignore_ce
+ Disable features for corrected errors, e.g. polling timer
+ and CMCI. Usually this disablement is not recommended,
+ however it will be a help if there are some conflict with
+ BIOS or hardware monitoring applications etc.
+ mce=bootlog
+ Enable logging of machine checks left over from booting.
+ Disabled by default on AMD because some BIOS leave bogus ones.
+ If your BIOS doesn't do that it's a good idea to enable though
+ to make sure you log even machine check events that result
+ in a reboot. On Intel systems it is enabled by default.
mce=nobootlog
Disable boot machine check logging.
mce=tolerancelevel (number)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 563933e..065858c 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -104,6 +104,8 @@ extern void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu);
#define MAX_NR_BANKS (MCE_EXTENDED_BANK - 1)
#ifdef CONFIG_X86_MCE_INTEL
+extern int cmci_disabled;
+extern int ignore_ce;
void mce_intel_feature_init(struct cpuinfo_x86 *c);
void cmci_clear(void);
void cmci_reenable(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c b/arch/x86/kernel/cpu/mcheck/mce_64.c
index 33d612e..bd0fce3 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -41,6 +41,8 @@
atomic_t mce_entry;
static int mce_dont_init;
+int cmci_disabled;
+int ignore_ce;
/*
* Tolerant levels:
@@ -633,6 +635,9 @@ static void mce_init_timer(void)
{
struct timer_list *t = &__get_cpu_var(mce_timer);
+ if (ignore_ce)
+ return;
+
/* data race harmless because everyone sets to the same value */
if (!next_interval)
next_interval = check_interval * HZ;
@@ -842,6 +847,8 @@ __setup("nomce", mcheck_disable);
/*
* mce=off disables machine check
+ * mce=no_cmci disables CMCI
+ * mce=ignore_ce disables polling for corrected errors and also CMCI
* mce=TOLERANCELEVEL (number, see above)
* mce=bootlog Log MCEs from before booting. Disabled by default on AMD.
* mce=nobootlog Don't log MCEs from before booting.
@@ -850,6 +857,10 @@ static int __init mcheck_enable(char *str)
{
if (!strcmp(str, "off"))
mce_dont_init = 1;
+ else if (!strcmp(str, "no_cmci"))
+ cmci_disabled = 1;
+ else if (!strcmp(str, "ignore_ce"))
+ ignore_ce = 1;
else if (!strcmp(str, "bootlog") || !strcmp(str, "nobootlog"))
mce_bootlog = (str[0] == 'b');
else if (isdigit(str[0]))
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
index d6b72df..64c0dd9 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
@@ -109,6 +109,9 @@ static int cmci_supported(int *banks)
{
u64 cap;
+ if (cmci_disabled | ignore_ce)
+ return 0;
+
/*
* Vendor check is not strictly needed, but the initial
* initialization is vendor keyed and this
--
1.6.2.2
Hidetoshi Seto <[email protected]> writes:
> After having some discussion with Andi Kleen, we have concluded
> that setting threshold >1 will not work properly. This patch
> reverts the previous patch that introduces mce_threshold option.
>
> However as Ingo pointed out, cmci is a new feature so having boot
> controls to disable it is generally a good idea, and it might be
> handy if the hw is misbehaving.
>
> So instead of mce_threshold, I will introduce "mce=no_cmci" option
> to support cmci disablement in later patch.
> Compare with mce_threshold, it lacks threshold >1 support but it
> does not matter because it will not work.
>
> Signed-off-by: Hidetoshi Seto <[email protected]>
> Cc: Andi Kleen <[email protected]>
Acked-by: Andi Kleen <[email protected]>
-Andi
--
[email protected] -- Speaking for myself only.
Hidetoshi Seto <[email protected]> writes:
> Disabling only polling but not cmci is pointless setting.
> Instead of "mce=nopoll" which tend to be paired with cmci disablement,
> it rather make sense to have a "mce=ignore_ce" option that disable
> both of polling and cmci at once. A patch for this new implementation
> will follow this reverting patch.
>
> OTOH, once booted, we can disable polling by setting check_interval
> to 0, but there are no mention about the fact. Later Andi will post
> updated documents that can respond this issue.
I still think that patch has bad semantics because you leave around
the events in the machine check registers and never clear
them. Especially with MCA recovery that has very unfortunate side
effects -- it means the OVER bit will be set and a in principle
recoverable MCA will require a panic. Even without MCA recovery it has
similar problems and will lead to confusing log output for non CE
MCAs.
I think a patch to not log corrected errors would be reasonable,
but you still need to clear the events from the machine check
banks at least.
So I would recommend you add a mce=dont_log_ce or somesuch
that just guards the mce_log() call in machine_check_poll()
Also for your use case really the better way would be to use
some way to let the firmware communicate that it doesn't want the OS
to log.
Also BTW before adding new features like this it would be a good
idea to first add the bug fixes I posted two weeks ago.
-Andi
--
[email protected] -- Speaking for myself only.
Hidetoshi Seto <[email protected]> writes:
> This patch introduces a couple of boot option for x86_64 mce.
>
> The "mce=no_cmci" boot option disables cmci feature.
> Since cmci is a new feature so having boot controls to disable
> it will be a help if the hardware is misbehaving.
Acked-by: Andi Kleen <[email protected]>
Although I expect on Nehalem you'll get more problems without CMCI
than with due to the shared banks. Perhaps the documentation
should make that clearer.
>
> The "mce=ignore_ce" boot option disables features for corrected
> errors, i.e. polling timer and cmci. Usually this disablement
> is not recommended, however it will be a help if there are some
> conflict with the BIOS or hardware monitoring applications etc.
Same problem as with the earlier patch. You need a point
somewhere where the event is cleared, otherwise all hell
breaks loose.
> static int mce_dont_init;
> +int cmci_disabled;
> +int ignore_ce;
For global variables you should use a mce_ prefix to not pollute
global name space.
> diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
> index d6b72df..64c0dd9 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
> @@ -109,6 +109,9 @@ static int cmci_supported(int *banks)
> {
> u64 cap;
>
> + if (cmci_disabled | ignore_ce)
I presume you meant || here
-Amndi
--
[email protected] -- Speaking for myself only.
Andi Kleen wrote:
> Hidetoshi Seto <[email protected]> writes:
>
>> Disabling only polling but not cmci is pointless setting.
>> Instead of "mce=nopoll" which tend to be paired with cmci disablement,
>> it rather make sense to have a "mce=ignore_ce" option that disable
>> both of polling and cmci at once. A patch for this new implementation
>> will follow this reverting patch.
>>
>> OTOH, once booted, we can disable polling by setting check_interval
>> to 0, but there are no mention about the fact. Later Andi will post
>> updated documents that can respond this issue.
>
> I still think that patch has bad semantics because you leave around
> the events in the machine check registers and never clear
> them. Especially with MCA recovery that has very unfortunate side
> effects -- it means the OVER bit will be set and a in principle
> recoverable MCA will require a panic. Even without MCA recovery it has
> similar problems and will lead to confusing log output for non CE
> MCAs.
>
> I think a patch to not log corrected errors would be reasonable,
> but you still need to clear the events from the machine check
> banks at least.
>
> So I would recommend you add a mce=dont_log_ce or somesuch
> that just guards the mce_log() call in machine_check_poll()
I suppose there are two possible situations:
1) There is a agent checking/clearing corrected errors
(such as BIOS) other than OS.
In this case, clearing MSRs by OS is not applicable.
So ignore_ce is better option here.
2) There is no agent checking/clearing corrected errors.
User just want to suppress logs of corrected errors.
In this case, dont_log_ce would be better option.
(Or adding filter to mcelog would be another solution)
I don't mind adding three options (no_cmci/ignore_ce/dont_log_ce)
at once. I'll rework 3/3 of this series to do so.
> Also for your use case really the better way would be to use
> some way to let the firmware communicate that it doesn't want the OS
> to log.
Yes. However AFAIK there is no way to do it yet.
> Also BTW before adding new features like this it would be a good
> idea to first add the bug fixes I posted two weeks ago.
>
> -Andi
The original of this repost were posted about three weeks ago (Apr.2)...
I think your patches will go smoothly if my revert patches added before
them.
BTW, could you give me your Acked-by on this 2/3 too?
Thanks,
H.Seto
Andi Kleen wrote:
> Hidetoshi Seto <[email protected]> writes:
>
>> This patch introduces a couple of boot option for x86_64 mce.
>>
>> The "mce=no_cmci" boot option disables cmci feature.
>> Since cmci is a new feature so having boot controls to disable
>> it will be a help if the hardware is misbehaving.
>
> Acked-by: Andi Kleen <[email protected]>
>
> Although I expect on Nehalem you'll get more problems without CMCI
> than with due to the shared banks. Perhaps the documentation
> should make that clearer.
Good idea. I'll add it to boot-options.txt
>> The "mce=ignore_ce" boot option disables features for corrected
>> errors, i.e. polling timer and cmci. Usually this disablement
>> is not recommended, however it will be a help if there are some
>> conflict with the BIOS or hardware monitoring applications etc.
>
> Same problem as with the earlier patch. You need a point
> somewhere where the event is cleared, otherwise all hell
> breaks loose.
Adding "dont_log_ce" and its description will be a help.
It need to describe how it different from ignore_ce and how events
are cleared.
>> static int mce_dont_init;
>> +int cmci_disabled;
>> +int ignore_ce;
>
> For global variables you should use a mce_ prefix to not pollute
> global name space.
OK. I'll fix.
>> diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
>> index d6b72df..64c0dd9 100644
>> --- a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
>> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
>> @@ -109,6 +109,9 @@ static int cmci_supported(int *banks)
>> {
>> u64 cap;
>>
>> + if (cmci_disabled | ignore_ce)
>
> I presume you meant || here
Yes, should be fixed.
Thanks,
H.Seto
> 1) There is a agent checking/clearing corrected errors
> (such as BIOS) other than OS.
>
> In this case, clearing MSRs by OS is not applicable.
> So ignore_ce is better option here.
Yes, but how do you make sure the option is set when the agent is active?
I think relying on a kernel boot option here is quite fragile.
Also it's something the administrator shouldn't be involved in.
>
> 2) There is no agent checking/clearing corrected errors.
> User just want to suppress logs of corrected errors.
>
> In this case, dont_log_ce would be better option.
> (Or adding filter to mcelog would be another solution)
>
> I don't mind adding three options (no_cmci/ignore_ce/dont_log_ce)
> at once. I'll rework 3/3 of this series to do so.
>
> > Also for your use case really the better way would be to use
> > some way to let the firmware communicate that it doesn't want the OS
> > to log.
>
> Yes. However AFAIK there is no way to do it yet.
We can just define one. We had Linux specific extensions before.
Just needs a new bit somewhere. I think that would give a better
experience for your customers.
There might be something forthcomming that could be usable in fact.
> The original of this repost were posted about three weeks ago (Apr.2)...
>
> I think your patches will go smoothly if my revert patches added before
> them.
The bug fixes should go in 2.6.30. After all that is what the rc stage
is for -- applying bug fixed.
> BTW, could you give me your Acked-by on this 2/3 too?
Not yet sorry.
-Andi
--
[email protected] -- Speaking for myself only.
Andi Kleen wrote:
>> 1) There is a agent checking/clearing corrected errors
>> (such as BIOS) other than OS.
>>
>> In this case, clearing MSRs by OS is not applicable.
>> So ignore_ce is better option here.
>
> Yes, but how do you make sure the option is set when the agent is active?
> I think relying on a kernel boot option here is quite fragile.
> Also it's something the administrator shouldn't be involved in.
Chicken and egg?
If there is a agent, and if the administrator cannot deactivate the agent,
then use this option. For example, there are no updated BIOS.
>>> Also for your use case really the better way would be to use
>>> some way to let the firmware communicate that it doesn't want the OS
>>> to log.
>> Yes. However AFAIK there is no way to do it yet.
>
> We can just define one. We had Linux specific extensions before.
> Just needs a new bit somewhere. I think that would give a better
> experience for your customers.
You are right in part.
But even if I defined one, it doesn't work until it actually used by
someone. In other words, for example, it will require update of BIOS.
> There might be something forthcomming that could be usable in fact.
Some standard interface will be appreciated.
Having linux specific thing is not happy for hardware vendors, I guess.
>> The original of this repost were posted about three weeks ago (Apr.2)...
>>
>> I think your patches will go smoothly if my revert patches added before
>> them.
>
> The bug fixes should go in 2.6.30. After all that is what the rc stage
> is for -- applying bug fixed.
Maybe you overlooked that patches here are against -tip tree, don't you?
I'd like to fix -tip tree, that will be merged in next window, 2.6.31-rc1.
>> BTW, could you give me your Acked-by on this 2/3 too?
>
> Not yet sorry.
You have acked on 3/3... So you mean "keep mce=nopoll option in -tip" ?
Thanks,
H.Seto
Here is the updated one.
Thanks in advance!
H.Seto
This patch introduces three boot options to control handling
for corrected errors.
The "mce=no_cmci" boot option disables cmci feature.
Since cmci is a new feature so having boot controls to disable
it will be a help if the hardware is misbehaving.
The "mce=dont_log_ce" boot option disables logging for corrected
errors. All reported corrected errors will be cleared silently.
This option will be useful if you never care corrected errors.
The "mce=ignore_ce" boot option disables features for corrected
errors, i.e. polling timer and cmci. All corrected events are
not cleared and kept in bank MSRs. Usually this disablement
is not recommended, however it will be a help if there are some
conflict with the BIOS or hardware monitoring applications etc.,
that clears corrected events in banks instead of OS.
And trivial cleanup (space -> tab) for doc is included.
Signed-off-by: Hidetoshi Seto <[email protected]>
---
Documentation/x86/x86_64/boot-options.txt | 36 ++++++++++++++++++++++++-----
arch/x86/include/asm/mce.h | 2 +
arch/x86/kernel/cpu/mcheck/mce_64.c | 20 ++++++++++++++-
arch/x86/kernel/cpu/mcheck/mce_intel_64.c | 3 ++
4 files changed, 53 insertions(+), 8 deletions(-)
diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
index 34c1304..04834ad 100644
--- a/Documentation/x86/x86_64/boot-options.txt
+++ b/Documentation/x86/x86_64/boot-options.txt
@@ -5,12 +5,36 @@ only the AMD64 specific ones are listed here.
Machine check
- mce=off disable machine check
- mce=bootlog Enable logging of machine checks left over from booting.
- Disabled by default on AMD because some BIOS leave bogus ones.
- If your BIOS doesn't do that it's a good idea to enable though
- to make sure you log even machine check events that result
- in a reboot. On Intel systems it is enabled by default.
+ mce=off
+ Disable machine check
+ mce=no_cmci
+ Disable CMCI(Corrected Machine Check Interrupt) that
+ Intel processor supports. Usually this disablement is
+ not recommended, but it might be handy if your hardware
+ is misbehaving.
+ Note that you'll get more problems without CMCI than with
+ due to the shared banks, i.e. you might get duplicated
+ error logs.
+ mce=dont_log_ce
+ Don't make logs for corrected errors. All events reported
+ as corrected are silently cleared by OS.
+ This option will be useful if you have no interest in any
+ of corrected errors.
+ mce=ignore_ce
+ Disable features for corrected errors, e.g. polling timer
+ and CMCI. All events reported as corrected are not cleared
+ by OS and remained in its error banks.
+ Usually this disablement is not recommended, however if
+ there is an agent checking/clearing corrected errors
+ (e.g. BIOS or hardware monitoring applications), conflicting
+ with OS's error handling, and you cannot deactivate the agent,
+ then this option will be a help.
+ mce=bootlog
+ Enable logging of machine checks left over from booting.
+ Disabled by default on AMD because some BIOS leave bogus ones.
+ If your BIOS doesn't do that it's a good idea to enable though
+ to make sure you log even machine check events that result
+ in a reboot. On Intel systems it is enabled by default.
mce=nobootlog
Disable boot machine check logging.
mce=tolerancelevel (number)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 563933e..a8a6cd5 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -104,6 +104,8 @@ extern void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu);
#define MAX_NR_BANKS (MCE_EXTENDED_BANK - 1)
#ifdef CONFIG_X86_MCE_INTEL
+extern int mce_cmci_disabled;
+extern int mce_ignore_ce;
void mce_intel_feature_init(struct cpuinfo_x86 *c);
void cmci_clear(void);
void cmci_reenable(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c b/arch/x86/kernel/cpu/mcheck/mce_64.c
index 33d612e..6abefe3 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -41,6 +41,9 @@
atomic_t mce_entry;
static int mce_dont_init;
+static int mce_dont_log_ce;
+int mce_cmci_disabled;
+int mce_ignore_ce;
/*
* Tolerant levels:
@@ -240,7 +243,8 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
* have anything to do with the actual error location.
*/
- mce_log(&m);
+ if (!mce_dont_log_ce)
+ mce_log(&m);
add_taint(TAINT_MACHINE_CHECK);
/*
@@ -633,6 +637,9 @@ static void mce_init_timer(void)
{
struct timer_list *t = &__get_cpu_var(mce_timer);
+ if (mce_ignore_ce)
+ return;
+
/* data race harmless because everyone sets to the same value */
if (!next_interval)
next_interval = check_interval * HZ;
@@ -841,7 +848,10 @@ static int __init mcheck_disable(char *str)
__setup("nomce", mcheck_disable);
/*
- * mce=off disables machine check
+ * mce=off Disables machine check
+ * mce=no_cmci Disables CMCI
+ * mce=dont_log_ce Clears corrected events silently, no log created for CEs.
+ * mce=ignore_ce Disables polling and CMCI, corrected events are not cleared.
* mce=TOLERANCELEVEL (number, see above)
* mce=bootlog Log MCEs from before booting. Disabled by default on AMD.
* mce=nobootlog Don't log MCEs from before booting.
@@ -850,6 +860,12 @@ static int __init mcheck_enable(char *str)
{
if (!strcmp(str, "off"))
mce_dont_init = 1;
+ else if (!strcmp(str, "no_cmci"))
+ mce_cmci_disabled = 1;
+ else if (!strcmp(str, "dont_log_ce"))
+ mce_dont_log_ce = 1;
+ else if (!strcmp(str, "ignore_ce"))
+ mce_ignore_ce = 1;
else if (!strcmp(str, "bootlog") || !strcmp(str, "nobootlog"))
mce_bootlog = (str[0] == 'b');
else if (isdigit(str[0]))
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
index d6b72df..a88bad9 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
@@ -109,6 +109,9 @@ static int cmci_supported(int *banks)
{
u64 cap;
+ if (mce_cmci_disabled || mce_ignore_ce)
+ return 0;
+
/*
* Vendor check is not strictly needed, but the initial
* initialization is vendor keyed and this
--
1.6.2.2
On Wed, Apr 22, 2009 at 12:25:04PM +0900, Hidetoshi Seto wrote:
> Here is the updated one.
Patch is ok for me.
Reviewed-by: Andi Kleen <[email protected]>
-Andi