2009-09-30 14:09:06

by Andi Kleen

[permalink] [raw]
Subject: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9


Can someone please revert this incorrect commit that's in mainline
now?

Obviously kernels compiled with AMD support can still run on non
AMD systems, so messages like this can never be removed at compile time.

-andi

Commit 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9
Author: Borislav Petkov <[email protected]>
Date: Tue Jul 28 14:47:10 2009 +0200

x86, mce: do not compile mcelog message on AMD

Now that decoding is done in-kernel, suppress mcelog message part.

CC: Andi Kleen <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index b82866f..9bfe9d2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -222,7 +222,10 @@ static void print_mce_head(void)
static void print_mce_tail(void)
{
printk(KERN_EMERG "This is not a software problem!\n"
- "Run through mcelog --ascii to decode and contact your hardware vendor\n");
+#if (!defined(CONFIG_EDAC) || !defined(CONFIG_CPU_SUP_AMD))
+ "Run through mcelog --ascii to decode and contact your hardware vendor\n"
+#endif
+ );
}

#define PANIC_TIMEOUT 5 /* 5 seconds */

--
[email protected] -- Speaking for myself only.


2009-09-30 19:40:57

by Borislav Petkov

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9

On Wed, Sep 30, 2009 at 04:09:04PM +0200, Andi Kleen wrote:
>
> Can someone please revert this incorrect commit that's in mainline
> now?
>
> Obviously kernels compiled with AMD support can still run on non
> AMD systems, so messages like this can never be removed at compile time.
>
> -andi
>
> Commit 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9
> Author: Borislav Petkov <[email protected]>
> Date: Tue Jul 28 14:47:10 2009 +0200
>
> x86, mce: do not compile mcelog message on AMD
>
> Now that decoding is done in-kernel, suppress mcelog message part.
>
> CC: Andi Kleen <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index b82866f..9bfe9d2 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -222,7 +222,10 @@ static void print_mce_head(void)
> static void print_mce_tail(void)
> {
> printk(KERN_EMERG "This is not a software problem!\n"
> - "Run through mcelog --ascii to decode and contact your hardware vendor\n");
> +#if (!defined(CONFIG_EDAC) || !defined(CONFIG_CPU_SUP_AMD))

how about

if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
pr_emerg("Run through mcelog --ascii to decode and contact your hardware vendor\n");

instead?

> + "Run through mcelog --ascii to decode and contact your hardware vendor\n"
> +#endif
> + );
> }
>
> #define PANIC_TIMEOUT 5 /* 5 seconds */

--
Regards/Gruss,
Boris.

2009-09-30 20:47:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9


* Borislav Petkov <[email protected]> wrote:

> On Wed, Sep 30, 2009 at 04:09:04PM +0200, Andi Kleen wrote:
> >
> > Can someone please revert this incorrect commit that's in mainline
> > now?
> >
> > Obviously kernels compiled with AMD support can still run on non
> > AMD systems, so messages like this can never be removed at compile time.
> >
> > -andi
> >
> > Commit 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9
> > Author: Borislav Petkov <[email protected]>
> > Date: Tue Jul 28 14:47:10 2009 +0200
> >
> > x86, mce: do not compile mcelog message on AMD
> >
> > Now that decoding is done in-kernel, suppress mcelog message part.
> >
> > CC: Andi Kleen <[email protected]>
> > Signed-off-by: Borislav Petkov <[email protected]>
> >
> > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> > index b82866f..9bfe9d2 100644
> > --- a/arch/x86/kernel/cpu/mcheck/mce.c
> > +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> > @@ -222,7 +222,10 @@ static void print_mce_head(void)
> > static void print_mce_tail(void)
> > {
> > printk(KERN_EMERG "This is not a software problem!\n"
> > - "Run through mcelog --ascii to decode and contact your hardware vendor\n");
> > +#if (!defined(CONFIG_EDAC) || !defined(CONFIG_CPU_SUP_AMD))
>
> how about
>
> if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> pr_emerg("Run through mcelog --ascii to decode and contact your hardware vendor\n");
>
> instead?

Yeah, a runtime check like that would be fine - but i'd suggest
something more clearly and more specifically connected to in-kernel
decoding: please define a new x86_mce_can_decode_errors capability flag
or so.

Obviously the Intel CPU side should be fixed and improved to decode MCE
errors in the kernel too.

Please also fix that printk to say something like:

"MCE error decoding not supported on this CPU: run through mcelog --ascii to decode\n"

Thanks,

Ingo

2009-09-30 21:49:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9


* Ingo Molnar <[email protected]> wrote:

>
> * Borislav Petkov <[email protected]> wrote:
>
> > On Wed, Sep 30, 2009 at 04:09:04PM +0200, Andi Kleen wrote:
> > >
> > > Can someone please revert this incorrect commit that's in mainline
> > > now?
> > >
> > > Obviously kernels compiled with AMD support can still run on non
> > > AMD systems, so messages like this can never be removed at compile time.
> > >
> > > -andi
> > >
> > > Commit 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9
> > > Author: Borislav Petkov <[email protected]>
> > > Date: Tue Jul 28 14:47:10 2009 +0200
> > >
> > > x86, mce: do not compile mcelog message on AMD
> > >
> > > Now that decoding is done in-kernel, suppress mcelog message part.
> > >
> > > CC: Andi Kleen <[email protected]>
> > > Signed-off-by: Borislav Petkov <[email protected]>
> > >
> > > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> > > index b82866f..9bfe9d2 100644
> > > --- a/arch/x86/kernel/cpu/mcheck/mce.c
> > > +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> > > @@ -222,7 +222,10 @@ static void print_mce_head(void)
> > > static void print_mce_tail(void)
> > > {
> > > printk(KERN_EMERG "This is not a software problem!\n"
> > > - "Run through mcelog --ascii to decode and contact your hardware vendor\n");
> > > +#if (!defined(CONFIG_EDAC) || !defined(CONFIG_CPU_SUP_AMD))
> >
> > how about
> >
> > if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> > pr_emerg("Run through mcelog --ascii to decode and contact your hardware vendor\n");
> >
> > instead?
>
> Yeah, a runtime check like that would be fine - but i'd suggest
> something more clearly and more specifically connected to in-kernel
> decoding: please define a new x86_mce_can_decode_errors capability
> flag or so.
>
> Obviously the Intel CPU side should be fixed and improved to decode
> MCE errors in the kernel too.
>
> Please also fix that printk to say something like:
>
> "MCE error decoding not supported on this CPU: run through mcelog --ascii to decode\n"
>
> Thanks,

I.e. something like the patch below. Completely untested.

Note, while looking at the interaction of decode_mce() with the other
MCE code i also noticed a few other things and made the following
cleanups/fixes:

- Fixed the mce_decode() weak alias - a weak alias is really not good
here, it should be a proper callback. A weak alias will be overriden
if a piece of code is built into the kernel - not good, obviously.

- The patch initializes the callback on AMD family 10h and 11h - a
quick glance suggests that decoding of earlier models isnt supported?

- Added the more correct fallback printk of:

No support for human readable MCE decoding on this CPU type.
Transcribe the message and run it through 'mcelog --ascii' to decode.

On CPUs that dont have a decoder.

- Made the surrounding code more readable.

Note that the callback allows us to have a default fallback - without
having to check the CPU versions during the printout itself. When an
EDAC module registers itself, it can install the decode-print function.

(there's no unregister needed as this is core code.)

Ingo

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index b608a64..f52d219 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -133,6 +133,8 @@ static inline void winchip_mcheck_init(struct cpuinfo_x86 *c) {}
static inline void enable_p5_mce(void) {}
#endif

+extern void (*x86_decode_mce_callback)(struct mce *m);
+
void mce_setup(struct mce *m);
void mce_log(struct mce *m);
DECLARE_PER_CPU(struct sys_device, mce_dev);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 183c345..adc8e2a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -85,6 +85,18 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_wait);
static DEFINE_PER_CPU(struct mce, mces_seen);
static int cpu_missing;

+static void default_decode_mce(struct mce *m)
+{
+ pr_emerg("No support for human readable MCE decoding on this CPU type.\n");
+ pr_emerg("Transcribe the message and run it through 'mcelog --ascii' to decode.\n");
+}
+
+/*
+ * CPU/chipset specific EDAC code can register a callback here to print
+ * MCE errors in a human-readable form:
+ */
+void (*x86_decode_mce_callback)(struct mce *m) = default_decode_mce;
+EXPORT_SYMBOL(x86_decode_mce_callback);

/* MCA banks polled by the period polling timer for corrected events */
DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
@@ -165,46 +177,47 @@ void mce_log(struct mce *mce)
set_bit(0, &mce_need_notify);
}

-void __weak decode_mce(struct mce *m)
-{
- return;
-}
-
static void print_mce(struct mce *m)
{
- printk(KERN_EMERG
- "CPU %d: Machine Check Exception: %16Lx Bank %d: %016Lx\n",
+ pr_emerg("CPU %d: Machine Check Exception: %16Lx Bank %d: %016Lx\n",
m->extcpu, m->mcgstatus, m->bank, m->status);
+
if (m->ip) {
- printk(KERN_EMERG "RIP%s %02x:<%016Lx> ",
+ pr_emerg("RIP%s %02x:<%016Lx> ",
!(m->mcgstatus & MCG_STATUS_EIPV) ? " !INEXACT!" : "",
m->cs, m->ip);
+
if (m->cs == __KERNEL_CS)
print_symbol("{%s}", m->ip);
- printk(KERN_CONT "\n");
+ pr_cont("\n");
}
- printk(KERN_EMERG "TSC %llx ", m->tsc);
+
+ pr_emerg("TSC %llx ", m->tsc);
if (m->addr)
- printk(KERN_CONT "ADDR %llx ", m->addr);
+ pr_cont("ADDR %llx ", m->addr);
if (m->misc)
- printk(KERN_CONT "MISC %llx ", m->misc);
- printk(KERN_CONT "\n");
- printk(KERN_EMERG "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x\n",
+ pr_cont("MISC %llx ", m->misc);
+
+ pr_cont("\n");
+ pr_emerg("PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x\n",
m->cpuvendor, m->cpuid, m->time, m->socketid,
m->apicid);

- decode_mce(m);
+ /*
+ * Print out human-readable details about the MCE error,
+ * (if the CPU has an implementation for that):
+ */
+ x86_decode_mce_callback(m);
}

static void print_mce_head(void)
{
- printk(KERN_EMERG "\nHARDWARE ERROR\n");
+ pr_emerg("\nHARDWARE ERROR\n");
}

static void print_mce_tail(void)
{
- printk(KERN_EMERG "This is not a software problem!\n"
- "Run through mcelog --ascii to decode and contact your hardware vendor\n");
+ pr_emerg("This is not a software problem!\n");
}

#define PANIC_TIMEOUT 5 /* 5 seconds */
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 0c21c37..4fee380 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -362,7 +362,7 @@ static inline void amd_decode_err_code(unsigned int ec)
pr_warning("Huh? Unknown MCE error 0x%x\n", ec);
}

-void decode_mce(struct mce *m)
+static void amd_decode_mce(struct mce *m)
{
struct err_regs regs;
int node, ecc;
@@ -420,3 +420,15 @@ void decode_mce(struct mce *m)

amd_decode_err_code(m->status & 0xffff);
}
+
+static __init int mce_amd_init(void)
+{
+ /*
+ * We can decode MCEs for Opteron and later CPUs:
+ */
+ if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) && (boot_cpu_data.x86 >= 0x10))
+ x86_mce_decode_callback = amd_decode;
+
+ return 0;
+}
+early_initcall(mce_amd_init);

2009-09-30 22:39:59

by Borislav Petkov

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9

On Wed, Sep 30, 2009 at 11:48:59PM +0200, Ingo Molnar wrote:
> I.e. something like the patch below. Completely untested.

[..]

> Note, while looking at the interaction of decode_mce() with the other
> MCE code i also noticed a few other things and made the following
> cleanups/fixes:
>
> - Fixed the mce_decode() weak alias - a weak alias is really not good
> here, it should be a proper callback. A weak alias will be overriden
> if a piece of code is built into the kernel - not good, obviously.

The original idea was for the edac_mce_amd.o module to override the weak
symbol but the problem with that is that when CONFIG_CPU_SUP_AMD is set,
MCE decoding is built in by default thus overriding the weak symbol even
on non-AMD systems running distro kernels with multiple x86 CPU types
support. Your patch solves that.

However, currently we can't get rid of the decoding code when booted on
non-AMD boxes - it amounts to ~ 15K of object code on non-debug builds.
Is it worth the trouble to add it to initmem on such boxes and remove it
during boot?

> - The patch initializes the callback on AMD family 10h and 11h - a
> quick glance suggests that decoding of earlier models isnt supported?

Actually, the K8 error types and messages should be decoded just fine
too. I'll doublecheck and adjust the family check accordingly.

[..]

> Note that the callback allows us to have a default fallback - without
> having to check the CPU versions during the printout itself. When an
> EDAC module registers itself, it can install the decode-print function.

Cool. It is much cleaner that way.

Thanks, will give it a run tomorrow.

--
Regards/Gruss,
Boris.

2009-09-30 23:10:01

by Ingo Molnar

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9


* Borislav Petkov <[email protected]> wrote:

> On Wed, Sep 30, 2009 at 11:48:59PM +0200, Ingo Molnar wrote:
> > I.e. something like the patch below. Completely untested.
>
> [..]
>
> > Note, while looking at the interaction of decode_mce() with the other
> > MCE code i also noticed a few other things and made the following
> > cleanups/fixes:
> >
> > - Fixed the mce_decode() weak alias - a weak alias is really not good
> > here, it should be a proper callback. A weak alias will be overriden
> > if a piece of code is built into the kernel - not good, obviously.
>
> The original idea was for the edac_mce_amd.o module to override the
> weak symbol but the problem with that is that when CONFIG_CPU_SUP_AMD
> is set, MCE decoding is built in by default thus overriding the weak
> symbol even on non-AMD systems running distro kernels with multiple
> x86 CPU types support. Your patch solves that.
>
> However, currently we can't get rid of the decoding code when booted
> on non-AMD boxes - it amounts to ~ 15K of object code on non-debug
> builds. Is it worth the trouble to add it to initmem on such boxes and
> remove it during boot?

it's ~5K here. Btw., how would that work? Initmem cannot generally be
discarded optionally like that.

But even if it's 15K, as long as it's in well-concentrated code staying
out of the hotpath (which it is here) is not nearly as much of a problem
as forms of bloat affecting the hotpath. We do have other forms of CPU
support code too which could remove an equal amount of code.

Generally, if your hardware is so constrained that it cannot keep
general x86 compatibility code around, you will build a specific bzImage
tailored specifically to that hardware environment. Such a box can
twiddle the existing config options just fine to include (or exclude)
this code.

> > - The patch initializes the callback on AMD family 10h and 11h - a
> > quick glance suggests that decoding of earlier models isnt supported?
>
> Actually, the K8 error types and messages should be decoded just fine
> too. I'll doublecheck and adjust the family check accordingly.

Ok.

Thanks,

Ingo

Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9

On Thu, Oct 01, 2009 at 01:09:47AM +0200, Ingo Molnar wrote:
>
> * Borislav Petkov <[email protected]> wrote:
>
> > On Wed, Sep 30, 2009 at 11:48:59PM +0200, Ingo Molnar wrote:
> > > I.e. something like the patch below. Completely untested.

Ok, here it is, tested on two Fam10 machines here with injecting MCEs.
The decoding code is now built-in by default (early_initcall requires
!MODULE).

I dunno, it looks like we could just as well move the decoding stuff to
<arch/x86/kernel/cpu/mcheck/> since its now built-in and doesn't have
any connection to EDAC except the NB decoder which is a function ptr
registration thing.

Please take a look and let me know if there are any objections.

Thanks.

---
>From 2c494ebc4ddbdb386a3ffa1a30339d3cec4072e5 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <[email protected]>
Date: Thu, 1 Oct 2009 11:43:48 +0200
Subject: [PATCH] x86, mce: Fix MCE decoding callback logic

Make decoding of MCEs happen only on AMD hardware by registering a
non-default callback only on CPU families which support it.

While looking at the interaction of decode_mce() with the other MCE code
i also noticed a few other things and made the following cleanups/fixes:

- Fixed the mce_decode() weak alias - a weak alias is really not good
here, it should be a proper callback. A weak alias will be overriden
if a piece of code is built into the kernel - not good, obviously.

- The patch initializes the callback on AMD family 10h and 11h - a
quick glance suggests that decoding of earlier models isnt supported?

- Added the more correct fallback printk of:

No support for human readable MCE decoding on this CPU type.
Transcribe the message and run it through 'mcelog --ascii' to decode.

On CPUs that dont have a decoder.

- Made the surrounding code more readable.

Note that the callback allows us to have a default fallback - without
having to check the CPU versions during the printout itself. When an
EDAC module registers itself, it can install the decode-print function.

(there's no unregister needed as this is core code.)

Borislav:

- add K8 to the set of supported CPUs

- always build in edac_mce_amd since we use an early_initcall now

- fixup checkpatch warnings

Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/mce.h | 2 +
arch/x86/kernel/cpu/mcheck/mce.c | 49 ++++++++++++++++++++++++--------------
drivers/edac/Makefile | 4 +--
drivers/edac/edac_mce_amd.c | 15 +++++++++++-
4 files changed, 48 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index b608a64..f1363b7 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -133,6 +133,8 @@ static inline void winchip_mcheck_init(struct cpuinfo_x86 *c) {}
static inline void enable_p5_mce(void) {}
#endif

+extern void (*x86_mce_decode_callback)(struct mce *m);
+
void mce_setup(struct mce *m);
void mce_log(struct mce *m);
DECLARE_PER_CPU(struct sys_device, mce_dev);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 183c345..614c849 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -85,6 +85,18 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_wait);
static DEFINE_PER_CPU(struct mce, mces_seen);
static int cpu_missing;

+static void default_decode_mce(struct mce *m)
+{
+ pr_emerg("No human readable MCE decoding support on this CPU type.\n");
+ pr_emerg("Run the message through 'mcelog --ascii' to decode.\n");
+}
+
+/*
+ * CPU/chipset specific EDAC code can register a callback here to print
+ * MCE errors in a human-readable form:
+ */
+void (*x86_mce_decode_callback)(struct mce *m) = default_decode_mce;
+EXPORT_SYMBOL(x86_mce_decode_callback);

/* MCA banks polled by the period polling timer for corrected events */
DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
@@ -165,46 +177,47 @@ void mce_log(struct mce *mce)
set_bit(0, &mce_need_notify);
}

-void __weak decode_mce(struct mce *m)
-{
- return;
-}
-
static void print_mce(struct mce *m)
{
- printk(KERN_EMERG
- "CPU %d: Machine Check Exception: %16Lx Bank %d: %016Lx\n",
+ pr_emerg("CPU %d: Machine Check Exception: %16Lx Bank %d: %016Lx\n",
m->extcpu, m->mcgstatus, m->bank, m->status);
+
if (m->ip) {
- printk(KERN_EMERG "RIP%s %02x:<%016Lx> ",
+ pr_emerg("RIP%s %02x:<%016Lx> ",
!(m->mcgstatus & MCG_STATUS_EIPV) ? " !INEXACT!" : "",
m->cs, m->ip);
+
if (m->cs == __KERNEL_CS)
print_symbol("{%s}", m->ip);
- printk(KERN_CONT "\n");
+ pr_cont("\n");
}
- printk(KERN_EMERG "TSC %llx ", m->tsc);
+
+ pr_emerg("TSC %llx ", m->tsc);
if (m->addr)
- printk(KERN_CONT "ADDR %llx ", m->addr);
+ pr_cont("ADDR %llx ", m->addr);
if (m->misc)
- printk(KERN_CONT "MISC %llx ", m->misc);
- printk(KERN_CONT "\n");
- printk(KERN_EMERG "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x\n",
+ pr_cont("MISC %llx ", m->misc);
+
+ pr_cont("\n");
+ pr_emerg("PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x\n",
m->cpuvendor, m->cpuid, m->time, m->socketid,
m->apicid);

- decode_mce(m);
+ /*
+ * Print out human-readable details about the MCE error,
+ * (if the CPU has an implementation for that):
+ */
+ x86_mce_decode_callback(m);
}

static void print_mce_head(void)
{
- printk(KERN_EMERG "\nHARDWARE ERROR\n");
+ pr_emerg("\nHARDWARE ERROR\n");
}

static void print_mce_tail(void)
{
- printk(KERN_EMERG "This is not a software problem!\n"
- "Run through mcelog --ascii to decode and contact your hardware vendor\n");
+ pr_emerg("This is not a software problem!\n");
}

#define PANIC_TIMEOUT 5 /* 5 seconds */
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 7a473bb..74fc527 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -17,9 +17,7 @@ ifdef CONFIG_PCI
edac_core-objs += edac_pci.o edac_pci_sysfs.o
endif

-ifdef CONFIG_CPU_SUP_AMD
-edac_core-objs += edac_mce_amd.o
-endif
+obj-$(CONFIG_CPU_SUP_AMD) += edac_mce_amd.o

obj-$(CONFIG_EDAC_AMD76X) += amd76x_edac.o
obj-$(CONFIG_EDAC_CPC925) += cpc925_edac.o
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 0c21c37..83a01a1 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -362,7 +362,7 @@ static inline void amd_decode_err_code(unsigned int ec)
pr_warning("Huh? Unknown MCE error 0x%x\n", ec);
}

-void decode_mce(struct mce *m)
+static void amd_decode_mce(struct mce *m)
{
struct err_regs regs;
int node, ecc;
@@ -420,3 +420,16 @@ void decode_mce(struct mce *m)

amd_decode_err_code(m->status & 0xffff);
}
+
+static int __init mce_amd_init(void)
+{
+ /*
+ * We can decode MCEs for Opteron and later CPUs:
+ */
+ if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
+ (boot_cpu_data.x86 >= 0xf))
+ x86_mce_decode_callback = amd_decode_mce;
+
+ return 0;
+}
+early_initcall(mce_amd_init);
--
1.6.4.3

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

2009-10-01 14:36:02

by Linus Torvalds

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9



On Thu, 1 Oct 2009, Borislav Petkov wrote:
>
> Ok, here it is, tested on two Fam10 machines here with injecting MCEs.
> The decoding code is now built-in by default (early_initcall requires
> !MODULE).

I don't think it has to require !MODULE. We could do what we do for the
other initcalls, ie if MODULE we turn it into just a regular initcall. If
that allows something like the EDAC MCE to be built as a module, and
people want to, then just go ahead and add the one-liner to <linux/init.h>

Of course, if it _requires_ being loaded early for some other reason, then
that's a different issue. But don't do it just because we don't have the

#define early_initcall(fn) module_init(fn)

line.

Linus

Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9

On Thu, Oct 01, 2009 at 07:34:11AM -0700, Linus Torvalds wrote:
> > Ok, here it is, tested on two Fam10 machines here with injecting MCEs.
> > The decoding code is now built-in by default (early_initcall requires
> > !MODULE).
>
> I don't think it has to require !MODULE. We could do what we do for the
> other initcalls, ie if MODULE we turn it into just a regular initcall. If
> that allows something like the EDAC MCE to be built as a module, and
> people want to, then just go ahead and add the one-liner to <linux/init.h>
>
> Of course, if it _requires_ being loaded early for some other reason, then

the only reason I can think of is if you want to be able to decode MCEs
happening as early as possible before the modules have been init-ted.

Question really is, what is more important to us? But we definitely
can do the module stuff too and get rid of ~5K core kernel fat by
relocating the whole functionality completely into the module. Hmm...

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

2009-10-01 14:55:24

by Ingo Molnar

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9


* Linus Torvalds <[email protected]> wrote:

> On Thu, 1 Oct 2009, Borislav Petkov wrote:
> >
> > Ok, here it is, tested on two Fam10 machines here with injecting
> > MCEs. The decoding code is now built-in by default (early_initcall
> > requires !MODULE).
>
> I don't think it has to require !MODULE. We could do what we do for
> the other initcalls, ie if MODULE we turn it into just a regular
> initcall. If that allows something like the EDAC MCE to be built as a
> module, and people want to, then just go ahead and add the one-liner
> to <linux/init.h>
>
> Of course, if it _requires_ being loaded early for some other reason,
> then that's a different issue. [...]

i think it's borderline.

The one issue that makes it nice to be core code is the fact that many
hardware problems hit early during bootup, so having the human-readable
decoder there has practical advantages.

I'd still like to see it nicely abstracted out, and not hardwired into
lowlevel code. I.e. we should slowly move towards having proper chipset
drivers in the long run. (the fact that northbridges now sit on the CPU
die make this easier - there's fewer variations in practice.)

( One detail: if the MODULE=y case is allowed, the unregister side of
the callback has to be implemented. That could be .33 material i
suspect. )

Ingo

2009-10-01 15:00:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9


* Borislav Petkov <[email protected]> wrote:

> On Thu, Oct 01, 2009 at 07:34:11AM -0700, Linus Torvalds wrote:
> > > Ok, here it is, tested on two Fam10 machines here with injecting MCEs.
> > > The decoding code is now built-in by default (early_initcall requires
> > > !MODULE).
> >
> > I don't think it has to require !MODULE. We could do what we do for the
> > other initcalls, ie if MODULE we turn it into just a regular initcall. If
> > that allows something like the EDAC MCE to be built as a module, and
> > people want to, then just go ahead and add the one-liner to <linux/init.h>
> >
> > Of course, if it _requires_ being loaded early for some other reason, then
>
> the only reason I can think of is if you want to be able to decode
> MCEs happening as early as possible before the modules have been
> init-ted.
>
> Question really is, what is more important to us? But we definitely
> can do the module stuff too and get rid of ~5K core kernel fat by
> relocating the whole functionality completely into the module. Hmm...

I'd focus on the built-in-side initially, which is the most
practical/debuggable/serviceable solution. It's of course not a problem
(at all) if it is _also_ module loadable but that's not the primary
goal.

Ingo

Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9

On Thu, Oct 01, 2009 at 05:00:46PM +0200, Ingo Molnar wrote:
> > > Of course, if it _requires_ being loaded early for some other reason, then
> >
> > the only reason I can think of is if you want to be able to decode
> > MCEs happening as early as possible before the modules have been
> > init-ted.
> >
> > Question really is, what is more important to us? But we definitely
> > can do the module stuff too and get rid of ~5K core kernel fat by
> > relocating the whole functionality completely into the module. Hmm...
>
> I'd focus on the built-in-side initially, which is the most
> practical/debuggable/serviceable solution. It's of course not a problem
> (at all) if it is _also_ module loadable but that's not the primary
> goal.

How about a CONFIG_EDAC_DECODE_MCE which depends on CONFIG_CPU_SUP_AMD and
defaults to yes with a config text saying something in the likes of

"... Select Y here if you want to decode MCEs happening early before the
modules framework has been initialized...."

?

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

2009-10-01 15:26:25

by Andi Kleen

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9

On Thu, Oct 01, 2009 at 07:34:11AM -0700, Linus Torvalds wrote:
> Of course, if it _requires_ being loaded early for some other reason, then
> that's a different issue. But don't do it just because we don't have the
>
> #define early_initcall(fn) module_init(fn)
>
> line.

Or just drop the patch like I proposed. I don't think it serves
a very useful purpose anyways.

-Andi
--
[email protected] -- Speaking for myself only.

2009-10-01 15:33:03

by Ingo Molnar

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9


* Borislav Petkov <[email protected]> wrote:

> On Thu, Oct 01, 2009 at 05:00:46PM +0200, Ingo Molnar wrote:
> > > > Of course, if it _requires_ being loaded early for some other reason, then
> > >
> > > the only reason I can think of is if you want to be able to decode
> > > MCEs happening as early as possible before the modules have been
> > > init-ted.
> > >
> > > Question really is, what is more important to us? But we definitely
> > > can do the module stuff too and get rid of ~5K core kernel fat by
> > > relocating the whole functionality completely into the module. Hmm...
> >
> > I'd focus on the built-in-side initially, which is the most
> > practical/debuggable/serviceable solution. It's of course not a problem
> > (at all) if it is _also_ module loadable but that's not the primary
> > goal.
>
> How about a CONFIG_EDAC_DECODE_MCE which depends on CONFIG_CPU_SUP_AMD
> and defaults to yes with a config text saying something in the likes
> of
>
> "... Select Y here if you want to decode MCEs happening early before
> the modules framework has been initialized...."

Yeah, that would be good. It should also obviously depend on
CONFIG_X86_MCE - so only visible if MCE support is enabled.

Ingo

Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9

On Thu, Oct 01, 2009 at 05:32:50PM +0200, Ingo Molnar wrote:
> Yeah, that would be good. It should also obviously depend on
> CONFIG_X86_MCE - so only visible if MCE support is enabled.

Ok, here are the patches with the discussed changes added. By the way,
do we want that lot in .32 still or should I queue them up for the merge
window next year :) ?

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

Subject: [PATCH 1/3] x86, mce, edac: Fix MCE decoding callback logic

>From 43ded574702ff1f367acb3d790c69b2687a666dd Mon Sep 17 00:00:00 2001
From: Ingo Molnar <[email protected]>
Date: Thu, 1 Oct 2009 16:14:32 +0200
Subject: [PATCH 1/3] x86, mce, edac: Fix MCE decoding callback logic

Make decoding of MCEs happen only on AMD hardware by registering a
non-default callback only on CPU families which support it.

While looking at the interaction of decode_mce() with the other MCE
code i also noticed a few other things and made the following
cleanups/fixes:

- Fixed the mce_decode() weak alias - a weak alias is really not
good here, it should be a proper callback. A weak alias will be
overriden if a piece of code is built into the kernel - not
good, obviously.

- The patch initializes the callback on AMD family 10h and 11h.

- Added the more correct fallback printk of:

No support for human readable MCE decoding on this CPU type.
Transcribe the message and run it through 'mcelog --ascii' to decode.

On CPUs that dont have a decoder.

- Made the surrounding code more readable.

Note that the callback allows us to have a default fallback -
without having to check the CPU versions during the printout
itself. When an EDAC module registers itself, it can install the
decode-print function.

(there's no unregister needed as this is core code.)

version -v2 by Borislav Petkov:

- add K8 to the set of supported CPUs

- always build in edac_mce_amd since we use an early_initcall now

- fix checkpatch warnings

Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Linus Torvalds <[email protected]>
LKML-Reference: <20091001141432.GA11410@aftab>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/mce.h | 2 +
arch/x86/kernel/cpu/mcheck/mce.c | 58 +++++++++++++++++++++++--------------
drivers/edac/Makefile | 2 +-
drivers/edac/edac_mce_amd.c | 15 +++++++++-
4 files changed, 53 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index b608a64..f1363b7 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -133,6 +133,8 @@ static inline void winchip_mcheck_init(struct cpuinfo_x86 *c) {}
static inline void enable_p5_mce(void) {}
#endif

+extern void (*x86_mce_decode_callback)(struct mce *m);
+
void mce_setup(struct mce *m);
void mce_log(struct mce *m);
DECLARE_PER_CPU(struct sys_device, mce_dev);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 183c345..b1598a9 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -85,6 +85,18 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_wait);
static DEFINE_PER_CPU(struct mce, mces_seen);
static int cpu_missing;

+static void default_decode_mce(struct mce *m)
+{
+ pr_emerg("No human readable MCE decoding support on this CPU type.\n");
+ pr_emerg("Run the message through 'mcelog --ascii' to decode.\n");
+}
+
+/*
+ * CPU/chipset specific EDAC code can register a callback here to print
+ * MCE errors in a human-readable form:
+ */
+void (*x86_mce_decode_callback)(struct mce *m) = default_decode_mce;
+EXPORT_SYMBOL(x86_mce_decode_callback);

/* MCA banks polled by the period polling timer for corrected events */
DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
@@ -165,46 +177,46 @@ void mce_log(struct mce *mce)
set_bit(0, &mce_need_notify);
}

-void __weak decode_mce(struct mce *m)
-{
- return;
-}
-
static void print_mce(struct mce *m)
{
- printk(KERN_EMERG
- "CPU %d: Machine Check Exception: %16Lx Bank %d: %016Lx\n",
+ pr_emerg("CPU %d: Machine Check Exception: %16Lx Bank %d: %016Lx\n",
m->extcpu, m->mcgstatus, m->bank, m->status);
+
if (m->ip) {
- printk(KERN_EMERG "RIP%s %02x:<%016Lx> ",
- !(m->mcgstatus & MCG_STATUS_EIPV) ? " !INEXACT!" : "",
- m->cs, m->ip);
+ pr_emerg("RIP%s %02x:<%016Lx> ",
+ !(m->mcgstatus & MCG_STATUS_EIPV) ? " !INEXACT!" : "",
+ m->cs, m->ip);
+
if (m->cs == __KERNEL_CS)
print_symbol("{%s}", m->ip);
- printk(KERN_CONT "\n");
+ pr_cont("\n");
}
- printk(KERN_EMERG "TSC %llx ", m->tsc);
+
+ pr_emerg("TSC %llx ", m->tsc);
if (m->addr)
- printk(KERN_CONT "ADDR %llx ", m->addr);
+ pr_cont("ADDR %llx ", m->addr);
if (m->misc)
- printk(KERN_CONT "MISC %llx ", m->misc);
- printk(KERN_CONT "\n");
- printk(KERN_EMERG "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x\n",
- m->cpuvendor, m->cpuid, m->time, m->socketid,
- m->apicid);
+ pr_cont("MISC %llx ", m->misc);
+
+ pr_cont("\n");
+ pr_emerg("PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x\n",
+ m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid);

- decode_mce(m);
+ /*
+ * Print out human-readable details about the MCE error,
+ * (if the CPU has an implementation for that):
+ */
+ x86_mce_decode_callback(m);
}

static void print_mce_head(void)
{
- printk(KERN_EMERG "\nHARDWARE ERROR\n");
+ pr_emerg("\nHARDWARE ERROR\n");
}

static void print_mce_tail(void)
{
- printk(KERN_EMERG "This is not a software problem!\n"
- "Run through mcelog --ascii to decode and contact your hardware vendor\n");
+ pr_emerg("This is not a software problem!\n");
}

#define PANIC_TIMEOUT 5 /* 5 seconds */
@@ -218,6 +230,7 @@ static atomic_t mce_fake_paniced;
static void wait_for_panic(void)
{
long timeout = PANIC_TIMEOUT*USEC_PER_SEC;
+
preempt_disable();
local_irq_enable();
while (timeout-- > 0)
@@ -285,6 +298,7 @@ static void mce_panic(char *msg, struct mce *final, char *exp)
static int msr_to_offset(u32 msr)
{
unsigned bank = __get_cpu_var(injectm.bank);
+
if (msr == rip_msr)
return offsetof(struct mce, ip);
if (msr == MSR_IA32_MCx_STATUS(bank))
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 7a473bb..8701cd7 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -18,7 +18,7 @@ edac_core-objs += edac_pci.o edac_pci_sysfs.o
endif

ifdef CONFIG_CPU_SUP_AMD
-edac_core-objs += edac_mce_amd.o
+obj-$(CONFIG_X86_MCE) += edac_mce_amd.o
endif

obj-$(CONFIG_EDAC_AMD76X) += amd76x_edac.o
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 0c21c37..83a01a1 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -362,7 +362,7 @@ static inline void amd_decode_err_code(unsigned int ec)
pr_warning("Huh? Unknown MCE error 0x%x\n", ec);
}

-void decode_mce(struct mce *m)
+static void amd_decode_mce(struct mce *m)
{
struct err_regs regs;
int node, ecc;
@@ -420,3 +420,16 @@ void decode_mce(struct mce *m)

amd_decode_err_code(m->status & 0xffff);
}
+
+static int __init mce_amd_init(void)
+{
+ /*
+ * We can decode MCEs for Opteron and later CPUs:
+ */
+ if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
+ (boot_cpu_data.x86 >= 0xf))
+ x86_mce_decode_callback = amd_decode_mce;
+
+ return 0;
+}
+early_initcall(mce_amd_init);
--
1.6.4.3

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

Subject: [PATCH 2/3] initcalls: add early_initcall for modules

>From 52cc05f71bac10dff8a3aacf1cad16a0bf6c375b Mon Sep 17 00:00:00 2001
From: Borislav Petkov <[email protected]>
Date: Thu, 1 Oct 2009 18:52:45 +0200
Subject: [PATCH 2/3] initcalls: add early_initcall for modules

Signed-off-by: Borislav Petkov <[email protected]>
---
include/linux/init.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/init.h b/include/linux/init.h
index 400adbb..ff8bde5 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -271,6 +271,7 @@ void __init parse_early_options(char *cmdline);
#else /* MODULE */

/* Don't use these in modules, but some people do... */
+#define early_initcall(fn) module_init(fn)
#define core_initcall(fn) module_init(fn)
#define postcore_initcall(fn) module_init(fn)
#define arch_initcall(fn) module_init(fn)
--
1.6.4.3

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

2009-10-02 13:26:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: x86: mce: Please revert 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9


* Borislav Petkov <[email protected]> wrote:

> On Thu, Oct 01, 2009 at 05:32:50PM +0200, Ingo Molnar wrote:
> > Yeah, that would be good. It should also obviously depend on
> > CONFIG_X86_MCE - so only visible if MCE support is enabled.
>
> Ok, here are the patches with the discussed changes added. By the way,
> do we want that lot in .32 still or should I queue them up for the
> merge window next year :) ?

I think it might be fine for this window still. The edac-mce angle is
new code in this cycle and we want to fix it. I'll test it and we'll
see.

Ingo

Subject: [PATCH 3/3] EDAC: carve out AMD MCE decoding logic

>From 7fa3435b6fdf73e8dfeb9153fe7cec95b88d6196 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <[email protected]>
Date: Fri, 2 Oct 2009 11:59:20 +0200
Subject: [PATCH 3/3] EDAC: carve out AMD MCE decoding logic

This converts the MCE decoding logic into a standalone config option
which can be built-in or a module, the first one being the default for
MCEs happening early on in the boot process.

Signed-off-by: Borislav Petkov <[email protected]>
---
drivers/edac/Kconfig | 14 +++++++++++++-
drivers/edac/Makefile | 5 +----
drivers/edac/edac_mce_amd.c | 19 ++++++++++++++++++-
3 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 02127e5..55c9c59 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -47,6 +47,18 @@ config EDAC_DEBUG_VERBOSE
Source file name and line number where debugging message
printed will be added to debugging message.

+ config EDAC_DECODE_MCE
+ tristate "Decode MCEs in human-readable form (only on AMD for now)"
+ depends on CPU_SUP_AMD && X86_MCE
+ default y
+ ---help---
+ Enable this option if you want to decode Machine Check Exceptions
+ occuring on your machine in human-readable form.
+
+ You should definitely say Y here in case you want to decode MCEs
+ which occur really early upon boot, before the module infrastructure
+ has been initialized.
+
config EDAC_MM_EDAC
tristate "Main Memory EDAC (Error Detection And Correction) reporting"
help
@@ -59,7 +71,7 @@ config EDAC_MM_EDAC

config EDAC_AMD64
tristate "AMD64 (Opteron, Athlon64) K8, F10h, F11h"
- depends on EDAC_MM_EDAC && K8_NB && X86_64 && PCI && CPU_SUP_AMD
+ depends on EDAC_MM_EDAC && K8_NB && X86_64 && PCI && EDAC_DECODE_MCE
help
Support for error detection and correction on the AMD 64
Families of Memory Controllers (K8, F10h and F11h)
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 8701cd7..bc5dc23 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -6,7 +6,6 @@
# GNU General Public License.
#

-
obj-$(CONFIG_EDAC) := edac_stub.o
obj-$(CONFIG_EDAC_MM_EDAC) += edac_core.o

@@ -17,9 +16,7 @@ ifdef CONFIG_PCI
edac_core-objs += edac_pci.o edac_pci_sysfs.o
endif

-ifdef CONFIG_CPU_SUP_AMD
-obj-$(CONFIG_X86_MCE) += edac_mce_amd.o
-endif
+obj-$(CONFIG_EDAC_DECODE_MCE) += edac_mce_amd.o

obj-$(CONFIG_EDAC_AMD76X) += amd76x_edac.o
obj-$(CONFIG_EDAC_CPC925) += cpc925_edac.o
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 83a01a1..713ed7d 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -3,6 +3,7 @@

static bool report_gart_errors;
static void (*nb_bus_decoder)(int node_id, struct err_regs *regs);
+static void (*orig_mce_callback)(struct mce *m);

void amd_report_gart_errors(bool v)
{
@@ -427,9 +428,25 @@ static int __init mce_amd_init(void)
* We can decode MCEs for Opteron and later CPUs:
*/
if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
- (boot_cpu_data.x86 >= 0xf))
+ (boot_cpu_data.x86 >= 0xf)) {
+ /* safe the default decode mce callback */
+ orig_mce_callback = x86_mce_decode_callback;
+
x86_mce_decode_callback = amd_decode_mce;
+ }

return 0;
}
early_initcall(mce_amd_init);
+
+#ifdef MODULE
+static void __exit mce_amd_exit(void)
+{
+ x86_mce_decode_callback = orig_mce_callback;
+}
+
+MODULE_DESCRIPTION("AMD MCE decoder");
+MODULE_ALIAS("edac-mce-amd");
+MODULE_LICENSE("GPL");
+module_exit(mce_amd_exit);
+#endif
--
1.6.4.3

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

2009-10-02 13:40:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 3/3] EDAC: carve out AMD MCE decoding logic


* Borislav Petkov <[email protected]> wrote:

> +static void (*orig_mce_callback)(struct mce *m);
>
> void amd_report_gart_errors(bool v)
> {
> @@ -427,9 +428,25 @@ static int __init mce_amd_init(void)
> * We can decode MCEs for Opteron and later CPUs:
> */
> if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
> - (boot_cpu_data.x86 >= 0xf))
> + (boot_cpu_data.x86 >= 0xf)) {
> + /* safe the default decode mce callback */
> + orig_mce_callback = x86_mce_decode_callback;
> +
> x86_mce_decode_callback = amd_decode_mce;
> + }
>
> return 0;
> }
> early_initcall(mce_amd_init);
> +
> +#ifdef MODULE
> +static void __exit mce_amd_exit(void)
> +{
> + x86_mce_decode_callback = orig_mce_callback;
> +}

I suspect this is fine currently because no two EDAC modules should be
active at the same time. A followup cleanup patch would be nice
nevertheless that uses a notifier chain here with proper
register/unregister locking.

Ingo

2009-10-02 14:02:19

by Ingo Molnar

[permalink] [raw]
Subject: [tip:x86/urgent] x86: EDAC: MCE: Fix MCE decoding callback logic

Commit-ID: f436f8bb73138bc74eb1c6527723e00988ad8a8a
Gitweb: http://git.kernel.org/tip/f436f8bb73138bc74eb1c6527723e00988ad8a8a
Author: Ingo Molnar <[email protected]>
AuthorDate: Thu, 1 Oct 2009 16:14:32 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 2 Oct 2009 15:42:18 +0200

x86: EDAC: MCE: Fix MCE decoding callback logic

Make decoding of MCEs happen only on AMD hardware by registering a
non-default callback only on CPU families which support it.

While looking at the interaction of decode_mce() with the other MCE
code i also noticed a few other things and made the following
cleanups/fixes:

- Fixed the mce_decode() weak alias - a weak alias is really not
good here, it should be a proper callback. A weak alias will be
overriden if a piece of code is built into the kernel - not
good, obviously.

- The patch initializes the callback on AMD family 10h and 11h.

- Added the more correct fallback printk of:

No support for human readable MCE decoding on this CPU type.
Transcribe the message and run it through 'mcelog --ascii' to decode.

On CPUs that dont have a decoder.

- Made the surrounding code more readable.

Note that the callback allows us to have a default fallback -
without having to check the CPU versions during the printout
itself. When an EDAC module registers itself, it can install the
decode-print function.

(there's no unregister needed as this is core code.)

version -v2 by Borislav Petkov:

- add K8 to the set of supported CPUs

- always build in edac_mce_amd since we use an early_initcall now

- fix checkpatch warnings

Signed-off-by: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andi Kleen <[email protected]>
LKML-Reference: <20091001141432.GA11410@aftab>
Signed-off-by: Ingo Molnar <[email protected]>


---
arch/x86/include/asm/mce.h | 2 +
arch/x86/kernel/cpu/mcheck/mce.c | 58 +++++++++++++++++++++++--------------
drivers/edac/Makefile | 2 +-
drivers/edac/edac_mce_amd.c | 15 +++++++++-
4 files changed, 53 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index b608a64..f1363b7 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -133,6 +133,8 @@ static inline void winchip_mcheck_init(struct cpuinfo_x86 *c) {}
static inline void enable_p5_mce(void) {}
#endif

+extern void (*x86_mce_decode_callback)(struct mce *m);
+
void mce_setup(struct mce *m);
void mce_log(struct mce *m);
DECLARE_PER_CPU(struct sys_device, mce_dev);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 183c345..b1598a9 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -85,6 +85,18 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_wait);
static DEFINE_PER_CPU(struct mce, mces_seen);
static int cpu_missing;

+static void default_decode_mce(struct mce *m)
+{
+ pr_emerg("No human readable MCE decoding support on this CPU type.\n");
+ pr_emerg("Run the message through 'mcelog --ascii' to decode.\n");
+}
+
+/*
+ * CPU/chipset specific EDAC code can register a callback here to print
+ * MCE errors in a human-readable form:
+ */
+void (*x86_mce_decode_callback)(struct mce *m) = default_decode_mce;
+EXPORT_SYMBOL(x86_mce_decode_callback);

/* MCA banks polled by the period polling timer for corrected events */
DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
@@ -165,46 +177,46 @@ void mce_log(struct mce *mce)
set_bit(0, &mce_need_notify);
}

-void __weak decode_mce(struct mce *m)
-{
- return;
-}
-
static void print_mce(struct mce *m)
{
- printk(KERN_EMERG
- "CPU %d: Machine Check Exception: %16Lx Bank %d: %016Lx\n",
+ pr_emerg("CPU %d: Machine Check Exception: %16Lx Bank %d: %016Lx\n",
m->extcpu, m->mcgstatus, m->bank, m->status);
+
if (m->ip) {
- printk(KERN_EMERG "RIP%s %02x:<%016Lx> ",
- !(m->mcgstatus & MCG_STATUS_EIPV) ? " !INEXACT!" : "",
- m->cs, m->ip);
+ pr_emerg("RIP%s %02x:<%016Lx> ",
+ !(m->mcgstatus & MCG_STATUS_EIPV) ? " !INEXACT!" : "",
+ m->cs, m->ip);
+
if (m->cs == __KERNEL_CS)
print_symbol("{%s}", m->ip);
- printk(KERN_CONT "\n");
+ pr_cont("\n");
}
- printk(KERN_EMERG "TSC %llx ", m->tsc);
+
+ pr_emerg("TSC %llx ", m->tsc);
if (m->addr)
- printk(KERN_CONT "ADDR %llx ", m->addr);
+ pr_cont("ADDR %llx ", m->addr);
if (m->misc)
- printk(KERN_CONT "MISC %llx ", m->misc);
- printk(KERN_CONT "\n");
- printk(KERN_EMERG "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x\n",
- m->cpuvendor, m->cpuid, m->time, m->socketid,
- m->apicid);
+ pr_cont("MISC %llx ", m->misc);
+
+ pr_cont("\n");
+ pr_emerg("PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x\n",
+ m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid);

- decode_mce(m);
+ /*
+ * Print out human-readable details about the MCE error,
+ * (if the CPU has an implementation for that):
+ */
+ x86_mce_decode_callback(m);
}

static void print_mce_head(void)
{
- printk(KERN_EMERG "\nHARDWARE ERROR\n");
+ pr_emerg("\nHARDWARE ERROR\n");
}

static void print_mce_tail(void)
{
- printk(KERN_EMERG "This is not a software problem!\n"
- "Run through mcelog --ascii to decode and contact your hardware vendor\n");
+ pr_emerg("This is not a software problem!\n");
}

#define PANIC_TIMEOUT 5 /* 5 seconds */
@@ -218,6 +230,7 @@ static atomic_t mce_fake_paniced;
static void wait_for_panic(void)
{
long timeout = PANIC_TIMEOUT*USEC_PER_SEC;
+
preempt_disable();
local_irq_enable();
while (timeout-- > 0)
@@ -285,6 +298,7 @@ static void mce_panic(char *msg, struct mce *final, char *exp)
static int msr_to_offset(u32 msr)
{
unsigned bank = __get_cpu_var(injectm.bank);
+
if (msr == rip_msr)
return offsetof(struct mce, ip);
if (msr == MSR_IA32_MCx_STATUS(bank))
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 7a473bb..8701cd7 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -18,7 +18,7 @@ edac_core-objs += edac_pci.o edac_pci_sysfs.o
endif

ifdef CONFIG_CPU_SUP_AMD
-edac_core-objs += edac_mce_amd.o
+obj-$(CONFIG_X86_MCE) += edac_mce_amd.o
endif

obj-$(CONFIG_EDAC_AMD76X) += amd76x_edac.o
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 0c21c37..83a01a1 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -362,7 +362,7 @@ static inline void amd_decode_err_code(unsigned int ec)
pr_warning("Huh? Unknown MCE error 0x%x\n", ec);
}

-void decode_mce(struct mce *m)
+static void amd_decode_mce(struct mce *m)
{
struct err_regs regs;
int node, ecc;
@@ -420,3 +420,16 @@ void decode_mce(struct mce *m)

amd_decode_err_code(m->status & 0xffff);
}
+
+static int __init mce_amd_init(void)
+{
+ /*
+ * We can decode MCEs for Opteron and later CPUs:
+ */
+ if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
+ (boot_cpu_data.x86 >= 0xf))
+ x86_mce_decode_callback = amd_decode_mce;
+
+ return 0;
+}
+early_initcall(mce_amd_init);

Subject: [tip:x86/urgent] initcalls: Add early_initcall() for modules

Commit-ID: 329bd4119c8a0afea95f9db6d6b402a2f2b40e84
Gitweb: http://git.kernel.org/tip/329bd4119c8a0afea95f9db6d6b402a2f2b40e84
Author: Borislav Petkov <[email protected]>
AuthorDate: Fri, 2 Oct 2009 15:23:21 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 2 Oct 2009 15:42:19 +0200

initcalls: Add early_initcall() for modules

Complete the early_initcall() API by making it available in modules
too. To be used by the EDAC/MCE code.

Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: Andi Kleen <[email protected]>
LKML-Reference: <20091002132321.GC28682@aftab>
Signed-off-by: Ingo Molnar <[email protected]>


---
include/linux/init.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/init.h b/include/linux/init.h
index 400adbb..ff8bde5 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -271,6 +271,7 @@ void __init parse_early_options(char *cmdline);
#else /* MODULE */

/* Don't use these in modules, but some people do... */
+#define early_initcall(fn) module_init(fn)
#define core_initcall(fn) module_init(fn)
#define postcore_initcall(fn) module_init(fn)
#define arch_initcall(fn) module_init(fn)

Subject: [tip:x86/urgent] x86: EDAC: carve out AMD MCE decoding logic

Commit-ID: 0d18b2e34bd1ad8f5bd3f3a17b5e7df132e511a9
Gitweb: http://git.kernel.org/tip/0d18b2e34bd1ad8f5bd3f3a17b5e7df132e511a9
Author: Borislav Petkov <[email protected]>
AuthorDate: Fri, 2 Oct 2009 15:31:48 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 2 Oct 2009 15:42:19 +0200

x86: EDAC: carve out AMD MCE decoding logic

This converts the MCE decoding logic into a standalone config
option which can be built-in or a module, the first one being the
default for MCEs happening early on in the boot process.

This, beyond being separated in a cleaner way, also saves RAM by
making the decoding logic modular.

Signed-off-by: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andi Kleen <[email protected]>
LKML-Reference: <20091002133148.GD28682@aftab>
Signed-off-by: Ingo Molnar <[email protected]>


---
drivers/edac/Kconfig | 14 +++++++++++++-
drivers/edac/Makefile | 5 +----
drivers/edac/edac_mce_amd.c | 19 ++++++++++++++++++-
3 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 02127e5..55c9c59 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -47,6 +47,18 @@ config EDAC_DEBUG_VERBOSE
Source file name and line number where debugging message
printed will be added to debugging message.

+ config EDAC_DECODE_MCE
+ tristate "Decode MCEs in human-readable form (only on AMD for now)"
+ depends on CPU_SUP_AMD && X86_MCE
+ default y
+ ---help---
+ Enable this option if you want to decode Machine Check Exceptions
+ occuring on your machine in human-readable form.
+
+ You should definitely say Y here in case you want to decode MCEs
+ which occur really early upon boot, before the module infrastructure
+ has been initialized.
+
config EDAC_MM_EDAC
tristate "Main Memory EDAC (Error Detection And Correction) reporting"
help
@@ -59,7 +71,7 @@ config EDAC_MM_EDAC

config EDAC_AMD64
tristate "AMD64 (Opteron, Athlon64) K8, F10h, F11h"
- depends on EDAC_MM_EDAC && K8_NB && X86_64 && PCI && CPU_SUP_AMD
+ depends on EDAC_MM_EDAC && K8_NB && X86_64 && PCI && EDAC_DECODE_MCE
help
Support for error detection and correction on the AMD 64
Families of Memory Controllers (K8, F10h and F11h)
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 8701cd7..bc5dc23 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -6,7 +6,6 @@
# GNU General Public License.
#

-
obj-$(CONFIG_EDAC) := edac_stub.o
obj-$(CONFIG_EDAC_MM_EDAC) += edac_core.o

@@ -17,9 +16,7 @@ ifdef CONFIG_PCI
edac_core-objs += edac_pci.o edac_pci_sysfs.o
endif

-ifdef CONFIG_CPU_SUP_AMD
-obj-$(CONFIG_X86_MCE) += edac_mce_amd.o
-endif
+obj-$(CONFIG_EDAC_DECODE_MCE) += edac_mce_amd.o

obj-$(CONFIG_EDAC_AMD76X) += amd76x_edac.o
obj-$(CONFIG_EDAC_CPC925) += cpc925_edac.o
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 83a01a1..713ed7d 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -3,6 +3,7 @@

static bool report_gart_errors;
static void (*nb_bus_decoder)(int node_id, struct err_regs *regs);
+static void (*orig_mce_callback)(struct mce *m);

void amd_report_gart_errors(bool v)
{
@@ -427,9 +428,25 @@ static int __init mce_amd_init(void)
* We can decode MCEs for Opteron and later CPUs:
*/
if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
- (boot_cpu_data.x86 >= 0xf))
+ (boot_cpu_data.x86 >= 0xf)) {
+ /* safe the default decode mce callback */
+ orig_mce_callback = x86_mce_decode_callback;
+
x86_mce_decode_callback = amd_decode_mce;
+ }

return 0;
}
early_initcall(mce_amd_init);
+
+#ifdef MODULE
+static void __exit mce_amd_exit(void)
+{
+ x86_mce_decode_callback = orig_mce_callback;
+}
+
+MODULE_DESCRIPTION("AMD MCE decoder");
+MODULE_ALIAS("edac-mce-amd");
+MODULE_LICENSE("GPL");
+module_exit(mce_amd_exit);
+#endif

Subject: Re: [PATCH 3/3] EDAC: carve out AMD MCE decoding logic

On Fri, Oct 02, 2009 at 03:39:54PM +0200, Ingo Molnar wrote:
> I suspect this is fine currently because no two EDAC modules should be
> active at the same time. A followup cleanup patch would be nice
> nevertheless that uses a notifier chain here with proper
> register/unregister locking.

Please do elaborate for I can't seem to follow.

AFAICT, we are going to have max one MCE decoder at any one time so no
need to notify it.

The x86_mce_decode_callback assignment probably needs locking just to be
on the safe side. We can do a

mce_set_decode_callback(void (*f)(struct mce*m))

helper which falls back to default_decode_mce() whenever its being
passed a NULL for *f using proper locking...

I'm pretty sure I'm missing something though...

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

2009-10-02 18:47:40

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 3/3] EDAC: carve out AMD MCE decoding logic


* Borislav Petkov <[email protected]> wrote:

> On Fri, Oct 02, 2009 at 03:39:54PM +0200, Ingo Molnar wrote:
> > I suspect this is fine currently because no two EDAC modules should be
> > active at the same time. A followup cleanup patch would be nice
> > nevertheless that uses a notifier chain here with proper
> > register/unregister locking.
>
> Please do elaborate for I can't seem to follow.
>
> AFAICT, we are going to have max one MCE decoder at any one time so no
> need to notify it.
>
> The x86_mce_decode_callback assignment probably needs locking just to
> be on the safe side. We can do a
>
> mce_set_decode_callback(void (*f)(struct mce*m))
>
> helper which falls back to default_decode_mce() whenever its being
> passed a NULL for *f using proper locking...
>
> I'm pretty sure I'm missing something though...

No, the locking was all that i meant. Using atomic_notifier would solve
that. Make the default decoder low-prio, that way there's no need to do
the callback save/restore sequence either.

Ingo

2009-10-03 06:57:57

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] EDAC: carve out AMD MCE decoding logic

On Fri, Oct 02, 2009 at 08:47:14PM +0200, Ingo Molnar wrote:
> No, the locking was all that i meant. Using atomic_notifier would solve
> that. Make the default decoder low-prio, that way there's no need to do
> the callback save/restore sequence either.

Ok, how's that for starters, it has been only compile-tested and it
looks straight-forward enough to me...

--
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index f1363b7..9bb1756 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -214,5 +214,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c);

void mce_log_therm_throt_event(__u64 status);

+int mce_register_decoder_notifier(struct notifier_block *nb);
+int mce_unregister_decoder_notifier(struct notifier_block *nb);
#endif /* __KERNEL__ */
#endif /* _ASM_X86_MCE_H */
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index b1598a9..e767cce 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -85,18 +85,36 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_wait);
static DEFINE_PER_CPU(struct mce, mces_seen);
static int cpu_missing;

-static void default_decode_mce(struct mce *m)
+static ATOMIC_NOTIFIER_HEAD(mce_decoder_chain);
+
+/*
+ * CPU/chipset specific EDAC code can register a notifier call here to print
+ * MCE errors in a human-readable form.
+ */
+int mce_register_decoder_notifier(struct notifier_block *nb)
+{
+ return atomic_notifier_chain_register(&mce_decoder_chain, nb);
+}
+EXPORT_SYMBOL_GPL(mce_register_decoder_notifier);
+
+int mce_unregister_decoder_notifier(struct notifier_block *nb)
+{
+ return atomic_notifier_chain_unregister(&mce_decoder_chain, nb);
+}
+EXPORT_SYMBOL_GPL(mce_unregister_decoder_notifier);
+
+static int default_decode_mce(struct notifier_block *nb, unsigned long val,
+ void *data)
{
pr_emerg("No human readable MCE decoding support on this CPU type.\n");
pr_emerg("Run the message through 'mcelog --ascii' to decode.\n");
+
+ return NOTIFY_STOP;
}

-/*
- * CPU/chipset specific EDAC code can register a callback here to print
- * MCE errors in a human-readable form:
- */
-void (*x86_mce_decode_callback)(struct mce *m) = default_decode_mce;
-EXPORT_SYMBOL(x86_mce_decode_callback);
+static struct notifier_block mce_dec_nb = {
+ .notifier_call = default_decode_mce,
+};

/* MCA banks polled by the period polling timer for corrected events */
DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
@@ -204,9 +222,9 @@ static void print_mce(struct mce *m)

/*
* Print out human-readable details about the MCE error,
- * (if the CPU has an implementation for that):
+ * (if the CPU has an implementation for that)
*/
- x86_mce_decode_callback(m);
+ atomic_notifier_call_chain(&mce_decoder_chain, 0, m);
}

static void print_mce_head(void)
@@ -1420,6 +1438,8 @@ void __cpuinit mcheck_init(struct cpuinfo_x86 *c)
mce_cpu_features(c);
mce_init_timer();
INIT_WORK(&__get_cpu_var(mce_work), mce_process_work);
+
+ mce_register_decoder_notifier(&mce_dec_nb);
}

/*
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 713ed7d..aa0061e 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -363,8 +363,10 @@ static inline void amd_decode_err_code(unsigned int ec)
pr_warning("Huh? Unknown MCE error 0x%x\n", ec);
}

-static void amd_decode_mce(struct mce *m)
+static int amd_decode_mce(struct notifier_block *nb, unsigned long val,
+ void *data)
{
+ struct mce *m = (struct mce *)data;
struct err_regs regs;
int node, ecc;

@@ -420,20 +422,23 @@ static void amd_decode_mce(struct mce *m)
}

amd_decode_err_code(m->status & 0xffff);
+
+ return NOTIFY_STOP;
}

+static struct notifier_block amd_mce_dec_nb = {
+ .notifier_call = amd_decode_mce,
+ .priority = 100,
+};
+
static int __init mce_amd_init(void)
{
/*
* We can decode MCEs for Opteron and later CPUs:
*/
if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
- (boot_cpu_data.x86 >= 0xf)) {
- /* safe the default decode mce callback */
- orig_mce_callback = x86_mce_decode_callback;
-
- x86_mce_decode_callback = amd_decode_mce;
- }
+ (boot_cpu_data.x86 >= 0xf))
+ mce_register_decoder_notifier(&amd_mce_dec_nb);

return 0;
}
@@ -442,7 +447,7 @@ early_initcall(mce_amd_init);
#ifdef MODULE
static void __exit mce_amd_exit(void)
{
- x86_mce_decode_callback = orig_mce_callback;
+ mce_unregister_decoder_notifier(&amd_mce_dec_nb);
}

MODULE_DESCRIPTION("AMD MCE decoder");


--
Regards/Gruss,
Boris.

2009-10-03 07:18:22

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 3/3] EDAC: carve out AMD MCE decoding logic


* Borislav Petkov <[email protected]> wrote:

> On Fri, Oct 02, 2009 at 08:47:14PM +0200, Ingo Molnar wrote:
> > No, the locking was all that i meant. Using atomic_notifier would solve
> > that. Make the default decoder low-prio, that way there's no need to do
> > the callback save/restore sequence either.
>
> Ok, how's that for starters, it has been only compile-tested and it
> looks straight-forward enough to me...

looks good at first sight. There's two further simplifications you could
do:

> +int mce_register_decoder_notifier(struct notifier_block *nb);
> +int mce_unregister_decoder_notifier(struct notifier_block *nb);

expose the notifier list itself and dont create new register/unregister
APIs. Doing:

atomic_notifier_chain_register(&x86_mce_decoder_chain, nb);

straight from the EDAC code is clean enough.

plus:

> +static struct notifier_block amd_mce_dec_nb = {
> + .notifier_call = amd_decode_mce,
> + .priority = 100,
> +};
> +

it's better to set the default notifier to negative priority, and let
the EDAC notifier(s) be at the default 0 priority. The 100 is arbitrary
and ugly - that line can then be left out altogether.

Ingo

Subject: Re: [PATCH 3/3] EDAC: carve out AMD MCE decoding logic

On Sat, Oct 03, 2009 at 09:18:07AM +0200, Ingo Molnar wrote:
> > > No, the locking was all that i meant. Using atomic_notifier would solve
> > > that. Make the default decoder low-prio, that way there's no need to do
> > > the callback save/restore sequence either.
> >
> > Ok, how's that for starters, it has been only compile-tested and it
> > looks straight-forward enough to me...
>
> looks good at first sight. There's two further simplifications you could
> do:

Ok, here we go, had to make sure we register the notifier only once
(done on the BSP now) since mcheck_init() runs on each CPU.

--
>From 294358680f02b2a8998d2f6b04d99e343fc3b475 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <[email protected]>
Date: Sat, 3 Oct 2009 20:16:25 +0200
Subject: [PATCH] mce, edac: use an atomic notifier for MCEs decoding

Add an atomic notifier which ensures proper locking when conveying MCE
info to EDAC for decoding. The actual notifier call overrides a default,
negative priority notifier.

Note: make sure we register the default decoder only once since
mcheck_init() runs on each CPU.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/mce.h | 3 ++-
arch/x86/kernel/cpu/mcheck/mce.c | 30 +++++++++++++++++++++---------
drivers/edac/edac_mce_amd.c | 23 ++++++++++++++---------
3 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index f1363b7..227a72d 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -108,6 +108,8 @@ struct mce_log {
#define K8_MCE_THRESHOLD_BANK_5 (MCE_THRESHOLD_BASE + 5 * 9)
#define K8_MCE_THRESHOLD_DRAM_ECC (MCE_THRESHOLD_BANK_4 + 0)

+extern struct atomic_notifier_head x86_mce_decoder_chain;
+
#ifdef __KERNEL__

#include <linux/percpu.h>
@@ -213,6 +215,5 @@ extern void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu);
void intel_init_thermal(struct cpuinfo_x86 *c);

void mce_log_therm_throt_event(__u64 status);
-
#endif /* __KERNEL__ */
#endif /* _ASM_X86_MCE_H */
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index b1598a9..09c2e2e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -85,18 +85,26 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_wait);
static DEFINE_PER_CPU(struct mce, mces_seen);
static int cpu_missing;

-static void default_decode_mce(struct mce *m)
+/*
+ * CPU/chipset specific EDAC code can register a notifier call here to print
+ * MCE errors in a human-readable form.
+ */
+ATOMIC_NOTIFIER_HEAD(x86_mce_decoder_chain);
+EXPORT_SYMBOL_GPL(x86_mce_decoder_chain);
+
+static int default_decode_mce(struct notifier_block *nb, unsigned long val,
+ void *data)
{
pr_emerg("No human readable MCE decoding support on this CPU type.\n");
pr_emerg("Run the message through 'mcelog --ascii' to decode.\n");
+
+ return NOTIFY_STOP;
}

-/*
- * CPU/chipset specific EDAC code can register a callback here to print
- * MCE errors in a human-readable form:
- */
-void (*x86_mce_decode_callback)(struct mce *m) = default_decode_mce;
-EXPORT_SYMBOL(x86_mce_decode_callback);
+static struct notifier_block mce_dec_nb = {
+ .notifier_call = default_decode_mce,
+ .priority = -1,
+};

/* MCA banks polled by the period polling timer for corrected events */
DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
@@ -204,9 +212,9 @@ static void print_mce(struct mce *m)

/*
* Print out human-readable details about the MCE error,
- * (if the CPU has an implementation for that):
+ * (if the CPU has an implementation for that)
*/
- x86_mce_decode_callback(m);
+ atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
}

static void print_mce_head(void)
@@ -1420,6 +1428,10 @@ void __cpuinit mcheck_init(struct cpuinfo_x86 *c)
mce_cpu_features(c);
mce_init_timer();
INIT_WORK(&__get_cpu_var(mce_work), mce_process_work);
+
+ if (raw_smp_processor_id() == 0)
+ atomic_notifier_chain_register(&x86_mce_decoder_chain,
+ &mce_dec_nb);
}

/*
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 713ed7d..6ee7529 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -3,7 +3,6 @@

static bool report_gart_errors;
static void (*nb_bus_decoder)(int node_id, struct err_regs *regs);
-static void (*orig_mce_callback)(struct mce *m);

void amd_report_gart_errors(bool v)
{
@@ -363,8 +362,10 @@ static inline void amd_decode_err_code(unsigned int ec)
pr_warning("Huh? Unknown MCE error 0x%x\n", ec);
}

-static void amd_decode_mce(struct mce *m)
+static int amd_decode_mce(struct notifier_block *nb, unsigned long val,
+ void *data)
{
+ struct mce *m = (struct mce *)data;
struct err_regs regs;
int node, ecc;

@@ -420,20 +421,23 @@ static void amd_decode_mce(struct mce *m)
}

amd_decode_err_code(m->status & 0xffff);
+
+ return NOTIFY_STOP;
}

+static struct notifier_block amd_mce_dec_nb = {
+ .notifier_call = amd_decode_mce,
+};
+
static int __init mce_amd_init(void)
{
/*
* We can decode MCEs for Opteron and later CPUs:
*/
if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
- (boot_cpu_data.x86 >= 0xf)) {
- /* safe the default decode mce callback */
- orig_mce_callback = x86_mce_decode_callback;
-
- x86_mce_decode_callback = amd_decode_mce;
- }
+ (boot_cpu_data.x86 >= 0xf))
+ atomic_notifier_chain_register(&x86_mce_decoder_chain,
+ &amd_mce_dec_nb);

return 0;
}
@@ -442,7 +446,8 @@ early_initcall(mce_amd_init);
#ifdef MODULE
static void __exit mce_amd_exit(void)
{
- x86_mce_decode_callback = orig_mce_callback;
+ atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
+ &amd_mce_dec_nb);
}

MODULE_DESCRIPTION("AMD MCE decoder");
--
1.6.4.3


--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

Subject: [tip:perf/mce] mce, edac: Use an atomic notifier for MCEs decoding

Commit-ID: fb2531953fd8855abdcf458459020fd382c5deca
Gitweb: http://git.kernel.org/tip/fb2531953fd8855abdcf458459020fd382c5deca
Author: Borislav Petkov <[email protected]>
AuthorDate: Wed, 7 Oct 2009 13:20:38 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 12 Oct 2009 12:24:45 +0200

mce, edac: Use an atomic notifier for MCEs decoding

Add an atomic notifier which ensures proper locking when conveying
MCE info to EDAC for decoding. The actual notifier call overrides a
default, negative priority notifier.

Note: make sure we register the default decoder only once since
mcheck_init() runs on each CPU.

Signed-off-by: Borislav Petkov <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/mce.h | 3 ++-
arch/x86/kernel/cpu/mcheck/mce.c | 29 ++++++++++++++++++++---------
drivers/edac/edac_mce_amd.c | 21 ++++++++++++---------
3 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index f1363b7..227a72d 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -108,6 +108,8 @@ struct mce_log {
#define K8_MCE_THRESHOLD_BANK_5 (MCE_THRESHOLD_BASE + 5 * 9)
#define K8_MCE_THRESHOLD_DRAM_ECC (MCE_THRESHOLD_BANK_4 + 0)

+extern struct atomic_notifier_head x86_mce_decoder_chain;
+
#ifdef __KERNEL__

#include <linux/percpu.h>
@@ -213,6 +215,5 @@ extern void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu);
void intel_init_thermal(struct cpuinfo_x86 *c);

void mce_log_therm_throt_event(__u64 status);
-
#endif /* __KERNEL__ */
#endif /* _ASM_X86_MCE_H */
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index b1598a9..15ba9c9 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -85,18 +85,26 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_wait);
static DEFINE_PER_CPU(struct mce, mces_seen);
static int cpu_missing;

-static void default_decode_mce(struct mce *m)
+/*
+ * CPU/chipset specific EDAC code can register a notifier call here to print
+ * MCE errors in a human-readable form.
+ */
+ATOMIC_NOTIFIER_HEAD(x86_mce_decoder_chain);
+EXPORT_SYMBOL_GPL(x86_mce_decoder_chain);
+
+static int default_decode_mce(struct notifier_block *nb, unsigned long val,
+ void *data)
{
pr_emerg("No human readable MCE decoding support on this CPU type.\n");
pr_emerg("Run the message through 'mcelog --ascii' to decode.\n");
+
+ return NOTIFY_STOP;
}

-/*
- * CPU/chipset specific EDAC code can register a callback here to print
- * MCE errors in a human-readable form:
- */
-void (*x86_mce_decode_callback)(struct mce *m) = default_decode_mce;
-EXPORT_SYMBOL(x86_mce_decode_callback);
+static struct notifier_block mce_dec_nb = {
+ .notifier_call = default_decode_mce,
+ .priority = -1,
+};

/* MCA banks polled by the period polling timer for corrected events */
DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
@@ -204,9 +212,9 @@ static void print_mce(struct mce *m)

/*
* Print out human-readable details about the MCE error,
- * (if the CPU has an implementation for that):
+ * (if the CPU has an implementation for that)
*/
- x86_mce_decode_callback(m);
+ atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
}

static void print_mce_head(void)
@@ -1420,6 +1428,9 @@ void __cpuinit mcheck_init(struct cpuinfo_x86 *c)
mce_cpu_features(c);
mce_init_timer();
INIT_WORK(&__get_cpu_var(mce_work), mce_process_work);
+
+ if (raw_smp_processor_id() == 0)
+ atomic_notifier_chain_register(&x86_mce_decoder_chain, &mce_dec_nb);
}

/*
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 713ed7d..689cc6a 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -3,7 +3,6 @@

static bool report_gart_errors;
static void (*nb_bus_decoder)(int node_id, struct err_regs *regs);
-static void (*orig_mce_callback)(struct mce *m);

void amd_report_gart_errors(bool v)
{
@@ -363,8 +362,10 @@ static inline void amd_decode_err_code(unsigned int ec)
pr_warning("Huh? Unknown MCE error 0x%x\n", ec);
}

-static void amd_decode_mce(struct mce *m)
+static int amd_decode_mce(struct notifier_block *nb, unsigned long val,
+ void *data)
{
+ struct mce *m = (struct mce *)data;
struct err_regs regs;
int node, ecc;

@@ -420,20 +421,22 @@ static void amd_decode_mce(struct mce *m)
}

amd_decode_err_code(m->status & 0xffff);
+
+ return NOTIFY_STOP;
}

+static struct notifier_block amd_mce_dec_nb = {
+ .notifier_call = amd_decode_mce,
+};
+
static int __init mce_amd_init(void)
{
/*
* We can decode MCEs for Opteron and later CPUs:
*/
if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
- (boot_cpu_data.x86 >= 0xf)) {
- /* safe the default decode mce callback */
- orig_mce_callback = x86_mce_decode_callback;
-
- x86_mce_decode_callback = amd_decode_mce;
- }
+ (boot_cpu_data.x86 >= 0xf))
+ atomic_notifier_chain_register(&x86_mce_decoder_chain, &amd_mce_dec_nb);

return 0;
}
@@ -442,7 +445,7 @@ early_initcall(mce_amd_init);
#ifdef MODULE
static void __exit mce_amd_exit(void)
{
- x86_mce_decode_callback = orig_mce_callback;
+ atomic_notifier_chain_unregister(&x86_mce_decoder_chain, &amd_mce_dec_nb);
}

MODULE_DESCRIPTION("AMD MCE decoder");