LinuxLists.cc - [RFC PATCH 0/14] amd64_edac: marry mcheck to amd64 edac

2009-07-20 16:14:36

Subject: [RFC PATCH 0/14] amd64_edac: marry mcheck to amd64 edac

Hi all,

this is the first version of the attempt to forward MCE information to
the amd64 EDAC module for further decoding. When the MCE handler gets
invoked and the EDAC module is loaded, here's how a decoded MCE looks
like:

Disabling lock debugging due to kernel taint

<0>HARDWARE ERROR
CPU 3: Machine Check Exception: 4 Bank 0: b20040001c000175
TSC 714e9b73cf
PROCESSOR 2:100f22 TIME 1247237579 SOCKET 0 APIC 3
MC0_STATUS: Uncorrected error, report: yes, MiscV: invalid, CPU context corrupt: yes
Data Cache Error: Data/Tag Evict error.
Transaction: Evict, Type: Data, Cache Level: L1
This is not a software problem!
<0>Run through mcelog --ascii to decode and contact your hardware vendor
Machine check: Processor context corrupt
Kernel panic - not syncing: Fatal machine check on current CPU
Pid: 4817, comm: cc1 Tainted: G M 2.6.31-rc2-00218-g78848b0-dirty #42
Call Trace:
<#MC> [<ffffffff8134a17a>] panic+0xaf/0x178
[<ffffffff812b5d9e>] ? decode_mce+0x47e/0x540
[<ffffffff81019210>] ? print_mce+0x90/0x110
[<ffffffff810193e7>] mce_panic+0x157/0x180
[<ffffffff81019de7>] do_machine_check+0x757/0x930
[<ffffffff8134d96d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[<ffffffff8134e9cb>] machine_check+0x1b/0x20
<EOE>

Clearly, the "Run through mcelog... " line is redundant now :) since
there's no need for userspace decoding anymore and the original EDAC
functionality (polling workqueue) is still preserved. The code currently
uses EDAC to decode DRAM ECC errors but this could clearly be extended
to handle all valid addresses acquired from MCi_ADDR registers.

Comments and further suggestions are most welcome.

Thanks,
Boris.

arch/x86/kernel/cpu/mcheck/mce.c | 7 +
drivers/edac/amd64_edac.c | 484 +++++++++++++++++++++--------------
drivers/edac/amd64_edac.h | 67 ++---
drivers/edac/amd64_edac_dbg.c | 2 +-
drivers/edac/amd64_edac_err_types.c | 126 +++++-----
5 files changed, 382 insertions(+), 304 deletions(-)

2009-07-20 16:13:29

Subject: [RFC PATCH 0/14] amd64_edac: marry mcheck to amd64 edac

Subject: [PATCH 14/14] amd64_edac: decode FR MCEs

Subject: [PATCH 02/14] amd64_edac: cleanup amd64_process_error_info

Subject: [PATCH 01/14] amd64_edac: simplify error type bits extractors

Subject: [PATCH 06/14] amd64_edac: cleanup amd64_decode_bus_error

Subject: [PATCH 10/14] amd64_edac: decode data cache MCEs

Subject: [PATCH 09/14] amd64_edac: carve out decoding of MCi_STATUS ErrorCode

Subject: [PATCH 13/14] amd64_edac: decode load store MCEs

Subject: [PATCH 03/14] amd64_edac: cleanup/complete NB MCE decoding

Subject: [PATCH 12/14] amd64_edac: decode bus unit MCEs

Subject: [PATCH 11/14] amd64_edac: decode instruction cache MCEs

Subject: [PATCH 04/14] amd64_edac: fixup ExtError decoding

Subject: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: [PATCH 08/14] amd64_edac: carve out MCi_STATUS decoding

Subject: [PATCH 05/14] amd64_edac: remove memory and GART TLB error decoders

Subject: Re: [RFC PATCH 0/14] amd64_edac: marry mcheck to amd64 edac

Subject: Re: [PATCH 01/14] amd64_edac: simplify error type bits extractors

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding II

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [RFC PATCH 0/14] amd64_edac: marry mcheck to amd64 edac

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 01/14] amd64_edac: simplify error type bits extractors

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding II

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding II

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding II

Subject: Re: [PATCH 07/14] mce3: pass mce info to EDAC for decoding