UCNA errors share the same handler with CMCI. But it doesn't
need extra operation to save error record in genpool. Remove
these uselss codes.
Signed-off-by: Chen, Gong <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index c5b0d562dbf5..1ad3fb4f99b7 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -609,20 +609,6 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
/*
- * In the cases where we don't have a valid address after all,
- * do not add it into the ring buffer.
- */
- if (severity == MCE_DEFERRED_SEVERITY && memory_error(&m)) {
- if (m.status & MCI_STATUS_ADDRV) {
- m.severity = severity;
- m.usable_addr = mce_usable_address(&m);
-
- if (!mce_gen_pool_add(&m))
- mce_schedule_work();
- }
- }
-
- /*
* Don't get the IP here because it's unlikely to
* have anything to do with the actual error location.
*/
--
2.3.2
On Wed, Nov 11, 2015 at 10:16:45AM -0500, Chen, Gong wrote:
> UCNA errors share the same handler with CMCI. But it doesn't
> need extra operation to save error record in genpool. Remove
> these uselss codes.
I'd have emphasised that this same mce is being added to the genpool
*twice* (once here, and again when we call mce_log() just below).
Though there may be some corner cases depending on flags and
mca_cfg.dont_log_ce
-Tony
>
> Signed-off-by: Chen, Gong <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce.c | 14 --------------
> 1 file changed, 14 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index c5b0d562dbf5..1ad3fb4f99b7 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -609,20 +609,6 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
> severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
>
> /*
> - * In the cases where we don't have a valid address after all,
> - * do not add it into the ring buffer.
> - */
> - if (severity == MCE_DEFERRED_SEVERITY && memory_error(&m)) {
> - if (m.status & MCI_STATUS_ADDRV) {
> - m.severity = severity;
> - m.usable_addr = mce_usable_address(&m);
> -
> - if (!mce_gen_pool_add(&m))
> - mce_schedule_work();
> - }
> - }
> -
> - /*
> * Don't get the IP here because it's unlikely to
> * have anything to do with the actual error location.
> */
> --
> 2.3.2
>
We used to have a special ring buffer for deferred errors that
was used to mark problem pages. We replaced that with a genpool.
Then later converted mce_log() to also use the same genpool. As
a result we end up adding all deferred errors to the genpool twice.
Rearrange this code. Make sure to set the m.severity and m.usable_addr
fields for deferred errors. Then if flags and mca_cfg.dont_log_ce mean
we call mce_log() we are done, because that will add this entry to the
genpool.
If we skipped mce_log(), then we still want to take action for the
deferred error, so add to the genpool.
Changed the name of the boolean "error_logged" to "error_seen", we
should set it whether of not we logged an error because the return
value from machine_check_poll() is used to decide whether storms
have subsided or not.
Reported-by: Chen, Gong <gong.chen.linux.intel.com>
Signed-off-by: Tony Luck <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index c5b0d562dbf5..6531cb46803c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -567,7 +567,7 @@ DEFINE_PER_CPU(unsigned, mce_poll_count);
*/
bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
{
- bool error_logged = false;
+ bool error_seen = false;
struct mce m;
int severity;
int i;
@@ -601,6 +601,8 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
(m.status & (mca_cfg.ser ? MCI_STATUS_S : MCI_STATUS_UC)))
continue;
+ error_seen = true;
+
mce_read_aux(&m, i);
if (!(flags & MCP_TIMESTAMP))
@@ -608,17 +610,10 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
- /*
- * In the cases where we don't have a valid address after all,
- * do not add it into the ring buffer.
- */
if (severity == MCE_DEFERRED_SEVERITY && memory_error(&m)) {
if (m.status & MCI_STATUS_ADDRV) {
m.severity = severity;
m.usable_addr = mce_usable_address(&m);
-
- if (!mce_gen_pool_add(&m))
- mce_schedule_work();
}
}
@@ -626,9 +621,16 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
* Don't get the IP here because it's unlikely to
* have anything to do with the actual error location.
*/
- if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce) {
- error_logged = true;
+ if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce)
mce_log(&m);
+ else if (m.usable_addr) {
+ /*
+ * Although we skipped logging this, we still want
+ * to take action. Add to the pool so the registered
+ * notifiers will see it.
+ */
+ if (!mce_gen_pool_add(&m))
+ mce_schedule_work();
}
/*
@@ -644,7 +646,7 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
sync_core();
- return error_logged;
+ return error_seen;
}
EXPORT_SYMBOL_GPL(machine_check_poll);
--
2.1.4
On Wed, Nov 11, 2015 at 02:01:51PM -0800, Luck, Tony wrote:
> Date: Wed, 11 Nov 2015 14:01:51 -0800
> From: Tony Luck <[email protected]>
> To: "Chen, Gong" <[email protected]>
> Cc: [email protected], [email protected], [email protected]
> Subject: [UNTESTED PATCH] x86, mce: Avoid double entry of deferred errors
> into the genpool.
>
> We used to have a special ring buffer for deferred errors that
> was used to mark problem pages. We replaced that with a genpool.
> Then later converted mce_log() to also use the same genpool. As
> a result we end up adding all deferred errors to the genpool twice.
>
> Rearrange this code. Make sure to set the m.severity and m.usable_addr
> fields for deferred errors. Then if flags and mca_cfg.dont_log_ce mean
> we call mce_log() we are done, because that will add this entry to the
> genpool.
>
> If we skipped mce_log(), then we still want to take action for the
> deferred error, so add to the genpool.
>
> Changed the name of the boolean "error_logged" to "error_seen", we
> should set it whether of not we logged an error because the return
> value from machine_check_poll() is used to decide whether storms
> have subsided or not.
>
> Reported-by: Chen, Gong <gong.chen.linux.intel.com>
> Signed-off-by: Tony Luck <[email protected]>
> ---
It's much better than my original version.
On Wed, Nov 11, 2015 at 02:01:51PM -0800, Tony Luck wrote:
> We used to have a special ring buffer for deferred errors that
> was used to mark problem pages. We replaced that with a genpool.
> Then later converted mce_log() to also use the same genpool. As
> a result we end up adding all deferred errors to the genpool twice.
>
> Rearrange this code. Make sure to set the m.severity and m.usable_addr
> fields for deferred errors. Then if flags and mca_cfg.dont_log_ce mean
> we call mce_log() we are done, because that will add this entry to the
> genpool.
>
> If we skipped mce_log(), then we still want to take action for the
> deferred error, so add to the genpool.
>
> Changed the name of the boolean "error_logged" to "error_seen", we
> should set it whether of not we logged an error because the return
> value from machine_check_poll() is used to decide whether storms
> have subsided or not.
>
> Reported-by: Chen, Gong <gong.chen.linux.intel.com>
> Signed-off-by: Tony Luck <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce.c | 24 +++++++++++++-----------
> 1 file changed, 13 insertions(+), 11 deletions(-)
Applied, thanks.
Btw, looking at that mce.usable_addr, it doesn't make a whole lotta
sense to me and we can use mce_usable_address() directly instead and use
the byte in struct mce for something more important. So how about I kill
it (diff ontop of yours):
---
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 03429da2fa80..2184943341bf 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -16,7 +16,7 @@ struct mce {
__u8 cpuvendor; /* cpu vendor as encoded in system.h */
__u8 inject_flags; /* software inject flags */
__u8 severity;
- __u8 usable_addr;
+ __u8 pad;
__u32 cpuid; /* CPUID 1 EAX */
__u8 cs; /* code segment */
__u8 bank; /* machine check bank */
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 6531cb46803c..fb8b1db7b150 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -484,7 +484,7 @@ static int srao_decode_notifier(struct notifier_block *nb, unsigned long val,
if (!mce)
return NOTIFY_DONE;
- if (mce->usable_addr && (mce->severity == MCE_AO_SEVERITY)) {
+ if (mce_usable_address(mce) && (mce->severity == MCE_AO_SEVERITY)) {
pfn = mce->addr >> PAGE_SHIFT;
memory_failure(pfn, MCE_VECTOR, 0);
}
@@ -610,12 +610,9 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
- if (severity == MCE_DEFERRED_SEVERITY && memory_error(&m)) {
- if (m.status & MCI_STATUS_ADDRV) {
+ if (severity == MCE_DEFERRED_SEVERITY && memory_error(&m))
+ if (m.status & MCI_STATUS_ADDRV)
m.severity = severity;
- m.usable_addr = mce_usable_address(&m);
- }
- }
/*
* Don't get the IP here because it's unlikely to
@@ -623,7 +620,7 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
*/
if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce)
mce_log(&m);
- else if (m.usable_addr) {
+ else if (mce_usable_address(&m)) {
/*
* Although we skipped logging this, we still want
* to take action. Add to the pool so the registered
@@ -1091,7 +1088,6 @@ void do_machine_check(struct pt_regs *regs, long error_code)
/* assuming valid severity level != 0 */
m.severity = severity;
- m.usable_addr = mce_usable_address(&m);
mce_log(&m);
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
> Applied, thanks.
Did you test it (note the "UNTESTED" in the subject!). My usual system for this is getting upgrades and being
flaky at the moment.
> Btw, looking at that mce.usable_addr, it doesn't make a whole lotta
> sense to me and we can use mce_usable_address() directly instead and use
> the byte in struct mce for something more important. So how about I kill
> it (diff ontop of yours):
Sure. "struct mce" is visible to user space via /dev/mcelog. But the only user is the
mcelog(8) daemon ... and it was never updated to look at the usable_addr field. So
returning it to "pad" status is fine.
-Tony
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
On Thu, Nov 19, 2015 at 07:33:58PM +0000, Luck, Tony wrote:
> > Applied, thanks.
>
> Did you test it (note the "UNTESTED" in the subject!). My usual system for this is getting upgrades and being
> flaky at the moment.
Bah, it builds, should be enough. Ship it. :-)
Lemme get a box...
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
On Wed, Nov 11, 2015 at 02:01:51PM -0800, Tony Luck wrote:
> We used to have a special ring buffer for deferred errors that
> was used to mark problem pages. We replaced that with a genpool.
> Then later converted mce_log() to also use the same genpool. As
> a result we end up adding all deferred errors to the genpool twice.
>
> Rearrange this code. Make sure to set the m.severity and m.usable_addr
> fields for deferred errors. Then if flags and mca_cfg.dont_log_ce mean
> we call mce_log() we are done, because that will add this entry to the
> genpool.
>
> If we skipped mce_log(), then we still want to take action for the
> deferred error, so add to the genpool.
>
> Changed the name of the boolean "error_logged" to "error_seen", we
> should set it whether of not we logged an error because the return
> value from machine_check_poll() is used to decide whether storms
> have subsided or not.
>
> Reported-by: Chen, Gong <gong.chen.linux.intel.com>
> Signed-off-by: Tony Luck <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce.c | 24 +++++++++++++-----------
> 1 file changed, 13 insertions(+), 11 deletions(-)
...
> @@ -626,9 +621,16 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
> * Don't get the IP here because it's unlikely to
> * have anything to do with the actual error location.
> */
> - if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce) {
> - error_logged = true;
> + if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce)
> mce_log(&m);
> + else if (m.usable_addr) {
> + /*
> + * Although we skipped logging this, we still want
> + * to take action. Add to the pool so the registered
> + * notifiers will see it.
> + */
> + if (!mce_gen_pool_add(&m))
> + mce_schedule_work();
Right, this still causes the error to come out on AMD because the
notifier calls amd_decode_mce().
I guess we can extend the "if (m.usable_addr)" check above with "if
error is not CE" too and only add it to the generic pool when its
severity is anything stronger than MCE_KEEP_SEVERITY...
Also, two more fixes I've done while injecting in a kvm guest I'm
sending as a reply to this message. Will inject on a real box too.
Thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
From: Borislav Petkov <[email protected]>
Date: Sat, 21 Nov 2015 11:29:05 +0100
Subject: [PATCH 1/2] x86/mce: Add the missing memory error check on AMD
We simply need to look at the extended error code when detecting whether
the error is of type memory.
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index fb8b1db7b150..e00e85ab7387 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -522,10 +522,10 @@ static bool memory_error(struct mce *m)
struct cpuinfo_x86 *c = &boot_cpu_data;
if (c->x86_vendor == X86_VENDOR_AMD) {
- /*
- * coming soon
- */
- return false;
+ /* ErrCodeExt[20:16] */
+ u8 xec = (m->status >> 16) & 0x1f;
+
+ return (xec == 0x0 || xec == 0x8);
} else if (c->x86_vendor == X86_VENDOR_INTEL) {
/*
* Intel SDM Volume 3B - 15.9.2 Compound Error Codes
--
2.3.5
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
From: Borislav Petkov <[email protected]>
Date: Sat, 21 Nov 2015 19:52:39 +0100
Subject: [PATCH 2/2] x86/mce: Make usable address checks Intel-only
The MCi_MISC bitfield definitions mce_usable_address() checks are
Intel-only. Make them so.
While at it, move mce_usable_address() up, before all its callers and
get rid of the forward declaration.
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce.c | 40 ++++++++++++++++++++++------------------
1 file changed, 22 insertions(+), 18 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e00e85ab7387..3865e95cc5ec 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -114,7 +114,6 @@ static struct work_struct mce_work;
static struct irq_work mce_irq_work;
static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
-static int mce_usable_address(struct mce *m);
/*
* CPU/chipset specific EDAC code can register a notifier call here to print
@@ -475,6 +474,28 @@ static void mce_report_event(struct pt_regs *regs)
irq_work_queue(&mce_irq_work);
}
+/*
+ * Check if the address reported by the CPU is in a format we can parse.
+ * It would be possible to add code for most other cases, but all would
+ * be somewhat complicated (e.g. segment offset would require an instruction
+ * parser). So only support physical addresses up to page granuality for now.
+ */
+static int mce_usable_address(struct mce *m)
+{
+ if (!(m->status & MCI_STATUS_MISCV) || !(m->status & MCI_STATUS_ADDRV))
+ return 0;
+
+ /* Checks after this one are Intel-specific: */
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return 1;
+
+ if (MCI_MISC_ADDR_LSB(m->misc) > PAGE_SHIFT)
+ return 0;
+ if (MCI_MISC_ADDR_MODE(m->misc) != MCI_MISC_ADDR_PHYS)
+ return 0;
+ return 1;
+}
+
static int srao_decode_notifier(struct notifier_block *nb, unsigned long val,
void *data)
{
@@ -930,23 +951,6 @@ reset:
return ret;
}
-/*
- * Check if the address reported by the CPU is in a format we can parse.
- * It would be possible to add code for most other cases, but all would
- * be somewhat complicated (e.g. segment offset would require an instruction
- * parser). So only support physical addresses up to page granuality for now.
- */
-static int mce_usable_address(struct mce *m)
-{
- if (!(m->status & MCI_STATUS_MISCV) || !(m->status & MCI_STATUS_ADDRV))
- return 0;
- if (MCI_MISC_ADDR_LSB(m->misc) > PAGE_SHIFT)
- return 0;
- if (MCI_MISC_ADDR_MODE(m->misc) != MCI_MISC_ADDR_PHYS)
- return 0;
- return 1;
-}
-
static void mce_clear_state(unsigned long *toclear)
{
int i;
--
2.3.5
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
On Thu, Nov 19, 2015 at 09:39:20PM +0100, Borislav Petkov wrote:
> On Thu, Nov 19, 2015 at 07:33:58PM +0000, Luck, Tony wrote:
> > > Applied, thanks.
> >
> > Did you test it (note the "UNTESTED" in the subject!). My usual system for this is getting upgrades and being
> > flaky at the moment.
>
> Bah, it builds, should be enough. Ship it. :-)
>
> Lemme get a box...
Here some results:
# grep . /sys/kernel/debug/apei/einj/*
/sys/kernel/debug/apei/einj/available_error_type:0x00000002 Processor Uncorrectable non-fatal
/sys/kernel/debug/apei/einj/available_error_type:0x00000008 Memory Correctable
/sys/kernel/debug/apei/einj/available_error_type:0x00000010 Memory Uncorrectable non-fatal
grep: /sys/kernel/debug/apei/einj/error_inject: Permission denied
/sys/kernel/debug/apei/einj/error_type:0x0
Looks like some old EINJ without all the features. Oh well, let's see
what'll happen anyway:
# echo 0x8 > error_type
# echo 1 > error_inject
[ 840.461666] mce: [Hardware Error]: Machine check events logged
[ 840.476221] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 840.489214] EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 5: 8c00004000010090
[ 840.507685] EDAC sbridge MC0: TSC 0
[ 840.515223] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86
[ 840.532477] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1448299322 SOCKET 0 APIC 0
[ 840.551279] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 840.563872] EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 8: 8800004100800090
[ 840.581970] EDAC sbridge MC0: TSC 0
[ 840.589513] EDAC sbridge MC0: ADDR 0 EDAC sbridge MC0: MISC 4908400040004200
[ 840.606267] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1448299322 SOCKET 0 APIC 0
[ 841.499090] EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
So yeah, mce_notify_irq() is visible there, i.e. we did mce_log() here
which sets mce_need_notify.
# echo 0x2 > error_type
# echo 1 > error_inject
bash: echo: write error: Invalid argument
[ 885.272000] [Firmware Warn]: APEI: Invalid action table, unknown instruction type: 5
ACPI_EINJ_FLUSH_CACHELINE??
Yeah, we're missing some functionality.
# echo 0x10 > error_type
# echo 1 > error_inject
That went BOOM:
[ 1296.233435] Disabling lock debugging due to kernel taint
[ 1296.248010] mce: [Hardware Error]: CPU 6: Machine Check Exception: 5 Bank 5: be00000000010090
[ 1296.269245] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8136260f> {intel_idle+0xbf/0x130}
[ 1296.290735] mce: [Hardware Error]: TSC 37c1fb53beb ADDR bb68f400 MISC 20401a9a86
[ 1296.309772] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1448299778 SOCKET 0 APIC c microcode 710
[ 1296.332058] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 1296.346094] EDAC sbridge MC0: CPU 6: Machine Check Exception: 5 Bank 5: be00000000010090
[ 1296.366517] EDAC sbridge MC0: TSC 37c1fb53beb
[ 1296.375974] EDAC sbridge MC0: ADDR bb68f400 EDAC sbridge MC0: MISC 20401a9a86
[ 1296.394493] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1448299778 SOCKET 0 APIC c
[ 1296.416153] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68f offset:
0x400 grain:32 - area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
...
judging by the CPU numbers, looks like node 0 got that error in the shared bank:
.... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7
.... node #0, CPUs: #32 #33 #34 #35 #36 #37 #38 #39
finishing with
[ 1299.907994] mce: [Hardware Error]: Machine check: Processor context corrupt
[ 1299.926783] Kernel panic - not syncing: Fatal machine check
[ 1299.959632] Kernel Offset: disabled
[ 1299.984254] Rebooting in 100 seconds..
dont_log_ce:
$ for i in $(seq 0 63); do echo 1 > /sys/devices/system/machinecheck/machinecheck$i/dont_log_ce; cat /sys/devices/system/machinecheck/machinecheck$i/dont_log_ce; done | uniq
1
# echo 0x8 > error_type
# echo 1 > error_inject
[ 318.263797] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 318.277029] EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 5: 8c00004000010090
[ 318.295631] EDAC sbridge MC0: TSC 0
[ 318.303143] EDAC sbridge MC0: ADDR bb68f000 EDAC sbridge MC0: MISC 2040262686
[ 318.320473] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1448300397 SOCKET 0 APIC 0
[ 318.809112] EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68f offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
This looks ok, we're missing the mce_notify_irq() line "mce: [Hardware
Error]: Machine check events logged" which is as expected but the EDAC
lines are there because we sent the error on the notify chain.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
> Also, two more fixes I've done while injecting in a kvm guest I'm
> sending as a reply to this message. Will inject on a real box too.
Ok ... applied those two on top of my "UNTESTED" patch and injected an error to force a UCNA log.
Everything looked ok. Just one copy on the console and in /var/log/mcelog (actually logs from
bank7 and bank3 ... but that was expected from this test).
So my patch is tested, and take this
Acked-by: Tony Luck <[email protected]> for your two additional patches.
-Tony
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
On Tue, Nov 24, 2015 at 12:19:18AM +0000, Luck, Tony wrote:
> > Also, two more fixes I've done while injecting in a kvm guest I'm
> > sending as a reply to this message. Will inject on a real box too.
>
> Ok ... applied those two on top of my "UNTESTED" patch and injected an error to force a UCNA log.
Ok, what error type is that in EINJ nomenclature? I had only
/sys/kernel/debug/apei/einj/available_error_type:0x00000002 Processor Uncorrectable non-fatal
/sys/kernel/debug/apei/einj/available_error_type:0x00000008 Memory Correctable
/sys/kernel/debug/apei/einj/available_error_type:0x00000010 Memory Uncorrectable non-fatal
and I would've guessed it is the 0x10 type, i.e., the memory
uncorrectable which is non-fatal - assuming here - but that one got
promoted to a #MC on my box.
The processor uncorrectable didn't want to inject due to missing EINJ
instruction 0x5 or so...
> Everything looked ok. Just one copy on the console and in
> /var/log/mcelog (actually logs from bank7 and bank3 ... but that was
> expected from this test).
Good.
> So my patch is tested, and take this
>
> Acked-by: Tony Luck <[email protected]> for your two additional patches.
Thanks!
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
>> Ok ... applied those two on top of my "UNTESTED" patch and injected an error to force a UCNA log.
>
> Ok, what error type is that in EINJ nomenclature? I had only
>
> /sys/kernel/debug/apei/einj/available_error_type:0x00000002 Processor Uncorrectable non-fatal
> /sys/kernel/debug/apei/einj/available_error_type:0x00000008 Memory Correctable
> /sys/kernel/debug/apei/einj/available_error_type:0x00000010 Memory Uncorrectable non-fatal
>
> and I would've guessed it is the 0x10 type, i.e., the memory
> uncorrectable which is non-fatal - assuming here - but that one got
> promoted to a #MC on my box.
I juggled with the type of the injection and the instruction sequence to access the target
location. I used 0x10 to inject an uncorrected memory error with "# echo 1 > notrigger"
to make sure the EINJ driver skipped the trigger actions. Then I had a user mode test program
write a byte to the cache line. That pulled the uncorrected data into the cache (which logged
the UCNA error signaled with CMCI). But the processor didn't actually consume the poison
(no registers had corrupted data), so there was no machine check.
Sneaky, huh?
-Tony
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
On Tue, Nov 24, 2015 at 03:51:21PM +0000, Luck, Tony wrote:
> >> Ok ... applied those two on top of my "UNTESTED" patch and injected an error to force a UCNA log.
> >
> > Ok, what error type is that in EINJ nomenclature? I had only
> >
> > /sys/kernel/debug/apei/einj/available_error_type:0x00000002 Processor Uncorrectable non-fatal
> > /sys/kernel/debug/apei/einj/available_error_type:0x00000008 Memory Correctable
> > /sys/kernel/debug/apei/einj/available_error_type:0x00000010 Memory Uncorrectable non-fatal
> >
> > and I would've guessed it is the 0x10 type, i.e., the memory
> > uncorrectable which is non-fatal - assuming here - but that one got
> > promoted to a #MC on my box.
>
> I juggled with the type of the injection and the instruction sequence to access the target
> location. I used 0x10 to inject an uncorrected memory error with "# echo 1 > notrigger"
> to make sure the EINJ driver skipped the trigger actions. Then I had a user mode test program
> write a byte to the cache line. That pulled the uncorrected data into the cache (which logged
> the UCNA error signaled with CMCI). But the processor didn't actually consume the poison
> (no registers had corrupted data), so there was no machine check.
>
> Sneaky, huh?
That reminds me of the whitepaper:
https://software.intel.com/sites/default/files/managed/b3/d1/MCA_Recovery_Validation_Guide.pdf
Btw, should we take those tools here:
https://git.kernel.org/cgit/linux/kernel/git/aegl/ras-tools.git
and glue them together with a python or a shell script or so which
goes and automatically takes care of loading einj.ko and injects the
proper error type and thus abstracts away all that detail which makes me
everytime look at Documentation/acpi/apei/einj.txt?
Something like
./einject.py --ucna
which would do all the fun?
That would simplify our testing a lot, methinks. Hmmm?
Oh, and btw, the box here didn't have the notrigger node, which means,
it'll always do the trigger actions. :-\
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.