2015-05-06 17:01:59

by Aravind Gopalakrishnan

[permalink] [raw]
Subject: [PATCH V2 0/6] Enable deferred error interrupts

Deferred errors indicate error conditions that were not corrected, but
require no action from S/W (or action is optional).These errors provide
info about a latent UC MCE that can occur when a poisoned data is
consumed by the processor.

Newer AMD processors can generate deferred errors and can be configured
to generate APIC interrupts on such events.

This patchset introduces a new interrupt handler for deferred errors and
configures the HW if the feature is present.

Patch1: Factor out logging mechanism so we can reuse for deferred errors
No functional change.
Patch2: Read MCx_ADDR(bank) before calling mce_log(). This fixes an issue
as currently, amd_decode_mce() will always only print error address
as 0x0 even if a valid address exists.
Patch3: Defines SUCCOR cpuid bit. This indicates prescence of features
such as data poisoning and deferred error interrupts in hardware.
Patch4: Implement the interrupt handler.
- setup vector number, build the interrupt and implement handler
function in this patch.
Patch5, Patch 6: Cleanups in the code. No functional changes are introduced.

Changes from V1:
- Two Prepatches-
* Factor out logging mechanism so we can reuse for deferred errors
* Read MCx_ADDR(bank) before calling mce_log() so we get relevant
error address printed out on kernel logs
- Providing short description of Deferred errors here as well as in commit
message of patch2 (per Ingo, Boris)
- Adding comments around mce_flags to define the bitfields better (per Boris)
- Assign truth values using double negation and 'BIT' macros. Vertically
align statements while at it. (per Boris)
- Change definitions of 'deferred_interrupt' to 'deferred_error_interrupt';
DEFERRED_APIC_VECTOR to DEFERRED_ERROR_VECTOR and irq_deferred_count
to irq_deferred_error_count (per Andy, Boris)
- Do the BIOS workaround check for all families as we are behind a cpuid
bit anyway. And print a FW_BUG message as needed. (per Boris)
- Updating the timestamp of patch to May 2015 in mce_amd.c

Aravind Gopalakrishnan (6):
x86/MCE/AMD: Factor out logging mechanism
x86/MCE/AMD: Read MCx_ADDR(bank) before we log the error
x86/mce: Define 'SUCCOR' cpuid bit
x86/MCE/AMD: Introduce deferred error interrupt handler
x86, irq: Cleanup ordering of vector numbers
x86/MCE/AMD: Rename setup_APIC_mce

arch/x86/include/asm/entry_arch.h | 3 +
arch/x86/include/asm/hardirq.h | 3 +
arch/x86/include/asm/hw_irq.h | 2 +
arch/x86/include/asm/irq_vectors.h | 11 +--
arch/x86/include/asm/mce.h | 20 ++++-
arch/x86/include/asm/trace/irq_vectors.h | 6 ++
arch/x86/include/asm/traps.h | 3 +-
arch/x86/kernel/cpu/mcheck/mce.c | 3 +-
arch/x86/kernel/cpu/mcheck/mce_amd.c | 132 ++++++++++++++++++++++++++++---
arch/x86/kernel/entry_64.S | 5 ++
arch/x86/kernel/irq.c | 6 ++
arch/x86/kernel/irqinit.c | 4 +
arch/x86/kernel/traps.c | 5 ++
13 files changed, 182 insertions(+), 21 deletions(-)

--
1.9.1


2015-05-06 17:03:18

by Aravind Gopalakrishnan

[permalink] [raw]
Subject: [PATCH V2 1/6] x86/MCE/AMD: Factor out logging mechanism

Refactoring the code here to setup struct mce and call
mce_log() to log the error.

No functional change is introduced.

Doing this so we can use it later to log error when
deferred error interrupts happen.

Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Aravind Gopalakrishnan <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 55ad9b3..60ae315 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -264,6 +264,24 @@ init:
}
}

+static void __log_error(unsigned int bank, bool is_thr, u64 misc)
+{
+ struct mce m;
+
+ mce_setup(&m);
+ rdmsrl(MSR_IA32_MCx_STATUS(bank), m.status);
+ if (!(m.status & MCI_STATUS_VAL))
+ return;
+
+ if (is_thr)
+ m.misc = misc;
+
+ m.bank = bank;
+ mce_log(&m);
+
+ wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
+}
+
/*
* APIC Interrupt Handler
*/
@@ -273,12 +291,12 @@ init:
* the interrupt goes off when error_count reaches threshold_limit.
* the handler will simply log mcelog w/ software defined bank number.
*/
+
static void amd_threshold_interrupt(void)
{
u32 low = 0, high = 0, address = 0;
int cpu = smp_processor_id();
unsigned int bank, block;
- struct mce m;

/* assume first bank caused it */
for (bank = 0; bank < mca_cfg.banks; ++bank) {
@@ -321,15 +339,7 @@ static void amd_threshold_interrupt(void)
return;

log:
- mce_setup(&m);
- rdmsrl(MSR_IA32_MCx_STATUS(bank), m.status);
- if (!(m.status & MCI_STATUS_VAL))
- return;
- m.misc = ((u64)high << 32) | low;
- m.bank = bank;
- mce_log(&m);
-
- wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
+ __log_error(bank, true, ((u64)high << 32) | low);
}

/*
--
1.9.1

2015-05-06 17:03:14

by Aravind Gopalakrishnan

[permalink] [raw]
Subject: [PATCH V2 2/6] x86/MCE/AMD: Read MCx_ADDR(bank) before we log the error

amd_decode_mce() needs value in m->addr so it can report the error
address correctly on kernel logs. This should be setup in
__log_error() before we call mce_log(). Else, m.addr = 0x0 and
amd_decode_mce() just ends up printing 0x0.

Like this:
[76916.275587] [Hardware Error]: Corrected error, no action required.
[76916.279576] [Hardware Error]: CPU:0 (15:60:0) MC0_STATUS
[-|CE|-|-|AddrV|-|-|CECC]: 0x840041000028017b
[76916.279576] [Hardware Error]: MC0 Error Address: 0x0000000000000000

We correct it in this patch by reading MSR_IA32_MCx_ADDR(bank) if
it is valid before calling mce_log().

Corrected error output:
[ 102.623490] [Hardware Error]: Corrected error, no action required.
[ 102.623668] [Hardware Error]: CPU:0 (15:60:0) MC0_STATUS
[-|CE|-|-|AddrV|-|-|CECC]: 0x840041000028017b
[ 102.623930] [Hardware Error]: MC0 Error Address: 0x00001f808f0ff040

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 60ae315..769f5cd 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -277,8 +277,10 @@ static void __log_error(unsigned int bank, bool is_thr, u64 misc)
m.misc = misc;

m.bank = bank;
- mce_log(&m);
+ if (m.status & MCI_STATUS_ADDRV)
+ rdmsrl(MSR_IA32_MCx_ADDR(bank), m.addr);

+ mce_log(&m);
wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
}

--
1.9.1

2015-05-06 17:01:50

by Aravind Gopalakrishnan

[permalink] [raw]
Subject: [PATCH V2 3/6] x86/mce: Define 'SUCCOR' cpuid bit

SUCCOR stands for S/W UnCorrectable error COntainment and Recovery.
It indicates support for data poisoning in HW and deferred error
interrupts.

Add new bitfield in mce_vendor_flags for this.
We use this to verify prescence of deferred error interrupts
before we enable them in mce_amd.c

While at it, clarify comments in mce_vendor_flags to provide an
indication of usages of the bitfields.

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
---
arch/x86/include/asm/mce.h | 17 +++++++++++++++--
arch/x86/kernel/cpu/mcheck/mce.c | 3 ++-
2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 1f5a86d..ab0cdac 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -117,8 +117,21 @@ struct mca_config {
};

struct mce_vendor_flags {
- __u64 overflow_recov : 1, /* cpuid_ebx(80000007) */
- __reserved_0 : 63;
+ /* Flags from cpuid_ebx(80000007) */
+
+ /*
+ * overflow recovery cpuid bit indicates that overflow
+ * conditions are not fatal
+ */
+ __u64 overflow_recov : 1,
+ /*
+ * SUCCOR stands for S/W UnCorrectable error
+ * COntainment and Recovery.
+ * It indicates support for data poisoning in HW and
+ * deferred error interrupts.
+ */
+ succor : 1,
+ __reserved_0 : 62;
};
extern struct mce_vendor_flags mce_flags;

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e535533..6d218ac 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1639,7 +1639,8 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
break;
case X86_VENDOR_AMD:
mce_amd_feature_init(c);
- mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1;
+ mce_flags.overflow_recov = !!(cpuid_ebx(0x80000007) & BIT(0));
+ mce_flags.succor = !!(cpuid_ebx(0x80000007) & BIT(1));
break;
default:
break;
--
1.9.1

2015-05-06 17:01:54

by Aravind Gopalakrishnan

[permalink] [raw]
Subject: [PATCH V2 4/6] x86/MCE/AMD: Introduce deferred error interrupt handler

Deferred errors indicate error conditions that were not corrected, but
require no action from S/W (or action is optional).These errors provide
info about a latent UC MCE that can occur when a poisoned data is
consumed by the processor.

Processors that report these errors can be configured to generate APIC
interrupts to notify OS about the error.

Providing interrupt handlers in this patch so that OS can catch these
errors as and when they happen. Currently, we simply log the errors and
exit the handler as S/W action is not mandated.

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
---
arch/x86/include/asm/entry_arch.h | 3 +
arch/x86/include/asm/hardirq.h | 3 +
arch/x86/include/asm/hw_irq.h | 2 +
arch/x86/include/asm/irq_vectors.h | 1 +
arch/x86/include/asm/mce.h | 3 +
arch/x86/include/asm/trace/irq_vectors.h | 6 ++
arch/x86/include/asm/traps.h | 3 +-
arch/x86/kernel/cpu/mcheck/mce_amd.c | 96 ++++++++++++++++++++++++++++++++
arch/x86/kernel/entry_64.S | 5 ++
arch/x86/kernel/irq.c | 6 ++
arch/x86/kernel/irqinit.c | 4 ++
arch/x86/kernel/traps.c | 5 ++
12 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/entry_arch.h b/arch/x86/include/asm/entry_arch.h
index dc5fa66..6da46db 100644
--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -50,4 +50,7 @@ BUILD_INTERRUPT(thermal_interrupt,THERMAL_APIC_VECTOR)
BUILD_INTERRUPT(threshold_interrupt,THRESHOLD_APIC_VECTOR)
#endif

+#ifdef CONFIG_X86_MCE_AMD
+BUILD_INTERRUPT(deferred_error_interrupt, DEFERRED_ERROR_VECTOR)
+#endif
#endif
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 0f5fb6b..db9f536 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -33,6 +33,9 @@ typedef struct {
#ifdef CONFIG_X86_MCE_THRESHOLD
unsigned int irq_threshold_count;
#endif
+#ifdef CONFIG_X86_MCE_AMD
+ unsigned int irq_deferred_error_count;
+#endif
#if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN)
unsigned int irq_hv_callback_count;
#endif
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index e9571dd..f71e489 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -73,6 +73,7 @@ extern asmlinkage void invalidate_interrupt31(void);
extern asmlinkage void irq_move_cleanup_interrupt(void);
extern asmlinkage void reboot_interrupt(void);
extern asmlinkage void threshold_interrupt(void);
+extern asmlinkage void deferred_error_interrupt(void);

extern asmlinkage void call_function_interrupt(void);
extern asmlinkage void call_function_single_interrupt(void);
@@ -87,6 +88,7 @@ extern void trace_spurious_interrupt(void);
extern void trace_thermal_interrupt(void);
extern void trace_reschedule_interrupt(void);
extern void trace_threshold_interrupt(void);
+extern void trace_deferred_error_interrupt(void);
extern void trace_call_function_interrupt(void);
extern void trace_call_function_single_interrupt(void);
#define trace_irq_move_cleanup_interrupt irq_move_cleanup_interrupt
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 666c89e..026fc1e 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -113,6 +113,7 @@
#define IRQ_WORK_VECTOR 0xf6

#define UV_BAU_MESSAGE 0xf5
+#define DEFERRED_ERROR_VECTOR 0xf4

/* Vector on which hypervisor callbacks will be delivered */
#define HYPERVISOR_CALLBACK_VECTOR 0xf3
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index ab0cdac..4d4a141 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -236,6 +236,9 @@ void do_machine_check(struct pt_regs *, long);
extern void (*mce_threshold_vector)(void);
extern void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu);

+/* Deferred error interrupt handler */
+extern void (*deferred_error_int_vector)(void);
+
/*
* Thermal handler
*/
diff --git a/arch/x86/include/asm/trace/irq_vectors.h b/arch/x86/include/asm/trace/irq_vectors.h
index 4cab890..38a09a1 100644
--- a/arch/x86/include/asm/trace/irq_vectors.h
+++ b/arch/x86/include/asm/trace/irq_vectors.h
@@ -101,6 +101,12 @@ DEFINE_IRQ_VECTOR_EVENT(call_function_single);
DEFINE_IRQ_VECTOR_EVENT(threshold_apic);

/*
+ * deferred_error_apic - called when entering/exiting a deferred apic interrupt
+ * vector handler
+ */
+DEFINE_IRQ_VECTOR_EVENT(deferred_error_apic);
+
+/*
* thermal_apic - called when entering/exiting a thermal apic interrupt
* vector handler
*/
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 4e49d7d..c5380be 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -108,7 +108,8 @@ extern int panic_on_unrecovered_nmi;
void math_emulate(struct math_emu_info *);
#ifndef CONFIG_X86_32
asmlinkage void smp_thermal_interrupt(void);
-asmlinkage void mce_threshold_interrupt(void);
+asmlinkage void smp_threshold_interrupt(void);
+asmlinkage void smp_deferred_error_interrupt(void);
#endif

extern enum ctx_state ist_enter(struct pt_regs *regs);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 769f5cd..a7ac1af 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -12,6 +12,8 @@
* - added support for AMD Family 0x10 processors
* May 2012
* - major scrubbing
+ * May 2015
+ * - add support for deferred error interrupts
*
* All MC4_MISCi registers are shared between multi-cores
*/
@@ -32,6 +34,7 @@
#include <asm/idle.h>
#include <asm/mce.h>
#include <asm/msr.h>
+#include <asm/trace/irq_vectors.h>

#define NR_BLOCKS 9
#define THRESHOLD_MAX 0xFFF
@@ -47,6 +50,13 @@
#define MASK_BLKPTR_LO 0xFF000000
#define MCG_XBLK_ADDR 0xC0000400

+/* Deferred error settings */
+#define MSR_CU_DEF_ERR 0xC0000410
+#define MASK_DEF_LVTOFF 0x000000F0
+#define MASK_DEF_INT_TYPE 0x00000006
+#define DEF_LVT_OFF 0x2
+#define DEF_INT_TYPE_APIC 0x2
+
static const char * const th_names[] = {
"load_store",
"insn_fetch",
@@ -60,6 +70,15 @@ static DEFINE_PER_CPU(struct threshold_bank **, threshold_banks);
static DEFINE_PER_CPU(unsigned char, bank_map); /* see which banks are on */

static void amd_threshold_interrupt(void);
+static void amd_deferred_error_interrupt(void);
+
+/* Setup default deferred error interrupt handler */
+static void default_deferred_error_interrupt(void)
+{
+ pr_err("Unexpected deferred interrupt at vector %x\n",
+ DEFERRED_ERROR_VECTOR);
+}
+void (*deferred_error_int_vector)(void) = default_deferred_error_interrupt;

/*
* CPU Initialization
@@ -205,6 +224,40 @@ static int setup_APIC_mce(int reserved, int new)
return reserved;
}

+static int setup_APIC_deferred_error(int reserved, int new)
+{
+ if (reserved < 0 && !setup_APIC_eilvt(new, DEFERRED_ERROR_VECTOR,
+ APIC_EILVT_MSG_FIX, 0))
+ return new;
+
+ return reserved;
+}
+
+static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c)
+{
+ u32 low = 0, high = 0;
+ int def_offset = -1, def_new;
+
+ if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high))
+ return;
+
+ def_new = (low & MASK_DEF_LVTOFF) >> 4;
+ if (!(low & MASK_DEF_LVTOFF)) {
+ pr_err(FW_BUG "Your BIOS is not setting up LVT offset 0x2"
+ " for deferred error IRQs correctly.\n");
+ def_new = DEF_LVT_OFF;
+ low = (low & ~MASK_DEF_LVTOFF) | (DEF_LVT_OFF << 4);
+ }
+
+ def_offset = setup_APIC_deferred_error(def_offset, def_new);
+ if ((def_offset == def_new) &&
+ (deferred_error_int_vector != amd_deferred_error_interrupt))
+ deferred_error_int_vector = amd_deferred_error_interrupt;
+
+ low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC;
+ wrmsr(MSR_CU_DEF_ERR, low, high);
+}
+
/* cpu init entry point, called from mce.c with preempt off */
void mce_amd_feature_init(struct cpuinfo_x86 *c)
{
@@ -262,6 +315,9 @@ init:
mce_threshold_block_init(&b, offset);
}
}
+
+ if (mce_flags.succor)
+ deferred_error_interrupt_enable(c);
}

static void __log_error(unsigned int bank, bool is_thr, u64 misc)
@@ -284,6 +340,46 @@ static void __log_error(unsigned int bank, bool is_thr, u64 misc)
wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
}

+static inline void __smp_deferred_error_interrupt(void)
+{
+ inc_irq_stat(irq_deferred_error_count);
+ deferred_error_int_vector();
+}
+
+asmlinkage __visible void smp_deferred_error_interrupt(void)
+{
+ entering_irq();
+ __smp_deferred_error_interrupt();
+ exiting_ack_irq();
+}
+
+asmlinkage __visible void smp_trace_deferred_error_interrupt(void)
+{
+ entering_irq();
+ trace_deferred_error_apic_entry(DEFERRED_ERROR_VECTOR);
+ __smp_deferred_error_interrupt();
+ trace_deferred_error_apic_exit(DEFERRED_ERROR_VECTOR);
+ exiting_ack_irq();
+}
+
+/* APIC interrupt handler for deferred errors */
+static void amd_deferred_error_interrupt(void)
+{
+ u64 status;
+ unsigned int bank;
+
+ for (bank = 0; bank < mca_cfg.banks; ++bank) {
+ rdmsrl(MSR_IA32_MCx_STATUS(bank), status);
+
+ if (!(status & MCI_STATUS_VAL) ||
+ !(status & MCI_STATUS_DEFERRED))
+ continue;
+
+ __log_error(bank, false, 0);
+ break;
+ }
+}
+
/*
* APIC Interrupt Handler
*/
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 10e1a2d..c5bcf7c 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -923,6 +923,11 @@ apicinterrupt THRESHOLD_APIC_VECTOR \
threshold_interrupt smp_threshold_interrupt
#endif

+#ifdef CONFIG_X86_MCE_AMD
+apicinterrupt DEFERRED_ERROR_VECTOR \
+ deferred_error_interrupt smp_deferred_error_interrupt
+#endif
+
#ifdef CONFIG_X86_THERMAL_VECTOR
apicinterrupt THERMAL_APIC_VECTOR \
thermal_interrupt smp_thermal_interrupt
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index e5952c2..cca1124 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -116,6 +116,12 @@ int arch_show_interrupts(struct seq_file *p, int prec)
seq_printf(p, "%10u ", irq_stats(j)->irq_threshold_count);
seq_puts(p, " Threshold APIC interrupts\n");
#endif
+#ifdef CONFIG_X86_MCE_AMD
+ seq_printf(p, "%*s: ", prec, "DEF");
+ for_each_online_cpu(j)
+ seq_printf(p, "%10u ", irq_stats(j)->irq_deferred_error_count);
+ seq_puts(p, " Deferred Error APIC interrupts\n");
+#endif
#ifdef CONFIG_X86_MCE
seq_printf(p, "%*s: ", prec, "MCE");
for_each_online_cpu(j)
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index cd10a64..d7ec6e7 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -135,6 +135,10 @@ static void __init apic_intr_init(void)
alloc_intr_gate(THRESHOLD_APIC_VECTOR, threshold_interrupt);
#endif

+#ifdef CONFIG_X86_MCE_AMD
+ alloc_intr_gate(DEFERRED_ERROR_VECTOR, deferred_error_interrupt);
+#endif
+
#ifdef CONFIG_X86_LOCAL_APIC
/* self generated IPI for local APIC timer */
alloc_intr_gate(LOCAL_TIMER_VECTOR, apic_timer_interrupt);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 324ab52..68b1d59 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -827,6 +827,11 @@ asmlinkage __visible void __attribute__((weak)) smp_threshold_interrupt(void)
{
}

+asmlinkage __visible void __attribute__((weak))
+smp_deferred_error_interrupt(void)
+{
+}
+
/*
* 'math_state_restore()' saves the current math information in the
* old math state array, and gets the new ones from the current task
--
1.9.1

2015-05-06 17:02:06

by Aravind Gopalakrishnan

[permalink] [raw]
Subject: [PATCH V2 5/6] x86, irq: Cleanup ordering of vector numbers

Enforcing proper descending order of vector number assignments here.
No functional change.

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
---
arch/x86/include/asm/irq_vectors.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 026fc1e..8900ba4 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -102,11 +102,6 @@
*/
#define X86_PLATFORM_IPI_VECTOR 0xf7

-/* Vector for KVM to deliver posted interrupt IPI */
-#ifdef CONFIG_HAVE_KVM
-#define POSTED_INTR_VECTOR 0xf2
-#endif
-
/*
* IRQ work vector:
*/
@@ -118,6 +113,11 @@
/* Vector on which hypervisor callbacks will be delivered */
#define HYPERVISOR_CALLBACK_VECTOR 0xf3

+/* Vector for KVM to deliver posted interrupt IPI */
+#ifdef CONFIG_HAVE_KVM
+#define POSTED_INTR_VECTOR 0xf2
+#endif
+
/*
* Local APIC timer IRQ vector is on a different priority level,
* to work around the 'lost local interrupt if more than 2 IRQ
--
1.9.1

2015-05-06 17:02:10

by Aravind Gopalakrishnan

[permalink] [raw]
Subject: [PATCH V2 6/6] x86/MCE/AMD: Rename setup_APIC_mce

'setup_APIC_mce' doesn't give us an indication of why we are
going to program LVT. Make that explicit by renaming it to
setup_APIC_mce_threshold so we know.

No functional change is introduced.

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index a7ac1af..8d5cfe1 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -215,7 +215,7 @@ static void mce_threshold_block_init(struct threshold_block *b, int offset)
threshold_restart_bank(&tr);
};

-static int setup_APIC_mce(int reserved, int new)
+static int setup_APIC_mce_threshold(int reserved, int new)
{
if (reserved < 0 && !setup_APIC_eilvt(new, THRESHOLD_APIC_VECTOR,
APIC_EILVT_MSG_FIX, 0))
@@ -305,7 +305,7 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)

b.interrupt_enable = 1;
new = (high & MASK_LVTOFF_HI) >> 20;
- offset = setup_APIC_mce(offset, new);
+ offset = setup_APIC_mce_threshold(offset, new);

if ((offset == new) &&
(mce_threshold_vector != amd_threshold_interrupt))
--
1.9.1

2015-05-06 17:42:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH V2 1/6] x86/MCE/AMD: Factor out logging mechanism

On Wed, May 06, 2015 at 06:58:53AM -0500, Aravind Gopalakrishnan wrote:
> Refactoring the code here to setup struct mce and call
> mce_log() to log the error.
>
> No functional change is introduced.
>
> Doing this so we can use it later to log error when
> deferred error interrupts happen.
>
> Suggested-by: Borislav Petkov <[email protected]>
> Signed-off-by: Aravind Gopalakrishnan <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce_amd.c | 30 ++++++++++++++++++++----------
> 1 file changed, 20 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> index 55ad9b3..60ae315 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> @@ -264,6 +264,24 @@ init:
> }
> }
>
> +static void __log_error(unsigned int bank, bool is_thr, u64 misc)
> +{
> + struct mce m;
> +
> + mce_setup(&m);
> + rdmsrl(MSR_IA32_MCx_STATUS(bank), m.status);
> + if (!(m.status & MCI_STATUS_VAL))
> + return;

Yeah, let's save us the mce_setup() work in the !MCI_STATUS_VAL case:

---
From: Aravind Gopalakrishnan <[email protected]>
Date: Wed, 6 May 2015 06:58:53 -0500
Subject: [PATCH] x86/mce/amd: Factor out logging mechanism

Refactor the code here to setup struct mce and call mce_log() to log
the error. We're going to reuse this in a later patch as part of the
deferred error interrupt enablement.

No functional change is introduced.

Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Aravind Gopalakrishnan <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: x86-ml <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 55ad9b37cae8..5f25de20db36 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -264,6 +264,27 @@ init:
}
}

+static void __log_error(unsigned int bank, bool threshold_err, u64 misc)
+{
+ struct mce m;
+ u64 status;
+
+ rdmsrl(MSR_IA32_MCx_STATUS(bank), status);
+ if (!(status & MCI_STATUS_VAL))
+ return;
+
+ mce_setup(&m);
+
+ m.status = status;
+ m.bank = bank;
+ if (threshold_err)
+ m.misc = misc;
+
+ mce_log(&m);
+
+ wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
+}
+
/*
* APIC Interrupt Handler
*/
@@ -273,12 +294,12 @@ init:
* the interrupt goes off when error_count reaches threshold_limit.
* the handler will simply log mcelog w/ software defined bank number.
*/
+
static void amd_threshold_interrupt(void)
{
u32 low = 0, high = 0, address = 0;
int cpu = smp_processor_id();
unsigned int bank, block;
- struct mce m;

/* assume first bank caused it */
for (bank = 0; bank < mca_cfg.banks; ++bank) {
@@ -321,15 +342,7 @@ static void amd_threshold_interrupt(void)
return;

log:
- mce_setup(&m);
- rdmsrl(MSR_IA32_MCx_STATUS(bank), m.status);
- if (!(m.status & MCI_STATUS_VAL))
- return;
- m.misc = ((u64)high << 32) | low;
- m.bank = bank;
- mce_log(&m);
-
- wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
+ __log_error(bank, true, ((u64)high << 32) | low);
}

/*
--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-06 17:47:58

by Aravind Gopalakrishnan

[permalink] [raw]
Subject: Re: [PATCH V2 1/6] x86/MCE/AMD: Factor out logging mechanism

On 5/6/2015 12:41 PM, Borislav Petkov wrote:
>
> Yeah, let's save us the mce_setup() work in the !MCI_STATUS_VAL case:
>

Sure, thanks for fixing it.

> +static void __log_error(unsigned int bank, bool threshold_err, u64 misc)
> +{
> + struct mce m;
> + u64 status;
> +
> + rdmsrl(MSR_IA32_MCx_STATUS(bank), status);
> + if (!(status & MCI_STATUS_VAL))
> + return;
> +
> + mce_setup(&m);
> +
>


-Aravind.

2015-05-06 18:49:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH V2 3/6] x86/mce: Define 'SUCCOR' cpuid bit

On Wed, May 06, 2015 at 06:58:55AM -0500, Aravind Gopalakrishnan wrote:
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index e535533..6d218ac 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -1639,7 +1639,8 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
> break;
> case X86_VENDOR_AMD:
> mce_amd_feature_init(c);
> - mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1;
> + mce_flags.overflow_recov = !!(cpuid_ebx(0x80000007) & BIT(0));
> + mce_flags.succor = !!(cpuid_ebx(0x80000007) & BIT(1));

Yuck, I thought gcc would be smart enough to recognize that we're
calling CPUID twice with the same params and save us the work but nooo.
asm looks yucky.

Oh well, I had to save us the second CPUID read and beef up the commit
message:

---
From: Aravind Gopalakrishnan <[email protected]>
Date: Wed, 6 May 2015 06:58:55 -0500
Subject: [PATCH] x86/mce: Add support for deferred errors on AMD

Deferred errors indicate error conditions that were not corrected, but
those errors have not been consumed yet. They require no action from
S/W (or action is optional). These errors provide info about a latent
uncorrectable MCE that can occur when a poisoned data is consumed by the
processor.

Newer AMD processors can generate deferred errors and can be configured
to generate APIC interrupts on such events.

SUCCOR stands for S/W UnCorrectable error COntainment and Recovery.
It indicates support for data poisoning in HW and deferred error
interrupts.

Add new bitfield to mce_vendor_flags for this. We use this to verify
presence of deferred error interrupts before we enable them in mce_amd.c

While at it, clarify comments in mce_vendor_flags to provide an
indication of usages of the bitfields.

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: x86-ml <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ beef up commit message, do CPUID(8000_0007) only once. ]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/mce.h | 15 +++++++++++++--
arch/x86/kernel/cpu/mcheck/mce.c | 10 ++++++++--
2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 1f5a86d518db..407ced642ac1 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -117,8 +117,19 @@ struct mca_config {
};

struct mce_vendor_flags {
- __u64 overflow_recov : 1, /* cpuid_ebx(80000007) */
- __reserved_0 : 63;
+ /*
+ * overflow recovery cpuid bit indicates that overflow
+ * conditions are not fatal
+ */
+ __u64 overflow_recov : 1,
+
+ /*
+ * SUCCOR stands for S/W UnCorrectable error COntainment
+ * and Recovery. It indicates support for data poisoning
+ * in HW and deferred error interrupts.
+ */
+ succor : 1,
+ __reserved_0 : 62;
};
extern struct mce_vendor_flags mce_flags;

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e535533d5ab8..521e5016aca6 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1637,10 +1637,16 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
mce_intel_feature_init(c);
mce_adjust_timer = cmci_intel_adjust_timer;
break;
- case X86_VENDOR_AMD:
+
+ case X86_VENDOR_AMD: {
+ u32 ebx = cpuid_ebx(0x80000007);
+
mce_amd_feature_init(c);
- mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1;
+ mce_flags.overflow_recov = !!(ebx & BIT(0));
+ mce_flags.succor = !!(ebx & BIT(1));
break;
+ }
+
default:
break;
}
--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-07 08:24:59

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH V2 4/6] x86/MCE/AMD: Introduce deferred error interrupt handler

On Wed, May 06, 2015 at 06:58:56AM -0500, Aravind Gopalakrishnan wrote:
> diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> index 769f5cd..a7ac1af 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> @@ -12,6 +12,8 @@
> * - added support for AMD Family 0x10 processors
> * May 2012
> * - major scrubbing
> + * May 2015
> + * - add support for deferred error interrupts

I added your name here.

> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index e5952c2..cca1124 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -116,6 +116,12 @@ int arch_show_interrupts(struct seq_file *p, int prec)
> seq_printf(p, "%10u ", irq_stats(j)->irq_threshold_count);
> seq_puts(p, " Threshold APIC interrupts\n");
> #endif
> +#ifdef CONFIG_X86_MCE_AMD
> + seq_printf(p, "%*s: ", prec, "DEF");

Changed that to "DFR" as "DEF" looks like DEFAULT.

With that: applied, thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-07 08:29:42

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH V2 5/6] x86, irq: Cleanup ordering of vector numbers

On Wed, May 06, 2015 at 06:58:57AM -0500, Aravind Gopalakrishnan wrote:
> Enforcing proper descending order of vector number assignments here.
> No functional change.
>
> Signed-off-by: Aravind Gopalakrishnan <[email protected]>
> ---
> arch/x86/include/asm/irq_vectors.h | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)

Applied, thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-07 08:34:11

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH V2 6/6] x86/MCE/AMD: Rename setup_APIC_mce

On Wed, May 06, 2015 at 06:58:58AM -0500, Aravind Gopalakrishnan wrote:
> 'setup_APIC_mce' doesn't give us an indication of why we are
> going to program LVT. Make that explicit by renaming it to
> setup_APIC_mce_threshold so we know.
>
> No functional change is introduced.
>
> Signed-off-by: Aravind Gopalakrishnan <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce_amd.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)

Applied, thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

Subject: [tip:x86/core] x86/mce/amd: Factor out logging mechanism

Commit-ID: afdf344e08fbec28ab2204a626fa1f260dcb68be
Gitweb: http://git.kernel.org/tip/afdf344e08fbec28ab2204a626fa1f260dcb68be
Author: Aravind Gopalakrishnan <[email protected]>
AuthorDate: Wed, 6 May 2015 06:58:53 -0500
Committer: Borislav Petkov <[email protected]>
CommitDate: Wed, 6 May 2015 19:49:20 +0200

x86/mce/amd: Factor out logging mechanism

Refactor the code here to setup struct mce and call mce_log() to log
the error. We're going to reuse this in a later patch as part of the
deferred error interrupt enablement.

No functional change is introduced.

Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Aravind Gopalakrishnan <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: x86-ml <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 55ad9b3..5f25de2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -264,6 +264,27 @@ init:
}
}

+static void __log_error(unsigned int bank, bool threshold_err, u64 misc)
+{
+ struct mce m;
+ u64 status;
+
+ rdmsrl(MSR_IA32_MCx_STATUS(bank), status);
+ if (!(status & MCI_STATUS_VAL))
+ return;
+
+ mce_setup(&m);
+
+ m.status = status;
+ m.bank = bank;
+ if (threshold_err)
+ m.misc = misc;
+
+ mce_log(&m);
+
+ wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
+}
+
/*
* APIC Interrupt Handler
*/
@@ -273,12 +294,12 @@ init:
* the interrupt goes off when error_count reaches threshold_limit.
* the handler will simply log mcelog w/ software defined bank number.
*/
+
static void amd_threshold_interrupt(void)
{
u32 low = 0, high = 0, address = 0;
int cpu = smp_processor_id();
unsigned int bank, block;
- struct mce m;

/* assume first bank caused it */
for (bank = 0; bank < mca_cfg.banks; ++bank) {
@@ -321,15 +342,7 @@ static void amd_threshold_interrupt(void)
return;

log:
- mce_setup(&m);
- rdmsrl(MSR_IA32_MCx_STATUS(bank), m.status);
- if (!(m.status & MCI_STATUS_VAL))
- return;
- m.misc = ((u64)high << 32) | low;
- m.bank = bank;
- mce_log(&m);
-
- wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
+ __log_error(bank, true, ((u64)high << 32) | low);
}

/*

Subject: [tip:x86/core] x86/mce/amd: Collect valid address before logging an error

Commit-ID: 6e6e746e33e9555a7fce159d25314c9df3bcda93
Gitweb: http://git.kernel.org/tip/6e6e746e33e9555a7fce159d25314c9df3bcda93
Author: Aravind Gopalakrishnan <[email protected]>
AuthorDate: Wed, 6 May 2015 06:58:54 -0500
Committer: Borislav Petkov <[email protected]>
CommitDate: Wed, 6 May 2015 19:49:31 +0200

x86/mce/amd: Collect valid address before logging an error

amd_decode_mce() needs value in m->addr so it can report the error
address correctly. This should be setup in __log_error() before we call
mce_log(). We do this because the error address is an important bit of
information which should be conveyed to userspace.

The correct output then reports proper address, like this:

[Hardware Error]: Corrected error, no action required.
[Hardware Error]: CPU:0 (15:60:0) MC0_STATUS [-|CE|-|-|AddrV|-|-|CECC]: 0x840041000028017b
[Hardware Error]: MC0 Error Address: 0x00001f808f0ff040

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: x86-ml <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 5f25de2..6070757 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -277,11 +277,14 @@ static void __log_error(unsigned int bank, bool threshold_err, u64 misc)

m.status = status;
m.bank = bank;
+
if (threshold_err)
m.misc = misc;

- mce_log(&m);
+ if (m.status & MCI_STATUS_ADDRV)
+ rdmsrl(MSR_IA32_MCx_ADDR(bank), m.addr);

+ mce_log(&m);
wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
}

Subject: [tip:x86/core] x86/mce: Add support for deferred errors on AMD

Commit-ID: 7559e13fb4abe7880dfaf985d6a1630ca90a67ce
Gitweb: http://git.kernel.org/tip/7559e13fb4abe7880dfaf985d6a1630ca90a67ce
Author: Aravind Gopalakrishnan <[email protected]>
AuthorDate: Wed, 6 May 2015 06:58:55 -0500
Committer: Borislav Petkov <[email protected]>
CommitDate: Wed, 6 May 2015 20:34:31 +0200

x86/mce: Add support for deferred errors on AMD

Deferred errors indicate error conditions that were not corrected, but
those errors have not been consumed yet. They require no action from
S/W (or action is optional). These errors provide info about a latent
uncorrectable MCE that can occur when a poisoned data is consumed by the
processor.

Newer AMD processors can generate deferred errors and can be configured
to generate APIC interrupts on such events.

SUCCOR stands for S/W UnCorrectable error COntainment and Recovery.
It indicates support for data poisoning in HW and deferred error
interrupts.

Add new bitfield to mce_vendor_flags for this. We use this to verify
presence of deferred error interrupts before we enable them in mce_amd.c

While at it, clarify comments in mce_vendor_flags to provide an
indication of usages of the bitfields.

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: x86-ml <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ beef up commit message, do CPUID(8000_0007) only once. ]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/mce.h | 15 +++++++++++++--
arch/x86/kernel/cpu/mcheck/mce.c | 10 ++++++++--
2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 1f5a86d..407ced6 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -117,8 +117,19 @@ struct mca_config {
};

struct mce_vendor_flags {
- __u64 overflow_recov : 1, /* cpuid_ebx(80000007) */
- __reserved_0 : 63;
+ /*
+ * overflow recovery cpuid bit indicates that overflow
+ * conditions are not fatal
+ */
+ __u64 overflow_recov : 1,
+
+ /*
+ * SUCCOR stands for S/W UnCorrectable error COntainment
+ * and Recovery. It indicates support for data poisoning
+ * in HW and deferred error interrupts.
+ */
+ succor : 1,
+ __reserved_0 : 62;
};
extern struct mce_vendor_flags mce_flags;

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e535533..521e501 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1637,10 +1637,16 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
mce_intel_feature_init(c);
mce_adjust_timer = cmci_intel_adjust_timer;
break;
- case X86_VENDOR_AMD:
+
+ case X86_VENDOR_AMD: {
+ u32 ebx = cpuid_ebx(0x80000007);
+
mce_amd_feature_init(c);
- mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1;
+ mce_flags.overflow_recov = !!(ebx & BIT(0));
+ mce_flags.succor = !!(ebx & BIT(1));
break;
+ }
+
default:
break;
}

Subject: [tip:x86/core] x86/mce/amd: Introduce deferred error interrupt handler

Commit-ID: 24fd78a81f6d3fe7f7a440c8629f9c52cd5f830e
Gitweb: http://git.kernel.org/tip/24fd78a81f6d3fe7f7a440c8629f9c52cd5f830e
Author: Aravind Gopalakrishnan <[email protected]>
AuthorDate: Wed, 6 May 2015 06:58:56 -0500
Committer: Borislav Petkov <[email protected]>
CommitDate: Thu, 7 May 2015 10:23:32 +0200

x86/mce/amd: Introduce deferred error interrupt handler

Deferred errors indicate error conditions that were not corrected, but
require no action from S/W (or action is optional).These errors provide
info about a latent UC MCE that can occur when a poisoned data is
consumed by the processor.

Processors that report these errors can be configured to generate APIC
interrupts to notify OS about the error.

Provide an interrupt handler in this patch so that OS can catch these
errors as and when they happen. Currently, we simply log the errors and
exit the handler as S/W action is not mandated.

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: x86-ml <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/entry_arch.h | 3 ++
arch/x86/include/asm/hardirq.h | 3 ++
arch/x86/include/asm/hw_irq.h | 2 +
arch/x86/include/asm/irq_vectors.h | 1 +
arch/x86/include/asm/mce.h | 3 ++
arch/x86/include/asm/trace/irq_vectors.h | 6 +++
arch/x86/include/asm/traps.h | 3 +-
arch/x86/kernel/cpu/mcheck/mce_amd.c | 93 ++++++++++++++++++++++++++++++++
arch/x86/kernel/entry_64.S | 5 ++
arch/x86/kernel/irq.c | 6 +++
arch/x86/kernel/irqinit.c | 4 ++
arch/x86/kernel/traps.c | 5 ++
12 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/entry_arch.h b/arch/x86/include/asm/entry_arch.h
index dc5fa66..6da46db 100644
--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -50,4 +50,7 @@ BUILD_INTERRUPT(thermal_interrupt,THERMAL_APIC_VECTOR)
BUILD_INTERRUPT(threshold_interrupt,THRESHOLD_APIC_VECTOR)
#endif

+#ifdef CONFIG_X86_MCE_AMD
+BUILD_INTERRUPT(deferred_error_interrupt, DEFERRED_ERROR_VECTOR)
+#endif
#endif
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 0f5fb6b..db9f536 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -33,6 +33,9 @@ typedef struct {
#ifdef CONFIG_X86_MCE_THRESHOLD
unsigned int irq_threshold_count;
#endif
+#ifdef CONFIG_X86_MCE_AMD
+ unsigned int irq_deferred_error_count;
+#endif
#if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN)
unsigned int irq_hv_callback_count;
#endif
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index e9571dd..f71e489 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -73,6 +73,7 @@ extern asmlinkage void invalidate_interrupt31(void);
extern asmlinkage void irq_move_cleanup_interrupt(void);
extern asmlinkage void reboot_interrupt(void);
extern asmlinkage void threshold_interrupt(void);
+extern asmlinkage void deferred_error_interrupt(void);

extern asmlinkage void call_function_interrupt(void);
extern asmlinkage void call_function_single_interrupt(void);
@@ -87,6 +88,7 @@ extern void trace_spurious_interrupt(void);
extern void trace_thermal_interrupt(void);
extern void trace_reschedule_interrupt(void);
extern void trace_threshold_interrupt(void);
+extern void trace_deferred_error_interrupt(void);
extern void trace_call_function_interrupt(void);
extern void trace_call_function_single_interrupt(void);
#define trace_irq_move_cleanup_interrupt irq_move_cleanup_interrupt
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 666c89e..026fc1e 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -113,6 +113,7 @@
#define IRQ_WORK_VECTOR 0xf6

#define UV_BAU_MESSAGE 0xf5
+#define DEFERRED_ERROR_VECTOR 0xf4

/* Vector on which hypervisor callbacks will be delivered */
#define HYPERVISOR_CALLBACK_VECTOR 0xf3
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 407ced6..6a3034a 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -234,6 +234,9 @@ void do_machine_check(struct pt_regs *, long);
extern void (*mce_threshold_vector)(void);
extern void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu);

+/* Deferred error interrupt handler */
+extern void (*deferred_error_int_vector)(void);
+
/*
* Thermal handler
*/
diff --git a/arch/x86/include/asm/trace/irq_vectors.h b/arch/x86/include/asm/trace/irq_vectors.h
index 4cab890..38a09a1 100644
--- a/arch/x86/include/asm/trace/irq_vectors.h
+++ b/arch/x86/include/asm/trace/irq_vectors.h
@@ -101,6 +101,12 @@ DEFINE_IRQ_VECTOR_EVENT(call_function_single);
DEFINE_IRQ_VECTOR_EVENT(threshold_apic);

/*
+ * deferred_error_apic - called when entering/exiting a deferred apic interrupt
+ * vector handler
+ */
+DEFINE_IRQ_VECTOR_EVENT(deferred_error_apic);
+
+/*
* thermal_apic - called when entering/exiting a thermal apic interrupt
* vector handler
*/
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 4e49d7d..c5380be 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -108,7 +108,8 @@ extern int panic_on_unrecovered_nmi;
void math_emulate(struct math_emu_info *);
#ifndef CONFIG_X86_32
asmlinkage void smp_thermal_interrupt(void);
-asmlinkage void mce_threshold_interrupt(void);
+asmlinkage void smp_threshold_interrupt(void);
+asmlinkage void smp_deferred_error_interrupt(void);
#endif

extern enum ctx_state ist_enter(struct pt_regs *regs);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 6070757..2e7ebe7 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -12,6 +12,8 @@
* - added support for AMD Family 0x10 processors
* May 2012
* - major scrubbing
+ * May 2015
+ * - add support for deferred error interrupts (Aravind Gopalakrishnan)
*
* All MC4_MISCi registers are shared between multi-cores
*/
@@ -32,6 +34,7 @@
#include <asm/idle.h>
#include <asm/mce.h>
#include <asm/msr.h>
+#include <asm/trace/irq_vectors.h>

#define NR_BLOCKS 9
#define THRESHOLD_MAX 0xFFF
@@ -47,6 +50,13 @@
#define MASK_BLKPTR_LO 0xFF000000
#define MCG_XBLK_ADDR 0xC0000400

+/* Deferred error settings */
+#define MSR_CU_DEF_ERR 0xC0000410
+#define MASK_DEF_LVTOFF 0x000000F0
+#define MASK_DEF_INT_TYPE 0x00000006
+#define DEF_LVT_OFF 0x2
+#define DEF_INT_TYPE_APIC 0x2
+
static const char * const th_names[] = {
"load_store",
"insn_fetch",
@@ -60,6 +70,13 @@ static DEFINE_PER_CPU(struct threshold_bank **, threshold_banks);
static DEFINE_PER_CPU(unsigned char, bank_map); /* see which banks are on */

static void amd_threshold_interrupt(void);
+static void amd_deferred_error_interrupt(void);
+
+static void default_deferred_error_interrupt(void)
+{
+ pr_err("Unexpected deferred interrupt at vector %x\n", DEFERRED_ERROR_VECTOR);
+}
+void (*deferred_error_int_vector)(void) = default_deferred_error_interrupt;

/*
* CPU Initialization
@@ -205,6 +222,39 @@ static int setup_APIC_mce(int reserved, int new)
return reserved;
}

+static int setup_APIC_deferred_error(int reserved, int new)
+{
+ if (reserved < 0 && !setup_APIC_eilvt(new, DEFERRED_ERROR_VECTOR,
+ APIC_EILVT_MSG_FIX, 0))
+ return new;
+
+ return reserved;
+}
+
+static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c)
+{
+ u32 low = 0, high = 0;
+ int def_offset = -1, def_new;
+
+ if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high))
+ return;
+
+ def_new = (low & MASK_DEF_LVTOFF) >> 4;
+ if (!(low & MASK_DEF_LVTOFF)) {
+ pr_err(FW_BUG "Your BIOS is not setting up LVT offset 0x2 for deferred error IRQs correctly.\n");
+ def_new = DEF_LVT_OFF;
+ low = (low & ~MASK_DEF_LVTOFF) | (DEF_LVT_OFF << 4);
+ }
+
+ def_offset = setup_APIC_deferred_error(def_offset, def_new);
+ if ((def_offset == def_new) &&
+ (deferred_error_int_vector != amd_deferred_error_interrupt))
+ deferred_error_int_vector = amd_deferred_error_interrupt;
+
+ low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC;
+ wrmsr(MSR_CU_DEF_ERR, low, high);
+}
+
/* cpu init entry point, called from mce.c with preempt off */
void mce_amd_feature_init(struct cpuinfo_x86 *c)
{
@@ -262,6 +312,9 @@ init:
mce_threshold_block_init(&b, offset);
}
}
+
+ if (mce_flags.succor)
+ deferred_error_interrupt_enable(c);
}

static void __log_error(unsigned int bank, bool threshold_err, u64 misc)
@@ -288,6 +341,46 @@ static void __log_error(unsigned int bank, bool threshold_err, u64 misc)
wrmsrl(MSR_IA32_MCx_STATUS(bank), 0);
}

+static inline void __smp_deferred_error_interrupt(void)
+{
+ inc_irq_stat(irq_deferred_error_count);
+ deferred_error_int_vector();
+}
+
+asmlinkage __visible void smp_deferred_error_interrupt(void)
+{
+ entering_irq();
+ __smp_deferred_error_interrupt();
+ exiting_ack_irq();
+}
+
+asmlinkage __visible void smp_trace_deferred_error_interrupt(void)
+{
+ entering_irq();
+ trace_deferred_error_apic_entry(DEFERRED_ERROR_VECTOR);
+ __smp_deferred_error_interrupt();
+ trace_deferred_error_apic_exit(DEFERRED_ERROR_VECTOR);
+ exiting_ack_irq();
+}
+
+/* APIC interrupt handler for deferred errors */
+static void amd_deferred_error_interrupt(void)
+{
+ u64 status;
+ unsigned int bank;
+
+ for (bank = 0; bank < mca_cfg.banks; ++bank) {
+ rdmsrl(MSR_IA32_MCx_STATUS(bank), status);
+
+ if (!(status & MCI_STATUS_VAL) ||
+ !(status & MCI_STATUS_DEFERRED))
+ continue;
+
+ __log_error(bank, false, 0);
+ break;
+ }
+}
+
/*
* APIC Interrupt Handler
*/
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 02c2eff..12aea85 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -935,6 +935,11 @@ apicinterrupt THRESHOLD_APIC_VECTOR \
threshold_interrupt smp_threshold_interrupt
#endif

+#ifdef CONFIG_X86_MCE_AMD
+apicinterrupt DEFERRED_ERROR_VECTOR \
+ deferred_error_interrupt smp_deferred_error_interrupt
+#endif
+
#ifdef CONFIG_X86_THERMAL_VECTOR
apicinterrupt THERMAL_APIC_VECTOR \
thermal_interrupt smp_thermal_interrupt
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index e5952c2..590ed6c 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -116,6 +116,12 @@ int arch_show_interrupts(struct seq_file *p, int prec)
seq_printf(p, "%10u ", irq_stats(j)->irq_threshold_count);
seq_puts(p, " Threshold APIC interrupts\n");
#endif
+#ifdef CONFIG_X86_MCE_AMD
+ seq_printf(p, "%*s: ", prec, "DFR");
+ for_each_online_cpu(j)
+ seq_printf(p, "%10u ", irq_stats(j)->irq_deferred_error_count);
+ seq_puts(p, " Deferred Error APIC interrupts\n");
+#endif
#ifdef CONFIG_X86_MCE
seq_printf(p, "%*s: ", prec, "MCE");
for_each_online_cpu(j)
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index cd10a64..d7ec6e7 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -135,6 +135,10 @@ static void __init apic_intr_init(void)
alloc_intr_gate(THRESHOLD_APIC_VECTOR, threshold_interrupt);
#endif

+#ifdef CONFIG_X86_MCE_AMD
+ alloc_intr_gate(DEFERRED_ERROR_VECTOR, deferred_error_interrupt);
+#endif
+
#ifdef CONFIG_X86_LOCAL_APIC
/* self generated IPI for local APIC timer */
alloc_intr_gate(LOCAL_TIMER_VECTOR, apic_timer_interrupt);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 324ab52..68b1d59 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -827,6 +827,11 @@ asmlinkage __visible void __attribute__((weak)) smp_threshold_interrupt(void)
{
}

+asmlinkage __visible void __attribute__((weak))
+smp_deferred_error_interrupt(void)
+{
+}
+
/*
* 'math_state_restore()' saves the current math information in the
* old math state array, and gets the new ones from the current task

Subject: [tip:x86/core] x86/irq: Cleanup ordering of vector numbers

Commit-ID: 5c0d728e1a8ccbaf68ec37181e466627ba0a6efc
Gitweb: http://git.kernel.org/tip/5c0d728e1a8ccbaf68ec37181e466627ba0a6efc
Author: Aravind Gopalakrishnan <[email protected]>
AuthorDate: Wed, 6 May 2015 06:58:57 -0500
Committer: Borislav Petkov <[email protected]>
CommitDate: Thu, 7 May 2015 10:28:43 +0200

x86/irq: Cleanup ordering of vector numbers

Sort vector number assignments in proper descending order. No functional
change.

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: x86-ml <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/irq_vectors.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 026fc1e..8900ba4 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -102,11 +102,6 @@
*/
#define X86_PLATFORM_IPI_VECTOR 0xf7

-/* Vector for KVM to deliver posted interrupt IPI */
-#ifdef CONFIG_HAVE_KVM
-#define POSTED_INTR_VECTOR 0xf2
-#endif
-
/*
* IRQ work vector:
*/
@@ -118,6 +113,11 @@
/* Vector on which hypervisor callbacks will be delivered */
#define HYPERVISOR_CALLBACK_VECTOR 0xf3

+/* Vector for KVM to deliver posted interrupt IPI */
+#ifdef CONFIG_HAVE_KVM
+#define POSTED_INTR_VECTOR 0xf2
+#endif
+
/*
* Local APIC timer IRQ vector is on a different priority level,
* to work around the 'lost local interrupt if more than 2 IRQ

Subject: [tip:x86/core] x86/mce/amd: Rename setup_APIC_mce

Commit-ID: 868c00bb5980653c44d931384baa2c7f1bde81ef
Gitweb: http://git.kernel.org/tip/868c00bb5980653c44d931384baa2c7f1bde81ef
Author: Aravind Gopalakrishnan <[email protected]>
AuthorDate: Wed, 6 May 2015 06:58:58 -0500
Committer: Borislav Petkov <[email protected]>
CommitDate: Thu, 7 May 2015 10:33:40 +0200

x86/mce/amd: Rename setup_APIC_mce

'setup_APIC_mce' doesn't give us an indication of why we are
going to program LVT. Make that explicit by renaming it to
setup_APIC_mce_threshold so we know.

No functional change is introduced.

Signed-off-by: Aravind Gopalakrishnan <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: x86-ml <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 2e7ebe7..70e1bf6 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -213,7 +213,7 @@ static void mce_threshold_block_init(struct threshold_block *b, int offset)
threshold_restart_bank(&tr);
};

-static int setup_APIC_mce(int reserved, int new)
+static int setup_APIC_mce_threshold(int reserved, int new)
{
if (reserved < 0 && !setup_APIC_eilvt(new, THRESHOLD_APIC_VECTOR,
APIC_EILVT_MSG_FIX, 0))
@@ -302,7 +302,7 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)

b.interrupt_enable = 1;
new = (high & MASK_LVTOFF_HI) >> 20;
- offset = setup_APIC_mce(offset, new);
+ offset = setup_APIC_mce_threshold(offset, new);

if ((offset == new) &&
(mce_threshold_vector != amd_threshold_interrupt))