Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750916AbbB0FC2 (ORCPT ); Fri, 27 Feb 2015 00:02:28 -0500 Received: from TYO201.gate.nec.co.jp ([210.143.35.51]:61467 "EHLO tyo201.gate.nec.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750713AbbB0FC0 convert rfc822-to-8bit (ORCPT ); Fri, 27 Feb 2015 00:02:26 -0500 From: Naoya Horiguchi To: Tony Luck , Borislav Petkov CC: Prarit Bhargava , Vivek Goyal , "linux-kernel@vger.kernel.org" , Junichi Nomura , Kiyoshi Ueda Subject: [PATCH v2 1/2] x86: mce: kexec: turn off MCE in kexec Thread-Topic: [PATCH v2 1/2] x86: mce: kexec: turn off MCE in kexec Thread-Index: AQHQUkoNpYXYiYQheEug6LuKQ8Lb8w== Date: Fri, 27 Feb 2015 04:58:40 +0000 Message-ID: <1425013116-23581-1-git-send-email-n-horiguchi@ah.jp.nec.com> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.128.101.14] Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3937 Lines: 109 kexec disables (or "shoots down") all CPUs other than a crashing CPU before entering the 2nd kernel. But the MCE handler is still enabled after that, so if MCE happens and broadcasts around CPUs after the main thread starts the 2nd kernel (which might not start MCE yet, or might decide not to start MCE,) MCE handler runs only on the other CPUs (not on the main thread,) leading to kernel panic with MCE synchronization. The user-visible effect of this bug is kdump failure. Note that this problem exists since current MCE handler was implemented in 2.6.32, and recently commit 716079f66eac ("mce: Panic when a core has reached a timeout") made it more visible by changing the default behavior of the synchronization timeout from "ignore" to "panic". This patch adds a global variable representing that the system is running kdump code in order to "turn off" the MCE handling code in kdump context. Signed-off-by: Naoya Horiguchi Cc: [2.6.32+] --- ChangeLog v1 -> v2 - clear MSR_IA32_MCG_CTL, MSR_IA32_MCx_CTL, and CR4.MCE instead of using global flag to ignore MCE events. - fixed the description of the problem --- arch/x86/include/asm/mce.h | 1 + arch/x86/kernel/cpu/mcheck/mce.c | 17 +++++++++++++++++ arch/x86/kernel/crash.c | 8 ++++++++ 3 files changed, 26 insertions(+) diff --git v3.19.orig/arch/x86/include/asm/mce.h v3.19/arch/x86/include/asm/mce.h index 51b26e895933..7ae9927d781a 100644 --- v3.19.orig/arch/x86/include/asm/mce.h +++ v3.19/arch/x86/include/asm/mce.h @@ -175,6 +175,7 @@ static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { } #endif int mce_available(struct cpuinfo_x86 *c); +void cpu_emergency_mce_disable(void); DECLARE_PER_CPU(unsigned, mce_exception_count); DECLARE_PER_CPU(unsigned, mce_poll_count); diff --git v3.19.orig/arch/x86/kernel/cpu/mcheck/mce.c v3.19/arch/x86/kernel/cpu/mcheck/mce.c index 3112b79ace8e..10359ae1f558 100644 --- v3.19.orig/arch/x86/kernel/cpu/mcheck/mce.c +++ v3.19/arch/x86/kernel/cpu/mcheck/mce.c @@ -2105,6 +2105,23 @@ static void mce_syscore_shutdown(void) } /* + * Called in kdump entering code to turn off MCE handling function. We clear + * global switch first to forbid the situation where only portion of CPUs are + * responsive to MCE and MCE causes kernel panic with synchronization timeout. + */ +void cpu_emergency_mce_disable(void) +{ + u64 cap; + int i; + + rdmsrl(MSR_IA32_MCG_CAP, cap); + if (cap & MCG_CTL_P) + wrmsr(MSR_IA32_MCG_CTL, 0, 0); + mce_disable_error_reporting(); + clear_in_cr4(X86_CR4_MCE); +} + +/* * On resume clear all MCE state. Don't want to see leftovers from the BIOS. * Only one CPU is active at this time, the others get re-added later using * CPU hotplug: diff --git v3.19.orig/arch/x86/kernel/crash.c v3.19/arch/x86/kernel/crash.c index aceb2f90c716..22451c687fca 100644 --- v3.19.orig/arch/x86/kernel/crash.c +++ v3.19/arch/x86/kernel/crash.c @@ -34,6 +34,7 @@ #include #include #include +#include /* Alignment required for elf header segment */ #define ELF_CORE_HEADER_ALIGN 4096 @@ -112,6 +113,8 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs) #endif crash_save_cpu(regs, cpu); + cpu_emergency_mce_disable(); + /* * VMCLEAR VMCSs loaded on all cpus if needed. */ @@ -157,6 +160,11 @@ void native_machine_crash_shutdown(struct pt_regs *regs) /* The kernel is broken so disable interrupts */ local_irq_disable(); + /* + * We can't expect MCE handling to work any more, so turn it off. + */ + cpu_emergency_mce_disable(); + kdump_nmi_shootdown_cpus(); /* -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/