Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261630AbVASH5d (ORCPT ); Wed, 19 Jan 2005 02:57:33 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261650AbVASH4a (ORCPT ); Wed, 19 Jan 2005 02:56:30 -0500 Received: from ebiederm.dsl.xmission.com ([166.70.28.69]:52415 "EHLO ebiederm.dsl.xmission.com") by vger.kernel.org with ESMTP id S261629AbVASHdZ (ORCPT ); Wed, 19 Jan 2005 02:33:25 -0500 From: "Eric W. Biederman" To: Andrew Morton Cc: , Subject: [PATCH 22/29] x86-crash_shutdown-nmi-shootdown Date: Wed, 19 Jan 2005 0:31:37 -0700 Message-ID: X-Mailer: patch-bomb.pl@ebiederm.dsl.xmission.com In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3096 Lines: 102 One of the dangers when switching from one kernel to another is what happens to all of the other cpus that were running in the crashed kernel. In an attempt to avoid that problem this patch adds a nmi handler and attempts to shoot down the other cpus by sending them non maskable interrupts. The code then waits for 1 second or until all known cpus have stopped running and then jumps from the running kernel that has crashed to the kernel in reserved memory. The kernel spin loop is used for the delay as that should behave continue to be safe even in after a crash. Signed-off-by: Eric Biederman --- crash.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 56 insertions(+) diff -uNr linux-2.6.11-rc1-mm1-nokexec-kexec-ppc-support/arch/i386/kernel/crash.c linux-2.6.11-rc1-mm1-nokexec-x86-crash_shutdown-nmi-shootdown/arch/i386/kernel/crash.c --- linux-2.6.11-rc1-mm1-nokexec-kexec-ppc-support/arch/i386/kernel/crash.c Tue Jan 18 22:58:15 2005 +++ linux-2.6.11-rc1-mm1-nokexec-x86-crash_shutdown-nmi-shootdown/arch/i386/kernel/crash.c Tue Jan 18 23:15:17 2005 @@ -23,12 +23,65 @@ #include #include #include +#include #define MAX_NOTE_BYTES 1024 typedef u32 note_buf_t[MAX_NOTE_BYTES/4]; note_buf_t crash_notes[NR_CPUS]; +#ifdef CONFIG_SMP +static atomic_t waiting_for_crash_ipi; + +static int crash_nmi_callback(struct pt_regs *regs, int cpu) +{ + local_irq_disable(); + atomic_dec(&waiting_for_crash_ipi); + /* Assume hlt works */ + __asm__("hlt"); + for(;;); + return 1; +} + +/* + * By using the NMI code instead of a vector we just sneak thru the + * word generator coming out with just what we want. AND it does + * not matter if clustered_apic_mode is set or not. + */ +static void smp_send_nmi_allbutself(void) +{ + send_IPI_allbutself(APIC_DM_NMI); +} + +static void nmi_shootdown_cpus(void) +{ + unsigned long msecs; + atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); + + /* Would it be better to replace the trap vector here? */ + set_nmi_callback(crash_nmi_callback); + /* Ensure the new callback function is set before sending + * out the NMI + */ + wmb(); + + smp_send_nmi_allbutself(); + + msecs = 1000; /* Wait at most a second for the other cpus to stop */ + while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) { + mdelay(1); + msecs--; + } + + /* Leave the nmi callback set */ +} +#else +static void nmi_shootdown_cpus(void) +{ + /* There are no cpus to shootdown */ +} +#endif + void machine_crash_shutdown(void) { /* This function is only called after the system @@ -39,4 +92,7 @@ * In practice this means shooting down the other cpus in * an SMP system. */ + /* The kernel is broken so disable interrupts */ + local_irq_disable(); + nmi_shootdown_cpus(); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/