Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754600AbbKXPMr (ORCPT ); Tue, 24 Nov 2015 10:12:47 -0500 Received: from cdptpa-outbound-snat.email.rr.com ([107.14.166.227]:13712 "EHLO cdptpa-oedge-vip.email.rr.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752879AbbKXPMp (ORCPT ); Tue, 24 Nov 2015 10:12:45 -0500 X-Greylist: delayed 451 seconds by postgrey-1.27 at vger.kernel.org; Tue, 24 Nov 2015 10:12:45 EST Date: Tue, 24 Nov 2015 10:05:10 -0500 From: Steven Rostedt To: Hidehiro Kawai Cc: Jonathan Corbet , Peter Zijlstra , Ingo Molnar , "Eric W. Biederman" , "H. Peter Anvin" , Andrew Morton , Thomas Gleixner , Vivek Goyal , Baoquan He , linux-doc@vger.kernel.org, x86@kernel.org, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, Michal Hocko , Ingo Molnar , Borislav Petkov , Masami Hiramatsu Subject: Re: [V5 PATCH 1/4] panic/x86: Fix re-entrance problem due to panic on NMI Message-ID: <20151124150510.GA6100@home.goodmis.org> References: <20151120093641.4285.97253.stgit@softrs> <20151120093644.4285.9349.stgit@softrs> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151120093644.4285.9349.stgit@softrs> User-Agent: Mutt/1.5.24 (2015-08-30) X-RR-Connecting-IP: 107.14.168.118:25 X-Cloudmark-Score: 0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1821 Lines: 69 On Fri, Nov 20, 2015 at 06:36:44PM +0900, Hidehiro Kawai wrote: > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index 350dfb0..480a4fd 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -445,6 +445,19 @@ extern int sysctl_panic_on_stackoverflow; > > extern bool crash_kexec_post_notifiers; > > +extern atomic_t panic_cpu; > + > +/* > + * A variant of panic() called from NMI context. > + * If we've already panicked on this cpu, return from here. > + */ > +#define nmi_panic(fmt, ...) \ > + do { \ > + int this_cpu = raw_smp_processor_id(); \ > + if (atomic_cmpxchg(&panic_cpu, -1, this_cpu) != this_cpu) \ > + panic(fmt, ##__VA_ARGS__); \ Hmm, What happens if: CPU 0: CPU 1: ------ ------ nmi_panic(); nmi_panic(); nmi_panic(); ? cmpxchg(&panic_cpu, -1, 0) != 0 returns -1 for cpu 0, thus 0 != 0, and sets panic_cpu to 0 cmpxchg(&panic_cpu, -1, 1) != 1 returns 0, and then it too panics, but does not set panic_cpu to 1 Now you have your external NMI triggering on CPU 1 cmpxchg(&panic_cpu, -1, 1) != 1 returns 0 again, and you call panic again within the panic of CPU 1. Is this OK? Perhaps you want a per cpu bitmask, and do a test_and_set() on the CPU. That would prevent any CPU from rerunning a panic() twice on any CPU. -- Steve > + } while (0) > + > /* > * Only to be used by arch init code. If the user over-wrote the default > * CONFIG_PANIC_TIMEOUT, honor it. > diff --git a/kernel/panic.c b/kernel/panic.c > index 4579dbb..24ee2ea 100644 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/