Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752741AbbG2Fsy (ORCPT ); Wed, 29 Jul 2015 01:48:54 -0400 Received: from mail9.hitachi.co.jp ([133.145.228.44]:36238 "EHLO mail9.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752472AbbG2Fsw convert rfc822-to-8bit (ORCPT ); Wed, 29 Jul 2015 01:48:52 -0400 From: =?iso-2022-jp?B?GyRCMk85ZzFROSgbKEIgLyBLQVdBSRskQiEkGyhCSElERUhJUk8=?= To: =?iso-2022-jp?B?GyRCMk85ZzFROSgbKEIgLyBLQVdBSRskQiEkGyhCSElERUhJUk8=?= , Michal Hocko CC: Jonathan Corbet , Peter Zijlstra , Ingo Molnar , "Eric W. Biederman" , "H. Peter Anvin" , Andrew Morton , Thomas Gleixner , Vivek Goyal , "linux-doc@vger.kernel.org" , "x86@kernel.org" , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Ingo Molnar , =?iso-2022-jp?B?GyRCSj8+PjJtTCYbKEIgLyBISVJBTUFUVRskQiEkGyhCTUFTQU1J?= Subject: RE: Re: [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI Thread-Topic: [!]Re: [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI Thread-Index: AQHQyNmAloQY/+XPZUWKQOuOmEC46J3x7UWw Date: Wed, 29 Jul 2015 05:48:47 +0000 Message-ID: <04EAB7311EE43145B2D3536183D1A8445491D5E8@GSjpTKYDCembx31.service.hitachi.net> References: <20150727015850.4928.87717.stgit@softrs> <20150727015850.4928.50289.stgit@softrs> <20150727143405.GF11317@dhcp22.suse.cz> <55B6E2A3.8070004@hitachi.com> In-Reply-To: <55B6E2A3.8070004@hitachi.com> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.198.220.54] Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1695 Lines: 45 Hi, > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Hidehiro Kawai > (2015/07/27 23:34), Michal Hocko wrote: > > On Mon 27-07-15 10:58:50, Hidehiro Kawai wrote: [...] > > The check could be also relaxed a bit and nmi_panic would > > return only if the ongoing panic is the current cpu when we really have > > to return and allow the preempted panic to finish. > > It's reasonable. I'll do that in the next version. I noticed atomic_read() is insufficient. Please consider the following scenario. CPU 1: call panic() in the normal context CPU 0: call nmi_panic(), check the value of panic_cpu, then call panic() CPU 1: set 1 to panic_cpu CPU 0: fail to set 0 to panic_cpu, then do an infinite loop CPU 1: call crash_kexec(), then call kdump_nmi_shootdown_cpus() At this point, since CPU 0 loops in NMI context, it never executes the NMI handler registered by kdump_nmi_shootdown_cpus(). This means that no register states are saved and no cleanups for VMX/SVM are performed. So, we should still use atomic_cmpxchg() in nmi_panic() to prevent other cpus from running panic routines. > > +void nmi_panic(const char *fmt, ...) > > +{ > > + /* > > + * We have to back off if the NMI has preempted an ongoing panic and > > + * allow it to finish > > + */ > > + if (atomic_read(&panic_cpu) == raw_smp_processor_id()) > > + return; > > + > > + panic(); > > +} > > +EXPORT_SYMBOL(nmi_panic); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/