Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754794AbbG3Bpp (ORCPT ); Wed, 29 Jul 2015 21:45:45 -0400 Received: from mail7.hitachi.co.jp ([133.145.228.42]:43171 "EHLO mail7.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754685AbbG3Bpl (ORCPT ); Wed, 29 Jul 2015 21:45:41 -0400 From: =?utf-8?B?5rKz5ZCI6Iux5a6PIC8gS0FXQUnvvIxISURFSElSTw==?= To: "'Michal Hocko'" CC: Jonathan Corbet , Peter Zijlstra , Ingo Molnar , "Eric W. Biederman" , "H. Peter Anvin" , Andrew Morton , Thomas Gleixner , Vivek Goyal , "linux-doc@vger.kernel.org" , "x86@kernel.org" , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Ingo Molnar , =?utf-8?B?5bmz5p2+6ZuF5bezIC8gSElSQU1BVFXvvIxNQVNBTUk=?= Subject: RE: Re: [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI Thread-Topic: Re: [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI Thread-Index: AQHQydfduQy2RlZRNkO8HiurnXt/8J3yI0hA//9yMQCAAZxu0A== Date: Thu, 30 Jul 2015 01:45:35 +0000 Message-ID: <04EAB7311EE43145B2D3536183D1A8445491F23A@GSjpTKYDCembx31.service.hitachi.net> References: <20150727015850.4928.87717.stgit@softrs> <20150727015850.4928.50289.stgit@softrs> <20150727143405.GF11317@dhcp22.suse.cz> <55B6E2A3.8070004@hitachi.com> <04EAB7311EE43145B2D3536183D1A8445491D5E8@GSjpTKYDCembx31.service.hitachi.net> <20150729082329.GA15801@dhcp22.suse.cz> <04EAB7311EE43145B2D3536183D1A8445491DB5E@GSjpTKYDCembx31.service.hitachi.net> <20150729092157.GC15801@dhcp22.suse.cz> In-Reply-To: <20150729092157.GC15801@dhcp22.suse.cz> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.198.220.53] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id t6U1joCD030162 Content-Length: 4295 Lines: 97 Hi, > From: Michal Hocko [mailto:mhocko@kernel.org] > > On Wed 29-07-15 09:09:18, 河合英宏 / KAWAI,HIDEHIRO wrote: > > > From: Michal Hocko [mailto:mhocko@kernel.org] > > > On Wed 29-07-15 05:48:47, 河合英宏 / KAWAI,HIDEHIRO wrote: > > > > Hi, > > > > > > > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Hidehiro Kawai > > > > > (2015/07/27 23:34), Michal Hocko wrote: > > > > > > On Mon 27-07-15 10:58:50, Hidehiro Kawai wrote: > > > > [...] > > > > > > The check could be also relaxed a bit and nmi_panic would > > > > > > return only if the ongoing panic is the current cpu when we really have > > > > > > to return and allow the preempted panic to finish. > > > > > > > > > > It's reasonable. I'll do that in the next version. > > > > > > > > I noticed atomic_read() is insufficient. Please consider the following > > > > scenario. > > > > > > > > CPU 1: call panic() in the normal context > > > > CPU 0: call nmi_panic(), check the value of panic_cpu, then call panic() > > > > CPU 1: set 1 to panic_cpu > > > > CPU 0: fail to set 0 to panic_cpu, then do an infinite loop > > > > CPU 1: call crash_kexec(), then call kdump_nmi_shootdown_cpus() > > > > > > > > At this point, since CPU 0 loops in NMI context, it never executes > > > > the NMI handler registered by kdump_nmi_shootdown_cpus(). This means > > > > that no register states are saved and no cleanups for VMX/SVM are > > > > performed. > > > > > > Yes this is true but it is no different from the current state, isn't > > > it? So if you want to handle that then it deserves a separate patch. > > > It is certainly not harmful wrt. panic behavior. > > > > > > > So, we should still use atomic_cmpxchg() in nmi_panic() to > > > > prevent other cpus from running panic routines. > > > > > > Not sure what you mean by that. > > > > I mean that we should use the same logic as my V2 patch like this: > > > > #define nmi_panic(fmt, ...) \ > > do { \ > > if (atomic_cmpxchg(&panic_cpu, -1, raw_smp_processor_id()) \ > > == -1) \ > > panic(fmt, ##__VA_ARGS__); \ > > } while (0) > > This would allow to return from NMI too eagerly. Yes, but what's the problem? The root cause of your case hasn't been clarified yet. I can't fix for an unclear issue because I don't know what's the right solution. > When I was testing my > previous approach (on 3.0 based kernel) I had basically the same thing > (one NMI to process panic) and others to return. This led to a strange > behavior when the NMI button triggered NMI on all (hundreds) CPUs. It's strange. Usually, NMI caused by NMI button is routed to only CPU 0 as an external NMI. External NMI for CPUs other than CPU 0 are masked at boot time. Does it really happen? Does the problem still happen on the latest kernel? What kind of NMI is deliverd to each CPU? Traditionally, we should have assumed that NMI for crash dumping is delivered to only one cpu. Otherwise, we should often fail to take a proper crash dump. It seems that your case is another problem to be solved separately. > The > crash kernel booted eventually but the log contained lockups when a > CPU waited for an IPI to the CPU which was handling the NMI panic. Could you explain more precisely? > Anyway, I do not thing this is really necessary to solve the panic > reentrancy issue. > If the missing saved state is a real problem then it > should be handled separately - maybe it can be achieved without an IPI > and directly from the panic context if we are in NMI. What I would like to do via this patchse is to solve race issues among NMI, panic() and crash_kexec(). So, I don't think we should fix that separately, although I would need to reword some descriptions and titles. Anyway, I'm going to sent out my revised version once in order to tidy up. I also would like to hear kexec/kdump guys' opinions. Regards, Kawai ????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?