Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759491AbbLCJfz (ORCPT ); Thu, 3 Dec 2015 04:35:55 -0500 Received: from mail.skyhub.de ([78.46.96.112]:56556 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757763AbbLCJfu (ORCPT ); Thu, 3 Dec 2015 04:35:50 -0500 Date: Thu, 3 Dec 2015 10:35:44 +0100 From: Borislav Petkov To: =?utf-8?B?5rKz5ZCI6Iux5a6PIC8gS0FXQUnvvIxISURFSElSTw==?= Cc: Jonathan Corbet , Peter Zijlstra , Ingo Molnar , "Eric W. Biederman" , "H. Peter Anvin" , Andrew Morton , Thomas Gleixner , Vivek Goyal , Baoquan He , "linux-doc@vger.kernel.org" , "x86@kernel.org" , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Michal Hocko , =?utf-8?B?5bmz5p2+6ZuF5bezIC8gSElSQU1BVFXvvIxNQVNBTUk=?= Subject: Re: [V5 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly Message-ID: <20151203093544.GC22271@pd.tnic> References: <20151120093641.4285.97253.stgit@softrs> <20151120093648.4285.17715.stgit@softrs> <20151125095457.GB29499@pd.tnic> <04EAB7311EE43145B2D3536183D1A84454A3B032@GSjpTKYDCembx31.service.hitachi.net> <20151202154023.GH3783@pd.tnic> <04EAB7311EE43145B2D3536183D1A84454A3CD95@GSjpTKYDCembx31.service.hitachi.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <04EAB7311EE43145B2D3536183D1A84454A3CD95@GSjpTKYDCembx31.service.hitachi.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2069 Lines: 61 On Thu, Dec 03, 2015 at 02:01:38AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote: > > On Wed, Dec 02, 2015 at 11:57:38AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote: > > > We can do so, but I think resetting panic_cpu always would be > > > simpler and safer. > > I'll state in detail. > > When we call crash_kexec() without entering panic() and return from > it, panic() should be called eventually. Huh, the call chain is panic->crash_kexec Or do you mean, when crash_kexec() is not called by panic() but by some of its other callers? > But the code paths are a bit complicated and there are many > implementations for each architecture. So one day, this assumption may > be broken; the CPU doesn't call panic(). Or the CPU may fail to call > panic() because we are already in insane state. It would be nervous, > but allowing another CPU to process panic routines by resetting > panic_cpu is safer approach. My suggestion was to do this only on the panic path - not necessarily on the others. > Since this code is executed only once due to panic_cpu, > I think introducing this logic is not much valuable. > Also, current implementation is already quite simple: > > panic() > { > ... > __crash_kexec(NULL) { > if (mutex_trylock(&kexec_mutex)) { > if (kexec_crash_image) { > /* don't return */ > } I don't mean the kexec_crash_image case - I mean the opposite one: !kexec_crash_image. And I think I know now what you're trying to tell me: the first CPU which hits panic, will finish panic eventually and so it will take down the machine. Every other CPU which happens to enter panic in between the first CPU and the machine being taken down, doesn't matter because, well, who cares, we're panicking already. Am I close? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/