Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933044AbbLCL32 (ORCPT ); Thu, 3 Dec 2015 06:29:28 -0500 Received: from mail9.hitachi.co.jp ([133.145.228.44]:40726 "EHLO mail9.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756465AbbLCL3Z (ORCPT ); Thu, 3 Dec 2015 06:29:25 -0500 From: =?utf-8?B?5rKz5ZCI6Iux5a6PIC8gS0FXQUnvvIxISURFSElSTw==?= To: "'Borislav Petkov'" CC: Jonathan Corbet , Peter Zijlstra , Ingo Molnar , "Eric W. Biederman" , "H. Peter Anvin" , Andrew Morton , Thomas Gleixner , Vivek Goyal , Baoquan He , "linux-doc@vger.kernel.org" , "x86@kernel.org" , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Michal Hocko , =?utf-8?B?5bmz5p2+6ZuF5bezIC8gSElSQU1BVFXvvIxNQVNBTUk=?= Subject: RE: [V5 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly Thread-Topic: [V5 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly Thread-Index: AQHRI3eQ3182xzsAkE+t80SD59JIOJ6r8N+AgAuxnQD//684gIABPFXA///wHgCAAKwiQA== Date: Thu, 3 Dec 2015 11:29:21 +0000 Message-ID: <04EAB7311EE43145B2D3536183D1A84454A3DF12@GSjpTKYDCembx31.service.hitachi.net> References: <20151120093641.4285.97253.stgit@softrs> <20151120093648.4285.17715.stgit@softrs> <20151125095457.GB29499@pd.tnic> <04EAB7311EE43145B2D3536183D1A84454A3B032@GSjpTKYDCembx31.service.hitachi.net> <20151202154023.GH3783@pd.tnic> <04EAB7311EE43145B2D3536183D1A84454A3CD95@GSjpTKYDCembx31.service.hitachi.net> <20151203093544.GC22271@pd.tnic> In-Reply-To: <20151203093544.GC22271@pd.tnic> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.198.219.54] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id tB3BTW6U013931 Content-Length: 2828 Lines: 77 > On Thu, Dec 03, 2015 at 02:01:38AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote: > > > On Wed, Dec 02, 2015 at 11:57:38AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote: > > > > We can do so, but I think resetting panic_cpu always would be > > > > simpler and safer. > > > > I'll state in detail. > > > > When we call crash_kexec() without entering panic() and return from > > it, panic() should be called eventually. > > Huh, the call chain is > > panic->crash_kexec > > Or do you mean, when crash_kexec() is not called by panic() but by some > of its other callers? I was arguing about the case of oops_end --> crash_kexec --> return from crash_kexec because of !kexec_crash_image --> panic. In the case of panic --> __crash_kexec, __crash_kexec is called only once, so we don't need to check the return value of __crash_kexec as you suggested. So I thought you stated about crash_kexec --> panic case. > > But the code paths are a bit complicated and there are many > > implementations for each architecture. So one day, this assumption may > > be broken; the CPU doesn't call panic(). Or the CPU may fail to call > > panic() because we are already in insane state. It would be nervous, > > but allowing another CPU to process panic routines by resetting > > panic_cpu is safer approach. > > My suggestion was to do this only on the panic path - not necessarily on > the others. > > > Since this code is executed only once due to panic_cpu, > > I think introducing this logic is not much valuable. > > Also, current implementation is already quite simple: > > > > panic() > > { > > ... > > __crash_kexec(NULL) { > > if (mutex_trylock(&kexec_mutex)) { > > if (kexec_crash_image) { > > /* don't return */ > > } > > I don't mean the kexec_crash_image case - I mean the opposite one: > !kexec_crash_image. I also mentioned !kexec_crash_image case... > And I think I know now what you're trying to tell > me: the first CPU which hits panic, will finish panic eventually and so > it will take down the machine. No. The first CPU calls panic, and then it calls __crash_kexec. Because of !kexec_crash_image, it returns from __crash_kexec and continues to the panic procedure. At the same time, another CPU tries to call panic(), but it doesn't run the panic procedure; panic_cpu prevents the second CPU from running it. This means __crash_kexec is called only once even if we don't check the return value of __crash_kexec. (Please note that crash_kexec can be called multiple times in the case of oops_end() --> crash_kexec().) I'm sorry I couldn't tell my thought well. Regards, -- Hidehiro Kawai Hitachi, Ltd. Research & Development Group ????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?