Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933458AbbHDLxf (ORCPT ); Tue, 4 Aug 2015 07:53:35 -0400 Received: from mail9.hitachi.co.jp ([133.145.228.44]:52327 "EHLO mail9.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932776AbbHDLxd (ORCPT ); Tue, 4 Aug 2015 07:53:33 -0400 From: =?utf-8?B?5rKz5ZCI6Iux5a6PIC8gS0FXQUnvvIxISURFSElSTw==?= To: "'Michal Hocko'" CC: Jonathan Corbet , Peter Zijlstra , Ingo Molnar , "Eric W. Biederman" , "H. Peter Anvin" , Andrew Morton , Thomas Gleixner , Vivek Goyal , "linux-doc@vger.kernel.org" , "x86@kernel.org" , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Ingo Molnar , =?utf-8?B?5bmz5p2+6ZuF5bezIC8gSElSQU1BVFXvvIxNQVNBTUk=?= Subject: RE: Re: [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI Thread-Topic: Re: [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI Thread-Index: AQHQydfduQy2RlZRNkO8HiurnXt/8J3yI0hA//9yMQCAAZxu0P//27WAgADIr+D//4VugABCBvNgALIQLoAAGKkE8A== Date: Tue, 4 Aug 2015 11:53:28 +0000 Message-ID: <04EAB7311EE43145B2D3536183D1A844549269F3@GSjpTKYDCembx31.service.hitachi.net> References: <55B6E2A3.8070004@hitachi.com> <04EAB7311EE43145B2D3536183D1A8445491D5E8@GSjpTKYDCembx31.service.hitachi.net> <20150729082329.GA15801@dhcp22.suse.cz> <04EAB7311EE43145B2D3536183D1A8445491DB5E@GSjpTKYDCembx31.service.hitachi.net> <20150729092157.GC15801@dhcp22.suse.cz> <04EAB7311EE43145B2D3536183D1A8445491F23A@GSjpTKYDCembx31.service.hitachi.net> <20150730074812.GA9387@dhcp22.suse.cz> <04EAB7311EE43145B2D3536183D1A8445491FC55@GSjpTKYDCembx31.service.hitachi.net> <20150730122747.GA3954@dhcp22.suse.cz> <04EAB7311EE43145B2D3536183D1A844549220E7@GSjpTKYDCembx31.service.hitachi.net> <20150804085651.GC18509@dhcp22.suse.cz> In-Reply-To: <20150804085651.GC18509@dhcp22.suse.cz> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.198.220.34] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id t74BreHZ032309 Content-Length: 2346 Lines: 52 Hi, > From: Michal Hocko [mailto:mhocko@kernel.org] > On Fri 31-07-15 11:23:00, 河合英宏 / KAWAI,HIDEHIRO wrote: > > > From: Michal Hocko [mailto:mhocko@kernel.org] > > > > There is a timeout of 1000ms in nmi_shootdown_cpus(), so I don't know > > > > why CPU 130 waits so long. I'll try to consider for a while. > > > > > > Yes, I do not understand the timing here either and the fact that the > > > log is a complete mess in the important parts doesn't help a wee bit. > > > > I'm interested in where "kernel panic -not syncing: " is. > > It may give us a clue. > > This one is lost in the mangled text: > [ 167.843771] U<0>[ 167.843771] hhuh. NMI received for unkn<0><0>[ 167.843765] Uh[ 16NM843774I own rea reived for > unknow<0 r 16n 2d 765] Uhhuh. CPU recei11. <0known reason 7. on770] Ker<[ - not rn NMI:nic - not contt sing > > <0 >[ : Not con.inu437azed and confused, b] Dtryingaed annue > > fu 167.8ut trying>[ to 7.<0377 167.843775] U<0>[ 167.843776] ]hhu.ived for u3nknown rMason 3 re oived for [nk167.843781] Thanks for the information. I anticipated that some lock contention on issuing messages (e.g. locks used by network/netconsole driver) delayed the panic procedure, but it seems not to be related because the panic message finished to be issued early. If I come up with something, I will post a mail. I think there may be potential issues. > 1. > <. N0>[ 167.843781] Uh. NMI recen 3d on CPU 0.i< >[ nowon 3d on] Chhuh.MI > eceived[ or7.843nknoUhhuh.wn rMason e3d ceCPivUd 120. > <0nk>no 167.wn843ason 3na s p120. > o<0er savi d6 e843ab88] Do yeu have a > [ er saving mode e nabl1d?7<4][ 167 84hu94]MIuh. NceIived for unknown reas vdfor 1no3was0>[ 2d 67.84380on CI > rUe 12e. > ive7d8u3800wn rveaseo f2d on CPo3.r< u>k[o 1 rea6s.o2d8 oo you hn aPve <0st>a e power 1s7.843816] Do yoauv ng moade > enbslra?ng[ e 167.8438p41o]er shhuhavi.ngIroenived fbled?nknow > < reaso0> 2d on [PU1626.41]0> Uh67.h. NM387I] receihed for .nknown reason 2Nn MC U ceived for . > [son 2d on CPU 6. > < 160>7.8467.84873] Uhhuh. 3MI received 908 o knstra > [ n167.843908] Do ygo pave westrangesa pvnv mode enableng mode ed? > n