Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753450AbbGNPsz (ORCPT ); Tue, 14 Jul 2015 11:48:55 -0400 Received: from ec2-54-201-57-178.us-west-2.compute.amazonaws.com ([54.201.57.178]:60814 "EHLO ip-172-31-12-36.us-west-2.compute.internal" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752818AbbGNPsv (ORCPT ); Tue, 14 Jul 2015 11:48:51 -0400 Date: Tue, 14 Jul 2015 15:48:33 +0000 From: dwalker@fifo99.com To: Vivek Goyal Cc: "Eric W. Biederman" , Hidehiro Kawai , Andrew Morton , linux-mips@linux-mips.org, Baoquan He , linux-sh@vger.kernel.org, linux-s390@vger.kernel.org, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, Ingo Molnar , HATAYAMA Daisuke , Masami Hiramatsu , linuxppc-dev@lists.ozlabs.org, linux-metag@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH 1/3] panic: Disable crash_kexec_post_notifiers if kdump is not available Message-ID: <20150714154833.GA18883@fifo99.com> References: <20150710113331.4368.10495.stgit@softrs> <20150710113331.4368.63745.stgit@softrs> <87wpy82kqf.fsf@x220.int.ebiederm.org> <20150713202611.GA16525@fifo99.com> <87h9p7r0we.fsf@x220.int.ebiederm.org> <20150714135919.GA18333@fifo99.com> <20150714150208.GD10792@redhat.com> <20150714153430.GA18766@fifo99.com> <20150714154040.GA3912@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150714154040.GA3912@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2954 Lines: 59 On Tue, Jul 14, 2015 at 11:40:40AM -0400, Vivek Goyal wrote: > On Tue, Jul 14, 2015 at 03:34:30PM +0000, dwalker@fifo99.com wrote: > > On Tue, Jul 14, 2015 at 11:02:08AM -0400, Vivek Goyal wrote: > > > On Tue, Jul 14, 2015 at 01:59:19PM +0000, dwalker@fifo99.com wrote: > > > > On Mon, Jul 13, 2015 at 08:19:45PM -0500, Eric W. Biederman wrote: > > > > > dwalker@fifo99.com writes: > > > > > > > > > > > On Fri, Jul 10, 2015 at 08:41:28AM -0500, Eric W. Biederman wrote: > > > > > >> Hidehiro Kawai writes: > > > > > >> > > > > > >> > You can call panic notifiers and kmsg dumpers before kdump by > > > > > >> > specifying "crash_kexec_post_notifiers" as a boot parameter. > > > > > >> > However, it doesn't make sense if kdump is not available. In that > > > > > >> > case, disable "crash_kexec_post_notifiers" boot parameter so that > > > > > >> > you can't change the value of the parameter. > > > > > >> > > > > > >> Nacked-by: "Eric W. Biederman" > > > > > > > > > > > > I think it would make sense if he just replaced "kdump" with "kexec". > > > > > > > > > > It would be less insane, however it still makes no sense as without > > > > > kexec on panic support crash_kexec is a noop. So the value of the > > > > > seeting makes no difference. > > > > > > > > Can you explain more, I don't really understand what you mean. Are you suggesting > > > > the whole "crash_kexec_post_notifiers" feature has no value ? > > > > > > Daniel, > > > > > > BTW, why are you using crash_kexec_post_notifiers commandline? Why not > > > without it? > > > > It was explained in the prior thread but to rehash, the notifiers are used to do a switch > > over from the crashed machine to another redundant machine. > > So why not detect failure using polling or issue notifications from second > kernel. > > IOW, expecting that a crashed machine will be able to deliver notification > reliably is falwed to begin with, IMHO. It's flawed to think you can kexec, but you still do it right ? I've not gotten into the deep details of this switching process, but that's how this interface is used. > If a machine is failing, there are high chance it can't deliver you the > notification. Detecting that failure suing some kind of polling mechanism > might be more reliable. And it will make even kdump mechanism more > reliable so that it does not have to run panic notifiers after the crash. I think what your suggesting is that my company should change how it's hardware works and that's not really an option for me. This isn't a simple thing like checking over the network if the machine is down or not, this is way more complex hardware design. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/