Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760192AbaGYLZU (ORCPT ); Fri, 25 Jul 2014 07:25:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54045 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759914AbaGYLZS (ORCPT ); Fri, 25 Jul 2014 07:25:18 -0400 Date: Fri, 25 Jul 2014 13:25:11 +0200 From: Andrew Jones To: Ulrich Obergfell Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, dzickus@redhat.com, pbonzini@redhat.com, akpm@linux-foundation.org, mingo@redhat.com Subject: Re: [PATCH 2/3] watchdog: control hard lockup detection default Message-ID: <20140725112510.GA3456@hawk.usersys.redhat.com> References: <1406196811-5384-1-git-send-email-drjones@redhat.com> <1406196811-5384-3-git-send-email-drjones@redhat.com> <615371508.17867577.1406277175913.JavaMail.zimbra@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <615371508.17867577.1406277175913.JavaMail.zimbra@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 25, 2014 at 04:32:55AM -0400, Ulrich Obergfell wrote: > > ----- Original Message ----- > > From: "Andrew Jones" > > To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org > > Cc: uobergfe@redhat.com, dzickus@redhat.com, pbonzini@redhat.com, akpm@linux-foundation.org, mingo@redhat.com > > Sent: Thursday, July 24, 2014 12:13:30 PM > > Subject: [PATCH 2/3] watchdog: control hard lockup detection default > > [...] > > > The running kernel still has the ability to enable/disable at any > > time with /proc/sys/kernel/nmi_watchdog us usual. However even > > when the default has been overridden /proc/sys/kernel/nmi_watchdog > > will initially show '1'. To truly turn it on one must disable/enable > > it, i.e. > > echo 0 > /proc/sys/kernel/nmi_watchdog > > echo 1 > /proc/sys/kernel/nmi_watchdog > > [...] > > > @@ -626,15 +665,17 @@ int proc_dowatchdog(struct ctl_table *table, int write, > > * disabled. The 'watchdog_running' variable check in > > * watchdog_*_all_cpus() function takes care of this. > > */ > > - if (watchdog_user_enabled && watchdog_thresh) > > + if (watchdog_user_enabled && watchdog_thresh) { > > + watchdog_enable_hardlockup_detector(true); > > err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh); > > - else > > + } else > > [...] > > > I just realized a possible issue in the above part of the patch: > > If we would want to give the user the option to override the effect of patch 3/3 > via /proc, I think proc_dowatchdog() should enable hard lockup detection _only_ > in case of a state transition from 'NOT watchdog_running' to 'watchdog_running'. > | > if (watchdog_user_enabled && watchdog_thresh) { | need to add this > if (!watchdog_running) <---------------------------' > watchdog_enable_hardlockup_detector(true); > err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh); > } else > ... > > The additional 'if (!watchdog_running)' would _require_ the user to perform the > sequence of commands > > echo 0 > /proc/sys/kernel/nmi_watchdog > echo 1 > /proc/sys/kernel/nmi_watchdog > > to enable hard lockup detection explicitly. > > I think changing the 'watchdog_thresh' while 'watchdog_running' is true should > _not_ enable hard lockup detection as a side-effect, because a user may have a > 'sysctl.conf' entry such as > > kernel.watchdog_thresh = ... > > or may only want to change the 'watchdog_thresh' on the fly. > > I think the following flow of execution could cause such undesired side-effect. > > proc_dowatchdog > if (watchdog_user_enabled && watchdog_thresh) { > > watchdog_enable_hardlockup_detector > hardlockup_detector_enabled = true > > watchdog_enable_all_cpus > if (!watchdog_running) { > ... > } else if (sample_period_changed) > update_timers_all_cpus > for_each_online_cpu > update_timers > watchdog_nmi_disable > ... > watchdog_nmi_enable > > watchdog_hardlockup_detector_is_enabled > return true > > enable perf counter for hard lockup detection > > Regards, > > Uli Nice catch. Looks like this will need a v2. Paolo, do we have a consensus on the proc echoing? Or should that be revisited in the v2 as well? drew -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/