Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934463AbcCNOec (ORCPT ); Mon, 14 Mar 2016 10:34:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35283 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932705AbcCNOe2 (ORCPT ); Mon, 14 Mar 2016 10:34:28 -0400 Date: Mon, 14 Mar 2016 10:34:26 -0400 From: Don Zickus To: Joshua Hunt Cc: akpm@linux-foundation.org, uobergfe@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH] watchdog: don't run proc_watchdog_update if new value is same as old Message-ID: <20160314143426.GK194535@redhat.com> References: <1457826627-21727-1-git-send-email-johunt@akamai.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1457826627-21727-1-git-send-email-johunt@akamai.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2607 Lines: 70 On Sat, Mar 12, 2016 at 06:50:26PM -0500, Joshua Hunt wrote: > While working on a script to restore all sysctl params before a series of > tests I found that writing any value into the > /proc/sys/kernel/{nmi_watchdog,soft_watchdog,watchdog,watchdog_thresh} > causes them to call proc_watchdog_update(). Not only that, but when I > wrote to these proc files in a loop I could easily trigger a soft lockup. > > [ 955.756196] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. > [ 955.765994] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. > [ 955.774619] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. > [ 955.783182] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. > [ 959.788319] NMI watchdog: BUG: soft lockup - CPU#4 stuck for 30s! [swapper/4:0] > [ 959.788325] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 30s! [swapper/5:0] > > There doesn't appear to be a reason for doing this work other every time a > write occurs, so only do the work when the values change. Hi Josh, Thanks for the patch. I have no objections to it, but Uli and myself were interested in the reason for the softlockups. Uli is going to provide a test patch to see if his theory is correct. That way we fix the underlying issue and then apply your patch on top. Make sense? Cheers, Don > > Signed-off-by: Josh Hunt > --- > kernel/watchdog.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index b3ace6e..9acb29f 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -923,6 +923,9 @@ static int proc_watchdog_common(int which, struct ctl_table *table, int write, > * both lockup detectors are disabled if proc_watchdog_update() > * returns an error. > */ > + if (old == new) > + goto out; > + > err = proc_watchdog_update(); > } > out: > @@ -967,7 +970,7 @@ int proc_soft_watchdog(struct ctl_table *table, int write, > int proc_watchdog_thresh(struct ctl_table *table, int write, > void __user *buffer, size_t *lenp, loff_t *ppos) > { > - int err, old; > + int err, old, new; > > get_online_cpus(); > mutex_lock(&watchdog_proc_mutex); > @@ -987,6 +990,10 @@ int proc_watchdog_thresh(struct ctl_table *table, int write, > /* > * Update the sample period. Restore on failure. > */ > + new = ACCESS_ONCE(watchdog_thresh); > + if (old == new) > + goto out; > + > set_sample_period(); > err = proc_watchdog_update(); > if (err) { > -- > 1.7.9.5 >