Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932332AbbERK6o (ORCPT ); Mon, 18 May 2015 06:58:44 -0400 Received: from mx6-phx2.redhat.com ([209.132.183.39]:55413 "EHLO mx6-phx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932166AbbERK50 (ORCPT ); Mon, 18 May 2015 06:57:26 -0400 Date: Mon, 18 May 2015 06:56:46 -0400 (EDT) From: Ulrich Obergfell To: Peter Zijlstra Cc: Michal Hocko , Linus Torvalds , Stephane Eranian , Don Zickus , Ingo Molnar , Andrew Morton , "Rafael J. Wysocki" , Kevin Hilman , Ulf Hansson , linux-pm@vger.kernel.org, LKML Message-ID: <309071648.615900.1431946606778.JavaMail.zimbra@redhat.com> In-Reply-To: <20150518093150.GC21418@twins.programming.kicks-ass.net> References: <20150517185041.GA5897@dhcp22.suse.cz> <20150518073046.GO17717@twins.programming.kicks-ass.net> <20150518090336.GA6393@dhcp22.suse.cz> <20150518093150.GC21418@twins.programming.kicks-ass.net> Subject: Re: suspend regression in 4.1-rc1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.36.4.69] X-Mailer: Zimbra 8.0.6_GA_5922 (ZimbraWebClient - FF22 (Linux)/8.0.6_GA_5922) Thread-Topic: suspend regression in 4.1-rc1 Thread-Index: FzCQcgK3VCZH9g6UBPAwdO+3zUgWPg== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4274 Lines: 132 Peter, please see my comments in-line. Regards, Uli ----- Original Message ----- From: "Peter Zijlstra" To: "Michal Hocko" [...] > On Mon, May 18, 2015 at 11:03:37AM +0200, Michal Hocko wrote: >> This doesn't hang anymore. I've just had to move the mutex definition >> up to make it compile. So feel free to add my > > I've also fixed a lock leak, see goto unlock :-) > >> Reported-and-tested-by: Michal Hocko > > *blink* that actually fixed it.. > > That somewhat leaves me at a loss explaining how s2r was failing. > > --- > Subject: watchdog: Fix merge 'conflict' > > Two watchdog changes that came through different trees had a non > conflicting conflict, that is, one changed the semantics of a variable > but no actual code conflict happened. So the merge appeared fine, but > the resulting code did not behave as expected. > > Commit 195daf665a62 ("watchdog: enable the new user interface of the > watchdog mechanism") changes the semantics of watchdog_user_enabled, > which thereafter is only used by the functions introduced by > b3738d293233 ("watchdog: Add watchdog enable/disable all functions"). Don and I already posted a patch in April to address this: https://lkml.org/lkml/2015/4/22/306 http://ozlabs.org/~akpm/mmots/broken-out/watchdog-fix-watchdog_nmi_enable_all.patch > There further appears to be a distinct lack of serialization between > setting and using watchdog_enabled, so perhaps we should wrap the > {en,dis}able_all() things in watchdog_proc_mutex. As I understand it, the {en,dis}able_all() functions are only called early at kernel startup, so I do not see how they could be racing with watchdog code that is executed in the context of write() system calls to parameters in /proc/sys/kernel. Please see also my earlier reply to Michal for further details: http://marc.info/?l=linux-pm&m=143194387208250&w=2 Do we really need synchronization here? > This patch fixes a s2r failure reported by Michal; which I cannot > readily explain. But this does make the code internally consistent > again. > > Reported-and-tested-by: Michal Hocko > Signed-off-by: Peter Zijlstra (Intel) > --- > kernel/watchdog.c | 20 +++++++++++++++----- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 2316f50..506edcc5 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -41,6 +41,8 @@ > #define NMI_WATCHDOG_ENABLED (1 << NMI_WATCHDOG_ENABLED_BIT) > #define SOFT_WATCHDOG_ENABLED (1 << SOFT_WATCHDOG_ENABLED_BIT) > > +static DEFINE_MUTEX(watchdog_proc_mutex); > + > #ifdef CONFIG_HARDLOCKUP_DETECTOR > static unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED|NMI_WATCHDOG_ENABLED; > #else > @@ -608,26 +610,36 @@ void watchdog_nmi_enable_all(void) > { > int cpu; > > - if (!watchdog_user_enabled) > - return; > + mutex_lock(&watchdog_proc_mutex); > + > + if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED)) > + goto unlock; > > get_online_cpus(); > for_each_online_cpu(cpu) > watchdog_nmi_enable(cpu); > put_online_cpus(); > + > +unlock: > + mutex_lock(&watchdog_proc_mutex); > } > > void watchdog_nmi_disable_all(void) > { > int cpu; > > + mutex_lock(&watchdog_proc_mutex); > + > if (!watchdog_running) > - return; > + goto unlock; > > get_online_cpus(); > for_each_online_cpu(cpu) > watchdog_nmi_disable(cpu); > put_online_cpus(); > + > +unlock: > + mutex_unlock(&watchdog_proc_mutex); > } > #else > static int watchdog_nmi_enable(unsigned int cpu) { return 0; } > @@ -744,8 +756,6 @@ static int proc_watchdog_update(void) > > } > > -static DEFINE_MUTEX(watchdog_proc_mutex); > - > /* > * common function for watchdog, nmi_watchdog and soft_watchdog parameter > * -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/