Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935999AbcCQQIK (ORCPT ); Thu, 17 Mar 2016 12:08:10 -0400 Received: from prod-mail-xrelay07.akamai.com ([23.79.238.175]:37065 "EHLO prod-mail-xrelay07.akamai.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935222AbcCQQIH (ORCPT ); Thu, 17 Mar 2016 12:08:07 -0400 Subject: Re: [PATCH] watchdog: don't run proc_watchdog_update if new value is same as old To: Ulrich Obergfell References: <1457826627-21727-1-git-send-email-johunt@akamai.com> <20160314143426.GK194535@redhat.com> <56E6CE86.5070606@akamai.com> <20160314162955.GQ194535@redhat.com> <56E78957.6020707@akamai.com> <1338048031.31356204.1458120082361.JavaMail.zimbra@redhat.com> Cc: Don Zickus , akpm@linux-foundation.org, linux-kernel@vger.kernel.org From: Josh Hunt Message-ID: <56EAD663.8080709@akamai.com> Date: Thu, 17 Mar 2016 11:08:03 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <1338048031.31356204.1458120082361.JavaMail.zimbra@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4710 Lines: 113 On 03/16/2016 04:21 AM, Ulrich Obergfell wrote: > > Josh, > > I haven't tried to reproduce the soft lockups with kernel 4.1, but I > believe I found an explanation in terms of how your test case breaks > the watchdog mechanism in kernel 4.1: > > The soft lockup detector is implemented via two components (per-cpu). > > - The watchdog_timer_fn() function. > This function runs periodically (sample_period) in the context of a > high-resolution timer on CPU N. It wakes up the 'watchdog/N' thread > and checks watchdog_touch_ts[N] to determine whether the thread has > run within the soft lockup detection interval (watchdog_thresh * 2). > > - The watchdog() function. > This function is executed in the context of the 'watchdog/N' thread. > It updates the watchdog_touch_ts[N] time stamp to signal that it got > a chance to run. The thread has a high realtime priority. If it does > not get a chance to run, this is indicative that 'something else' is > hogging CPU N. > > > The while :; do echo 1 > /proc/sys/kernel/nmi_watchdog; sleep .1; done > loop triggers the following flow of execution in the watchdog code at > a frequency that is much higher than sample_period. > > proc_nmi_watchdog > proc_watchdog_common > proc_watchdog_update > watchdog_enable_all_cpus > update_watchdog_all_cpus > for_each_online_cpu > update_watchdog > smp_call_function_single(restart_watchdog_hrtimer) > > Calling the restart_watchdog_hrtimer() function at such a high rate has > the following effects. > > - The high-resolution timer gets canceled and restarted before it ever > gets a chance to expire. Hence, the watchdog_timer_fn() function does > not get a chance to wake up the 'watchdog/N' thread. So this prevents > the thread from running within the soft lockup detection interval. > > - Since watchdog_timer_fn() does not wake up the 'watchdog/N' thread, > the watchdog_touch_ts[N] time stamp is not being updated. > > The sleep 30 && kill %1 command terminates the above while loop after > 30 seconds, so the high-resolution timer no longer gets canceled. When > the timer expires, watchdog_timer_fn() detects that 'watchdog/N' has not > run within the soft lockup detection interval. > > Your patch 'masks' the above issue because proc_watchdog_update() is not > called if the parameter value is not changed. > > > The mechanism that picks up changes of watchdog parameters 'on the fly' > was rewritten in kernel 4.3 (see https://lkml.org/lkml/2015/8/1/64 and > http://marc.info/?l=linux-kernel&m=143894132129981&w=2). > > $ git log --pretty=oneline v4.2..v4.3 -- kernel/watchdog.c | head -5 > ec6a90661a0d6ce1461d05c7a58a0a151154e14a watchdog: rename watchdog_suspend() and watchdog_resume() > 999bbe49ea0118b70ddf3f5d679f51dc7a97ae55 watchdog: use suspend/resume interface in fixup_ht_bug() > d4bdd0b21c7652a8271f873cc755486b255c1bbd watchdog: use park/unpark functions in update_watchdog_all_cpus() > 8c073d27d7ad293bf734cc8475689413afadab81 watchdog: introduce watchdog_suspend() and watchdog_resume() > 81a4beef91ba4a9e8ad6054ca9933dff7e25ff28 watchdog: introduce watchdog_park_threads() and watchdog_unpark_threads() > > The new version of the update_watchdog_all_cpus() function now parks and > unparks all 'watchdog/N' threads. If any watchdog parameter in /proc was > changed, the new value gets picked up by the threads during unpark when > watchdog_enable() is executed. > > The following is particularly relevant to your test case. > > - watchdog_disable() cancels the high-resolution timer during park > > - watchdog_enable() starts the high-resolution timer during unpark > and updates watchdog_touch_ts[N] by calling __touch_watchdog() > > I believe this explains why you can only reproduce the soft lockup with > kernel v4.1 but not with v4.5. > > > Your patch "don't run proc_watchdog_update if new value is same as old" > looks OK to me. I have no objections. > > > Many Thanks, > > Uli Uli Thanks for the detailed explanation! As you mention my patch will mask this problem for 4.1 which is why I wanted to get it into stable. Do you think there is any way to mitigate this issue for the stable kernels (4.1 to 4.4) if the user changes the values doing something like: foo=1; while :; do echo $foo > /proc/sys/kernel/nmi_watchdog; foo=$(( ! $foo )); sleep .1; done & sleep 30 && kill %1 ? I realize this isn't a real-world use-case (I hope :)), but shows there is still a way to cause the box to soft lockup with this code path. I ask b/c these kernels are both longterm stable releases and so this issue will likely live for a while in the wild. Josh