Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754388AbcLOSlT (ORCPT ); Thu, 15 Dec 2016 13:41:19 -0500 Received: from mail-wm0-f45.google.com ([74.125.82.45]:35562 "EHLO mail-wm0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752706AbcLOSlS (ORCPT ); Thu, 15 Dec 2016 13:41:18 -0500 Date: Thu, 15 Dec 2016 18:41:15 +0000 From: Aaron Tomlin To: Don Zickus Cc: LKML , Andrew Morton , Ulrich Obergfell Subject: Re: [PATCH] kernel/watchdog: Prevent false hardlockup on overloaded system Message-ID: <20161215184114.ut2ulhhflap5bfur@atomlin.usersys.redhat.com> References: <1481041033-192236-1-git-send-email-dzickus@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1481041033-192236-1-git-send-email-dzickus@redhat.com> X-PGP-Key: http://pgp.mit.edu/pks/lookup?search=atomlin%40redhat.com X-PGP-Fingerprint: 7906 84EB FA8A 9638 8D1E 6E9B E2DE 9658 19CC 77D6 User-Agent: NeoMutt/20161126 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1291 Lines: 36 On Tue 2016-12-06 11:17 -0500, Don Zickus wrote: > On an overloaded system, it is possible that a change in the watchdog threshold > can be delayed long enough to trigger a false positive. > > This can easily be achieved by having a cpu spinning indefinitely on a task, > while another cpu updates watchdog threshold. > > What happens is while trying to park the watchdog threads, the hrtimers on the > other cpus trigger and reprogram themselves with the new slower watchdog > threshold. Meanwhile, the nmi watchdog is still programmed with the old faster > threshold. > > Because the one cpu is blocked, it prevents the thread parking on the other > cpus from completing, which is needed to shutdown the nmi watchdog and > reprogram it correctly. As a result, a false positive from the nmi watchdog is > reported. > > Fix this by setting a park_in_progress flag to block all lockups > until the parking is complete. > > Fix provided by Ulrich Obergfell. > > Cc: Ulrich Obergfell > Signed-off-by: Don Zickus > --- > include/linux/nmi.h | 1 + > kernel/watchdog.c | 9 +++++++++ > kernel/watchdog_hld.c | 3 +++ > 3 files changed, 13 insertions(+) Looks fine to me. Reviewed-by: Aaron Tomlin -- Aaron Tomlin