Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932842AbbLODau (ORCPT ); Mon, 14 Dec 2015 22:30:50 -0500 Received: from mail-ob0-f194.google.com ([209.85.214.194]:33141 "EHLO mail-ob0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753701AbbLODas (ORCPT ); Mon, 14 Dec 2015 22:30:48 -0500 Date: Mon, 14 Dec 2015 20:30:45 -0700 From: Jeff Merkey To: linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, uobergfe@redhat.com, dzickus@redhat.com, atomlin@redhat.com, cmetcalf@ezchip.com, fweisbec@gmail.com Subject: [PATCH] Fix spurious hard lockup events while in debugger Message-ID: <20151215033045.GA14158@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1653 Lines: 49 The current touch_nmi_watchdog() function in /kernel/watchdog.c does not always catch all cases when a processor is spinning in the nmi handler inside either KGDB, KDB, or MDB, in particular, the case where a processor is being held by a debugger inside an int1 handler. The hrtimer_interrupts_saved count can still end up matching the hrtime value in some cases, resulting in the hard lockup detector tagging processors inside a debugger and executing a panic. The patch below corrects this problem. I did not add this to the touch_nmi_function directly becuase of possible affects on timing issues since the function is widely used by drivers and modules. I have tested this patch and it fixes the problem for kernel debuggers stopping errant hard lockup events when processors are spinning inside the debugger. Signed-off-by: Jeff Merkey --- kernel/watchdog.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 18f34cf..b682aab 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -283,6 +283,13 @@ static bool is_hardlockup(void) __this_cpu_write(hrtimer_interrupts_saved, hrint); return false; } + +void touch_hardlockup_watchdog(void) +{ + __this_cpu_write(hrtimer_interrupts_saved, 0); +} +EXPORT_SYMBOL_GPL(touch_hardlockup_watchdog); + #endif static int is_softlockup(unsigned long touch_ts) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/