Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752112AbdFUNrx (ORCPT ); Wed, 21 Jun 2017 09:47:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51004 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752019AbdFUNru (ORCPT ); Wed, 21 Jun 2017 09:47:50 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 7B5C3C04D2A4 Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dzickus@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 7B5C3C04D2A4 Date: Wed, 21 Jun 2017 09:47:47 -0400 From: Don Zickus To: "Liang, Kan" Cc: Andrew Morton , "linux-kernel@vger.kernel.org" , "mingo@kernel.org" , "babu.moger@oracle.com" , "atomlin@redhat.com" , "prarit@redhat.com" , "torvalds@linux-foundation.org" , "peterz@infradead.org" , "tglx@linutronix.de" , "eranian@google.com" , "acme@redhat.com" , "ak@linux.intel.com" , "stable@vger.kernel.org" Subject: Re: [PATCH] kernel/watchdog: fix spurious hard lockups Message-ID: <20170621134747.kd6w5rq4zforzaad@redhat.com> References: <20170620213309.30051-1-kan.liang@intel.com> <20170620150359.0fbb417aed72c84ac6ad8498@linux-foundation.org> <37D7C6CF3E00A74B8858931C1DB2F07753710034@SHSMSX103.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F07753710034@SHSMSX103.ccr.corp.intel.com> User-Agent: NeoMutt/20170428-dirty (1.8.2) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 21 Jun 2017 13:47:49 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1122 Lines: 27 On Wed, Jun 21, 2017 at 12:40:28PM +0000, Liang, Kan wrote: > > > > > > > The right fix for mainline can be found here. > > > perf/x86/intel: enable CPU ref_cycles for GP counter perf/x86/intel, > > > watchdog: Switch NMI watchdog to ref cycles on x86 > > > https://patchwork.kernel.org/patch/9779087/ > > > https://patchwork.kernel.org/patch/9779089/ > > > > Presumably the "right fix" will later be altered to revert this one-line > > workaround? > > The "right fix" itself will not touch the watchdog rate. I will modify the > changelog to notify the people who want to do the backport. > > As my understanding, it's not harmful even if we don't revert the > workaround. It can still detect the hardlockup, only takes > a tiny bit longer. It depends on you perspective of harmful. :-) There are folks that would like that sampling rate to be more accurate, so they can detect problems soon than later. You just took an input of 'watchdog_thresh' and blindly multiplied it by 3, which can confuse an end user who thought they setup a 5 second threshold but instead it turned into a 15 second one. :-( Cheers, Don