Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751358AbdGQNNy (ORCPT ); Mon, 17 Jul 2017 09:13:54 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:59555 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751281AbdGQNNx (ORCPT ); Mon, 17 Jul 2017 09:13:53 -0400 Date: Mon, 17 Jul 2017 15:13:35 +0200 (CEST) From: Thomas Gleixner To: "Liang, Kan" cc: Don Zickus , "linux-kernel@vger.kernel.org" , "mingo@kernel.org" , "akpm@linux-foundation.org" , "babu.moger@oracle.com" , "atomlin@redhat.com" , "prarit@redhat.com" , "torvalds@linux-foundation.org" , "peterz@infradead.org" , "eranian@google.com" , "acme@redhat.com" , "ak@linux.intel.com" , "stable@vger.kernel.org" Subject: RE: [PATCH V2] kernel/watchdog: fix spurious hard lockups In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F0775371D8AA@SHSMSX103.ccr.corp.intel.com> Message-ID: References: <20170621144118.5939-1-kan.liang@intel.com> <20170622154450.2lua7fdmigcixldw@redhat.com> <20170623162907.l6inpxgztwwkeaoi@redhat.com> <20170626201927.3ak7fk3yvdzbb4ay@redhat.com> <20170627201249.ll34ecwhpme3vh2u@redhat.com> <37D7C6CF3E00A74B8858931C1DB2F0775371D43E@SHSMSX103.ccr.corp.intel.com> <37D7C6CF3E00A74B8858931C1DB2F0775371D8AA@SHSMSX103.ccr.corp.intel.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1252 Lines: 37 On Mon, 17 Jul 2017, Liang, Kan wrote: > > That doesn't make sense. What's the exact test procedure? > > I don't know the exact test procedure. The test case is from our customer. > I only know that the test case makes calls into the x11 libs. Sigh. This starts to be silly. You test something and have no idea what it does? > > > According to our test, only patch 3 works well. > > > The other two patches will hang the system eventually. Hang the system eventually? Does that mean that the system stops working and the watchdog does not catch the problem? > > > BTW: We set 1 to watchdog_thresh when we did the test. > > > It's believed that can speed up the failure. > > > > Believe is not really a technical measure.... > > > > 1 is a valid value for watchdog_thresh. > It was set through the standard proc interface. > /proc/sys/kernel/watchdog_thresh > It should not impacts the final test result. I know that 1 is a valid value and I know how that can be set. Still, it does not help if you believe that setting the threshold to 1 can speed up the failure. Either you know it for sure or not. You can believe in god or whatever, but here we talk about facts. Please start coming up with facts and proper explanations. Thanks, tglx