Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752142AbdFUVC5 (ORCPT ); Wed, 21 Jun 2017 17:02:57 -0400 Received: from mga05.intel.com ([192.55.52.43]:15443 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751148AbdFUVCz (ORCPT ); Wed, 21 Jun 2017 17:02:55 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.39,370,1493708400"; d="scan'208";a="100241945" Subject: Re: [PATCH] kernel/watchdog: fix spurious hard lockups To: Prarit Bhargava , Andi Kleen References: <20170620213309.30051-1-kan.liang@intel.com> <4718a252-9515-626e-a69f-565f1c2bc589@redhat.com> <20170620230002.GE23705@tassilo.jf.intel.com> <9320cd00-88f4-49c5-aaa5-4bb4a80c8813@redhat.com> Cc: kan.liang@intel.com, linux-kernel@vger.kernel.org, dzickus@redhat.com, mingo@kernel.org, akpm@linux-foundation.org, babu.moger@oracle.com, atomlin@redhat.com, torvalds@linux-foundation.org, peterz@infradead.org, tglx@linutronix.de, eranian@google.com, acme@redhat.com, stable@vger.kernel.org From: Marc Herbert Message-ID: <6920d83c-19b9-9243-c6aa-0f21288a4006@intel.com> Date: Wed, 21 Jun 2017 14:02:48 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <9320cd00-88f4-49c5-aaa5-4bb4a80c8813@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 650 Lines: 18 On 20/06/17 17:12, Prarit Bhargava wrote: >>> Hmm ... odd that I haven't seen this. We're running a pretty wide >>> variety of systems here. Do you have a reproducer? I'd like to see >>> this occur on production HW. "Production" is where this patch was born and still lives right now: https://chromium-review.googlesource.com/c/506327/ >> It only happens on a few specific CPU SKUs with a very wide Turbo range. > > Which ones? > The ones with turbo mode > 2.5 x TSC_MHz when you stress them hard and long enough. Simple maths: just compare the soft and hard timers in the code. The factor 3 moves the condition to: turbo > 7.5 x TSC_MHz.