Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932104AbcDLA6L (ORCPT ); Mon, 11 Apr 2016 20:58:11 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:11060 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754659AbcDLA6B (ORCPT ); Mon, 11 Apr 2016 20:58:01 -0400 From: Shaohua Li To: lkml CC: Thomas Gleixner , John Stultz , Subject: [RFC 1/2] time: workaround crappy hpet Date: Mon, 11 Apr 2016 17:57:56 -0700 Message-ID: <09c4f19409012995595db6fd0a12f326c292af1a.1460422356.git.shli@fb.com> X-Mailer: git-send-email 2.8.0.rc2 X-FB-Internal: Safe MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-04-12_01:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2028 Lines: 57 Calvin found 'perf record -a --call-graph dwarf -- sleep 5' making clocksource switching to hpet. We found similar symptom in another machine. Here is an example: [8224517.520885] timekeeping watchdog: Marking clocksource 'tsc' as unstable, because the skew is too large: [8224517.540032] 'hpet' wd_now: ffffffff wd_last: b39c0bd mask: ffffffff [8224517.553092] 'tsc' cs_now: 48ceac7013714e cs_last: 48ceac25be34ac mask: ffffffffffffffff [8224517.569849] Switched to clocksource hpet In both machines, wd_now is 0xffffffff. The tsc time looks correct, the cpu is 2.5G (0x48ceac7013714e - 0x48ceac25be34ac)/2500000 = 0.4988s 0.4988s matches WATCHDOG_INTERVAL. Since hpet reads to 0xffffffff in both machines, this sounds not coincidence, hept is crappy. This patch tries to workaround this issue. We do retry if hpet has 0xffffff value. In the relevant machine, the hpet counter doesn't read to 0xffffffff later. The chance hpet has 0xffffffff counter is very small, this patch should have no impact for good hpet. I'm open if there is better solution. Reported-by: Calvin Owens Signed-off-by: Shaohua Li --- arch/x86/kernel/hpet.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index a1f0e4a..333b57c 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -763,7 +763,23 @@ static int hpet_cpuhp_notify(struct notifier_block *n, */ static cycle_t read_hpet(struct clocksource *cs) { - return (cycle_t)hpet_readl(HPET_COUNTER); + unsigned int ret; + static bool checked; + ret = hpet_readl(HPET_COUNTER); + + if (unlikely(ret == 0xffffffff && !checked)) { + int i; + for (i = 0; i < 20; i++) { + ret = hpet_readl(HPET_COUNTER); + if (ret != 0xffffffff) + break; + } + if (i == 20) { + WARN_ONCE(true, "HPET counter value is abnormal\n"); + checked = true; + } + } + return (cycle_t)ret; } static struct clocksource clocksource_hpet = { -- 2.8.0.rc2