Received: by 2002:ac0:8845:0:0:0:0:0 with SMTP id g63csp1947902img; Wed, 27 Feb 2019 08:08:01 -0800 (PST) X-Google-Smtp-Source: AHgI3IYaCsjKE2a2nCQKeM6onGp5agjsMXxIWA6Z3pqb+D350fJwNVpXpRY+GnBEo7DZu9qFj1AH X-Received: by 2002:a65:41c2:: with SMTP id b2mr3617513pgq.67.1551283681415; Wed, 27 Feb 2019 08:08:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551283681; cv=none; d=google.com; s=arc-20160816; b=i8low7G3YQGtRF5Kloeu6VVMrtC3iR1K2bX433zzzWY6gw0S0GutOd5jQtcScl7XJr szZrPpGN1zqEyKx8MCw3Tp0g2fSQyKttzJbuontHAa62Tf2kn0MtzaXh4LcdLdlODWFg zqKOI5WuSMefQZNFpBDKDo6b3iPHYKduLn16b9EI/Q7/wfDH0AlRChDgdwe3p6eBesJi Mcrv+hPt0WtzYjBNSJrngHzwKhMMK20IHTJEsunBVjrj+92FN63tmqjt0G8QGB2ccHZP GHLIRb85xxczqqP1qFmiGcSNeqYRB54WZ4x35fxKDnUF9gvWxPBnUbU+nDG+kJLxl1DV mu9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=vJZ1KdqI6BNiBeSqoG9hQ8g88fVbwKo/Bgq4lxh8DmQ=; b=CsIt6Z5gEU7uLbS/y8pA5p1HCE5na/s4zdF7uhfYYVPZKTJF3enmqjSef5QHPNq8f6 yQyl4Wqq4jfJryfCT4zF+v7Vc53APwYcmThl8vDjhreFRR6MywBPanvbgJ9RS9Lyft41 mkjZQFJzF6lUM0DmajyS7DcwZk8RNOlVHXh8vduAYnFz8qY9w5AvRqBHN8CEw4103hhV NtBDNvXmynAY9OfXwMxeZ6eh1EGxX3TxY2u8CrUXMvsb1VLM5JOVfuCQ/uuOtcpw3NNa O3sXu/MzVAmU7XQ8iYFzol7Cesy/+dTumbJs2n3L/xBTM/+hckWmS7X4kvhnVMnFBPJS IZBw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f4si15272925pgb.164.2019.02.27.08.07.43; Wed, 27 Feb 2019 08:08:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730102AbfB0QGB (ORCPT + 99 others); Wed, 27 Feb 2019 11:06:01 -0500 Received: from mga02.intel.com ([134.134.136.20]:24149 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730001AbfB0QF4 (ORCPT ); Wed, 27 Feb 2019 11:05:56 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Feb 2019 08:05:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,420,1544515200"; d="scan'208";a="303011731" Received: from unknown (HELO luv-build.sc.intel.com) ([172.25.110.25]) by orsmga005.jf.intel.com with ESMTP; 27 Feb 2019 08:05:54 -0800 From: Ricardo Neri To: Thomas Gleixner , Ingo Molnar , Borislav Petkov Cc: Ashok Raj , Andi Kleen , Peter Zijlstra , "Ravi V. Shankar" , x86@kernel.org, linux-kernel@vger.kernel.org, Ricardo Neri , Ricardo Neri , "H. Peter Anvin" , Tony Luck , Clemens Ladisch , Arnd Bergmann , Philippe Ombredanne , Kate Stewart , "Rafael J. Wysocki" , Mimi Zohar , Jan Kiszka , Nick Desaulniers , Masahiro Yamada , Nayna Jain Subject: [RFC PATCH v2 12/14] x86/watchdog/hardlockup/hpet: Determine if HPET timer caused NMI Date: Wed, 27 Feb 2019 08:05:16 -0800 Message-Id: <1551283518-18922-13-git-send-email-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1551283518-18922-1-git-send-email-ricardo.neri-calderon@linux.intel.com> References: <1551283518-18922-1-git-send-email-ricardo.neri-calderon@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The only direct method to determine whether an HPET timer caused an interrupt is to read the Interrupt Status register. Unfortunately, reading HPET registers is slow and, therefore, it is not recommended to read them while in NMI context. Furthermore, status is not available if the interrupt is generated vi the Front Side Bus. An indirect manner is to compute the expected value of the the time-stamp counter and, at the time of the interrupt and verify that its actual value is within a range of the expected value. Since the hardlockup detector operates in seconds, high precision is not needed. This implementation considers that the HPET caused the HMI if the time-stamp counter reads the expected value -/+ 1.5%. This value is selected is it is equivalent to 1/64 and the division can be performed using bit shifts. Experimentally, the error in the estimation is consistently less than 1%. Also, only read the time-stamp counter of the handling CPU (the one targeted by the HPET timer). This helps to avoid variability of the time stamp across CPUs. Cc: "H. Peter Anvin" Cc: Ashok Raj Cc: Andi Kleen Cc: Tony Luck Cc: Peter Zijlstra Cc: Clemens Ladisch Cc: Arnd Bergmann Cc: Philippe Ombredanne Cc: Kate Stewart Cc: "Rafael J. Wysocki" Cc: Mimi Zohar Cc: Jan Kiszka Cc: Nick Desaulniers Cc: Masahiro Yamada Cc: Nayna Jain Cc: "Ravi V. Shankar" Cc: x86@kernel.org Suggested-by: Andi Kleen Signed-off-by: Ricardo Neri --- arch/x86/include/asm/hpet.h | 2 ++ arch/x86/kernel/watchdog_hld_hpet.c | 28 +++++++++++++++++++++++++--- 2 files changed, 27 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h index 15dc3b576496..09763340c911 100644 --- a/arch/x86/include/asm/hpet.h +++ b/arch/x86/include/asm/hpet.h @@ -123,6 +123,8 @@ struct hpet_hld_data { u32 num; u32 flags; u64 ticks_per_second; + u64 tsc_next; + u64 tsc_next_error; u32 handling_cpu; struct cpumask cpu_monitored_mask; struct msi_msg msi_msg; diff --git a/arch/x86/kernel/watchdog_hld_hpet.c b/arch/x86/kernel/watchdog_hld_hpet.c index cfa284da4bf6..65b4699f249a 100644 --- a/arch/x86/kernel/watchdog_hld_hpet.c +++ b/arch/x86/kernel/watchdog_hld_hpet.c @@ -55,6 +55,11 @@ static inline void set_comparator(struct hpet_hld_data *hdata, * * Reprogram the timer to expire within watchdog_thresh seconds in the future. * + * Also compute the expected value of the time-stamp counter at the time of + * expiration as well as a deviation from the expected value. The maximum + * deviation is of ~1.5%. This deviation can be easily computed by shifting + * by 6 positions the delta between the current and expected time-stamp values. + * * Returns: * * None @@ -62,7 +67,18 @@ static inline void set_comparator(struct hpet_hld_data *hdata, static void kick_timer(struct hpet_hld_data *hdata, bool force) { bool kick_needed = force || !(hdata->flags & HPET_DEV_PERI_CAP); - unsigned long new_compare, count; + unsigned long tsc_curr, tsc_delta, new_compare, count; + + /* Start obtaining the current TSC and HPET counts. */ + tsc_curr = rdtsc(); + + if (kick_needed) + count = get_count(); + + tsc_delta = (unsigned long)watchdog_thresh * (unsigned long)tsc_khz + * 1000L; + hdata->tsc_next = tsc_curr + tsc_delta; + hdata->tsc_next_error = tsc_delta >> 6; /* * Update the comparator in increments of watch_thresh seconds relative @@ -74,8 +90,6 @@ static void kick_timer(struct hpet_hld_data *hdata, bool force) */ if (kick_needed) { - count = get_count(); - new_compare = count + watchdog_thresh * hdata->ticks_per_second; set_comparator(hdata, new_compare); @@ -147,6 +161,14 @@ static void set_periodic(struct hpet_hld_data *hdata) */ static bool is_hpet_wdt_interrupt(struct hpet_hld_data *hdata) { + if (smp_processor_id() == hdata->handling_cpu) { + unsigned long tsc_curr; + + tsc_curr = rdtsc(); + if (abs(tsc_curr - hdata->tsc_next) < hdata->tsc_next_error) + return true; + } + return false; } -- 2.17.1