Received: by 2002:ab2:1689:0:b0:1f7:5705:b850 with SMTP id d9csp1814809lqa; Mon, 29 Apr 2024 23:03:07 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWbw9ELmTasOPgicKfZ28gEjUVstlaqyyiUF1xeQ2a9kI8p7NgIBcjK/K+mGF3fd6rQlKFKTe3/eG4XgvvhnvhDuQACrorAnNOAJVbmSw== X-Google-Smtp-Source: AGHT+IH/0/pb7HFpfmUHGZTR+UQ6LESrC/Mty3Iax1sAB5+0445NnV9z8kxh2Zh3q1lH0PbSjkuy X-Received: by 2002:a05:620a:5dca:b0:790:eedb:4b63 with SMTP id xy10-20020a05620a5dca00b00790eedb4b63mr1757885qkn.16.1714456987079; Mon, 29 Apr 2024 23:03:07 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714456987; cv=pass; d=google.com; s=arc-20160816; b=awX66HE+f89gI4HW1kMcUcYQkPKRYxzjKZwRpL2oM5PB1JtD6WcJMUVPFlpkGS7ruX rhyPLbP5lUwNbjNuizFliCLK2nPTaHgtdXSULeqBZfdgrmer8cQXw850fGHDEubKT/Np 5sA0TCGXSA1xgRLN6cEd5AJilvzntIBIh/QKaYHnH2bQkqE18OzwzCED9u33BqYKNyrF V73cfhPNAnkoix9efWr2og8vVuo98A1+CRhiCGp0IpaVxUdf1hRbWhqYDXCJn/YIcNH8 Ty3g9v+bxf4y5G9zTLLscr/iAHLKMGkBugPoeF6R3MmLHm6oNId4nI/D0sCL9Em4x0Ai 6N8g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=N8qFCAmxsKTU4/t3+6N5eaeUuRG4RTFsfIyJXMDpsDg=; fh=K1DnZzhoUE9CSpdEfpcuuJKpyalWv3qdD+mV0w5dL58=; b=KxYwSZPO9J7tXS08ZeCg6QaYUMXrEr4fboMc90nVu67y1IVisfd5iyaH0dUmCk1PlC pPv6CSGCZpniLammpPK3SHDV2rRTmwGGrwa2BkvC8QGjJ9gGOmCkWX1wmXo+1wNvwqK3 VA57Ueonzdnd+iAZ3I89rNnuXa485khkoPAvVyYAdJsqeVc9o//Rl8m2gHHJbH/zF1gu lEs37F2B89P5TKtGXNP4jDC34IGnNkhn95w/eSHcR4kqKS3nEesucTagl4x60c8d2NfO PEp60LJnb6c60fuOulzaRvG5N04xsO5iAbf6El554Aldw0QmGFLH10UwB7wHTg01pMdR /7Qw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZzSf6suE; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-163383-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-163383-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id a6-20020a05620a438600b0078f1bfcca35si29394614qkp.386.2024.04.29.23.03.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Apr 2024 23:03:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-163383-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZzSf6suE; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-163383-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-163383-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id BBBE81C22078 for ; Tue, 30 Apr 2024 06:03:06 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3C4C818E1C; Tue, 30 Apr 2024 06:02:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZzSf6suE" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DF3A1865A for ; Tue, 30 Apr 2024 06:02:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714456973; cv=none; b=pRNHa3gSE3nPm/gH7L3BIf5M+MQmkE6XFoO4IlyPvhXl1cawiBcNpiL5eKfHwhU0Mw3nCXLf5aToUQcX5V/Fx3VcIJ+Pk98sN9tXm3KRApR92jLD1WP2Y3FcUJL6jST6TZMj27Mcfysh6luV5012xsmdHJYSbtspgM6HYxEuXJ8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714456973; c=relaxed/simple; bh=sNwgjAA5n6vIbiyq0PGOZo2Y0hPkG0cNpvgy5eJdU1U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uR/bvuWR4xw0PtMBP84MPpycss70h3ZhBw3VMQAA2TVQ/YX2ZlUgjWlqbrz8HcCuf8QSqeIusO6fgmtX14RNmzNObwxU1cx3Xqgrd4KUIFzKUVp16nEBDdu30C9FotVG7yZ67bo2rVxyiv2A18LWYlUrII47KUt6hFgdj3aQhGc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZzSf6suE; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1549AC4AF19; Tue, 30 Apr 2024 06:02:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1714456972; bh=sNwgjAA5n6vIbiyq0PGOZo2Y0hPkG0cNpvgy5eJdU1U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZzSf6suEDWr2C4HwFO9PMdj3LczIsCGszUBsrThRhMGdNyf+ioN6u3dkFIWcMzpfc vU7JO4It86mNidziVqSeFD+0EkLFZu37yYgKdDPJVM9ex5S78uqo1W7hPksXx4nank Rn1gH8IRBWjHoM68f2yWEcbbAoa4srPDwFGECOyNunazPWtrCflBFTR0p70j5RaeDd 3AvqnNiHJME7loxQhgmoKwPBvgWQWJvrwjti/p7yJOPYTEa4ax0NmD7YAU1iUhGaae UVNCx28SLGh43zbe1xPXaTGvfCZ5gIzyTOMvUFuArQjOvYXqPTZtcqdE45KJYux8XM qtPKmoaxgHo+Q== From: Song Liu To: linux-kernel@vger.kernel.org Cc: kernel-team@meta.com, Song Liu , Andrew Morton , Peter Zijlstra Subject: [PATCH v5 2/2] watchdog: Allow nmi watchdog to use raw perf event Date: Mon, 29 Apr 2024 23:02:36 -0700 Message-ID: <20240430060236.1878002-2-song@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240430060236.1878002-1-song@kernel.org> References: <20240430060236.1878002-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit NMI watchdog permanently consumes one hardware counters per CPU on the system. For systems that use many hardware counters, this causes more aggressive time multiplexing of perf events. OTOH, some CPUs (mostly Intel) support "ref-cycles" event, which is rarely used. Add kernel cmdline arg nmi_watchdog=rNNN to configure the watchdog to use raw event. For example, on Intel CPUs, we can use "r300" to configure the watchdog to use ref-cycles event. If the raw event does not work, fall back to use "cycles". Cc: Andrew Morton Cc: Peter Zijlstra Signed-off-by: Song Liu --- Changes in v5: Change the design so that we can configure the watchdog with any raw event. Add fall back mechanism that use "cycles" if the raw event doesn't work. v4: https://lore.kernel.org/lkml/20230518002555.1114189-1-song@kernel.org/ Changes in v4: Fix compile error for !CONFIG_HARDLOCKUP_DETECTOR_PERF. (kernel test bot) Changes in v3: Pivot the design to use kernel arg nmi_watchdog=ref-cycles (Peter) --- .../admin-guide/kernel-parameters.txt | 5 ++- include/linux/nmi.h | 2 + kernel/watchdog.c | 2 + kernel/watchdog_perf.c | 44 +++++++++++++++++++ 4 files changed, 51 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 213d0719e2b7..7445738f45b3 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3776,10 +3776,12 @@ Format: [state][,regs][,debounce][,die] nmi_watchdog= [KNL,BUGS=X86] Debugging features for SMP kernels - Format: [panic,][nopanic,][num] + Format: [panic,][nopanic,][rNNN,][num] Valid num: 0 or 1 0 - turn hardlockup detector in nmi_watchdog off 1 - turn hardlockup detector in nmi_watchdog on + rNNN - configure the watchdog with raw perf event 0xNNN + When panic is specified, panic when an NMI watchdog timeout occurs (or 'nopanic' to not panic on an NMI watchdog, if CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is set) @@ -7467,4 +7469,3 @@ memory, and other data can't be written using xmon commands. off xmon is disabled. - diff --git a/include/linux/nmi.h b/include/linux/nmi.h index f53438eae815..a8dfb38c9bb6 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -105,10 +105,12 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs); extern void hardlockup_detector_perf_stop(void); extern void hardlockup_detector_perf_restart(void); extern void hardlockup_detector_perf_cleanup(void); +extern void hardlockup_config_perf_event(const char *str); #else static inline void hardlockup_detector_perf_stop(void) { } static inline void hardlockup_detector_perf_restart(void) { } static inline void hardlockup_detector_perf_cleanup(void) { } +static inline void hardlockup_config_perf_event(const char *str) { } #endif void watchdog_hardlockup_stop(void); diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 7f54484de16f..ab0129b15f25 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -80,6 +80,8 @@ static int __init hardlockup_panic_setup(char *str) watchdog_hardlockup_user_enabled = 0; else if (!strncmp(str, "1", 1)) watchdog_hardlockup_user_enabled = 1; + else if (!strncmp(str, "r", 1)) + hardlockup_config_perf_event(str + 1); while (*(str++)) { if (*str == ',') { str++; diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c index 8ea00c4a24b2..fff032b47c55 100644 --- a/kernel/watchdog_perf.c +++ b/kernel/watchdog_perf.c @@ -90,6 +90,14 @@ static struct perf_event_attr wd_hw_attr = { .disabled = 1, }; +static struct perf_event_attr fallback_wd_hw_attr = { + .type = PERF_TYPE_HARDWARE, + .config = PERF_COUNT_HW_CPU_CYCLES, + .size = sizeof(struct perf_event_attr), + .pinned = 1, + .disabled = 1, +}; + /* Callback function for perf event subsystem */ static void watchdog_overflow_callback(struct perf_event *event, struct perf_sample_data *data, @@ -122,6 +130,13 @@ static int hardlockup_detector_event_create(void) /* Try to register using hardware perf events */ evt = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL); + if (IS_ERR(evt)) { + wd_attr = &fallback_wd_hw_attr; + wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh); + evt = perf_event_create_kernel_counter(wd_attr, cpu, NULL, + watchdog_overflow_callback, NULL); + } + if (IS_ERR(evt)) { pr_debug("Perf event create on CPU %d failed with %ld\n", cpu, PTR_ERR(evt)); @@ -259,3 +274,32 @@ int __init watchdog_hardlockup_probe(void) } return ret; } + +/** + * hardlockup_config_perf_event - Overwrite config of wd_hw_attr. + */ +void __init hardlockup_config_perf_event(const char *str) +{ + u64 config; + char buf[24]; + char *comma = strchr(str, ','); + + if (!comma) { + if (kstrtoull(str, 16, &config)) + return; + } else { + unsigned int len = comma - str; + + if (len >= sizeof(buf)) + return; + + if (strscpy(buf, str, sizeof(buf)) < 0) + return; + buf[len] = 0; + if (kstrtoull(buf, 16, &config)) + return; + } + + wd_hw_attr.type = PERF_TYPE_RAW; + wd_hw_attr.config = config; +} -- 2.43.0