Received: by 2002:a05:6500:1b41:b0:1fb:d597:ff75 with SMTP id cz1csp407193lqb; Tue, 4 Jun 2024 15:25:27 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVXLGTTvWdPmvAX1N5bD00U8xaGs05syhx9tvBu9bjNVZkMr1A5FSlYBGki8QnsWAfGaIENABvqiYnDsLYCEYW5Eh3XYhH/Q6Z8JGnLaA== X-Google-Smtp-Source: AGHT+IEjNBD6x21KO4V17KPOKVHX5IuB4Y66xc60Jiup026tnS2my5XjT+2AqszxRDNvw6yHeYyB X-Received: by 2002:a05:6358:590b:b0:19c:2c64:9606 with SMTP id e5c5f4694b2df-19c6c688713mr99247655d.13.1717539926851; Tue, 04 Jun 2024 15:25:26 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717539926; cv=pass; d=google.com; s=arc-20160816; b=pj9WxLgK/h5ECuBy1WFTAwQddfRZEL61M335iTM1mdSPyW7ZQqo47h1BL44dtVsxuG 23UxbqfwvRdH1RswlYxe+m2fM3Xwp06qML90kZ0b/Tvwprwt+Y8/1ZzmB3v66G54vEwC 8TDR9G4l1SIyAecka382ZdrCWbRsaaQM2Lxr/A6CN/icufesBh7YSIpQ5EJ7PULl4zzC F+6RNMMGpyB22y1v4iRAMtzs0t5aQNvbEAHJyvIMa7UGZHzF5obXr2GcHj3H62N0clOm dR44VF3Jk+qc+sPcdsj/qNS7GAAnZ/MKwof8ZLTbwN7fx8tyxiW6cETck7IEsdgvqDy8 VP4g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=resENzOMi1irHR99SrDGMpjOj9BoIH/0QBNjUcsrRJs=; fh=7X8x8G5iKcevRm7Fw6HPzFN8bAHsR0JXqCFY/f0LEIA=; b=0RgNhdtSBli6PNZHVHpF7iq704fJaqLjUoBHihrTzCaYkTCILvvnkmR6pJ2rsjN850 sJQOtNO5ta5O5uZe7EqQbplv84tr+B0nsf+lpMv1Yxo3mxmKDQJOj8rHWE8b7tJQ5l+b 9jdEwiwjOXv3hSOZxkKyJd0D4SFX1VJq/L0H8cOAETrRpbeax6w/f6bJv86nyba2boN6 lXx2lgdzsUBrQq7lPetvPELZefy6LarL7CGNFW78AxgE2yWC9N/TUo5FwmQzIERgUIP1 Ny1zxdcQJxV2WR7N+1juNUh/iZn9/PmlDiNDg1QOpsUv8Vq7d083dKQdouGKgjtTxcKV +/Ug==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="X7EL/eN1"; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-201466-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-201466-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id 41be03b00d2f7-6c35b222bddsi8878891a12.476.2024.06.04.15.25.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Jun 2024 15:25:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-201466-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="X7EL/eN1"; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-201466-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-201466-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 3DD0628717E for ; Tue, 4 Jun 2024 22:25:26 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2F79514EC6D; Tue, 4 Jun 2024 22:24:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="X7EL/eN1" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B46B14C5BA; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=QscyFfmwZiliLB1ukdcFYq945D6b7xv6CSVm2ztytL+A7N8gb5kAXo3aZ6tF7UwPqBvE9P6SXdyL2yuaMPAT6ZuRmsURfEPO2FJ8QbUttGZNnLdB/fv3OJPbOKXg3/9qTrmWfbaHfi2hipruYQ4wC998/OL9DDASIUZ6eCDANE8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=Mr0jgnXrW9nLakafqNopRYzxiUW6MmdPijZgfPRpq7Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CmsjppLI5Mq0SQu52mleQH+AuFCpqmvKHJ0sFYewJezFrmYP+3KnW0Nw4p7lTEIMJvm3Sx2g3Xa5zokrp9TEwMaP4qEOhXBXq166hDmXfjS1YMmIMT+CNS01tWI0/1w0Qgm+Opxw/OvtQvzbM1PHb/BCk+Z3qTUkvssTc5J7o+g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=X7EL/eN1; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 16C78C4AF0F; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539837; bh=Mr0jgnXrW9nLakafqNopRYzxiUW6MmdPijZgfPRpq7Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=X7EL/eN1SHfNEkXECubtVnDDCJtiDSOesBLMbda4eaJenNyPOBOZW+dZ/RsThju5i lmHgj0eZddck8K7R13yuNIfmjLrbD7+geOzHmyQdZFXpMD2k4zPkALiG3bPBsDUb5N zJYlTCELaZTimBbStA8G9cznf/a9jllb9gwDfkwIX4opsRsVuAxKBLMLHtRt9vQ4EB rrMPVBqT23skMVg+9fBlO+AfrBmdIh31tDvgCymnOE7VTXirr9cqLqOxFHQ5WURalQ XNukyjTTkXivfjLnV9+4UkxycIl9rBTD15FGv5AZIsN3rRNMP5IfMminSqSzUJHRld OqMZH0CWowTeg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 67235CE3F30; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" , Leonardo Bras , Sean Christopherson Subject: [PATCH rcu 6/9] rcu: Add rcutree.nocb_patience_delay to reduce nohz_full OS jitter Date: Tue, 4 Jun 2024 15:23:52 -0700 Message-Id: <20240604222355.2370768-6-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit If a CPU is running either a userspace application or a guest OS in nohz_full mode, it is possible for a system call to occur just as an RCU grace period is starting. If that CPU also has the scheduling-clock tick enabled for any reason (such as a second runnable task), and if the system was booted with rcutree.use_softirq=0, then RCU can add insult to injury by awakening that CPU's rcuc kthread, resulting in yet another task and yet more OS jitter due to switching to that task, running it, and switching back. In addition, in the common case where that system call is not of excessively long duration, awakening the rcuc task is pointless. This pointlessness is due to the fact that the CPU will enter an extended quiescent state upon returning to the userspace application or guest OS. In this case, the rcuc kthread cannot do anything that the main RCU grace-period kthread cannot do on its behalf, at least if it is given a few additional milliseconds (for example, given the time duration specified by rcutree.jiffies_till_first_fqs, give or take scheduling delays). This commit therefore adds a rcutree.nocb_patience_delay kernel boot parameter that specifies the grace period age (in milliseconds) before which RCU will refrain from awakening the rcuc kthread. Preliminary experiementation suggests a value of 1000, that is, one second. Increasing rcutree.nocb_patience_delay will increase grace-period latency and in turn increase memory footprint, so systems with constrained memory might choose a smaller value. Systems with less-aggressive OS-jitter requirements might choose the default value of zero, which keeps the traditional immediate-wakeup behavior, thus avoiding increases in grace-period latency. [ paulmck: Apply Leonardo Bras feedback. ] Link: https://lore.kernel.org/all/20240328171949.743211-1-leobras@redhat.com/ Reported-by: Leonardo Bras Suggested-by: Leonardo Bras Suggested-by: Sean Christopherson Signed-off-by: Paul E. McKenney Reviewed-by: Leonardo Bras --- Documentation/admin-guide/kernel-parameters.txt | 8 ++++++++ kernel/rcu/tree.c | 10 ++++++++-- kernel/rcu/tree_plugin.h | 10 ++++++++++ 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 500cfa7762257..2d4a512cf1fc6 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5018,6 +5018,14 @@ the ->nocb_bypass queue. The definition of "too many" is supplied by this kernel boot parameter. + rcutree.nocb_patience_delay= [KNL] + On callback-offloaded (rcu_nocbs) CPUs, avoid + disturbing RCU unless the grace period has + reached the specified age in milliseconds. + Defaults to zero. Large values will be capped + at five seconds. All values will be rounded down + to the nearest value representable by jiffies. + rcutree.qhimark= [KNL] Set threshold of queued RCU callbacks beyond which batch limiting is disabled. diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 35bf4a3736765..408b020c9501f 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -176,6 +176,9 @@ static int gp_init_delay; module_param(gp_init_delay, int, 0444); static int gp_cleanup_delay; module_param(gp_cleanup_delay, int, 0444); +static int nocb_patience_delay; +module_param(nocb_patience_delay, int, 0444); +static int nocb_patience_delay_jiffies; // Add delay to rcu_read_unlock() for strict grace periods. static int rcu_unlock_delay; @@ -4344,11 +4347,14 @@ static int rcu_pending(int user) return 1; /* Is this a nohz_full CPU in userspace or idle? (Ignore RCU if so.) */ - if ((user || rcu_is_cpu_rrupt_from_idle()) && rcu_nohz_full_cpu()) + gp_in_progress = rcu_gp_in_progress(); + if ((user || rcu_is_cpu_rrupt_from_idle() || + (gp_in_progress && + time_before(jiffies, READ_ONCE(rcu_state.gp_start) + nocb_patience_delay_jiffies))) && + rcu_nohz_full_cpu()) return 0; /* Is the RCU core waiting for a quiescent state from this CPU? */ - gp_in_progress = rcu_gp_in_progress(); if (rdp->core_needs_qs && !rdp->cpu_no_qs.b.norm && gp_in_progress) return 1; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 340bbefe5f652..31c539f09c150 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -93,6 +93,16 @@ static void __init rcu_bootup_announce_oddness(void) pr_info("\tRCU debug GP init slowdown %d jiffies.\n", gp_init_delay); if (gp_cleanup_delay) pr_info("\tRCU debug GP cleanup slowdown %d jiffies.\n", gp_cleanup_delay); + if (nocb_patience_delay < 0) { + pr_info("\tRCU NOCB CPU patience negative (%d), resetting to zero.\n", nocb_patience_delay); + nocb_patience_delay = 0; + } else if (nocb_patience_delay > 5 * MSEC_PER_SEC) { + pr_info("\tRCU NOCB CPU patience too large (%d), resetting to %ld.\n", nocb_patience_delay, 5 * MSEC_PER_SEC); + nocb_patience_delay = 5 * MSEC_PER_SEC; + } else if (nocb_patience_delay) { + pr_info("\tRCU NOCB CPU patience set to %d milliseconds.\n", nocb_patience_delay); + } + nocb_patience_delay_jiffies = msecs_to_jiffies(nocb_patience_delay); if (!use_softirq) pr_info("\tRCU_SOFTIRQ processing moved to rcuc kthreads.\n"); if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG)) -- 2.40.1