Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp5013770pxu; Thu, 10 Dec 2020 10:42:57 -0800 (PST) X-Google-Smtp-Source: ABdhPJybR9Pr7cb/tjywYvB12sa+IlOour3h3cTgU7SxShfjym1sDYqBR9jzZV7U8+5tmgaKZt5Q X-Received: by 2002:a17:906:fb0e:: with SMTP id lz14mr7833142ejb.232.1607625777157; Thu, 10 Dec 2020 10:42:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607625777; cv=none; d=google.com; s=arc-20160816; b=Ww4/g02kq7/RqzsAbGwvdahdcHxviRpJ8MmrecoVrnpUZ8jlRJyIPaWN3ijieZsUwt uA0PWymDcf7xPTX5H8TBmv5trPMVv/vRZwCxjODGwRzIMqTO4JEgsnuFWqrsDdt5yhX8 gel9Yv/VKLbbUGKdhMvp8iiFKZ8+PfLgtEoFMIzDarmwvQQjA4kU4MzEfOwMWP2xgv0h uXgzgQFSgIvfyDze/swcB7xbFa7ESi9oE2GK9Bc5BJ/TwdfK5E4lCos0CmUPwvwxHGI7 E2CeU86EeDNFetRwMLDdKPunz1yddvwJT8zYQd/i/9FvTtK3Fp8ctmWAFFJX1qeEc4Le W+BA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:content-disposition:mime-version :reply-to:message-id:subject:cc:to:from:dkim-signature:date; bh=ktjfFIPD5NghpSq8L5nxahYuv1Z/uTZtulhaVI7B2DQ=; b=G19ZA0RV9cJPPRYjEqtBO4rqWeIPgQEGRxGynmEv0qw7DrIU5+y+F0lrA1WQZ5q4fx jUdKn6kiQg/4Zuo+r1K3j/uWN771Bjalgp73Rzj9ISFoK+IZ04yl3IZI19khAKqG0LhN mIVzVZFkBHWRJzHl4by93ZzI2N9hMxDashF82GoiaUuZKc0+mw/3LRi3zvffnmzBPgYh 0pFFRkqfVnzUgp0ca1+oNdVbN3z42WwWZuIyFz9wOsztAM+rRbnBFMYUA1JBSYBi0rdV L/q1+d+TxXEVCxZ3Xnn++kp6/w3k+EDfqdWZAd7rOLAduLBmKQ8fecwleiZBqdSFgDew ZIUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Xd1O+QHu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o25si3072646ejg.728.2020.12.10.10.42.30; Thu, 10 Dec 2020 10:42:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Xd1O+QHu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390803AbgLJSib (ORCPT + 99 others); Thu, 10 Dec 2020 13:38:31 -0500 Received: from mail.kernel.org ([198.145.29.99]:55758 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390753AbgLJSiU (ORCPT ); Thu, 10 Dec 2020 13:38:20 -0500 Date: Thu, 10 Dec 2020 10:37:37 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1607625458; bh=BD0MD4SLOvRiQSS9/INdETpZ5uzloO97b/Xr8w29zIc=; h=From:To:Cc:Subject:Reply-To:From; b=Xd1O+QHu0WEkKRur0V4C6WptGoXrL2Yu/9Kn/LXmYv4E/2warRfSoI4E3SMQGO6kf GY4TNITqezywQp3d0wy0Yl7vUWusKs1ltx8oz6V53yaVcqrUqpQLFSf1CGrrohaAa4 7zNf6zGtKmSnHFfiiSP3/NaGuyApnlCZb7b7QRRMUvEDuXSJLdamFVLhqrpwlDK6ud pnvMRXoNjK/NLE/bchuXH7gD/gZ5vxKWWn+GQO/8ZrIy3TznCLtUWcT/NhWJjxTGQ7 ELzCHabxfLl75Zx/WbtD+egYQNVQKUZe/Ypy72QtJUHEQGTzsPFfDh8XVNxHVG6XkG X2cjl0N9xvLcg== From: "Paul E. McKenney" To: joel@joelfernandes.org Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Energy-efficiency options within RCU Message-ID: <20201210183737.GA12900@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Joel, In case you are -seriously- interested... ;-) Thanx, Paul rcu_nocbs= Adding a CPU to this list offloads RCU callback invocation from that CPU's softirq handler to a kthread. In big.LITTLE systems, this kthread can be placed on a LITTLE CPU, which has been demonstrated to save significant energy in benchmarks. http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf nohz_full= Any CPU specified by this boot parameter is handled as if it was specified by rcu_nocbs=. rcutree.jiffies_till_first_fqs= Increasing this will decrease wakeup frequency to the grace-period kthread for the first FQS scan. And increase grace-period latency. rcutree.jiffies_till_next_fqs= Ditto, but for the second and subsequent FQS scans. My guess is that neither of these makes much difference. But if they do, maybe some sort of backoff scheme for FQS scans? rcutree.jiffies_till_sched_qs= Increasing this will delay RCU's getting excited about CPUs and tasks not responding with quiescent states. This excitement can cause extra overhead. No idea whether adjusting this would help. But if you increase rcutree.jiffies_till_first_fqs or rcutree.jiffies_till_next_fqs, you might need to increase this one accordingly. rcutree.qovld= Increasing this will increase the grace-period duration at which RCU starts sending IPIs, thus perhaps reducing the total number of IPIs that RCU sends. The destination CPUs are unlikely to be idle, so it is not clear to me that this would help much. But perhaps I am wrong about them being mostly non-idle, who knows? rcupdate.rcu_cpu_stall_timeout= If you get overly zealous about the earlier kernel boot parameters, you might need to increase this one as well. Or instead use the rcupdate.rcu_cpu_stall_suppress= kernel boot parameter to suppress RCU CPU stall warnings entirely. rcutree.rcu_nocb_gp_stride= Increasing this might reduce grace-period work somewhat. I don't see why a (say) 16-CPU system really needs to have more than one rcuog kthread, so if this does help it might be worthwhile setting a lower limit to this kernel parameter. rcutree.rcu_idle_gp_delay= (Only CONFIG_RCU_FAST_NO_HZ=y kernels.) This defaults to four jiffies on the theory that grace periods tend to last about that long. If grace periods tend to take longer, then it makes a lot of sense to increase this. And maybe battery-powered devices would rather have it be about 2x or 3x the expected grace-period duration, who knows? I would keep it to a power of two, but the code should work with other numbers. Except that I don't know that this has ever been tested. ;-) srcutree.exp_holdoff= Increasing this decreases the number of SRCU grace periods that are treated as expedited. But you have to have closely-spaced SRCU grace periods for this to matter. (These do happen at least sometimes because I added this only because someone complained about the performance regression from the earlier non-tree SRCU.) rcupdate.rcu_task_ipi_delay= This kernel parameter delays sending IPIs for RCU Tasks Trace, which is used by sleepable BPF programs. Increasing it can reduce overhead, but can also increase the latency of removing sleepable BPF programs. rcupdate.rcu_task_stall_timeout= If you slow down RCU Tasks Trace too much, you may need this. But then again, the default 10-minute value should suffice. CONFIG_RCU_FAST_NO_HZ=y This only has effect on CPUs not specified by rcu_nocbs, and thus might be useful on systems that offload RCU callbacks only on some of the CPUs. For example, a big.LITTLE system might offload only the big CPUs. This Kconfig option reduces the frequency of timer interrupts (and thus of RCU-related softirq processing) on idle CPUs. This has been shown to save significant energy in benchmarks: http://www.rdrop.com/users/paulmck/realtime/paper/AMPenergy.2013.04.19a.pdf CONFIG_RCU_STRICT_GRACE_PERIOD=y This works hard (as in burns CPU) to sharply reduce grace-period latency. The effect is probably to greatly increase power consumption, but there might well be workloads where the shorter grace periods more than make up for the extra CPU time. Or not. CONFIG_HZ= Reducing the scheduler-clock interrupt frequency has the opposite effect, namely of increasing RCU grace-period latency, but while also reducing RCU's CPU utilization. CONFIG_TASKS_TRACE_RCU_READ_MB=y Reduce the need to IPI RCU Tasks Trace holdout tasks, but at the expense of an increase in to/from idle overhead. This Kconfig option also slows down the rate at which RCU Tasks Trace polls for holdout tasks. This polling rate cannot be separately specified, but if changing the initial source-code values of either rcu_tasks_trace.gp_sleep or rcu_tasks_trace.init_fract proves useful, kernel boot parameters could be created. That said, automatic initialization heuristics are more convenient. When they work, anyway.