2020-07-06 14:31:20

by Qais Yousef

[permalink] [raw]
Subject: [PATCH v6 0/2] sched/uclamp: new sysctl for default RT boost value

This series introduces a new sysctl_sched_uclamp_util_min_rt_default to control
at runtime the default boost value of RT tasks.

Full rationale is in patch 1 commit message.

v6 has changed the approach taken in v5 [1] and earlier by moving away from the
lazy update approach that touched the fast path to a synchronous one that is
performed when the write to the procfs entry is done.

for_each_process_thread() is used to update all existing RT tasks now. And to
handle the race with a concurrent fork() we introduce sched_post_fork() in
_do_fork() to ensure a concurrently forked RT tasks gets the right update.

To ensure the race condition is handled correctly, I wrote this small (simple!)
test program:

https://github.com/qais-yousef/uclamp_test.git

And ran it on 4core x86 system and 8core big.LITTLE juno-r2 system.

From juno-r2 run, 10 iterations each run:

Without sched_post_fork()

# ./run.sh
pid 3105 has 336 but default should be 337
pid 13162 has 336 but default should be 337
pid 23256 has 338 but default should be 339
All forked RT tasks had the correct uclamp.min
pid 10638 has 334 but default should be 335
All forked RT tasks had the correct uclamp.min
pid 30683 has 335 but default should be 336
pid 8247 has 336 but default should be 337
pid 18170 has 1024 but default should be 334
pid 28274 has 336 but default should be 337

With sched_post_fork()

# ./run.sh
All forked RT tasks had the correct uclamp.min
All forked RT tasks had the correct uclamp.min
All forked RT tasks had the correct uclamp.min
All forked RT tasks had the correct uclamp.min
All forked RT tasks had the correct uclamp.min
All forked RT tasks had the correct uclamp.min
All forked RT tasks had the correct uclamp.min
All forked RT tasks had the correct uclamp.min
All forked RT tasks had the correct uclamp.min
All forked RT tasks had the correct uclamp.min

Thanks

--
Qais Yousef

[1] https://lore.kernel.org/lkml/[email protected]/

CC: Jonathan Corbet <[email protected]>
CC: Juri Lelli <[email protected]>
CC: Vincent Guittot <[email protected]>
CC: Dietmar Eggemann <[email protected]>
CC: Steven Rostedt <[email protected]>
CC: Ben Segall <[email protected]>
CC: Mel Gorman <[email protected]>
CC: Luis Chamberlain <[email protected]>
CC: Kees Cook <[email protected]>
CC: Iurii Zaikin <[email protected]>
CC: Quentin Perret <[email protected]>
CC: Valentin Schneider <[email protected]>
CC: Patrick Bellasi <[email protected]>
CC: Pavan Kondeti <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]

Qais Yousef (2):
sched/uclamp: Add a new sysctl to control RT default boost value
Documentation/sysctl: Document uclamp sysctl knobs

Documentation/admin-guide/sysctl/kernel.rst | 54 ++++++++
include/linux/sched/sysctl.h | 1 +
include/linux/sched/task.h | 1 +
kernel/fork.c | 1 +
kernel/sched/core.c | 132 ++++++++++++++++++--
kernel/sysctl.c | 7 ++
6 files changed, 189 insertions(+), 7 deletions(-)

--
2.17.1


2020-07-06 14:31:35

by Qais Yousef

[permalink] [raw]
Subject: [PATCH v6 2/2] Documentation/sysctl: Document uclamp sysctl knobs

Uclamp exposes 3 sysctl knobs:

* sched_util_clamp_min
* sched_util_clamp_max
* sched_util_clamp_min_rt_default

Document them in sysctl/kernel.rst.

Signed-off-by: Qais Yousef <[email protected]>
CC: Jonathan Corbet <[email protected]>
CC: Juri Lelli <[email protected]>
CC: Vincent Guittot <[email protected]>
CC: Dietmar Eggemann <[email protected]>
CC: Steven Rostedt <[email protected]>
CC: Ben Segall <[email protected]>
CC: Mel Gorman <[email protected]>
CC: Luis Chamberlain <[email protected]>
CC: Kees Cook <[email protected]>
CC: Iurii Zaikin <[email protected]>
CC: Quentin Perret <[email protected]>
CC: Valentin Schneider <[email protected]>
CC: Patrick Bellasi <[email protected]>
CC: Pavan Kondeti <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]
---
Documentation/admin-guide/sysctl/kernel.rst | 54 +++++++++++++++++++++
1 file changed, 54 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 83acf5025488..55bf6b4de4ec 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -1062,6 +1062,60 @@ Enables/disables scheduler statistics. Enabling this feature
incurs a small amount of overhead in the scheduler but is
useful for debugging and performance tuning.

+sched_util_clamp_min:
+=====================
+
+Max allowed *minimum* utilization.
+
+Default value is 1024, which is the maximum possible value.
+
+It means that any requested uclamp.min value cannot be greater than
+sched_util_clamp_min, i.e., it is restricted to the range
+[0:sched_util_clamp_min].
+
+sched_util_clamp_max:
+=====================
+
+Max allowed *maximum* utilization.
+
+Default value is 1024, which is the maximum possible value.
+
+It means that any requested uclamp.max value cannot be greater than
+sched_util_clamp_max, i.e., it is restricted to the range
+[0:sched_util_clamp_max].
+
+sched_util_clamp_min_rt_default:
+================================
+
+By default Linux is tuned for performance. Which means that RT tasks always run
+at the highest frequency and most capable (highest capacity) CPU (in
+heterogeneous systems).
+
+Uclamp achieves this by setting the requested uclamp.min of all RT tasks to
+1024 by default, which effectively boosts the tasks to run at the highest
+frequency and biases them to run on the biggest CPU.
+
+This knob allows admins to change the default behavior when uclamp is being
+used. In battery powered devices particularly, running at the maximum
+capacity and frequency will increase energy consumption and shorten the battery
+life.
+
+This knob is only effective for RT tasks which the user hasn't modified their
+requested uclamp.min value via sched_setattr() syscall.
+
+This knob will not escape the range constraint imposed by sched_util_clamp_min
+defined above.
+
+For example if
+
+ sched_util_clamp_min_rt_default = 800
+ sched_util_clamp_min = 600
+
+Then the boost will be clamped to 600 because 800 is outside of the permissible
+range of [0:600]. This could happen for instance if a powersave mode will
+restrict all boosts temporarily by modifying sched_util_clamp_min. As soon as
+this restriction is lifted, the requested sched_util_clamp_min_rt_default
+will take effect.

seccomp
=======
--
2.17.1