2023-10-02 14:23:12

by Cyril Hrubis

[permalink] [raw]
Subject: [PATCH v3 0/3] Fix sched-rt sysctl files & update docs

New in v3:
- Spelling fixes as requested by Ingo
- added third patch that replaces all cases of realtime with real-time
in the docs

Cyril Hrubis (3):
sched/rt: Disallow writing invalid values to sched_rt_period_us
docs: scheduler-rt: Clarify & fix sched_rt_* sysctl docs
docs: scheduler-rt: Use real-time instead of realtime

Documentation/scheduler/sched-rt-group.rst | 40 ++++++++++++----------
kernel/sched/rt.c | 9 ++---
2 files changed, 26 insertions(+), 23 deletions(-)

--
2.41.0


2023-10-02 14:26:46

by Cyril Hrubis

[permalink] [raw]
Subject: [PATCH v3 2/3] docs: scheduler-rt: Clarify & fix sched_rt_* sysctl docs

- Describe explicitly that sched_rt_runtime_us is allocated from
sched_rt_period_us and hence always less or equal to that value.

- The limit for sched_rt_runtime_us is not INT_MAX-1, but rather it's
limited by the value of sched_rt_period_us. If sched_rt_period_us is
INT_MAX then sched_rt_runtime_us can be set to INT_MAX as well.

Signed-off-by: Cyril Hrubis <[email protected]>
---
Documentation/scheduler/sched-rt-group.rst | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/scheduler/sched-rt-group.rst b/Documentation/scheduler/sched-rt-group.rst
index 655a096ec8fb..a16bee8f74c2 100644
--- a/Documentation/scheduler/sched-rt-group.rst
+++ b/Documentation/scheduler/sched-rt-group.rst
@@ -87,18 +87,20 @@ lack an EDF scheduler to make non-uniform periods usable.
The system wide settings are configured under the /proc virtual file system:

/proc/sys/kernel/sched_rt_period_us:
- The scheduling period that is equivalent to 100% CPU bandwidth
+ The scheduling period that is equivalent to 100% CPU bandwidth.

/proc/sys/kernel/sched_rt_runtime_us:
- A global limit on how much time realtime scheduling may use. Even without
- CONFIG_RT_GROUP_SCHED enabled, this will limit time reserved to realtime
- processes. With CONFIG_RT_GROUP_SCHED it signifies the total bandwidth
- available to all realtime groups.
+ A global limit on how much time realtime scheduling may use. This is always
+ less or equal to the period_us, as it denotes the time allocated from the
+ period_us for the realtime tasks. Even without CONFIG_RT_GROUP_SCHED enabled,
+ this will limit time reserved to realtime processes. With
+ CONFIG_RT_GROUP_SCHED=y it signifies the total bandwidth available to all
+ realtime groups.

* Time is specified in us because the interface is s32. This gives an
operating range from 1us to about 35 minutes.
* sched_rt_period_us takes values from 1 to INT_MAX.
- * sched_rt_runtime_us takes values from -1 to (INT_MAX - 1).
+ * sched_rt_runtime_us takes values from -1 to sched_rt_period_us.
* A run time of -1 specifies runtime == period, ie. no limit.


--
2.41.0

2023-10-02 16:06:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] Fix sched-rt sysctl files & update docs


* Cyril Hrubis <[email protected]> wrote:

> New in v3:
> - Spelling fixes as requested by Ingo
> - added third patch that replaces all cases of realtime with real-time
> in the docs
>
> Cyril Hrubis (3):
> sched/rt: Disallow writing invalid values to sched_rt_period_us
> docs: scheduler-rt: Clarify & fix sched_rt_* sysctl docs
> docs: scheduler-rt: Use real-time instead of realtime
>
> Documentation/scheduler/sched-rt-group.rst | 40 ++++++++++++----------
> kernel/sched/rt.c | 9 ++---
> 2 files changed, 26 insertions(+), 23 deletions(-)

Applied to tip:sched/core for a v6.7 merge, thanks!

Ingo

2023-10-02 16:15:20

by Cyril Hrubis

[permalink] [raw]
Subject: [PATCH v3 3/3] docs: scheduler-rt: Use real-time instead of realtime

Signed-off-by: Cyril Hrubis <[email protected]>
---
Documentation/scheduler/sched-rt-group.rst | 34 +++++++++++-----------
1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/Documentation/scheduler/sched-rt-group.rst b/Documentation/scheduler/sched-rt-group.rst
index a16bee8f74c2..d685609ed3d7 100644
--- a/Documentation/scheduler/sched-rt-group.rst
+++ b/Documentation/scheduler/sched-rt-group.rst
@@ -39,10 +39,10 @@ Most notable:
1.1 The problem
---------------

-Realtime scheduling is all about determinism, a group has to be able to rely on
+Real-time scheduling is all about determinism, a group has to be able to rely on
the amount of bandwidth (eg. CPU time) being constant. In order to schedule
-multiple groups of realtime tasks, each group must be assigned a fixed portion
-of the CPU time available. Without a minimum guarantee a realtime group can
+multiple groups of real-time tasks, each group must be assigned a fixed portion
+of the CPU time available. Without a minimum guarantee a real-time group can
obviously fall short. A fuzzy upper limit is of no use since it cannot be
relied upon. Which leaves us with just the single fixed portion.

@@ -50,14 +50,14 @@ relied upon. Which leaves us with just the single fixed portion.
----------------

CPU time is divided by means of specifying how much time can be spent running
-in a given period. We allocate this "run time" for each realtime group which
-the other realtime groups will not be permitted to use.
+in a given period. We allocate this "run time" for each real-time group which
+the other real-time groups will not be permitted to use.

-Any time not allocated to a realtime group will be used to run normal priority
+Any time not allocated to a real-time group will be used to run normal priority
tasks (SCHED_OTHER). Any allocated run time not used will also be picked up by
SCHED_OTHER.

-Let's consider an example: a frame fixed realtime renderer must deliver 25
+Let's consider an example: a frame fixed real-time renderer must deliver 25
frames a second, which yields a period of 0.04s per frame. Now say it will also
have to play some music and respond to input, leaving it with around 80% CPU
time dedicated for the graphics. We can then give this group a run time of 0.8
@@ -70,7 +70,7 @@ needs only about 3% CPU time to do so, it can do with a 0.03 * 0.005s =
of 0.00015s.

The remaining CPU time will be used for user input and other tasks. Because
-realtime tasks have explicitly allocated the CPU time they need to perform
+real-time tasks have explicitly allocated the CPU time they need to perform
their tasks, buffer underruns in the graphics or audio can be eliminated.

NOTE: the above example is not fully implemented yet. We still
@@ -90,12 +90,12 @@ The system wide settings are configured under the /proc virtual file system:
The scheduling period that is equivalent to 100% CPU bandwidth.

/proc/sys/kernel/sched_rt_runtime_us:
- A global limit on how much time realtime scheduling may use. This is always
+ A global limit on how much time real-time scheduling may use. This is always
less or equal to the period_us, as it denotes the time allocated from the
- period_us for the realtime tasks. Even without CONFIG_RT_GROUP_SCHED enabled,
- this will limit time reserved to realtime processes. With
+ period_us for the real-time tasks. Even without CONFIG_RT_GROUP_SCHED enabled,
+ this will limit time reserved to real-time processes. With
CONFIG_RT_GROUP_SCHED=y it signifies the total bandwidth available to all
- realtime groups.
+ real-time groups.

* Time is specified in us because the interface is s32. This gives an
operating range from 1us to about 35 minutes.
@@ -110,7 +110,7 @@ The system wide settings are configured under the /proc virtual file system:
The default values for sched_rt_period_us (1000000 or 1s) and
sched_rt_runtime_us (950000 or 0.95s). This gives 0.05s to be used by
SCHED_OTHER (non-RT tasks). These defaults were chosen so that a run-away
-realtime tasks will not lock up the machine but leave a little time to recover
+real-time tasks will not lock up the machine but leave a little time to recover
it. By setting runtime to -1 you'd get the old behaviour back.

By default all bandwidth is assigned to the root group and new groups get the
@@ -118,10 +118,10 @@ period from /proc/sys/kernel/sched_rt_period_us and a run time of 0. If you
want to assign bandwidth to another group, reduce the root group's bandwidth
and assign some or all of the difference to another group.

-Realtime group scheduling means you have to assign a portion of total CPU
-bandwidth to the group before it will accept realtime tasks. Therefore you will
-not be able to run realtime tasks as any user other than root until you have
-done that, even if the user has the rights to run processes with realtime
+Real-time group scheduling means you have to assign a portion of total CPU
+bandwidth to the group before it will accept real-time tasks. Therefore you will
+not be able to run real-time tasks as any user other than root until you have
+done that, even if the user has the rights to run processes with real-time
priority!


--
2.41.0

2023-10-02 19:13:38

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: sched/core] sched/rt/docs: Use 'real-time' instead of 'realtime'

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 83494dc51033506eb60c5e11a335461b2dc42111
Gitweb: https://git.kernel.org/tip/83494dc51033506eb60c5e11a335461b2dc42111
Author: Cyril Hrubis <[email protected]>
AuthorDate: Mon, 02 Oct 2023 13:55:53 +02:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 02 Oct 2023 15:17:14 +02:00

sched/rt/docs: Use 'real-time' instead of 'realtime'

Standardize on a single variant.

Signed-off-by: Cyril Hrubis <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Documentation/scheduler/sched-rt-group.rst | 34 ++++++++++-----------
1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/Documentation/scheduler/sched-rt-group.rst b/Documentation/scheduler/sched-rt-group.rst
index a16bee8..d685609 100644
--- a/Documentation/scheduler/sched-rt-group.rst
+++ b/Documentation/scheduler/sched-rt-group.rst
@@ -39,10 +39,10 @@ Most notable:
1.1 The problem
---------------

-Realtime scheduling is all about determinism, a group has to be able to rely on
+Real-time scheduling is all about determinism, a group has to be able to rely on
the amount of bandwidth (eg. CPU time) being constant. In order to schedule
-multiple groups of realtime tasks, each group must be assigned a fixed portion
-of the CPU time available. Without a minimum guarantee a realtime group can
+multiple groups of real-time tasks, each group must be assigned a fixed portion
+of the CPU time available. Without a minimum guarantee a real-time group can
obviously fall short. A fuzzy upper limit is of no use since it cannot be
relied upon. Which leaves us with just the single fixed portion.

@@ -50,14 +50,14 @@ relied upon. Which leaves us with just the single fixed portion.
----------------

CPU time is divided by means of specifying how much time can be spent running
-in a given period. We allocate this "run time" for each realtime group which
-the other realtime groups will not be permitted to use.
+in a given period. We allocate this "run time" for each real-time group which
+the other real-time groups will not be permitted to use.

-Any time not allocated to a realtime group will be used to run normal priority
+Any time not allocated to a real-time group will be used to run normal priority
tasks (SCHED_OTHER). Any allocated run time not used will also be picked up by
SCHED_OTHER.

-Let's consider an example: a frame fixed realtime renderer must deliver 25
+Let's consider an example: a frame fixed real-time renderer must deliver 25
frames a second, which yields a period of 0.04s per frame. Now say it will also
have to play some music and respond to input, leaving it with around 80% CPU
time dedicated for the graphics. We can then give this group a run time of 0.8
@@ -70,7 +70,7 @@ needs only about 3% CPU time to do so, it can do with a 0.03 * 0.005s =
of 0.00015s.

The remaining CPU time will be used for user input and other tasks. Because
-realtime tasks have explicitly allocated the CPU time they need to perform
+real-time tasks have explicitly allocated the CPU time they need to perform
their tasks, buffer underruns in the graphics or audio can be eliminated.

NOTE: the above example is not fully implemented yet. We still
@@ -90,12 +90,12 @@ The system wide settings are configured under the /proc virtual file system:
The scheduling period that is equivalent to 100% CPU bandwidth.

/proc/sys/kernel/sched_rt_runtime_us:
- A global limit on how much time realtime scheduling may use. This is always
+ A global limit on how much time real-time scheduling may use. This is always
less or equal to the period_us, as it denotes the time allocated from the
- period_us for the realtime tasks. Even without CONFIG_RT_GROUP_SCHED enabled,
- this will limit time reserved to realtime processes. With
+ period_us for the real-time tasks. Even without CONFIG_RT_GROUP_SCHED enabled,
+ this will limit time reserved to real-time processes. With
CONFIG_RT_GROUP_SCHED=y it signifies the total bandwidth available to all
- realtime groups.
+ real-time groups.

* Time is specified in us because the interface is s32. This gives an
operating range from 1us to about 35 minutes.
@@ -110,7 +110,7 @@ The system wide settings are configured under the /proc virtual file system:
The default values for sched_rt_period_us (1000000 or 1s) and
sched_rt_runtime_us (950000 or 0.95s). This gives 0.05s to be used by
SCHED_OTHER (non-RT tasks). These defaults were chosen so that a run-away
-realtime tasks will not lock up the machine but leave a little time to recover
+real-time tasks will not lock up the machine but leave a little time to recover
it. By setting runtime to -1 you'd get the old behaviour back.

By default all bandwidth is assigned to the root group and new groups get the
@@ -118,10 +118,10 @@ period from /proc/sys/kernel/sched_rt_period_us and a run time of 0. If you
want to assign bandwidth to another group, reduce the root group's bandwidth
and assign some or all of the difference to another group.

-Realtime group scheduling means you have to assign a portion of total CPU
-bandwidth to the group before it will accept realtime tasks. Therefore you will
-not be able to run realtime tasks as any user other than root until you have
-done that, even if the user has the rights to run processes with realtime
+Real-time group scheduling means you have to assign a portion of total CPU
+bandwidth to the group before it will accept real-time tasks. Therefore you will
+not be able to run real-time tasks as any user other than root until you have
+done that, even if the user has the rights to run processes with real-time
priority!


2023-10-02 19:57:37

by Cyril Hrubis

[permalink] [raw]
Subject: [PATCH v3 1/3] sched/rt: Disallow writing invalid values to sched_rt_period_us

The validation of the value written to sched_rt_period_us was broken
because:

- the sysclt_sched_rt_period is declared as unsigned int
- parsed by proc_do_intvec()
- the range is asserted after the value parsed by proc_do_intvec()

Because of this negative values written to the file were written into a
unsigned integer that were later on interpreted as large positive
integers which did passed the check:

if (sysclt_sched_rt_period <= 0)
return EINVAL;

This commit fixes the parsing by setting explicit range for both
perid_us and runtime_us into the sched_rt_sysctls table and processes
the values with proc_dointvec_minmax() instead.

Alternatively if we wanted to use full range of unsigned int for the
period value we would have to split the proc_handler and use
proc_douintvec() for it however even the
Documentation/scheduller/sched-rt-group.rst describes the range as 1 to
INT_MAX.

As far as I can tell the only problem this causes is that the sysctl
file allows writing negative values which when read back may confuse
userspace.

There is also a LTP test being submitted for these sysctl files at:

http://patchwork.ozlabs.org/project/ltp/patch/[email protected]/

Signed-off-by: Cyril Hrubis <[email protected]>
---
kernel/sched/rt.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 0597ba0f85ff..aed3d55de2dd 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -37,6 +37,8 @@ static struct ctl_table sched_rt_sysctls[] = {
.maxlen = sizeof(unsigned int),
.mode = 0644,
.proc_handler = sched_rt_handler,
+ .extra1 = SYSCTL_ONE,
+ .extra2 = SYSCTL_INT_MAX,
},
{
.procname = "sched_rt_runtime_us",
@@ -44,6 +46,8 @@ static struct ctl_table sched_rt_sysctls[] = {
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = sched_rt_handler,
+ .extra1 = SYSCTL_NEG_ONE,
+ .extra2 = SYSCTL_INT_MAX,
},
{
.procname = "sched_rr_timeslice_ms",
@@ -2985,9 +2989,6 @@ static int sched_rt_global_constraints(void)
#ifdef CONFIG_SYSCTL
static int sched_rt_global_validate(void)
{
- if (sysctl_sched_rt_period <= 0)
- return -EINVAL;
-
if ((sysctl_sched_rt_runtime != RUNTIME_INF) &&
((sysctl_sched_rt_runtime > sysctl_sched_rt_period) ||
((u64)sysctl_sched_rt_runtime *
@@ -3018,7 +3019,7 @@ static int sched_rt_handler(struct ctl_table *table, int write, void *buffer,
old_period = sysctl_sched_rt_period;
old_runtime = sysctl_sched_rt_runtime;

- ret = proc_dointvec(table, write, buffer, lenp, ppos);
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);

if (!ret && write) {
ret = sched_rt_global_validate();
--
2.41.0

2023-10-02 21:11:46

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: sched/core] sched/rt/docs: Clarify & fix sched_rt_* sysctl docs

The following commit has been merged into the sched/core branch of tip:

Commit-ID: e6dbdd8fb75526b01787050087b65d12c76b3666
Gitweb: https://git.kernel.org/tip/e6dbdd8fb75526b01787050087b65d12c76b3666
Author: Cyril Hrubis <[email protected]>
AuthorDate: Mon, 02 Oct 2023 13:55:52 +02:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 02 Oct 2023 15:17:13 +02:00

sched/rt/docs: Clarify & fix sched_rt_* sysctl docs

- Describe explicitly that sched_rt_runtime_us is allocated from
sched_rt_period_us and hence always less or equal to that value.

- The limit for sched_rt_runtime_us is not INT_MAX-1, but rather it's
limited by the value of sched_rt_period_us. If sched_rt_period_us is
INT_MAX then sched_rt_runtime_us can be set to INT_MAX as well.

Signed-off-by: Cyril Hrubis <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Documentation/scheduler/sched-rt-group.rst | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/scheduler/sched-rt-group.rst b/Documentation/scheduler/sched-rt-group.rst
index 655a096..a16bee8 100644
--- a/Documentation/scheduler/sched-rt-group.rst
+++ b/Documentation/scheduler/sched-rt-group.rst
@@ -87,18 +87,20 @@ lack an EDF scheduler to make non-uniform periods usable.
The system wide settings are configured under the /proc virtual file system:

/proc/sys/kernel/sched_rt_period_us:
- The scheduling period that is equivalent to 100% CPU bandwidth
+ The scheduling period that is equivalent to 100% CPU bandwidth.

/proc/sys/kernel/sched_rt_runtime_us:
- A global limit on how much time realtime scheduling may use. Even without
- CONFIG_RT_GROUP_SCHED enabled, this will limit time reserved to realtime
- processes. With CONFIG_RT_GROUP_SCHED it signifies the total bandwidth
- available to all realtime groups.
+ A global limit on how much time realtime scheduling may use. This is always
+ less or equal to the period_us, as it denotes the time allocated from the
+ period_us for the realtime tasks. Even without CONFIG_RT_GROUP_SCHED enabled,
+ this will limit time reserved to realtime processes. With
+ CONFIG_RT_GROUP_SCHED=y it signifies the total bandwidth available to all
+ realtime groups.

* Time is specified in us because the interface is s32. This gives an
operating range from 1us to about 35 minutes.
* sched_rt_period_us takes values from 1 to INT_MAX.
- * sched_rt_runtime_us takes values from -1 to (INT_MAX - 1).
+ * sched_rt_runtime_us takes values from -1 to sched_rt_period_us.
* A run time of -1 specifies runtime == period, ie. no limit.