2020-09-22 08:41:01

by Dietmar Eggemann

[permalink] [raw]
Subject: [PATCH 0/2] sched/cpupri: Cleanup cpu priority vector handling

Two of the 102 elements of the cpu priority vector, among them the one
for MAX_PRIO (140) representing the IDLE task, are never used.

Remove them and adapt the cpupri implementation accordingly.

Dietmar Eggemann (2):
sched/cpupri: Remove pri_to_cpu[CPUPRI_IDLE]
sched/cpupri: Remove pri_to_cpu[1]

kernel/sched/cpupri.c | 10 ++++------
kernel/sched/cpupri.h | 7 +++----
2 files changed, 7 insertions(+), 10 deletions(-)

--
2.17.1


2020-09-22 08:41:16

by Dietmar Eggemann

[permalink] [raw]
Subject: [PATCH 1/2] sched/cpupri: Remove pri_to_cpu[CPUPRI_IDLE]

pri_to_cpu[CPUPRI_IDLE=0] isn't used since cpupri_set(..., newpri) is
never called with newpri = MAX_PRIO (140).

Current mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

140 0 (CPUPRI_IDLE)

100 1 (CPUPRI_NORMAL)

1 98 98 3
...
49 50 50 51
50 49 49 52
...
99 0 0 101

Even when cpupri was introduced with commit 6e0534f27819 ("sched: use a
2-d bitmap for searching lowest-pri CPU") in v2.6.27, only

(1) CPUPRI_INVALID (-1),
(2) MAX_RT_PRIO (100),
(3) an RT prio (RT1..RT99)

were used as newprio in cpupri_set(..., newpri) -> convert_prio(newpri).

MAX_RT_PRIO is used only in dec_rt_tasks() -> dec_rt_prio() ->
dec_rt_prio_smp() -> cpupri_set() in case of !rt_rq->rt_nr_running.
I.e. it stands for a non-rt task, including the IDLE task.

Commit 57785df5ac53 ("sched: Fix task priority bug") removed code in
v2.6.33 which did set the priority of the IDLE task to MAX_PRIO.
Although this happened after the introduction of cpupri, it didn't have
an effect on the values used for cpupri_set(..., newpri).

Remove CPUPRI_IDLE and adapt the cpupri implementation accordingly.
This will save a useless for loop with an atomic_read in
cpupri_find_fitness() calling __cpupri_find().

New mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

100 0 (CPUPRI_NORMAL)

1 98 98 2
...
49 50 50 50
50 49 49 51
...
99 0 0 100

Signed-off-by: Dietmar Eggemann <[email protected]>
---
kernel/sched/cpupri.c | 10 ++++------
kernel/sched/cpupri.h | 7 +++----
2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c
index 0033731a0797..a5d14ed485f4 100644
--- a/kernel/sched/cpupri.c
+++ b/kernel/sched/cpupri.c
@@ -11,7 +11,7 @@
* This code tracks the priority of each CPU so that global migration
* decisions are easy to calculate. Each CPU can be in a state as follows:
*
- * (INVALID), IDLE, NORMAL, RT1, ... RT99
+ * (INVALID), NORMAL, RT1, ... RT99
*
* going from the lowest priority to the highest. CPUs in the INVALID state
* are not eligible for routing. The system maintains this state with
@@ -19,24 +19,22 @@
* in that class). Therefore a typical application without affinity
* restrictions can find a suitable CPU with O(1) complexity (e.g. two bit
* searches). For tasks with affinity restrictions, the algorithm has a
- * worst case complexity of O(min(102, nr_domcpus)), though the scenario that
+ * worst case complexity of O(min(101, nr_domcpus)), though the scenario that
* yields the worst case search is fairly contrived.
*/
#include "sched.h"

-/* Convert between a 140 based task->prio, and our 102 based cpupri */
+/* Convert between a 140 based task->prio, and our 101 based cpupri */
static int convert_prio(int prio)
{
int cpupri;

if (prio == CPUPRI_INVALID)
cpupri = CPUPRI_INVALID;
- else if (prio == MAX_PRIO)
- cpupri = CPUPRI_IDLE;
else if (prio >= MAX_RT_PRIO)
cpupri = CPUPRI_NORMAL;
else
- cpupri = MAX_RT_PRIO - prio + 1;
+ cpupri = MAX_RT_PRIO - prio;

return cpupri;
}
diff --git a/kernel/sched/cpupri.h b/kernel/sched/cpupri.h
index efbb492bb94c..1a162369b8d4 100644
--- a/kernel/sched/cpupri.h
+++ b/kernel/sched/cpupri.h
@@ -1,11 +1,10 @@
/* SPDX-License-Identifier: GPL-2.0 */

-#define CPUPRI_NR_PRIORITIES (MAX_RT_PRIO + 2)
+#define CPUPRI_NR_PRIORITIES (MAX_RT_PRIO + 1)

#define CPUPRI_INVALID -1
-#define CPUPRI_IDLE 0
-#define CPUPRI_NORMAL 1
-/* values 2-101 are RT priorities 0-99 */
+#define CPUPRI_NORMAL 0
+/* values 2-100 are RT priorities 0-99 */

struct cpupri_vec {
atomic_t count;
--
2.17.1

2020-09-22 08:43:40

by Dietmar Eggemann

[permalink] [raw]
Subject: [PATCH 2/2] sched/cpupri: Remove pri_to_cpu[1]

pri_to_cpu[1] isn't used since cpupri_set(..., newpri) is
never called with newpri = 99.

The valid RT priorities RT1..RT99 (p->rt_priority = [1..99]) map into
cpupri (idx of pri_to_cpu[]) = [2..100]

Current mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

100 0 (CPUPRI_NORMAL)

1 98 98 2
...
49 50 50 50
50 49 49 51
...
99 0 0 100

So cpupri = 1 isn't used.

Reduce the size of pri_to_cpu[] by 1 and adapt the cpupri
implementation accordingly. This will save a useless for loop with an
atomic_read in cpupri_find_fitness() calling __cpupri_find().

New mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

100 0 (CPUPRI_NORMAL)

1 98 98 1
...
49 50 50 49
50 49 49 50
...
99 0 0 99

Signed-off-by: Dietmar Eggemann <[email protected]>
---
kernel/sched/cpupri.c | 6 +++---
kernel/sched/cpupri.h | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c
index a5d14ed485f4..8d9952a51664 100644
--- a/kernel/sched/cpupri.c
+++ b/kernel/sched/cpupri.c
@@ -19,12 +19,12 @@
* in that class). Therefore a typical application without affinity
* restrictions can find a suitable CPU with O(1) complexity (e.g. two bit
* searches). For tasks with affinity restrictions, the algorithm has a
- * worst case complexity of O(min(101, nr_domcpus)), though the scenario that
+ * worst case complexity of O(min(100, nr_domcpus)), though the scenario that
* yields the worst case search is fairly contrived.
*/
#include "sched.h"

-/* Convert between a 140 based task->prio, and our 101 based cpupri */
+/* Convert between a 140 based task->prio, and our 100 based cpupri */
static int convert_prio(int prio)
{
int cpupri;
@@ -34,7 +34,7 @@ static int convert_prio(int prio)
else if (prio >= MAX_RT_PRIO)
cpupri = CPUPRI_NORMAL;
else
- cpupri = MAX_RT_PRIO - prio;
+ cpupri = MAX_RT_PRIO - prio - 1;

return cpupri;
}
diff --git a/kernel/sched/cpupri.h b/kernel/sched/cpupri.h
index 1a162369b8d4..e28e1ed12e3d 100644
--- a/kernel/sched/cpupri.h
+++ b/kernel/sched/cpupri.h
@@ -1,10 +1,10 @@
/* SPDX-License-Identifier: GPL-2.0 */

-#define CPUPRI_NR_PRIORITIES (MAX_RT_PRIO + 1)
+#define CPUPRI_NR_PRIORITIES MAX_RT_PRIO

#define CPUPRI_INVALID -1
#define CPUPRI_NORMAL 0
-/* values 2-100 are RT priorities 0-99 */
+/* values 1-99 are for RT1-RT99 priorities */

struct cpupri_vec {
atomic_t count;
--
2.17.1

2020-10-15 02:01:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/2] sched/cpupri: Cleanup cpu priority vector handling

On Tue, Sep 22, 2020 at 10:39:32AM +0200, Dietmar Eggemann wrote:
> Two of the 102 elements of the cpu priority vector, among them the one
> for MAX_PRIO (140) representing the IDLE task, are never used.
>
> Remove them and adapt the cpupri implementation accordingly.
>
> Dietmar Eggemann (2):
> sched/cpupri: Remove pri_to_cpu[CPUPRI_IDLE]
> sched/cpupri: Remove pri_to_cpu[1]

Thanks!, I've also got a few more patches on top, I'll post them
separately.

2020-10-15 02:02:41

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 3/2] sched/cpupri: Remap CPUPRI_NORMAL to MAX_RT_PRIO-1


This makes the mapping continuous and frees up 100 for other usage.

Prev mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

100 0 (CPUPRI_NORMAL)

1 98 98 1
...
49 50 50 49
50 49 49 50
...
99 0 0 99

New mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

99 0 (CPUPRI_NORMAL)

1 98 98 1
...
49 50 50 49
50 49 49 50
...
99 0 0 99

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
kernel/sched/cpupri.c | 34 +++++++++++++++++++++++++++-------
kernel/sched/rt.c | 16 +++++++++-------
2 files changed, 36 insertions(+), 14 deletions(-)

--- a/kernel/sched/cpupri.c
+++ b/kernel/sched/cpupri.c
@@ -24,17 +24,37 @@
*/
#include "sched.h"

-/* Convert between a 140 based task->prio, and our 100 based cpupri */
+/*
+ * p->rt_priority p->prio newpri cpupri
+ *
+ * -1 -1 (CPUPRI_INVALID)
+ *
+ * 99 0 (CPUPRI_NORMAL)
+ *
+ * 1 98 98 1
+ * ...
+ * 49 50 50 49
+ * 50 49 49 50
+ * ...
+ * 99 0 0 99
+ */
static int convert_prio(int prio)
{
int cpupri;

- if (prio == CPUPRI_INVALID)
- cpupri = CPUPRI_INVALID;
- else if (prio >= MAX_RT_PRIO)
- cpupri = CPUPRI_NORMAL;
- else
- cpupri = MAX_RT_PRIO - prio - 1;
+ switch (prio) {
+ case CPUPRI_INVALID:
+ cpupri = CPUPRI_INVALID; /* -1 */
+ break;
+
+ case 0...98:
+ cpupri = MAX_RT_PRIO-1 - prio; /* 1 ... 99 */
+ break;
+
+ case MAX_RT_PRIO-1:
+ cpupri = CPUPRI_NORMAL; /* 0 */
+ break;
+ }

return cpupri;
}
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -89,8 +89,8 @@ void init_rt_rq(struct rt_rq *rt_rq)
__set_bit(MAX_RT_PRIO, array->bitmap);

#if defined CONFIG_SMP
- rt_rq->highest_prio.curr = MAX_RT_PRIO;
- rt_rq->highest_prio.next = MAX_RT_PRIO;
+ rt_rq->highest_prio.curr = MAX_RT_PRIO-1;
+ rt_rq->highest_prio.next = MAX_RT_PRIO-1;
rt_rq->rt_nr_migratory = 0;
rt_rq->overloaded = 0;
plist_head_init(&rt_rq->pushable_tasks);
@@ -161,7 +161,7 @@ void init_tg_rt_entry(struct task_group
{
struct rq *rq = cpu_rq(cpu);

- rt_rq->highest_prio.curr = MAX_RT_PRIO;
+ rt_rq->highest_prio.curr = MAX_RT_PRIO-1;
rt_rq->rt_nr_boosted = 0;
rt_rq->rq = rq;
rt_rq->tg = tg;
@@ -393,8 +393,9 @@ static void dequeue_pushable_task(struct
p = plist_first_entry(&rq->rt.pushable_tasks,
struct task_struct, pushable_tasks);
rq->rt.highest_prio.next = p->prio;
- } else
- rq->rt.highest_prio.next = MAX_RT_PRIO;
+ } else {
+ rq->rt.highest_prio.next = MAX_RT_PRIO-1;
+ }
}

#else
@@ -1147,8 +1148,9 @@ dec_rt_prio(struct rt_rq *rt_rq, int pri
sched_find_first_bit(array->bitmap);
}

- } else
- rt_rq->highest_prio.curr = MAX_RT_PRIO;
+ } else {
+ rt_rq->highest_prio.curr = MAX_RT_PRIO-1;
+ }

dec_rt_prio_smp(rt_rq, prio, prev_prio);
}

2020-10-15 02:04:35

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 4/2] sched/cpupri: Add CPUPRI_HIGHER


Add CPUPRI_HIGHER above the RT99 priority to denote the CPU is in use
by higher priority tasks (specifically deadline).

XXX: we should probably drive PUSH-PULL from cpupri, that would
automagically result in an RT-PUSH when DL sets cpupri to CPUPRI_HIGHER.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
kernel/sched/cpupri.c | 10 ++++++++--
kernel/sched/cpupri.h | 3 ++-
kernel/sched/deadline.c | 3 +++
3 files changed, 13 insertions(+), 3 deletions(-)

--- a/kernel/sched/cpupri.c
+++ b/kernel/sched/cpupri.c
@@ -11,7 +11,7 @@
* This code tracks the priority of each CPU so that global migration
* decisions are easy to calculate. Each CPU can be in a state as follows:
*
- * (INVALID), NORMAL, RT1, ... RT99
+ * (INVALID), NORMAL, RT1, ... RT99, HIGHER
*
* going from the lowest priority to the highest. CPUs in the INVALID state
* are not eligible for routing. The system maintains this state with
@@ -19,7 +19,7 @@
* in that class). Therefore a typical application without affinity
* restrictions can find a suitable CPU with O(1) complexity (e.g. two bit
* searches). For tasks with affinity restrictions, the algorithm has a
- * worst case complexity of O(min(100, nr_domcpus)), though the scenario that
+ * worst case complexity of O(min(101, nr_domcpus)), though the scenario that
* yields the worst case search is fairly contrived.
*/
#include "sched.h"
@@ -37,6 +37,8 @@
* 50 49 49 50
* ...
* 99 0 0 99
+ *
+ * 100 100 (CPUPRI_HIGHER)
*/
static int convert_prio(int prio)
{
@@ -54,6 +56,10 @@ static int convert_prio(int prio)
case MAX_RT_PRIO-1:
cpupri = CPUPRI_NORMAL; /* 0 */
break;
+
+ case MAX_RT_PRIO:
+ cpupri = CPUPRI_HIGHER; /* 100 */
+ break;
}

return cpupri;
--- a/kernel/sched/cpupri.h
+++ b/kernel/sched/cpupri.h
@@ -1,10 +1,11 @@
/* SPDX-License-Identifier: GPL-2.0 */

-#define CPUPRI_NR_PRIORITIES MAX_RT_PRIO
+#define CPUPRI_NR_PRIORITIES (MAX_RT_PRIO+1)

#define CPUPRI_INVALID -1
#define CPUPRI_NORMAL 0
/* values 1-99 are for RT1-RT99 priorities */
+#define CPUPRI_HIGHER 100

struct cpupri_vec {
atomic_t count;
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1364,6 +1364,8 @@ static void inc_dl_deadline(struct dl_rq

if (dl_rq->earliest_dl.curr == 0 ||
dl_time_before(deadline, dl_rq->earliest_dl.curr)) {
+ if (dl_rq->earliest_dl.curr == 0)
+ cpupri_set(&rq->rd->cpupri, rq->cpu, CPUPRI_HIGHER);
dl_rq->earliest_dl.curr = deadline;
cpudl_set(&rq->rd->cpudl, rq->cpu, deadline);
}
@@ -1381,6 +1383,7 @@ static void dec_dl_deadline(struct dl_rq
dl_rq->earliest_dl.curr = 0;
dl_rq->earliest_dl.next = 0;
cpudl_clear(&rq->rd->cpudl, rq->cpu);
+ cpupri_set(&rq->rd->cpupri, rq->cpu, rq->rt.highest_prio.curr);
} else {
struct rb_node *leftmost = dl_rq->root.rb_leftmost;
struct sched_dl_entity *entry;

2020-10-19 14:17:34

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH 3/2] sched/cpupri: Remap CPUPRI_NORMAL to MAX_RT_PRIO-1

On 14/10/2020 21:48, Peter Zijlstra wrote:

[...]

> + switch (prio) {
> + case CPUPRI_INVALID:
> + cpupri = CPUPRI_INVALID; /* -1 */
> + break;
> +
> + case 0...98:

kernel/sched/cpupri.c:54:7: error: too many decimal points in number
54 | case 0...98:
| ^~~~~~

There need to be spaces around the ellipses.

Otherwise LGTM.

2020-10-19 14:17:39

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH 4/2] sched/cpupri: Add CPUPRI_HIGHER

On 14/10/2020 21:54, Peter Zijlstra wrote:
>
> Add CPUPRI_HIGHER above the RT99 priority to denote the CPU is in use
> by higher priority tasks (specifically deadline).

sugov:X already triggers this now on our !fast-switching devices running
schedutil.

> XXX: we should probably drive PUSH-PULL from cpupri, that would
> automagically result in an RT-PUSH when DL sets cpupri to CPUPRI_HIGHER.
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>

[...]

> @@ -54,6 +56,10 @@ static int convert_prio(int prio)

The BUG_ON could be tightened:

- BUG_ON(prio >= MAX_PRIO);
+ BUG_ON(prio > MAX_RT_PRIO);

> case MAX_RT_PRIO-1:
> cpupri = CPUPRI_NORMAL; /* 0 */
> break;
> +
> + case MAX_RT_PRIO:
> + cpupri = CPUPRI_HIGHER; /* 100 */
> + break;
> }
>
> return cpupri;

Just saw that the comment for cpupri_set() needs changing:

@@ -205,7 +208,7 @@ int cpupri_find_fitness(struct cpupri *cp, struct
task_struct *p,
* cpupri_set - update the CPU priority setting
* @cp: The cpupri context
* @cpu: The target CPU
- * @newpri: The priority (INVALID-RT99) to assign to this CPU
+ * @newpri: The priority (INVALID-RT1-RT99-NORMAL-HIGHER) to assign to
this CPU

Reviewed-by: Dietmar Eggemann <[email protected]>

2020-10-20 09:52:11

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 3/2] sched/cpupri: Remap CPUPRI_NORMAL to MAX_RT_PRIO-1

On Mon, Oct 19, 2020 at 04:14:16PM +0200, Dietmar Eggemann wrote:
> On 14/10/2020 21:48, Peter Zijlstra wrote:
>
> [...]
>
> > + switch (prio) {
> > + case CPUPRI_INVALID:
> > + cpupri = CPUPRI_INVALID; /* -1 */
> > + break;
> > +
> > + case 0...98:
>
> kernel/sched/cpupri.c:54:7: error: too many decimal points in number
> 54 | case 0...98:
> | ^~~~~~
>
> There need to be spaces around the ellipses.
>
> Otherwise LGTM.

Yah, some robot already told me. Fix this already in the queue.git
version. Thanks for looking at it though.

2020-10-20 20:42:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 4/2] sched/cpupri: Add CPUPRI_HIGHER

On Mon, Oct 19, 2020 at 04:15:01PM +0200, Dietmar Eggemann wrote:
> On 14/10/2020 21:54, Peter Zijlstra wrote:
> >
> > Add CPUPRI_HIGHER above the RT99 priority to denote the CPU is in use
> > by higher priority tasks (specifically deadline).
>
> sugov:X already triggers this now on our !fast-switching devices running
> schedutil.

Right, that would also be a nice test-case for:

> > XXX: we should probably drive PUSH-PULL from cpupri, that would
> > automagically result in an RT-PUSH when DL sets cpupri to CPUPRI_HIGHER.

This, once we get there..

> > @@ -54,6 +56,10 @@ static int convert_prio(int prio)
>
> The BUG_ON could be tightened:
>
> - BUG_ON(prio >= MAX_PRIO);
> + BUG_ON(prio > MAX_RT_PRIO);
>

Maybe I've not had enough wake-up juice, but I can't seem to locate
this.

> > case MAX_RT_PRIO-1:
> > cpupri = CPUPRI_NORMAL; /* 0 */
> > break;
> > +
> > + case MAX_RT_PRIO:
> > + cpupri = CPUPRI_HIGHER; /* 100 */
> > + break;
> > }
> >
> > return cpupri;
>
> Just saw that the comment for cpupri_set() needs changing:
>
> @@ -205,7 +208,7 @@ int cpupri_find_fitness(struct cpupri *cp, struct
> task_struct *p,
> * cpupri_set - update the CPU priority setting
> * @cp: The cpupri context
> * @cpu: The target CPU
> - * @newpri: The priority (INVALID-RT99) to assign to this CPU
> + * @newpri: The priority (INVALID-RT1-RT99-NORMAL-HIGHER) to assign to
> this CPU

I made that:

+ * @newpri: The priority (INVALID,NORMAL,RT1-RT99,HIGHER) to assign to this CPU

>
> Reviewed-by: Dietmar Eggemann <[email protected]>

Thanks!

2020-10-21 06:28:34

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH 4/2] sched/cpupri: Add CPUPRI_HIGHER

On 20/10/2020 09:37, Peter Zijlstra wrote:
> On Mon, Oct 19, 2020 at 04:15:01PM +0200, Dietmar Eggemann wrote:
>> On 14/10/2020 21:54, Peter Zijlstra wrote:

[...]

> Maybe I've not had enough wake-up juice, but I can't seem to locate
> this.

Sorry, I was commenting on my own debug code ;-)

Subject: [tip: sched/core] sched/cpupri: Remove pri_to_cpu[CPUPRI_IDLE]

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 5e054bca44fe92323de5e9b71478d1904b8bb1b7
Gitweb: https://git.kernel.org/tip/5e054bca44fe92323de5e9b71478d1904b8bb1b7
Author: Dietmar Eggemann <[email protected]>
AuthorDate: Tue, 22 Sep 2020 10:39:33 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 29 Oct 2020 11:00:29 +01:00

sched/cpupri: Remove pri_to_cpu[CPUPRI_IDLE]

pri_to_cpu[CPUPRI_IDLE=0] isn't used since cpupri_set(..., newpri) is
never called with newpri = MAX_PRIO (140).

Current mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

140 0 (CPUPRI_IDLE)

100 1 (CPUPRI_NORMAL)

1 98 98 3
...
49 50 50 51
50 49 49 52
...
99 0 0 101

Even when cpupri was introduced with commit 6e0534f27819 ("sched: use a
2-d bitmap for searching lowest-pri CPU") in v2.6.27, only

(1) CPUPRI_INVALID (-1),
(2) MAX_RT_PRIO (100),
(3) an RT prio (RT1..RT99)

were used as newprio in cpupri_set(..., newpri) -> convert_prio(newpri).

MAX_RT_PRIO is used only in dec_rt_tasks() -> dec_rt_prio() ->
dec_rt_prio_smp() -> cpupri_set() in case of !rt_rq->rt_nr_running.
I.e. it stands for a non-rt task, including the IDLE task.

Commit 57785df5ac53 ("sched: Fix task priority bug") removed code in
v2.6.33 which did set the priority of the IDLE task to MAX_PRIO.
Although this happened after the introduction of cpupri, it didn't have
an effect on the values used for cpupri_set(..., newpri).

Remove CPUPRI_IDLE and adapt the cpupri implementation accordingly.
This will save a useless for loop with an atomic_read in
cpupri_find_fitness() calling __cpupri_find().

New mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

100 0 (CPUPRI_NORMAL)

1 98 98 2
...
49 50 50 50
50 49 49 51
...
99 0 0 100

Signed-off-by: Dietmar Eggemann <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/sched/cpupri.c | 10 ++++------
kernel/sched/cpupri.h | 7 +++----
2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c
index 0033731..a5d14ed 100644
--- a/kernel/sched/cpupri.c
+++ b/kernel/sched/cpupri.c
@@ -11,7 +11,7 @@
* This code tracks the priority of each CPU so that global migration
* decisions are easy to calculate. Each CPU can be in a state as follows:
*
- * (INVALID), IDLE, NORMAL, RT1, ... RT99
+ * (INVALID), NORMAL, RT1, ... RT99
*
* going from the lowest priority to the highest. CPUs in the INVALID state
* are not eligible for routing. The system maintains this state with
@@ -19,24 +19,22 @@
* in that class). Therefore a typical application without affinity
* restrictions can find a suitable CPU with O(1) complexity (e.g. two bit
* searches). For tasks with affinity restrictions, the algorithm has a
- * worst case complexity of O(min(102, nr_domcpus)), though the scenario that
+ * worst case complexity of O(min(101, nr_domcpus)), though the scenario that
* yields the worst case search is fairly contrived.
*/
#include "sched.h"

-/* Convert between a 140 based task->prio, and our 102 based cpupri */
+/* Convert between a 140 based task->prio, and our 101 based cpupri */
static int convert_prio(int prio)
{
int cpupri;

if (prio == CPUPRI_INVALID)
cpupri = CPUPRI_INVALID;
- else if (prio == MAX_PRIO)
- cpupri = CPUPRI_IDLE;
else if (prio >= MAX_RT_PRIO)
cpupri = CPUPRI_NORMAL;
else
- cpupri = MAX_RT_PRIO - prio + 1;
+ cpupri = MAX_RT_PRIO - prio;

return cpupri;
}
diff --git a/kernel/sched/cpupri.h b/kernel/sched/cpupri.h
index efbb492..1a16236 100644
--- a/kernel/sched/cpupri.h
+++ b/kernel/sched/cpupri.h
@@ -1,11 +1,10 @@
/* SPDX-License-Identifier: GPL-2.0 */

-#define CPUPRI_NR_PRIORITIES (MAX_RT_PRIO + 2)
+#define CPUPRI_NR_PRIORITIES (MAX_RT_PRIO + 1)

#define CPUPRI_INVALID -1
-#define CPUPRI_IDLE 0
-#define CPUPRI_NORMAL 1
-/* values 2-101 are RT priorities 0-99 */
+#define CPUPRI_NORMAL 0
+/* values 2-100 are RT priorities 0-99 */

struct cpupri_vec {
atomic_t count;

Subject: [tip: sched/core] sched/cpupri: Remove pri_to_cpu[1]

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 1b08782ce31f612d98e11ccccf3e3df9a147a67d
Gitweb: https://git.kernel.org/tip/1b08782ce31f612d98e11ccccf3e3df9a147a67d
Author: Dietmar Eggemann <[email protected]>
AuthorDate: Tue, 22 Sep 2020 10:39:34 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 29 Oct 2020 11:00:29 +01:00

sched/cpupri: Remove pri_to_cpu[1]

pri_to_cpu[1] isn't used since cpupri_set(..., newpri) is
never called with newpri = 99.

The valid RT priorities RT1..RT99 (p->rt_priority = [1..99]) map into
cpupri (idx of pri_to_cpu[]) = [2..100]

Current mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

100 0 (CPUPRI_NORMAL)

1 98 98 2
...
49 50 50 50
50 49 49 51
...
99 0 0 100

So cpupri = 1 isn't used.

Reduce the size of pri_to_cpu[] by 1 and adapt the cpupri
implementation accordingly. This will save a useless for loop with an
atomic_read in cpupri_find_fitness() calling __cpupri_find().

New mapping:

p->rt_priority p->prio newpri cpupri

-1 -1 (CPUPRI_INVALID)

100 0 (CPUPRI_NORMAL)

1 98 98 1
...
49 50 50 49
50 49 49 50
...
99 0 0 99

Signed-off-by: Dietmar Eggemann <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/sched/cpupri.c | 6 +++---
kernel/sched/cpupri.h | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c
index a5d14ed..8d9952a 100644
--- a/kernel/sched/cpupri.c
+++ b/kernel/sched/cpupri.c
@@ -19,12 +19,12 @@
* in that class). Therefore a typical application without affinity
* restrictions can find a suitable CPU with O(1) complexity (e.g. two bit
* searches). For tasks with affinity restrictions, the algorithm has a
- * worst case complexity of O(min(101, nr_domcpus)), though the scenario that
+ * worst case complexity of O(min(100, nr_domcpus)), though the scenario that
* yields the worst case search is fairly contrived.
*/
#include "sched.h"

-/* Convert between a 140 based task->prio, and our 101 based cpupri */
+/* Convert between a 140 based task->prio, and our 100 based cpupri */
static int convert_prio(int prio)
{
int cpupri;
@@ -34,7 +34,7 @@ static int convert_prio(int prio)
else if (prio >= MAX_RT_PRIO)
cpupri = CPUPRI_NORMAL;
else
- cpupri = MAX_RT_PRIO - prio;
+ cpupri = MAX_RT_PRIO - prio - 1;

return cpupri;
}
diff --git a/kernel/sched/cpupri.h b/kernel/sched/cpupri.h
index 1a16236..e28e1ed 100644
--- a/kernel/sched/cpupri.h
+++ b/kernel/sched/cpupri.h
@@ -1,10 +1,10 @@
/* SPDX-License-Identifier: GPL-2.0 */

-#define CPUPRI_NR_PRIORITIES (MAX_RT_PRIO + 1)
+#define CPUPRI_NR_PRIORITIES MAX_RT_PRIO

#define CPUPRI_INVALID -1
#define CPUPRI_NORMAL 0
-/* values 2-100 are RT priorities 0-99 */
+/* values 1-99 are for RT1-RT99 priorities */

struct cpupri_vec {
atomic_t count;