2017-03-23 10:33:32

by Byungchul Park

[permalink] [raw]
Subject: [PATCH 0/8] sched/deadline: Return the best satisfying affinity and dl in cpudl_find

cpudl_find() is used to find a cpu having the latest dl. The function
should return the latest cpu among ones satisfying task's affinity and
dl constraint, but current code gives up immediately and just return
fail when it fails at the test *only with* the maximum cpu.

For example:

cpu 0 is running a task (dl: 10).
cpu 1 is running a task (dl: 9).
cpu 2 is running a task (dl: 8).
cpu 3 is running a task (dl: 2).

where cpu 3 want to push a task (affinity is 1 2 3 and dl is 1).

In this case, the task should be migrated from cpu 3 to cpu 1, and
preempt cpu 1's task. However, current code just returns fail because
it fails at the affinity test with the maximum cpu, that is, cpu 0.

This patch set tries to find the best among ones satisfying task's
affinity and dl constraint until success or no more to see.

Byungchul Park (8):
sched/deadline: Make find_later_rq() choose a closer cpu in topology
sched/deadline: Re-define parameters of cpudl heapify functions
sched/deadline: Add cpudl_maximum_dl() for clean-up
sched/deadline: Factor out the selecting of suitable max cpu
sched/deadline: Protect read of cpudl heap with a lock
sched/deadline: Don't return meaningless cpu in cpudl_maximum_cpu()
sched/deadline: Factor out the modifying of cpudl's heap tree
sched/deadline: Return the best satisfying affinity and dl in
cpudl_find

kernel/sched/cpudeadline.c | 228 +++++++++++++++++++++++++++++++--------------
kernel/sched/cpudeadline.h | 9 ++
kernel/sched/deadline.c | 29 +++---
3 files changed, 182 insertions(+), 84 deletions(-)

--
1.9.1


2017-03-23 10:33:42

by Byungchul Park

[permalink] [raw]
Subject: [PATCH 2/8] sched/deadline: Re-define parameters of cpudl heapify functions

If you don't like this clean-up patch, please let me know. If so, I will
exclude this patch from next spin.

Thank you,
Byungchul

-----8<-----
>From b303ccf6b64064ce22aa17d16a012dcc7904ad32 Mon Sep 17 00:00:00 2001
From: Byungchul Park <[email protected]>
Date: Wed, 22 Mar 2017 14:07:22 +0900
Subject: [PATCH 2/8] sched/deadline: Re-define parameters of cpudl heapify
functions

Currently whole 'struct cpudl' is used as a parameter on heapifying.
However, only elements and size are needed on the work. So change the
parameters.

Signed-off-by: Byungchul Park <[email protected]>
---
kernel/sched/cpudeadline.c | 63 ++++++++++++++++++++++------------------------
1 file changed, 30 insertions(+), 33 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index fba235c..7dd6d9d 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -31,14 +31,14 @@ static inline int right_child(int i)
return (i << 1) + 2;
}

-static void cpudl_heapify_down(struct cpudl *cp, int idx)
+static void cpudl_heapify_down(struct cpudl_item *e, int size, int idx)
{
int l, r, largest;

- int orig_cpu = cp->elements[idx].cpu;
- u64 orig_dl = cp->elements[idx].dl;
+ int orig_cpu = e[idx].cpu;
+ u64 orig_dl = e[idx].dl;

- if (left_child(idx) >= cp->size)
+ if (left_child(idx) >= size)
return;

/* adapted from lib/prio_heap.c */
@@ -49,63 +49,60 @@ static void cpudl_heapify_down(struct cpudl *cp, int idx)
largest = idx;
largest_dl = orig_dl;

- if ((l < cp->size) && dl_time_before(orig_dl,
- cp->elements[l].dl)) {
+ if ((l < size) && dl_time_before(orig_dl, e[l].dl)) {
largest = l;
- largest_dl = cp->elements[l].dl;
+ largest_dl = e[l].dl;
}
- if ((r < cp->size) && dl_time_before(largest_dl,
- cp->elements[r].dl))
+ if ((r < size) && dl_time_before(largest_dl, e[r].dl))
largest = r;

if (largest == idx)
break;

/* pull largest child onto idx */
- cp->elements[idx].cpu = cp->elements[largest].cpu;
- cp->elements[idx].dl = cp->elements[largest].dl;
- cp->elements[cp->elements[idx].cpu].idx = idx;
+ e[idx].cpu = e[largest].cpu;
+ e[idx].dl = e[largest].dl;
+ e[e[idx].cpu].idx = idx;
idx = largest;
}
/* actual push down of saved original values orig_* */
- cp->elements[idx].cpu = orig_cpu;
- cp->elements[idx].dl = orig_dl;
- cp->elements[cp->elements[idx].cpu].idx = idx;
+ e[idx].cpu = orig_cpu;
+ e[idx].dl = orig_dl;
+ e[e[idx].cpu].idx = idx;
}

-static void cpudl_heapify_up(struct cpudl *cp, int idx)
+static void cpudl_heapify_up(struct cpudl_item *e, int size, int idx)
{
int p;

- int orig_cpu = cp->elements[idx].cpu;
- u64 orig_dl = cp->elements[idx].dl;
+ int orig_cpu = e[idx].cpu;
+ u64 orig_dl = e[idx].dl;

if (idx == 0)
return;

do {
p = parent(idx);
- if (dl_time_before(orig_dl, cp->elements[p].dl))
+ if (dl_time_before(orig_dl, e[p].dl))
break;
/* pull parent onto idx */
- cp->elements[idx].cpu = cp->elements[p].cpu;
- cp->elements[idx].dl = cp->elements[p].dl;
- cp->elements[cp->elements[idx].cpu].idx = idx;
+ e[idx].cpu = e[p].cpu;
+ e[idx].dl = e[p].dl;
+ e[e[idx].cpu].idx = idx;
idx = p;
} while (idx != 0);
/* actual push up of saved original values orig_* */
- cp->elements[idx].cpu = orig_cpu;
- cp->elements[idx].dl = orig_dl;
- cp->elements[cp->elements[idx].cpu].idx = idx;
+ e[idx].cpu = orig_cpu;
+ e[idx].dl = orig_dl;
+ e[e[idx].cpu].idx = idx;
}

-static void cpudl_heapify(struct cpudl *cp, int idx)
+static void cpudl_heapify(struct cpudl_item *e, int size, int idx)
{
- if (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
- cp->elements[idx].dl))
- cpudl_heapify_up(cp, idx);
+ if (idx > 0 && dl_time_before(e[parent(idx)].dl, e[idx].dl))
+ cpudl_heapify_up(e, size, idx);
else
- cpudl_heapify_down(cp, idx);
+ cpudl_heapify_down(e, size, idx);
}

static inline int cpudl_maximum(struct cpudl *cp)
@@ -176,7 +173,7 @@ void cpudl_clear(struct cpudl *cp, int cpu)
cp->size--;
cp->elements[new_cpu].idx = old_idx;
cp->elements[cpu].idx = IDX_INVALID;
- cpudl_heapify(cp, old_idx);
+ cpudl_heapify(cp->elements, cp->size, old_idx);

cpumask_set_cpu(cpu, cp->free_cpus);
}
@@ -208,11 +205,11 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl)
cp->elements[new_idx].dl = dl;
cp->elements[new_idx].cpu = cpu;
cp->elements[cpu].idx = new_idx;
- cpudl_heapify_up(cp, new_idx);
+ cpudl_heapify_up(cp->elements, cp->size, new_idx);
cpumask_clear_cpu(cpu, cp->free_cpus);
} else {
cp->elements[old_idx].dl = dl;
- cpudl_heapify(cp, old_idx);
+ cpudl_heapify(cp->elements, cp->size, old_idx);
}

raw_spin_unlock_irqrestore(&cp->lock, flags);
--
1.9.1

2017-03-23 10:33:30

by Byungchul Park

[permalink] [raw]
Subject: [PATCH 1/8] sched/deadline: Make find_later_rq() choose a closer cpu in topology

When cpudl_find() returns any among free_cpus, the cpu might not be
closer than others, considering sched domain. For example:

this_cpu: 15
free_cpus: 0, 1,..., 14 (== later_mask)
best_cpu: 0

topology:

0 --+
+--+
1 --+ |
+-- ... --+
2 --+ | |
+--+ |
3 --+ |

... ...

12 --+ |
+--+ |
13 --+ | |
+-- ... -+
14 --+ |
+--+
15 --+

In this case, it would be best to select 14 since it's a free cpu and
closest to 15(this_cpu). However, currently the code select 0(best_cpu)
even though that's just any among free_cpus. Fix it.

Signed-off-by: Byungchul Park <[email protected]>
---
kernel/sched/deadline.c | 29 +++++++++++++++--------------
1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index a2ce590..49c93b9 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1324,7 +1324,7 @@ static int find_later_rq(struct task_struct *task)
struct sched_domain *sd;
struct cpumask *later_mask = this_cpu_cpumask_var_ptr(local_cpu_mask_dl);
int this_cpu = smp_processor_id();
- int best_cpu, cpu = task_cpu(task);
+ int cpu = task_cpu(task);

/* Make sure the mask is initialized first */
if (unlikely(!later_mask))
@@ -1337,17 +1337,14 @@ static int find_later_rq(struct task_struct *task)
* We have to consider system topology and task affinity
* first, then we can look for a suitable cpu.
*/
- best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
- task, later_mask);
- if (best_cpu == -1)
+ if (cpudl_find(&task_rq(task)->rd->cpudl, task, later_mask) == -1)
return -1;

/*
- * If we are here, some target has been found,
- * the most suitable of which is cached in best_cpu.
- * This is, among the runqueues where the current tasks
- * have later deadlines than the task's one, the rq
- * with the latest possible one.
+ * If we are here, some targets have been found, including
+ * the most suitable which is, among the runqueues where the
+ * current tasks have later deadlines than the task's one, the
+ * rq with the latest possible one.
*
* Now we check how well this matches with task's
* affinity and system topology.
@@ -1367,6 +1364,7 @@ static int find_later_rq(struct task_struct *task)
rcu_read_lock();
for_each_domain(cpu, sd) {
if (sd->flags & SD_WAKE_AFFINE) {
+ int closest_cpu;

/*
* If possible, preempting this_cpu is
@@ -1378,14 +1376,17 @@ static int find_later_rq(struct task_struct *task)
return this_cpu;
}

+ closest_cpu = cpumask_first_and(later_mask,
+ sched_domain_span(sd));
/*
- * Last chance: if best_cpu is valid and is
- * in the mask, that becomes our choice.
+ * Last chance: if a cpu being in both later_mask
+ * and current sd span is valid, that becomes our
+ * choice. Of course, the latest possible cpu is
+ * already under consideration through later_mask.
*/
- if (best_cpu < nr_cpu_ids &&
- cpumask_test_cpu(best_cpu, sched_domain_span(sd))) {
+ if (closest_cpu < nr_cpu_ids) {
rcu_read_unlock();
- return best_cpu;
+ return closest_cpu;
}
}
}
--
1.9.1

2017-03-23 10:33:28

by Byungchul Park

[permalink] [raw]
Subject: [PATCH 3/8] sched/deadline: Add cpudl_maximum_dl() for clean-up

Current code uses cpudl_maximum() to get the root node's cpu, while it
directly accesses the root node using 'cp->elements[0].dl' to get the
root node's dl. It would be better and more readible if a function to
get the dl is added. So add it.

Signed-off-by: Byungchul Park <[email protected]>
---
kernel/sched/cpudeadline.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 7dd6d9d..f1a6ce4 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -105,11 +105,16 @@ static void cpudl_heapify(struct cpudl_item *e, int size, int idx)
cpudl_heapify_down(e, size, idx);
}

-static inline int cpudl_maximum(struct cpudl *cp)
+static inline int cpudl_maximum_cpu(struct cpudl *cp)
{
return cp->elements[0].cpu;
}

+static inline u64 cpudl_maximum_dl(struct cpudl *cp)
+{
+ return cp->elements[0].dl;
+}
+
/*
* cpudl_find - find the best (later-dl) CPU in the system
* @cp: the cpudl max-heap context
@@ -128,9 +133,9 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p,
cpumask_and(later_mask, cp->free_cpus, &p->cpus_allowed)) {
best_cpu = cpumask_any(later_mask);
goto out;
- } else if (cpumask_test_cpu(cpudl_maximum(cp), &p->cpus_allowed) &&
- dl_time_before(dl_se->deadline, cp->elements[0].dl)) {
- best_cpu = cpudl_maximum(cp);
+ } else if (cpumask_test_cpu(cpudl_maximum_cpu(cp), &p->cpus_allowed) &&
+ dl_time_before(dl_se->deadline, cpudl_maximum_dl(cp))) {
+ best_cpu = cpudl_maximum_cpu(cp);
if (later_mask)
cpumask_set_cpu(best_cpu, later_mask);
}
--
1.9.1

2017-03-23 10:33:25

by Byungchul Park

[permalink] [raw]
Subject: [PATCH 4/8] sched/deadline: Factor out the selecting of suitable max cpu

Currently, dl scheduler selects a cpu having the maximum dl on pushing.
On success, it would be a fast path, but it might fail because of the
task's affinity or dl value, so we need a slow path in case of failure.
As a first step in adding a slow path, factor out the selecting into
helper function, cpudl_fast_find().

Signed-off-by: Byungchul Park <[email protected]>
---
kernel/sched/cpudeadline.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index f1a6ce4..f03479c 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -115,6 +115,19 @@ static inline u64 cpudl_maximum_dl(struct cpudl *cp)
return cp->elements[0].dl;
}

+static int cpudl_fast_find(struct cpudl *cp, struct task_struct *p)
+{
+ const struct sched_dl_entity *dl_se = &p->dl;
+ int max_cpu = cpudl_maximum_cpu(cp);
+ u64 max_dl = cpudl_maximum_dl(cp);
+
+ if (cpumask_test_cpu(max_cpu, &p->cpus_allowed) &&
+ dl_time_before(dl_se->deadline, max_dl))
+ return max_cpu;
+
+ return -1;
+}
+
/*
* cpudl_find - find the best (later-dl) CPU in the system
* @cp: the cpudl max-heap context
@@ -127,16 +140,14 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p,
struct cpumask *later_mask)
{
int best_cpu = -1;
- const struct sched_dl_entity *dl_se = &p->dl;

if (later_mask &&
cpumask_and(later_mask, cp->free_cpus, &p->cpus_allowed)) {
best_cpu = cpumask_any(later_mask);
goto out;
- } else if (cpumask_test_cpu(cpudl_maximum_cpu(cp), &p->cpus_allowed) &&
- dl_time_before(dl_se->deadline, cpudl_maximum_dl(cp))) {
- best_cpu = cpudl_maximum_cpu(cp);
- if (later_mask)
+ } else {
+ best_cpu = cpudl_fast_find(cp, p);
+ if (best_cpu != -1 && later_mask)
cpumask_set_cpu(best_cpu, later_mask);
}

--
1.9.1

2017-03-23 10:35:39

by Byungchul Park

[permalink] [raw]
Subject: [PATCH 6/8] sched/deadline: Don't return meaningless cpu in cpudl_maximum_cpu()

When the heap tree is empty, cp->elements[0].cpu has meaningless value.
We need to consider the case.

Signed-off-by: Byungchul Park <[email protected]>
---
kernel/sched/cpudeadline.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 37bbb66..21404b8 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -107,7 +107,8 @@ static void cpudl_heapify(struct cpudl_item *e, int size, int idx)

static inline int cpudl_maximum_cpu(struct cpudl *cp)
{
- return cp->elements[0].cpu;
+ int cpu = cp->elements[0].cpu;
+ return cp->elements[cpu].idx == IDX_INVALID ? -1 : cpu;
}

static inline u64 cpudl_maximum_dl(struct cpudl *cp)
@@ -127,7 +128,8 @@ static int cpudl_fast_find(struct cpudl *cp, struct task_struct *p)
max_dl = cpudl_maximum_dl(cp);
raw_spin_unlock_irqrestore(&cp->lock, flags);

- if (cpumask_test_cpu(max_cpu, &p->cpus_allowed) &&
+ if (max_cpu != -1 &&
+ cpumask_test_cpu(max_cpu, &p->cpus_allowed) &&
dl_time_before(dl_se->deadline, max_dl))
return max_cpu;

--
1.9.1

2017-03-23 10:36:01

by Byungchul Park

[permalink] [raw]
Subject: [PATCH 5/8] sched/deadline: Protect read of cpudl heap with a lock

Current code reads cp->elements[0].cpu and cp->elements[0].dl without
acquiring cpudl's lock. There are two problems on it:

1. When we read elements[0].dl, the value can be broken on 32 bit
machine because elements[0].dl is 64 bit data. We should guarantee
it to be done atomically.

2. Obsolete data can be read unless syncronizing with updaters:

updater1 updater2 reader
-------- -------- ------
lock A
set maxcpu = cpu1
unlock A
lock A
set maxcpu = cpu2
unlock A
read maxcpu, it might be -1

where maxcpu was -1 initially.

When reading maxcpu, the value might be -1, we expect that should be
cpu2 though. So force readers also to be protected using the lock.

Signed-off-by: Byungchul Park <[email protected]>
---
kernel/sched/cpudeadline.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index f03479c..37bbb66 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -118,8 +118,14 @@ static inline u64 cpudl_maximum_dl(struct cpudl *cp)
static int cpudl_fast_find(struct cpudl *cp, struct task_struct *p)
{
const struct sched_dl_entity *dl_se = &p->dl;
- int max_cpu = cpudl_maximum_cpu(cp);
- u64 max_dl = cpudl_maximum_dl(cp);
+ unsigned long flags;
+ int max_cpu;
+ u64 max_dl;
+
+ raw_spin_lock_irqsave(&cp->lock, flags);
+ max_cpu = cpudl_maximum_cpu(cp);
+ max_dl = cpudl_maximum_dl(cp);
+ raw_spin_unlock_irqrestore(&cp->lock, flags);

if (cpumask_test_cpu(max_cpu, &p->cpus_allowed) &&
dl_time_before(dl_se->deadline, max_dl))
--
1.9.1

2017-03-23 10:35:59

by Byungchul Park

[permalink] [raw]
Subject: [PATCH 7/8] sched/deadline: Factor out the modifying of cpudl's heap tree

Currently, cpudl_{set,clear} is responsible for manipulating cpudl's
heap tree and free_cpus list under lock protection. However, operation
manipulating the heap tree itself is reusable.

Actually, the operation is useful when picking up the second maximum
node from the tree, where it does not need to manipulate free_cpus list
but only needs to manipulate the tree.

Signed-off-by: Byungchul Park <[email protected]>
---
kernel/sched/cpudeadline.c | 95 ++++++++++++++++++++++++++++++----------------
1 file changed, 62 insertions(+), 33 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 21404b8..453159a 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -116,6 +116,64 @@ static inline u64 cpudl_maximum_dl(struct cpudl *cp)
return cp->elements[0].dl;
}

+/*
+ * __cpudl_clear - remove a cpu from the cpudl max-heap
+ * @cp: the cpudl max-heap context
+ * @cpu: the target cpu
+ *
+ * Notes: assumes cpu_rq(cpu)->lock and cpudl->lock are locked
+ *
+ * Returns: (void)
+ */
+static void __cpudl_clear(struct cpudl *cp, int cpu)
+{
+ int old_idx, new_cpu;
+
+ old_idx = cp->elements[cpu].idx;
+ if (old_idx == IDX_INVALID) {
+ /*
+ * Nothing to remove if old_idx was invalid.
+ * This could happen if a rq_offline_dl is
+ * called for a CPU without -dl tasks running.
+ */
+ } else {
+ new_cpu = cp->elements[cp->size - 1].cpu;
+ cp->elements[old_idx].dl = cp->elements[cp->size - 1].dl;
+ cp->elements[old_idx].cpu = new_cpu;
+ cp->size--;
+ cp->elements[new_cpu].idx = old_idx;
+ cp->elements[cpu].idx = IDX_INVALID;
+ cpudl_heapify(cp->elements, cp->size, old_idx);
+ }
+}
+
+/*
+ * __cpudl_set - update the cpudl max-heap
+ * @cp: the cpudl max-heap context
+ * @cpu: the target cpu
+ * @dl: the new earliest deadline for this cpu
+ *
+ * Notes: assumes cpu_rq(cpu)->lock and cpudl->lock are locked
+ *
+ * Returns: (void)
+ */
+static void __cpudl_set(struct cpudl *cp, int cpu, u64 dl)
+{
+ int old_idx;
+
+ old_idx = cp->elements[cpu].idx;
+ if (old_idx == IDX_INVALID) {
+ int new_idx = cp->size++;
+ cp->elements[new_idx].dl = dl;
+ cp->elements[new_idx].cpu = cpu;
+ cp->elements[cpu].idx = new_idx;
+ cpudl_heapify_up(cp->elements, cp->size, new_idx);
+ } else {
+ cp->elements[old_idx].dl = dl;
+ cpudl_heapify(cp->elements, cp->size, old_idx);
+ }
+}
+
static int cpudl_fast_find(struct cpudl *cp, struct task_struct *p)
{
const struct sched_dl_entity *dl_se = &p->dl;
@@ -176,31 +234,14 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p,
*/
void cpudl_clear(struct cpudl *cp, int cpu)
{
- int old_idx, new_cpu;
unsigned long flags;

WARN_ON(!cpu_present(cpu));

raw_spin_lock_irqsave(&cp->lock, flags);
-
- old_idx = cp->elements[cpu].idx;
- if (old_idx == IDX_INVALID) {
- /*
- * Nothing to remove if old_idx was invalid.
- * This could happen if a rq_offline_dl is
- * called for a CPU without -dl tasks running.
- */
- } else {
- new_cpu = cp->elements[cp->size - 1].cpu;
- cp->elements[old_idx].dl = cp->elements[cp->size - 1].dl;
- cp->elements[old_idx].cpu = new_cpu;
- cp->size--;
- cp->elements[new_cpu].idx = old_idx;
- cp->elements[cpu].idx = IDX_INVALID;
- cpudl_heapify(cp->elements, cp->size, old_idx);
-
+ __cpudl_clear(cp, cpu);
+ if (cp->elements[cpu].idx != IDX_INVALID)
cpumask_set_cpu(cpu, cp->free_cpus);
- }
raw_spin_unlock_irqrestore(&cp->lock, flags);
}

@@ -216,26 +257,14 @@ void cpudl_clear(struct cpudl *cp, int cpu)
*/
void cpudl_set(struct cpudl *cp, int cpu, u64 dl)
{
- int old_idx;
unsigned long flags;

WARN_ON(!cpu_present(cpu));

raw_spin_lock_irqsave(&cp->lock, flags);
-
- old_idx = cp->elements[cpu].idx;
- if (old_idx == IDX_INVALID) {
- int new_idx = cp->size++;
- cp->elements[new_idx].dl = dl;
- cp->elements[new_idx].cpu = cpu;
- cp->elements[cpu].idx = new_idx;
- cpudl_heapify_up(cp->elements, cp->size, new_idx);
+ __cpudl_set(cp, cpu, dl);
+ if (cp->elements[cpu].idx == IDX_INVALID)
cpumask_clear_cpu(cpu, cp->free_cpus);
- } else {
- cp->elements[old_idx].dl = dl;
- cpudl_heapify(cp->elements, cp->size, old_idx);
- }
-
raw_spin_unlock_irqrestore(&cp->lock, flags);
}

--
1.9.1

2017-03-23 10:35:57

by Byungchul Park

[permalink] [raw]
Subject: [PATCH 8/8] sched/deadline: Return the best satisfying affinity and dl in cpudl_find

cpudl_find() is used to find a cpu having the latest dl. The function
should return the latest cpu among ones satisfying task's affinity and
dl constraint, but current code gives up immediately and just return
fail when it fails at the test *only with* the maximum cpu.

For example:

cpu 0 is running a task (dl: 10).
cpu 1 is running a task (dl: 9).
cpu 2 is running a task (dl: 8).
cpu 3 is running a task (dl: 2).

where cpu 3 want to push a task (affinity is 1 2 3 and dl is 1).

In this case, the task should be migrated from cpu 3 to cpu 1, and
preempt cpu 1's task. However, current code just returns fail because
it fails at the affinity test with the maximum cpu, that is, cpu 0.

This patch tries to find the best among ones satisfying task's affinity
and dl constraint until success or no more to see.

Signed-off-by: Byungchul Park <[email protected]>
---
kernel/sched/cpudeadline.c | 38 ++++++++++++++++++++++++++++++++++++++
kernel/sched/cpudeadline.h | 9 +++++++++
2 files changed, 47 insertions(+)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 453159a..9172646 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -174,6 +174,42 @@ static void __cpudl_set(struct cpudl *cp, int cpu, u64 dl)
}
}

+static int cpudl_slow_find(struct cpudl *cp, struct task_struct *p)
+{
+ const struct sched_dl_entity *dl_se = &p->dl;
+ unsigned long flags;
+ int prev_cpu = -1;
+ int max_cpu;
+ u64 max_dl;
+
+ raw_spin_lock_irqsave(&cp->lock, flags);
+ max_cpu = cpudl_maximum_cpu(cp);
+ max_dl = cpudl_maximum_dl(cp);
+
+ while (max_cpu != -1) {
+ if (cpumask_test_cpu(max_cpu, &p->cpus_allowed) &&
+ dl_time_before(dl_se->deadline, max_dl))
+ break;
+
+ /* Pick up the next. */
+ cp->elements[max_cpu].restore_cpu = prev_cpu;
+ cp->elements[max_cpu].restore_dl = max_dl;
+ prev_cpu = max_cpu;
+ __cpudl_clear(cp, max_cpu);
+ max_cpu = cpudl_maximum_cpu(cp);
+ max_dl = cpudl_maximum_dl(cp);
+ }
+
+ /* Restore the heap tree */
+ while (prev_cpu != -1) {
+ __cpudl_set(cp, prev_cpu, cp->elements[prev_cpu].restore_dl);
+ prev_cpu = cp->elements[prev_cpu].restore_cpu;
+ }
+
+ raw_spin_unlock_irqrestore(&cp->lock, flags);
+ return max_cpu;
+}
+
static int cpudl_fast_find(struct cpudl *cp, struct task_struct *p)
{
const struct sched_dl_entity *dl_se = &p->dl;
@@ -213,6 +249,8 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p,
goto out;
} else {
best_cpu = cpudl_fast_find(cp, p);
+ if (best_cpu == -1)
+ best_cpu = cpudl_slow_find(cp, p);
if (best_cpu != -1 && later_mask)
cpumask_set_cpu(best_cpu, later_mask);
}
diff --git a/kernel/sched/cpudeadline.h b/kernel/sched/cpudeadline.h
index f7da8c5..736ff89 100644
--- a/kernel/sched/cpudeadline.h
+++ b/kernel/sched/cpudeadline.h
@@ -10,6 +10,15 @@ struct cpudl_item {
u64 dl;
int cpu;
int idx;
+ /*
+ * cpudl_slow_find() needs to pop one by one from
+ * the heap tree until it eventually finds suitable
+ * cpu considering task's affinity. After that,
+ * we need to restore the tree to original state,
+ * using the following restore_cpu and restore_dl.
+ */
+ int restore_cpu;
+ u64 restore_dl;
};

struct cpudl {
--
1.9.1

2017-03-23 22:35:42

by Byungchul Park

[permalink] [raw]
Subject: Re: [PATCH 0/8] sched/deadline: Return the best satisfying affinity and dl in cpudl_find

On Thu, Mar 23, 2017 at 07:32:35PM +0900, Byungchul Park wrote:
> cpudl_find() is used to find a cpu having the latest dl. The function
> should return the latest cpu among ones satisfying task's affinity and
> dl constraint, but current code gives up immediately and just return
> fail when it fails at the test *only with* the maximum cpu.
>
> For example:
>
> cpu 0 is running a task (dl: 10).
> cpu 1 is running a task (dl: 9).
> cpu 2 is running a task (dl: 8).
> cpu 3 is running a task (dl: 2).
^
should be 1
>
> where cpu 3 want to push a task (affinity is 1 2 3 and dl is 1).
^
should be 2
>
> In this case, the task should be migrated from cpu 3 to cpu 1, and
> preempt cpu 1's task. However, current code just returns fail because
> it fails at the affinity test with the maximum cpu, that is, cpu 0.
>
> This patch set tries to find the best among ones satisfying task's
> affinity and dl constraint until success or no more to see.
>
> Byungchul Park (8):
> sched/deadline: Make find_later_rq() choose a closer cpu in topology
> sched/deadline: Re-define parameters of cpudl heapify functions
> sched/deadline: Add cpudl_maximum_dl() for clean-up
> sched/deadline: Factor out the selecting of suitable max cpu
> sched/deadline: Protect read of cpudl heap with a lock
> sched/deadline: Don't return meaningless cpu in cpudl_maximum_cpu()
> sched/deadline: Factor out the modifying of cpudl's heap tree
> sched/deadline: Return the best satisfying affinity and dl in
> cpudl_find
>
> kernel/sched/cpudeadline.c | 228 +++++++++++++++++++++++++++++++--------------
> kernel/sched/cpudeadline.h | 9 ++
> kernel/sched/deadline.c | 29 +++---
> 3 files changed, 182 insertions(+), 84 deletions(-)
>
> --
> 1.9.1

2017-03-23 22:37:02

by Byungchul Park

[permalink] [raw]
Subject: Re: [PATCH 8/8] sched/deadline: Return the best satisfying affinity and dl in cpudl_find

On Thu, Mar 23, 2017 at 07:32:43PM +0900, Byungchul Park wrote:
> cpudl_find() is used to find a cpu having the latest dl. The function
> should return the latest cpu among ones satisfying task's affinity and
> dl constraint, but current code gives up immediately and just return
> fail when it fails at the test *only with* the maximum cpu.
>
> For example:
>
> cpu 0 is running a task (dl: 10).
> cpu 1 is running a task (dl: 9).
> cpu 2 is running a task (dl: 8).
> cpu 3 is running a task (dl: 2).
^
should be 1
>
> where cpu 3 want to push a task (affinity is 1 2 3 and dl is 1).
^
should be 2
>
> In this case, the task should be migrated from cpu 3 to cpu 1, and
> preempt cpu 1's task. However, current code just returns fail because
> it fails at the affinity test with the maximum cpu, that is, cpu 0.
>
> This patch tries to find the best among ones satisfying task's affinity
> and dl constraint until success or no more to see.

2017-03-27 14:07:13

by Juri Lelli

[permalink] [raw]
Subject: Re: [PATCH 0/8] sched/deadline: Return the best satisfying affinity and dl in cpudl_find

Hi,

On 23/03/17 19:32, Byungchul Park wrote:
> cpudl_find() is used to find a cpu having the latest dl. The function
> should return the latest cpu among ones satisfying task's affinity and
> dl constraint, but current code gives up immediately and just return
> fail when it fails at the test *only with* the maximum cpu.
>
> For example:
>
> cpu 0 is running a task (dl: 10).
> cpu 1 is running a task (dl: 9).
> cpu 2 is running a task (dl: 8).
> cpu 3 is running a task (dl: 2).
>
> where cpu 3 want to push a task (affinity is 1 2 3 and dl is 1).

Hummm, but this should only happen if you disable admission control,
right? Otherwise task's affinity can't be smaller that 0-3.

>
> In this case, the task should be migrated from cpu 3 to cpu 1, and
> preempt cpu 1's task. However, current code just returns fail because
> it fails at the affinity test with the maximum cpu, that is, cpu 0.
>
> This patch set tries to find the best among ones satisfying task's
> affinity and dl constraint until success or no more to see.
>

Anyway, do you have numbers showing how common is you fail scenario?
It would be interesting to understand how much the slow path is actually
used, IMHO.

Thanks,

- Juri

2017-03-28 00:43:58

by Byungchul Park

[permalink] [raw]
Subject: Re: [PATCH 0/8] sched/deadline: Return the best satisfying affinity and dl in cpudl_find

On Mon, Mar 27, 2017 at 03:05:07PM +0100, Juri Lelli wrote:
> Hi,
>
> On 23/03/17 19:32, Byungchul Park wrote:
> > cpudl_find() is used to find a cpu having the latest dl. The function
> > should return the latest cpu among ones satisfying task's affinity and
> > dl constraint, but current code gives up immediately and just return
> > fail when it fails at the test *only with* the maximum cpu.
> >
> > For example:
> >
> > cpu 0 is running a task (dl: 10).
> > cpu 1 is running a task (dl: 9).
> > cpu 2 is running a task (dl: 8).
> > cpu 3 is running a task (dl: 2).
> >
> > where cpu 3 want to push a task (affinity is 1 2 3 and dl is 1).
>
> Hummm, but this should only happen if you disable admission control,
> right? Otherwise task's affinity can't be smaller that 0-3.

Hi Juri,

Can I ask you what is addmission control? Do you mean affinity setting?
And do you mean s/disable/enable? Or am I misunderstanding?

> >
> > In this case, the task should be migrated from cpu 3 to cpu 1, and
> > preempt cpu 1's task. However, current code just returns fail because
> > it fails at the affinity test with the maximum cpu, that is, cpu 0.
> >
> > This patch set tries to find the best among ones satisfying task's
> > affinity and dl constraint until success or no more to see.
> >
>
> Anyway, do you have numbers showing how common is you fail scenario?

Actually, it very depends on how to set test environment. I can provide
you ones which generate many fails. IMHO, it's not a matter of frequency
but a matter of whether it works corrently. As you know, rt policy already
works corrently regarding this problem.

In other words, if there are dl tasks in a system like:

task a (dl: 1) -+ -+
task b (dl: 2) -| -|
task c (dl: 3) -| -|
task d (dl: 4) -| -+- should be run on 4 cpus machine
task e (dl: 5) -|
task f (dl: 6) -|
task g (dl: 7) -|
task h (dl: 8) -+- should be run on 8 cpus machine
task i (dl: 9)
task j (dl: 10)

IMHO, deadline scheduler should ensure most urgent tasks as many as the
number of cpus in the system to be run, as long as their affinities are
satisfied. What do you think about this?

Thanks,
Byungchul

> It would be interesting to understand how much the slow path is actually
> used, IMHO.
>
> Thanks,
>
> - Juri

2017-03-28 07:11:52

by Juri Lelli

[permalink] [raw]
Subject: Re: [PATCH 0/8] sched/deadline: Return the best satisfying affinity and dl in cpudl_find

On 28/03/17 09:42, Byungchul Park wrote:
> On Mon, Mar 27, 2017 at 03:05:07PM +0100, Juri Lelli wrote:
> > Hi,
> >
> > On 23/03/17 19:32, Byungchul Park wrote:
> > > cpudl_find() is used to find a cpu having the latest dl. The function
> > > should return the latest cpu among ones satisfying task's affinity and
> > > dl constraint, but current code gives up immediately and just return
> > > fail when it fails at the test *only with* the maximum cpu.
> > >
> > > For example:
> > >
> > > cpu 0 is running a task (dl: 10).
> > > cpu 1 is running a task (dl: 9).
> > > cpu 2 is running a task (dl: 8).
> > > cpu 3 is running a task (dl: 2).
> > >
> > > where cpu 3 want to push a task (affinity is 1 2 3 and dl is 1).
> >
> > Hummm, but this should only happen if you disable admission control,
> > right? Otherwise task's affinity can't be smaller that 0-3.
>
> Hi Juri,
>
> Can I ask you what is addmission control? Do you mean affinity setting?

sched_setattr() for DEADLINE tasks peforms a set of checks before
admitting the task to the system. Please have a look at Documentation/
scheduler/sched-deadline.txt::Section5 for what concerns affinity.

> And do you mean s/disable/enable? Or am I misunderstanding?
>

No, I meant disable. The problem is that if you disable admission
control the problem you are pointing out can happen, if admission
control is enabled otherwise it can't, as we enforce that tasks have
affinity equal to the root_domain span to which they belong. E.g, in
your case the task will have affinity set to 0-3 (or it won't be able to
enter the system), so that would make the problem go away.

> > >
> > > In this case, the task should be migrated from cpu 3 to cpu 1, and
> > > preempt cpu 1's task. However, current code just returns fail because
> > > it fails at the affinity test with the maximum cpu, that is, cpu 0.
> > >
> > > This patch set tries to find the best among ones satisfying task's
> > > affinity and dl constraint until success or no more to see.
> > >
> >
> > Anyway, do you have numbers showing how common is you fail scenario?
>
> Actually, it very depends on how to set test environment. I can provide
> you ones which generate many fails. IMHO, it's not a matter of frequency
> but a matter of whether it works corrently. As you know, rt policy already
> works corrently regarding this problem.
>

Right. But, my point is that if what you are highlighting turns out to
be a pretty frequent situation, maybe we need to find a better data
structure to speed up push operations or we will end up using the slow
path most of the times, making the heap useless.

> In other words, if there are dl tasks in a system like:
>
> task a (dl: 1) -+ -+
> task b (dl: 2) -| -|
> task c (dl: 3) -| -|
> task d (dl: 4) -| -+- should be run on 4 cpus machine
> task e (dl: 5) -|
> task f (dl: 6) -|
> task g (dl: 7) -|
> task h (dl: 8) -+- should be run on 8 cpus machine
> task i (dl: 9)
> task j (dl: 10)
>
> IMHO, deadline scheduler should ensure most urgent tasks as many as the
> number of cpus in the system to be run, as long as their affinities are
> satisfied. What do you think about this?
>

Correct. But please read above for what regards affinities.

2017-03-28 07:30:00

by Byungchul Park

[permalink] [raw]
Subject: Re: [PATCH 0/8] sched/deadline: Return the best satisfying affinity and dl in cpudl_find

On Tue, Mar 28, 2017 at 08:11:53AM +0100, Juri Lelli wrote:
> > > > For example:
> > > >
> > > > cpu 0 is running a task (dl: 10).
> > > > cpu 1 is running a task (dl: 9).
> > > > cpu 2 is running a task (dl: 8).
> > > > cpu 3 is running a task (dl: 2).
> > > >
> > > > where cpu 3 want to push a task (affinity is 1 2 3 and dl is 1).
> > >
> > > Hummm, but this should only happen if you disable admission control,
> > > right? Otherwise task's affinity can't be smaller that 0-3.
> >
> > Hi Juri,
> >
> > Can I ask you what is addmission control? Do you mean affinity setting?
>
> sched_setattr() for DEADLINE tasks peforms a set of checks before
> admitting the task to the system. Please have a look at Documentation/
> scheduler/sched-deadline.txt::Section5 for what concerns affinity.

I see.

> > And do you mean s/disable/enable? Or am I misunderstanding?
> >
>
> No, I meant disable. The problem is that if you disable admission
> control the problem you are pointing out can happen, if admission
> control is enabled otherwise it can't, as we enforce that tasks have
> affinity equal to the root_domain span to which they belong. E.g, in
> your case the task will have affinity set to 0-3 (or it won't be able to
> enter the system), so that would make the problem go away.

I see.

> > > > In this case, the task should be migrated from cpu 3 to cpu 1, and
> > > > preempt cpu 1's task. However, current code just returns fail because
> > > > it fails at the affinity test with the maximum cpu, that is, cpu 0.
> > > >
> > > > This patch set tries to find the best among ones satisfying task's
> > > > affinity and dl constraint until success or no more to see.
> > > >
> > >
> > > Anyway, do you have numbers showing how common is you fail scenario?
> >
> > Actually, it very depends on how to set test environment. I can provide
> > you ones which generate many fails. IMHO, it's not a matter of frequency
> > but a matter of whether it works corrently. As you know, rt policy already
> > works corrently regarding this problem.
> >
>
> Right. But, my point is that if what you are highlighting turns out to
> be a pretty frequent situation, maybe we need to find a better data
> structure to speed up push operations or we will end up using the slow
> path most of the times, making the heap useless.

I totally agree with you. I will check it and let you know.

Thank you,
Byungchul