Hi all,
I noticed some discrepancies between the schedule produced
by SCHED_DEADLINE and the expectations from real-time
scheduling theory. After some investigations, it turned out
that such discrepancies are due to two bugs in deadline.c,
which are particularly visible when using global scheduling
on multiple CPUs (see the two patches for more details).
I think the first bug (fixed in patch 0001) is particularly
critical, because it causes a violation of the SCHED_DEADLINE
guarantee (if the total load is smaller than the number of
CPUs, there is an upper bound for the response times. This is
a well known property for global EDF, but is not respected by
SCHED_DEADLINE - see patch 0001 for more details).
The second patch is IMHO also important, but less critical.
Luca Abeni (2):
Fix migration of SCHED_DEADLINE tasks
Avoid double-accounting in case of missed deadlines
kernel/sched/deadline.c | 25 ++++---------------------
1 file changed, 4 insertions(+), 21 deletions(-)
--
1.9.1
The dl_runtime_exceeded() function is supposed to ckeck if
a SCHED_DEADLINE task must be throttled, by checking if its
current runtime is <= 0. However, it also checks if the
scheduling deadline has been missed (the current time is
larger than the current scheduling deadline), further
decreasing the runtime if this happens.
This "double accounting" is wrong:
- In case of partitioned scheduling (or single CPU), this
happens if task_tick_dl() has been called later than expected
(due to small HZ values). In this case, the current runtime is
also negative, and replenish_dl_entity() can take care of the
deadline miss by recharging the current runtime to a value smaller
than dl_runtime
- In case of global scheduling on multiple CPUs, scheduling
deadlines can be missed even if the task did not consume more
runtime than expected, hence penalizing the task is wrong
This patch fix this problem by throttling a SCHED_DEADLINE task
only when its runtime becomes negative, and not modifying the runtime
Signed-off-by: Luca Abeni <[email protected]>
---
kernel/sched/deadline.c | 19 +------------------
1 file changed, 1 insertion(+), 18 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 55af498..b52092f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -570,24 +570,7 @@ void init_dl_task_timer(struct sched_dl_entity *dl_se)
static
int dl_runtime_exceeded(struct rq *rq, struct sched_dl_entity *dl_se)
{
- int dmiss = dl_time_before(dl_se->deadline, rq_clock(rq));
- int rorun = dl_se->runtime <= 0;
-
- if (!rorun && !dmiss)
- return 0;
-
- /*
- * If we are beyond our current deadline and we are still
- * executing, then we have already used some of the runtime of
- * the next instance. Thus, if we do not account that, we are
- * stealing bandwidth from the system at each deadline miss!
- */
- if (dmiss) {
- dl_se->runtime = rorun ? dl_se->runtime : 0;
- dl_se->runtime -= rq_clock(rq) - dl_se->deadline;
- }
-
- return 1;
+ return (dl_se->runtime <= 0);
}
extern bool sched_rt_bandwidth_account(struct rt_rq *rt_rq);
--
1.9.1
According to global EDF, tasks should be migrated between runqueues
without checking if their scheduling deadlines and runtimes are valid.
However, SCHED_DEADLINE currently performs such a check:
a migration happens doing
deactivate_task(rq, next_task, 0);
set_task_cpu(next_task, later_rq->cpu);
activate_task(later_rq, next_task, 0);
which ends up calling dequeue_task_dl(), setting the new CPU, and then
calling enqueue_task_dl().
enqueue_task_dl() then calls enqueue_dl_entity(), which calls
update_dl_entity(), which can modify scheduling deadline and runtime,
breaking global EDF scheduling.
As a result, some of the properties of global EDF are not respected:
for example, a taskset {(30, 80), (40, 80), (120, 170)} scheduled on
two cores can have unbounded response times for the third task even
if 30/80+40/80+120/170 = 1.5809 < 2
This can be fixed by invoking update_dl_entity() only in case of
wakeup, or if this is a new SCHED_DEADLINE task.
Signed-off-by: Luca Abeni <[email protected]>
---
kernel/sched/deadline.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index e5db8c6..55af498 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -826,10 +826,10 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se,
* parameters of the task might need updating. Otherwise,
* we want a replenishment of its runtime.
*/
- if (!dl_se->dl_new && flags & ENQUEUE_REPLENISH)
- replenish_dl_entity(dl_se, pi_se);
- else
+ if (dl_se->dl_new || flags & ENQUEUE_WAKEUP)
update_dl_entity(dl_se, pi_se);
+ else if (flags & ENQUEUE_REPLENISH)
+ replenish_dl_entity(dl_se, pi_se);
__enqueue_dl_entity(dl_se);
}
--
1.9.1
Hi Luca,
On 17/12/14 10:50, Luca Abeni wrote:
> Hi all,
>
> I noticed some discrepancies between the schedule produced
> by SCHED_DEADLINE and the expectations from real-time
> scheduling theory. After some investigations, it turned out
> that such discrepancies are due to two bugs in deadline.c,
> which are particularly visible when using global scheduling
> on multiple CPUs (see the two patches for more details).
>
> I think the first bug (fixed in patch 0001) is particularly
> critical, because it causes a violation of the SCHED_DEADLINE
> guarantee (if the total load is smaller than the number of
> CPUs, there is an upper bound for the response times. This is
> a well known property for global EDF, but is not respected by
> SCHED_DEADLINE - see patch 0001 for more details).
> The second patch is IMHO also important, but less critical.
>
I already reviewed and tested them. They looks ok and are important
fixes. ACK for both. :)
Thanks a lot!
- Juri
> Luca Abeni (2):
> Fix migration of SCHED_DEADLINE tasks
> Avoid double-accounting in case of missed deadlines
>
> kernel/sched/deadline.c | 25 ++++---------------------
> 1 file changed, 4 insertions(+), 21 deletions(-)
>