Subject: [PATCH V4 0/3] sched/deadline: Fixes for constrained deadline tasks

While reading sched deadline code, I find out that a constrained
deadline task could be replenished before the next period if
activated after the deadline, opening the window to run for more
than Q/P. The patch [2] explains and fixes this problem.

Furthermore, while fixing this issue, I found that the replenishment
timer was being fired at the deadline of the task. This works fine
for implicit deadline tasks (deadline == period) because the deadline
is at the same point in time of the next period. But that is not true
for constrained deadline tasks (deadline < period). This problem is
not as visible as the first because the runtime leakage takes
place only in the second activation. Next activations receive the
correct bandwidth. However, after the 2nd activation, tasks are
activated in the (period - dl_deadline) instant, which is before
the expected activation. This problem is explained in the fix
description as well.

While testing these fixes, Rostedt tweaked the test case a little.
Instead of having the runtime equal to the deadline, he increased
the deadline ten fold. Then, the task started using much more than
.1% of the CPU. More like 20%. Looking into this he found that it
was due to the dl_entity_overflow() constantly returning true. That's
because it uses the relative period against relative runtime vs the
absolute deadline against absolute runtime. As we care about if the
runtime can make its deadline, not its period, we need to use the
task's density in the check, not the task's utilization. After
correcting this, now when the task gets enqueued, it can throttle
correctly.

Changes from V3:
- Fixes grammar errors in the patch 2/3. (Steven Rostedt)
- I was checking if the pi_se was constrained, not the task being
awakened.
This was not causing problems in the test case because
pi_se = &p->dl, but this would be a problem if we were activating
the task in a PI case:
It would check the pi-waiter, not the task being awakened (p).
Changes from V2:
- Fixes dl_entity_overflow(): (Steven Rostedt)
Patch 3/3 fixes the dl_entity_overflow() for constrained deadline
tasks by using the density, not the utilization.
(as deadline <= period, deadline is always == min(deadline, period))
Changes from V1:
- Fix a broken comment style. (Peter Zijlstra)
- Fixes dl_is_constrained(). (Steven Rostedt)
A constrained deadline task has dl_deadline < dl_period; so
"dl_runtime < dl_period"; s/runtime/deadline/

Daniel Bristot de Oliveira (2):
sched/deadline: Replenishment timer should fire in the next period
sched/deadline: Throttle a constrained deadline task activated after
the deadline

Steven Rostedt (VMware) (1):
sched/deadline: Use deadline instead of period when calculating
overflow

kernel/sched/deadline.c | 62 ++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 56 insertions(+), 6 deletions(-)

--
2.9.3


Subject: [PATCH V4 1/3] sched/deadline: Replenishment timer should fire in the next period

Currently, the replenishment timer is set to fire at the deadline
of a task. Although that works for implicit deadline tasks because the
deadline is equals to the begin of the next period, that is not correct
for constrained deadline tasks (deadline < period).

For instance:

f.c:
--------------- %< ---------------
int main (void)
{
for(;;);
}
--------------- >% ---------------

# gcc -o f f.c

# trace-cmd record -e sched:sched_switch \
-e syscalls:sys_exit_sched_setattr \
chrt -d --sched-runtime 490000000 \
--sched-deadline 500000000 \
--sched-period 1000000000 0 ./f

# trace-cmd report | grep "{pid of ./f}"

After setting parameters, the task is replenished and continue running
until being throttled.
f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0

The task is throttled after running 492318 ms, as expected.
f-11295 [003] 13322.606094: sched_switch: f:11295 [-1] R ==> \
watchdog/3:32 [0]

But then, the task is replenished 500719 ms after the first
replenishment:
<idle>-0 [003] 13322.614495: sched_switch: swapper/3:0 [120] R \
==> f:11295 [-1]

Running for 490277 ms:
f-11295 [003] 13323.104772: sched_switch: f:11295 [-1] R ==> \
swapper/3:0 [120]

Hence, in the first period, the task runs 2 * runtime, and that is a bug.

During the first replenishment, the next deadline is set one period away.
So the runtime / period starts to be respected. However, as the second
replenishment took place in the wrong instant, the next replenishment
will also be held in a wrong instant of time. Rather than occurring in
the nth period away from the first activation, it is taking place
in the (nth period - relative deadline).

Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Reviewed-by: Luca Abeni <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Reviewed-by: Juri Lelli <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Tommaso Cucinotta <[email protected]>
Cc: Luca Abeni <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Romulo Silva de Oliveira <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: [email protected]
---
kernel/sched/deadline.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 27737f3..3e3caae 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -505,10 +505,15 @@ static void update_dl_entity(struct sched_dl_entity *dl_se,
}
}

+static inline u64 dl_next_period(struct sched_dl_entity *dl_se)
+{
+ return dl_se->deadline - dl_se->dl_deadline + dl_se->dl_period;
+}
+
/*
* If the entity depleted all its runtime, and if we want it to sleep
* while waiting for some new execution time to become available, we
- * set the bandwidth enforcement timer to the replenishment instant
+ * set the bandwidth replenishment timer to the replenishment instant
* and try to activate it.
*
* Notice that it is important for the caller to know if the timer
@@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p)
* that it is actually coming from rq->clock and not from
* hrtimer's time base reading.
*/
- act = ns_to_ktime(dl_se->deadline);
+ act = ns_to_ktime(dl_next_period(dl_se));
now = hrtimer_cb_get_time(timer);
delta = ktime_to_ns(now) - rq_clock(rq);
act = ktime_add_ns(act, delta);
--
2.9.3

Subject: [PATCH V4 2/3] sched/deadline: Throttle a constrained deadline task activated after the deadline

During the activation, CBS checks if it can reuse the current task's
runtime and period. If the deadline of the task is in the past, CBS
cannot use the runtime, and so it replenishes the task. This rule
works fine for implicit deadline tasks (deadline == period), and the
CBS was designed for implicit deadline tasks. However, a task with
constrained deadline (deadine < period) might be awakened after the
deadline, but before the next period. In this case, replenishing the
task would allow it to run for runtime / deadline. As in this case
deadline < period, CBS enables a task to run for more than the
runtime / period. In a very loaded system, this can cause a domino
effect, making other tasks miss their deadlines.

To avoid this problem, in the activation of a constrained deadline
task after the deadline but before the next period, throttle the
task and set the replenishing timer to the begin of the next period,
unless it is boosted.

Reproducer:

--------------- %< ---------------
int main (int argc, char **argv)
{
int ret;
int flags = 0;
unsigned long l = 0;
struct timespec ts;
struct sched_attr attr;

memset(&attr, 0, sizeof(attr));
attr.size = sizeof(attr);

attr.sched_policy = SCHED_DEADLINE;
attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */

ts.tv_sec = 0;
ts.tv_nsec = 2000 * 1000; /* 2 ms */

ret = sched_setattr(0, &attr, flags);

if (ret < 0) {
perror("sched_setattr");
exit(-1);
}

for(;;) {
/* XXX: you may need to adjust the loop */
for (l = 0; l < 150000; l++);
/*
* The ideia is to go to sleep right before the deadline
* and then wake up before the next period to receive
* a new replenishment.
*/
nanosleep(&ts, NULL);
}

exit(0);
}
--------------- >% ---------------

On my box, this reproducer uses almost 50% of the CPU time, which is
obviously wrong for a task with 2/2000 reservation.

Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Tommaso Cucinotta <[email protected]>
Cc: Luca Abeni <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Romulo Silva de Oliveira <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: [email protected]
---
kernel/sched/deadline.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 3e3caae..b669f7f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -694,6 +694,37 @@ void init_dl_task_timer(struct sched_dl_entity *dl_se)
timer->function = dl_task_timer;
}

+/*
+ * During the activation, CBS checks if it can reuse the current task's
+ * runtime and period. If the deadline of the task is in the past, CBS
+ * cannot use the runtime, and so it replenishes the task. This rule
+ * works fine for implicit deadline tasks (deadline == period), and the
+ * CBS was designed for implicit deadline tasks. However, a task with
+ * constrained deadline (deadine < period) might be awakened after the
+ * deadline, but before the next period. In this case, replenishing the
+ * task would allow it to run for runtime / deadline. As in this case
+ * deadline < period, CBS enables a task to run for more than the
+ * runtime / period. In a very loaded system, this can cause a domino
+ * effect, making other tasks miss their deadlines.
+ *
+ * To avoid this problem, in the activation of a constrained deadline
+ * task after the deadline but before the next period, throttle the
+ * task and set the replenishing timer to the begin of the next period,
+ * unless it is boosted.
+ */
+static inline void dl_check_constrained_dl(struct sched_dl_entity *dl_se)
+{
+ struct task_struct *p = dl_task_of(dl_se);
+ struct rq *rq = rq_of_dl_rq(dl_rq_of_se(dl_se));
+
+ if (dl_time_before(dl_se->deadline, rq_clock(rq)) &&
+ dl_time_before(rq_clock(rq), dl_next_period(dl_se))) {
+ if (unlikely(dl_se->dl_boosted || !start_dl_timer(p)))
+ return;
+ dl_se->dl_throttled = 1;
+ }
+}
+
static
int dl_runtime_exceeded(struct sched_dl_entity *dl_se)
{
@@ -927,6 +958,11 @@ static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
__dequeue_dl_entity(dl_se);
}

+static inline bool dl_is_constrained(struct sched_dl_entity *dl_se)
+{
+ return dl_se->dl_deadline < dl_se->dl_period;
+}
+
static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
{
struct task_struct *pi_task = rt_mutex_get_top_task(p);
@@ -953,6 +989,15 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
}

/*
+ * Check if a constrained deadline task was activated
+ * after the deadline but before the next period.
+ * If that is the case, the task will be throttled and
+ * the replenishment timer will be set to the next period.
+ */
+ if (!p->dl.dl_throttled && dl_is_constrained(&p->dl))
+ dl_check_constrained_dl(&p->dl);
+
+ /*
* If p is throttled, we do nothing. In fact, if it exhausted
* its budget it needs a replenishment and, since it now is on
* its rq, the bandwidth timer callback (which clearly has not
--
2.9.3

Subject: [PATCH V4 3/3] sched/deadline: Use deadline instead of period when calculating overflow

From: "Steven Rostedt (VMware)" <[email protected]>

I was testing Daniel's changes with his test case, and tweaked it a
little. Instead of having the runtime equal to the deadline, I
increased the deadline ten fold.

Daniel's test case had:

attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */

To make it more interesting, I changed it to:

attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
attr.sched_deadline = 20 * 1000 * 1000; /* 20 ms */
attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */

The results were rather surprising. The behavior that Daniel's patch
was fixing came back. The task started using much more than .1% of the
CPU. More like 20%.

Looking into this I found that it was due to the dl_entity_overflow()
constantly returning true. That's because it uses the relative period
against relative runtime vs the absolute deadline against absolute
runtime.

runtime / (deadline - t) > dl_runtime / dl_period

There's even a comment mentioning this, and saying that when relative
deadline equals relative period, that the equation is the same as using
deadline instead of period. That comment is backwards! What we really
want is:

runtime / (deadline - t) > dl_runtime / dl_deadline

We care about if the runtime can make its deadline, not its period. And
then we can say "when the deadline equals the period, the equation is
the same as using dl_period instead of dl_deadline".

After correcting this, now when the task gets enqueued, it can throttle
correctly, and Daniel's fix to the throttling of sleeping deadline
tasks works even when the runtime and deadline are not the same.

Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Tommaso Cucinotta <[email protected]>
Cc: Luca Abeni <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Romulo Silva de Oliveira <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: [email protected]
---
kernel/sched/deadline.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index b669f7f..f7403e5 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -445,13 +445,13 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se,
*
* This function returns true if:
*
- * runtime / (deadline - t) > dl_runtime / dl_period ,
+ * runtime / (deadline - t) > dl_runtime / dl_deadline ,
*
* IOW we can't recycle current parameters.
*
- * Notice that the bandwidth check is done against the period. For
+ * Notice that the bandwidth check is done against the deadline. For
* task with deadline equal to period this is the same of using
- * dl_deadline instead of dl_period in the equation above.
+ * dl_period instead of dl_deadline in the equation above.
*/
static bool dl_entity_overflow(struct sched_dl_entity *dl_se,
struct sched_dl_entity *pi_se, u64 t)
@@ -476,7 +476,7 @@ static bool dl_entity_overflow(struct sched_dl_entity *dl_se,
* of anything below microseconds resolution is actually fiction
* (but still we want to give the user that illusion >;).
*/
- left = (pi_se->dl_period >> DL_SCALE) * (dl_se->runtime >> DL_SCALE);
+ left = (pi_se->dl_deadline >> DL_SCALE) * (dl_se->runtime >> DL_SCALE);
right = ((dl_se->deadline - t) >> DL_SCALE) *
(pi_se->dl_runtime >> DL_SCALE);

--
2.9.3

2017-03-06 15:52:03

by luca abeni

[permalink] [raw]
Subject: Re: [PATCH V4 2/3] sched/deadline: Throttle a constrained deadline task activated after the deadline

On Thu, 2 Mar 2017 15:10:58 +0100
Daniel Bristot de Oliveira <[email protected]> wrote:

> During the activation, CBS checks if it can reuse the current task's
> runtime and period. If the deadline of the task is in the past, CBS
> cannot use the runtime, and so it replenishes the task. This rule
> works fine for implicit deadline tasks (deadline == period), and the
> CBS was designed for implicit deadline tasks. However, a task with
> constrained deadline (deadine < period) might be awakened after the
> deadline, but before the next period. In this case, replenishing the
> task would allow it to run for runtime / deadline. As in this case
> deadline < period, CBS enables a task to run for more than the
> runtime / period. In a very loaded system, this can cause a domino
> effect, making other tasks miss their deadlines.
>
> To avoid this problem, in the activation of a constrained deadline
> task after the deadline but before the next period, throttle the
> task and set the replenishing timer to the begin of the next period,
> unless it is boosted.
[...]

I agree with Daniel that the current code is broken here... And I think
this patch is a reasonable solution (maybe we can improve it later, but
I think a fix for this issue should go in soon).

So,
Reviewed-by: Luca Abeni <[email protected]>


Thanks,
Luca


>
> Reproducer:
>
> --------------- %< ---------------
> int main (int argc, char **argv)
> {
> int ret;
> int flags = 0;
> unsigned long l = 0;
> struct timespec ts;
> struct sched_attr attr;
>
> memset(&attr, 0, sizeof(attr));
> attr.size = sizeof(attr);
>
> attr.sched_policy = SCHED_DEADLINE;
> attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms
> */ attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
> attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */
>
> ts.tv_sec = 0;
> ts.tv_nsec = 2000 * 1000; /* 2 ms */
>
> ret = sched_setattr(0, &attr, flags);
>
> if (ret < 0) {
> perror("sched_setattr");
> exit(-1);
> }
>
> for(;;) {
> /* XXX: you may need to adjust the loop */
> for (l = 0; l < 150000; l++);
> /*
> * The ideia is to go to sleep right before the
> deadline
> * and then wake up before the next period to receive
> * a new replenishment.
> */
> nanosleep(&ts, NULL);
> }
>
> exit(0);
> }
> --------------- >% ---------------
>
> On my box, this reproducer uses almost 50% of the CPU time, which is
> obviously wrong for a task with 2/2000 reservation.
>
> Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Juri Lelli <[email protected]>
> Cc: Tommaso Cucinotta <[email protected]>
> Cc: Luca Abeni <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Mike Galbraith <[email protected]>
> Cc: Romulo Silva de Oliveira <[email protected]>
> Cc: Daniel Bristot de Oliveira <[email protected]>
> Cc: [email protected]
> ---
> kernel/sched/deadline.c | 45
> +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45
> insertions(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 3e3caae..b669f7f 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -694,6 +694,37 @@ void init_dl_task_timer(struct sched_dl_entity
> *dl_se) timer->function = dl_task_timer;
> }
>
> +/*
> + * During the activation, CBS checks if it can reuse the current
> task's
> + * runtime and period. If the deadline of the task is in the past,
> CBS
> + * cannot use the runtime, and so it replenishes the task. This rule
> + * works fine for implicit deadline tasks (deadline == period), and
> the
> + * CBS was designed for implicit deadline tasks. However, a task with
> + * constrained deadline (deadine < period) might be awakened after
> the
> + * deadline, but before the next period. In this case, replenishing
> the
> + * task would allow it to run for runtime / deadline. As in this case
> + * deadline < period, CBS enables a task to run for more than the
> + * runtime / period. In a very loaded system, this can cause a domino
> + * effect, making other tasks miss their deadlines.
> + *
> + * To avoid this problem, in the activation of a constrained deadline
> + * task after the deadline but before the next period, throttle the
> + * task and set the replenishing timer to the begin of the next
> period,
> + * unless it is boosted.
> + */
> +static inline void dl_check_constrained_dl(struct sched_dl_entity
> *dl_se) +{
> + struct task_struct *p = dl_task_of(dl_se);
> + struct rq *rq = rq_of_dl_rq(dl_rq_of_se(dl_se));
> +
> + if (dl_time_before(dl_se->deadline, rq_clock(rq)) &&
> + dl_time_before(rq_clock(rq), dl_next_period(dl_se))) {
> + if (unlikely(dl_se->dl_boosted
> || !start_dl_timer(p)))
> + return;
> + dl_se->dl_throttled = 1;
> + }
> +}
> +
> static
> int dl_runtime_exceeded(struct sched_dl_entity *dl_se)
> {
> @@ -927,6 +958,11 @@ static void dequeue_dl_entity(struct
> sched_dl_entity *dl_se) __dequeue_dl_entity(dl_se);
> }
>
> +static inline bool dl_is_constrained(struct sched_dl_entity *dl_se)
> +{
> + return dl_se->dl_deadline < dl_se->dl_period;
> +}
> +
> static void enqueue_task_dl(struct rq *rq, struct task_struct *p,
> int flags) {
> struct task_struct *pi_task = rt_mutex_get_top_task(p);
> @@ -953,6 +989,15 @@ static void enqueue_task_dl(struct rq *rq,
> struct task_struct *p, int flags) }
>
> /*
> + * Check if a constrained deadline task was activated
> + * after the deadline but before the next period.
> + * If that is the case, the task will be throttled and
> + * the replenishment timer will be set to the next period.
> + */
> + if (!p->dl.dl_throttled && dl_is_constrained(&p->dl))
> + dl_check_constrained_dl(&p->dl);
> +
> + /*
> * If p is throttled, we do nothing. In fact, if it exhausted
> * its budget it needs a replenishment and, since it now is
> on
> * its rq, the bandwidth timer callback (which clearly has
> not

2017-03-07 08:02:29

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH V4 1/3] sched/deadline: Replenishment timer should fire in the next period

2017-03-02 22:10 GMT+08:00 Daniel Bristot de Oliveira <[email protected]>:
> Currently, the replenishment timer is set to fire at the deadline
> of a task. Although that works for implicit deadline tasks because the
> deadline is equals to the begin of the next period, that is not correct
> for constrained deadline tasks (deadline < period).
>
> For instance:
>
> f.c:
> --------------- %< ---------------
> int main (void)
> {
> for(;;);
> }
> --------------- >% ---------------
>
> # gcc -o f f.c
>
> # trace-cmd record -e sched:sched_switch \
> -e syscalls:sys_exit_sched_setattr \
> chrt -d --sched-runtime 490000000 \
> --sched-deadline 500000000 \
> --sched-period 1000000000 0 ./f
>
> # trace-cmd report | grep "{pid of ./f}"
>
> After setting parameters, the task is replenished and continue running
> until being throttled.
> f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0
>
> The task is throttled after running 492318 ms, as expected.

This should be us.

Otherwise,

Reviewed-by: Wanpeng Li <[email protected]>

> f-11295 [003] 13322.606094: sched_switch: f:11295 [-1] R ==> \
> watchdog/3:32 [0]
>
> But then, the task is replenished 500719 ms after the first
> replenishment:
> <idle>-0 [003] 13322.614495: sched_switch: swapper/3:0 [120] R \
> ==> f:11295 [-1]
>
> Running for 490277 ms:
> f-11295 [003] 13323.104772: sched_switch: f:11295 [-1] R ==> \
> swapper/3:0 [120]
>
> Hence, in the first period, the task runs 2 * runtime, and that is a bug.
>
> During the first replenishment, the next deadline is set one period away.
> So the runtime / period starts to be respected. However, as the second
> replenishment took place in the wrong instant, the next replenishment
> will also be held in a wrong instant of time. Rather than occurring in
> the nth period away from the first activation, it is taking place
> in the (nth period - relative deadline).
>
> Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
> Reviewed-by: Luca Abeni <[email protected]>
> Reviewed-by: Steven Rostedt (VMware) <[email protected]>
> Reviewed-by: Juri Lelli <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Juri Lelli <[email protected]>
> Cc: Tommaso Cucinotta <[email protected]>
> Cc: Luca Abeni <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Mike Galbraith <[email protected]>
> Cc: Romulo Silva de Oliveira <[email protected]>
> Cc: Daniel Bristot de Oliveira <[email protected]>
> Cc: [email protected]
> ---
> kernel/sched/deadline.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 27737f3..3e3caae 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -505,10 +505,15 @@ static void update_dl_entity(struct sched_dl_entity *dl_se,
> }
> }
>
> +static inline u64 dl_next_period(struct sched_dl_entity *dl_se)
> +{
> + return dl_se->deadline - dl_se->dl_deadline + dl_se->dl_period;
> +}
> +
> /*
> * If the entity depleted all its runtime, and if we want it to sleep
> * while waiting for some new execution time to become available, we
> - * set the bandwidth enforcement timer to the replenishment instant
> + * set the bandwidth replenishment timer to the replenishment instant
> * and try to activate it.
> *
> * Notice that it is important for the caller to know if the timer
> @@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p)
> * that it is actually coming from rq->clock and not from
> * hrtimer's time base reading.
> */
> - act = ns_to_ktime(dl_se->deadline);
> + act = ns_to_ktime(dl_next_period(dl_se));
> now = hrtimer_cb_get_time(timer);
> delta = ktime_to_ns(now) - rq_clock(rq);
> act = ktime_add_ns(act, delta);
> --
> 2.9.3
>

2017-03-07 08:02:39

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH V4 2/3] sched/deadline: Throttle a constrained deadline task activated after the deadline

2017-03-02 22:10 GMT+08:00 Daniel Bristot de Oliveira <[email protected]>:
> During the activation, CBS checks if it can reuse the current task's
> runtime and period. If the deadline of the task is in the past, CBS
> cannot use the runtime, and so it replenishes the task. This rule
> works fine for implicit deadline tasks (deadline == period), and the
> CBS was designed for implicit deadline tasks. However, a task with
> constrained deadline (deadine < period) might be awakened after the
> deadline, but before the next period. In this case, replenishing the
> task would allow it to run for runtime / deadline. As in this case
> deadline < period, CBS enables a task to run for more than the
> runtime / period. In a very loaded system, this can cause a domino
> effect, making other tasks miss their deadlines.
>
> To avoid this problem, in the activation of a constrained deadline
> task after the deadline but before the next period, throttle the
> task and set the replenishing timer to the begin of the next period,
> unless it is boosted.
>
> Reproducer:
>
> --------------- %< ---------------
> int main (int argc, char **argv)
> {
> int ret;
> int flags = 0;
> unsigned long l = 0;
> struct timespec ts;
> struct sched_attr attr;
>
> memset(&attr, 0, sizeof(attr));
> attr.size = sizeof(attr);
>
> attr.sched_policy = SCHED_DEADLINE;
> attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
> attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
> attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */
>
> ts.tv_sec = 0;
> ts.tv_nsec = 2000 * 1000; /* 2 ms */
>
> ret = sched_setattr(0, &attr, flags);
>
> if (ret < 0) {
> perror("sched_setattr");
> exit(-1);
> }
>
> for(;;) {
> /* XXX: you may need to adjust the loop */
> for (l = 0; l < 150000; l++);
> /*
> * The ideia is to go to sleep right before the deadline
> * and then wake up before the next period to receive
> * a new replenishment.
> */
> nanosleep(&ts, NULL);
> }
>
> exit(0);
> }
> --------------- >% ---------------
>
> On my box, this reproducer uses almost 50% of the CPU time, which is
> obviously wrong for a task with 2/2000 reservation.
>
> Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Juri Lelli <[email protected]>
> Cc: Tommaso Cucinotta <[email protected]>
> Cc: Luca Abeni <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Mike Galbraith <[email protected]>
> Cc: Romulo Silva de Oliveira <[email protected]>
> Cc: Daniel Bristot de Oliveira <[email protected]>
> Cc: [email protected]

Reviewed-by: Wanpeng Li <[email protected]>

> ---
> kernel/sched/deadline.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 45 insertions(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 3e3caae..b669f7f 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -694,6 +694,37 @@ void init_dl_task_timer(struct sched_dl_entity *dl_se)
> timer->function = dl_task_timer;
> }
>
> +/*
> + * During the activation, CBS checks if it can reuse the current task's
> + * runtime and period. If the deadline of the task is in the past, CBS
> + * cannot use the runtime, and so it replenishes the task. This rule
> + * works fine for implicit deadline tasks (deadline == period), and the
> + * CBS was designed for implicit deadline tasks. However, a task with
> + * constrained deadline (deadine < period) might be awakened after the
> + * deadline, but before the next period. In this case, replenishing the
> + * task would allow it to run for runtime / deadline. As in this case
> + * deadline < period, CBS enables a task to run for more than the
> + * runtime / period. In a very loaded system, this can cause a domino
> + * effect, making other tasks miss their deadlines.
> + *
> + * To avoid this problem, in the activation of a constrained deadline
> + * task after the deadline but before the next period, throttle the
> + * task and set the replenishing timer to the begin of the next period,
> + * unless it is boosted.
> + */
> +static inline void dl_check_constrained_dl(struct sched_dl_entity *dl_se)
> +{
> + struct task_struct *p = dl_task_of(dl_se);
> + struct rq *rq = rq_of_dl_rq(dl_rq_of_se(dl_se));
> +
> + if (dl_time_before(dl_se->deadline, rq_clock(rq)) &&
> + dl_time_before(rq_clock(rq), dl_next_period(dl_se))) {
> + if (unlikely(dl_se->dl_boosted || !start_dl_timer(p)))
> + return;
> + dl_se->dl_throttled = 1;
> + }
> +}
> +
> static
> int dl_runtime_exceeded(struct sched_dl_entity *dl_se)
> {
> @@ -927,6 +958,11 @@ static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
> __dequeue_dl_entity(dl_se);
> }
>
> +static inline bool dl_is_constrained(struct sched_dl_entity *dl_se)
> +{
> + return dl_se->dl_deadline < dl_se->dl_period;
> +}
> +
> static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
> {
> struct task_struct *pi_task = rt_mutex_get_top_task(p);
> @@ -953,6 +989,15 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
> }
>
> /*
> + * Check if a constrained deadline task was activated
> + * after the deadline but before the next period.
> + * If that is the case, the task will be throttled and
> + * the replenishment timer will be set to the next period.
> + */
> + if (!p->dl.dl_throttled && dl_is_constrained(&p->dl))
> + dl_check_constrained_dl(&p->dl);
> +
> + /*
> * If p is throttled, we do nothing. In fact, if it exhausted
> * its budget it needs a replenishment and, since it now is on
> * its rq, the bandwidth timer callback (which clearly has not
> --
> 2.9.3
>

2017-03-07 08:22:48

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH V4 3/3] sched/deadline: Use deadline instead of period when calculating overflow

2017-03-02 22:10 GMT+08:00 Daniel Bristot de Oliveira <[email protected]>:
> From: "Steven Rostedt (VMware)" <[email protected]>
>
> I was testing Daniel's changes with his test case, and tweaked it a
> little. Instead of having the runtime equal to the deadline, I
> increased the deadline ten fold.
>
> Daniel's test case had:
>
> attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
> attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
> attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */
>
> To make it more interesting, I changed it to:
>
> attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
> attr.sched_deadline = 20 * 1000 * 1000; /* 20 ms */
> attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */
>
> The results were rather surprising. The behavior that Daniel's patch
> was fixing came back. The task started using much more than .1% of the
> CPU. More like 20%.
>
> Looking into this I found that it was due to the dl_entity_overflow()
> constantly returning true. That's because it uses the relative period
> against relative runtime vs the absolute deadline against absolute
> runtime.
>
> runtime / (deadline - t) > dl_runtime / dl_period
>
> There's even a comment mentioning this, and saying that when relative
> deadline equals relative period, that the equation is the same as using
> deadline instead of period. That comment is backwards! What we really
> want is:
>
> runtime / (deadline - t) > dl_runtime / dl_deadline
>
> We care about if the runtime can make its deadline, not its period. And
> then we can say "when the deadline equals the period, the equation is
> the same as using dl_period instead of dl_deadline".
>
> After correcting this, now when the task gets enqueued, it can throttle
> correctly, and Daniel's fix to the throttling of sleeping deadline
> tasks works even when the runtime and deadline are not the same.
>
> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
> Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Juri Lelli <[email protected]>
> Cc: Tommaso Cucinotta <[email protected]>
> Cc: Luca Abeni <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Mike Galbraith <[email protected]>
> Cc: Romulo Silva de Oliveira <[email protected]>
> Cc: Daniel Bristot de Oliveira <[email protected]>
> Cc: [email protected]
> ---

Reviewed-by: Wanpeng Li <[email protected]>

> kernel/sched/deadline.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index b669f7f..f7403e5 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -445,13 +445,13 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se,
> *
> * This function returns true if:
> *
> - * runtime / (deadline - t) > dl_runtime / dl_period ,
> + * runtime / (deadline - t) > dl_runtime / dl_deadline ,
> *
> * IOW we can't recycle current parameters.
> *
> - * Notice that the bandwidth check is done against the period. For
> + * Notice that the bandwidth check is done against the deadline. For
> * task with deadline equal to period this is the same of using
> - * dl_deadline instead of dl_period in the equation above.
> + * dl_period instead of dl_deadline in the equation above.
> */
> static bool dl_entity_overflow(struct sched_dl_entity *dl_se,
> struct sched_dl_entity *pi_se, u64 t)
> @@ -476,7 +476,7 @@ static bool dl_entity_overflow(struct sched_dl_entity *dl_se,
> * of anything below microseconds resolution is actually fiction
> * (but still we want to give the user that illusion >;).
> */
> - left = (pi_se->dl_period >> DL_SCALE) * (dl_se->runtime >> DL_SCALE);
> + left = (pi_se->dl_deadline >> DL_SCALE) * (dl_se->runtime >> DL_SCALE);
> right = ((dl_se->deadline - t) >> DL_SCALE) *
> (pi_se->dl_runtime >> DL_SCALE);
>
> --
> 2.9.3
>

Subject: Re: [PATCH V4 0/3] sched/deadline: Fixes for constrained deadline tasks

Hi,

This is a gentle ping.

On 03/02/2017 03:10 PM, Daniel Bristot de Oliveira wrote:
> While reading sched deadline code, I find out that a constrained
> deadline task could be replenished before the next period if
> activated after the deadline, opening the window to run for more
> than Q/P. The patch [2] explains and fixes this problem.
>
> Furthermore, while fixing this issue, I found that the replenishment
> timer was being fired at the deadline of the task. This works fine
> for implicit deadline tasks (deadline == period) because the deadline
> is at the same point in time of the next period. But that is not true
> for constrained deadline tasks (deadline < period). This problem is
> not as visible as the first because the runtime leakage takes
> place only in the second activation. Next activations receive the
> correct bandwidth. However, after the 2nd activation, tasks are
> activated in the (period - dl_deadline) instant, which is before
> the expected activation. This problem is explained in the fix
> description as well.
>
> While testing these fixes, Rostedt tweaked the test case a little.
> Instead of having the runtime equal to the deadline, he increased
> the deadline ten fold. Then, the task started using much more than
> .1% of the CPU. More like 20%. Looking into this he found that it
> was due to the dl_entity_overflow() constantly returning true. That's
> because it uses the relative period against relative runtime vs the
> absolute deadline against absolute runtime. As we care about if the
> runtime can make its deadline, not its period, we need to use the
> task's density in the check, not the task's utilization. After
> correcting this, now when the task gets enqueued, it can throttle
> correctly.
>
> Changes from V3:
> - Fixes grammar errors in the patch 2/3. (Steven Rostedt)
> - I was checking if the pi_se was constrained, not the task being
> awakened.
> This was not causing problems in the test case because
> pi_se = &p->dl, but this would be a problem if we were activating
> the task in a PI case:
> It would check the pi-waiter, not the task being awakened (p).
> Changes from V2:
> - Fixes dl_entity_overflow(): (Steven Rostedt)
> Patch 3/3 fixes the dl_entity_overflow() for constrained deadline
> tasks by using the density, not the utilization.
> (as deadline <= period, deadline is always == min(deadline, period))
> Changes from V1:
> - Fix a broken comment style. (Peter Zijlstra)
> - Fixes dl_is_constrained(). (Steven Rostedt)
> A constrained deadline task has dl_deadline < dl_period; so
> "dl_runtime < dl_period"; s/runtime/deadline/
>
> Daniel Bristot de Oliveira (2):
> sched/deadline: Replenishment timer should fire in the next period
> sched/deadline: Throttle a constrained deadline task activated after
> the deadline
>
> Steven Rostedt (VMware) (1):
> sched/deadline: Use deadline instead of period when calculating
> overflow
>
> kernel/sched/deadline.c | 62 ++++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 56 insertions(+), 6 deletions(-)
>

Subject: [tip:sched/core] sched/deadline: Make sure the replenishment timer fires in the next period

Commit-ID: 5ac69d37784b237707a7b15d199cdb6c6fdb6780
Gitweb: http://git.kernel.org/tip/5ac69d37784b237707a7b15d199cdb6c6fdb6780
Author: Daniel Bristot de Oliveira <[email protected]>
AuthorDate: Thu, 2 Mar 2017 15:10:57 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 16 Mar 2017 09:37:37 +0100

sched/deadline: Make sure the replenishment timer fires in the next period

Currently, the replenishment timer is set to fire at the deadline
of a task. Although that works for implicit deadline tasks because the
deadline is equals to the begin of the next period, that is not correct
for constrained deadline tasks (deadline < period).

For instance:

f.c:
--------------- %< ---------------
int main (void)
{
for(;;);
}
--------------- >% ---------------

# gcc -o f f.c

# trace-cmd record -e sched:sched_switch \
-e syscalls:sys_exit_sched_setattr \
chrt -d --sched-runtime 490000000 \
--sched-deadline 500000000 \
--sched-period 1000000000 0 ./f

# trace-cmd report | grep "{pid of ./f}"

After setting parameters, the task is replenished and continue running
until being throttled:

f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0

The task is throttled after running 492318 ms, as expected:

f-11295 [003] 13322.606094: sched_switch: f:11295 [-1] R ==> watchdog/3:32 [0]

But then, the task is replenished 500719 ms after the first
replenishment:

<idle>-0 [003] 13322.614495: sched_switch: swapper/3:0 [120] R ==> f:11295 [-1]

Running for 490277 ms:

f-11295 [003] 13323.104772: sched_switch: f:11295 [-1] R ==> swapper/3:0 [120]

Hence, in the first period, the task runs 2 * runtime, and that is a bug.

During the first replenishment, the next deadline is set one period away.
So the runtime / period starts to be respected. However, as the second
replenishment took place in the wrong instant, the next replenishment
will also be held in a wrong instant of time. Rather than occurring in
the nth period away from the first activation, it is taking place
in the (nth period - relative deadline).

Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Luca Abeni <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Reviewed-by: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Romulo Silva de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tommaso Cucinotta <[email protected]>
Link: http://lkml.kernel.org/r/ac50d89887c25285b47465638354b63362f8adff.1488392936.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched/deadline.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index c6db3fd..445e278 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -505,10 +505,15 @@ static void update_dl_entity(struct sched_dl_entity *dl_se,
}
}

+static inline u64 dl_next_period(struct sched_dl_entity *dl_se)
+{
+ return dl_se->deadline - dl_se->dl_deadline + dl_se->dl_period;
+}
+
/*
* If the entity depleted all its runtime, and if we want it to sleep
* while waiting for some new execution time to become available, we
- * set the bandwidth enforcement timer to the replenishment instant
+ * set the bandwidth replenishment timer to the replenishment instant
* and try to activate it.
*
* Notice that it is important for the caller to know if the timer
@@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p)
* that it is actually coming from rq->clock and not from
* hrtimer's time base reading.
*/
- act = ns_to_ktime(dl_se->deadline);
+ act = ns_to_ktime(dl_next_period(dl_se));
now = hrtimer_cb_get_time(timer);
delta = ktime_to_ns(now) - rq_clock(rq);
act = ktime_add_ns(act, delta);

Subject: [tip:sched/core] sched/deadline: Use deadline instead of period when calculating overflow

Commit-ID: 2317d5f1c34913bac5971d93d69fb6c31bb74670
Gitweb: http://git.kernel.org/tip/2317d5f1c34913bac5971d93d69fb6c31bb74670
Author: Steven Rostedt (VMware) <[email protected]>
AuthorDate: Thu, 2 Mar 2017 15:10:59 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 16 Mar 2017 09:37:38 +0100

sched/deadline: Use deadline instead of period when calculating overflow

I was testing Daniel's changes with his test case, and tweaked it a
little. Instead of having the runtime equal to the deadline, I
increased the deadline ten fold.

Daniel's test case had:

attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */

To make it more interesting, I changed it to:

attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
attr.sched_deadline = 20 * 1000 * 1000; /* 20 ms */
attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */

The results were rather surprising. The behavior that Daniel's patch
was fixing came back. The task started using much more than .1% of the
CPU. More like 20%.

Looking into this I found that it was due to the dl_entity_overflow()
constantly returning true. That's because it uses the relative period
against relative runtime vs the absolute deadline against absolute
runtime.

runtime / (deadline - t) > dl_runtime / dl_period

There's even a comment mentioning this, and saying that when relative
deadline equals relative period, that the equation is the same as using
deadline instead of period. That comment is backwards! What we really
want is:

runtime / (deadline - t) > dl_runtime / dl_deadline

We care about if the runtime can make its deadline, not its period. And
then we can say "when the deadline equals the period, the equation is
the same as using dl_period instead of dl_deadline".

After correcting this, now when the task gets enqueued, it can throttle
correctly, and Daniel's fix to the throttling of sleeping deadline
tasks works even when the runtime and deadline are not the same.

Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luca Abeni <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Romulo Silva de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tommaso Cucinotta <[email protected]>
Link: http://lkml.kernel.org/r/02135a27f1ae3fe5fd032568a5a2f370e190e8d7.1488392936.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched/deadline.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 736d8b9..a2ce590 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -445,13 +445,13 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se,
*
* This function returns true if:
*
- * runtime / (deadline - t) > dl_runtime / dl_period ,
+ * runtime / (deadline - t) > dl_runtime / dl_deadline ,
*
* IOW we can't recycle current parameters.
*
- * Notice that the bandwidth check is done against the period. For
+ * Notice that the bandwidth check is done against the deadline. For
* task with deadline equal to period this is the same of using
- * dl_deadline instead of dl_period in the equation above.
+ * dl_period instead of dl_deadline in the equation above.
*/
static bool dl_entity_overflow(struct sched_dl_entity *dl_se,
struct sched_dl_entity *pi_se, u64 t)
@@ -476,7 +476,7 @@ static bool dl_entity_overflow(struct sched_dl_entity *dl_se,
* of anything below microseconds resolution is actually fiction
* (but still we want to give the user that illusion >;).
*/
- left = (pi_se->dl_period >> DL_SCALE) * (dl_se->runtime >> DL_SCALE);
+ left = (pi_se->dl_deadline >> DL_SCALE) * (dl_se->runtime >> DL_SCALE);
right = ((dl_se->deadline - t) >> DL_SCALE) *
(pi_se->dl_runtime >> DL_SCALE);


Subject: [tip:sched/core] sched/deadline: Throttle a constrained deadline task activated after the deadline

Commit-ID: df8eac8cafce7d086be3bd5cf5a838fa37594dfb
Gitweb: http://git.kernel.org/tip/df8eac8cafce7d086be3bd5cf5a838fa37594dfb
Author: Daniel Bristot de Oliveira <[email protected]>
AuthorDate: Thu, 2 Mar 2017 15:10:58 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 16 Mar 2017 09:37:38 +0100

sched/deadline: Throttle a constrained deadline task activated after the deadline

During the activation, CBS checks if it can reuse the current task's
runtime and period. If the deadline of the task is in the past, CBS
cannot use the runtime, and so it replenishes the task. This rule
works fine for implicit deadline tasks (deadline == period), and the
CBS was designed for implicit deadline tasks. However, a task with
constrained deadline (deadine < period) might be awakened after the
deadline, but before the next period. In this case, replenishing the
task would allow it to run for runtime / deadline. As in this case
deadline < period, CBS enables a task to run for more than the
runtime / period. In a very loaded system, this can cause a domino
effect, making other tasks miss their deadlines.

To avoid this problem, in the activation of a constrained deadline
task after the deadline but before the next period, throttle the
task and set the replenishing timer to the begin of the next period,
unless it is boosted.

Reproducer:

--------------- %< ---------------
int main (int argc, char **argv)
{
int ret;
int flags = 0;
unsigned long l = 0;
struct timespec ts;
struct sched_attr attr;

memset(&attr, 0, sizeof(attr));
attr.size = sizeof(attr);

attr.sched_policy = SCHED_DEADLINE;
attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */

ts.tv_sec = 0;
ts.tv_nsec = 2000 * 1000; /* 2 ms */

ret = sched_setattr(0, &attr, flags);

if (ret < 0) {
perror("sched_setattr");
exit(-1);
}

for(;;) {
/* XXX: you may need to adjust the loop */
for (l = 0; l < 150000; l++);
/*
* The ideia is to go to sleep right before the deadline
* and then wake up before the next period to receive
* a new replenishment.
*/
nanosleep(&ts, NULL);
}

exit(0);
}
--------------- >% ---------------

On my box, this reproducer uses almost 50% of the CPU time, which is
obviously wrong for a task with 2/2000 reservation.

Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luca Abeni <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Romulo Silva de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tommaso Cucinotta <[email protected]>
Link: http://lkml.kernel.org/r/edf58354e01db46bf42df8d2dd32418833f68c89.1488392936.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched/deadline.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 445e278..736d8b9 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -695,6 +695,37 @@ void init_dl_task_timer(struct sched_dl_entity *dl_se)
timer->function = dl_task_timer;
}

+/*
+ * During the activation, CBS checks if it can reuse the current task's
+ * runtime and period. If the deadline of the task is in the past, CBS
+ * cannot use the runtime, and so it replenishes the task. This rule
+ * works fine for implicit deadline tasks (deadline == period), and the
+ * CBS was designed for implicit deadline tasks. However, a task with
+ * constrained deadline (deadine < period) might be awakened after the
+ * deadline, but before the next period. In this case, replenishing the
+ * task would allow it to run for runtime / deadline. As in this case
+ * deadline < period, CBS enables a task to run for more than the
+ * runtime / period. In a very loaded system, this can cause a domino
+ * effect, making other tasks miss their deadlines.
+ *
+ * To avoid this problem, in the activation of a constrained deadline
+ * task after the deadline but before the next period, throttle the
+ * task and set the replenishing timer to the begin of the next period,
+ * unless it is boosted.
+ */
+static inline void dl_check_constrained_dl(struct sched_dl_entity *dl_se)
+{
+ struct task_struct *p = dl_task_of(dl_se);
+ struct rq *rq = rq_of_dl_rq(dl_rq_of_se(dl_se));
+
+ if (dl_time_before(dl_se->deadline, rq_clock(rq)) &&
+ dl_time_before(rq_clock(rq), dl_next_period(dl_se))) {
+ if (unlikely(dl_se->dl_boosted || !start_dl_timer(p)))
+ return;
+ dl_se->dl_throttled = 1;
+ }
+}
+
static
int dl_runtime_exceeded(struct sched_dl_entity *dl_se)
{
@@ -928,6 +959,11 @@ static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
__dequeue_dl_entity(dl_se);
}

+static inline bool dl_is_constrained(struct sched_dl_entity *dl_se)
+{
+ return dl_se->dl_deadline < dl_se->dl_period;
+}
+
static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
{
struct task_struct *pi_task = rt_mutex_get_top_task(p);
@@ -954,6 +990,15 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
}

/*
+ * Check if a constrained deadline task was activated
+ * after the deadline but before the next period.
+ * If that is the case, the task will be throttled and
+ * the replenishment timer will be set to the next period.
+ */
+ if (!p->dl.dl_throttled && dl_is_constrained(&p->dl))
+ dl_check_constrained_dl(&p->dl);
+
+ /*
* If p is throttled, we do nothing. In fact, if it exhausted
* its budget it needs a replenishment and, since it now is on
* its rq, the bandwidth timer callback (which clearly has not