Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752468AbdDLF0o (ORCPT ); Wed, 12 Apr 2017 01:26:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39416 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751027AbdDLF0m (ORCPT ); Wed, 12 Apr 2017 01:26:42 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 6F49931F3ED Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=xpang@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 6F49931F3ED Reply-To: xlpang@redhat.com Subject: Re: [PATCH] sched/deadline: Throttle a constrained task activated if overflow References: <1491816131-20268-1-git-send-email-xlpang@redhat.com> <4d24b94c-1e57-b82c-ff0f-099157ba5526@redhat.com> To: Daniel Bristot de Oliveira , Xunlei Pang , linux-kernel@vger.kernel.org Cc: Peter Zijlstra , Juri Lelli , Ingo Molnar , Luca Abeni , Steven Rostedt , Tommaso Cucinotta , =?UTF-8?Q?R=c3=b4mulo_Silva_de_Oliveira?= , Mathieu Poirier From: Xunlei Pang Message-ID: <58EDBAC4.7020902@redhat.com> Date: Wed, 12 Apr 2017 13:27:32 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <4d24b94c-1e57-b82c-ff0f-099157ba5526@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Wed, 12 Apr 2017 05:26:41 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9644 Lines: 228 On 04/11/2017 at 04:47 AM, Daniel Bristot de Oliveira wrote: > On 04/10/2017 11:22 AM, Xunlei Pang wrote: >> I was testing Daniel's changes with his test case in the commit >> df8eac8cafce ("sched/deadline: Throttle a constrained deadline >> task activated after the deadline"), and tweaked it a little. >> >> Instead of having the runtime equal to the deadline, I tweaked >> runtime, deadline and sleep value to ensure every time it calls >> dl_check_constrained_dl() with "dl_se->deadline > rq_clock(rq)" >> as well as true dl_entity_overflow(), so it does replenishing >> every wake up in update_dl_entity(), and break its bandwidth. >> >> Daniel's test case had: >> attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */ >> attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms*/ >> attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */ >> ts.tv_sec = 0; >> ts.tv_nsec = 2000 * 1000; /* 2 ms */ >> >> I changed it to: >> attr.sched_runtime = 5 * 1000 * 1000; /* 5 ms */ >> attr.sched_deadline = 7 * 1000 * 1000; /* 7 ms */ >> attr.sched_period = 1 * 1000 * 1000 * 1000; /* 1 s */ >> ts.tv_sec = 0; >> ts.tv_nsec = 1000 * 1000; /* 1 ms */ >> >> The change above can result in over 25% of the CPU on my machine. >> >> In order to avoid the beakage, we improve dl_check_constrained_dl() >> to prevent dl tasks from being activated until the next period if it >> runs out of bandwidth of the current period. > The problem now is that, with your patch, we will throttle the task > with some possible runtime. Moreover, the task did not brake any > rule, like being awakened after the deadline - the user-space is not > misbehaving. > > That is +- what the reproducer is doing when using your patch, > (I put some trace_printk when noticing the overflow in the wakeup). > > -0 [007] d.h. 1505.066439: enqueue_task_dl: my current runtime is 3657361 and the deadline is 4613027 from now > -0 [007] d.h. 1505.066439: enqueue_task_dl: my dl_runtime is 5000000 > > and so the task will be throttled with 3657361 ns runtime available. > > As we can see, it is really breaking the density: > > 5ms / 7ms (.714285) < 3657361 / 4613027 (.792833) > > Well, it is not breaking that much. Trying to be less pessimist, we can > compute a new runtime with following equation: > > runtime = (dl_runtime / dl_deadline) * (deadline - now) > > That is, a runtime which fits in the task's density. > > In code: > > dl_se->runtime = (div64_u64(dl_se->dl_runtime << 20, dl_se->dl_deadline) * (dl_se->deadline - rq_clock(rq))) >> 20; > > For example (a trace_printk) showing the adjusted runtime for the > previous task: > -0 [007] d.h. 1505.066440: enqueue_task_dl: I can still run for 3294208 (it is not that bad) > > With the adjusted runtime, we have the following density: > > 3294208 / 4613027 = .714109 > > as .714109 < .714285 > > VoilĂ . We can still use 3.2 ms of runtime! Not bad at all. > > However, even this result is, somehow, controversial. Because we are > reducing the task's reservation (Q/P). But that is very close to the > implicit deadline behavior: when it "overflows", the runtime is truncated > (a replenishment...) and the deadline is postponed. In this case, we do > not need to postpone the deadline, just to "truncate" the runtime to fit > in the density... it is not that bad. > > Another possibility is not to replenish a constrained deadline > task twice in a period. In this case, the task would run further > the deadline, potentially causing problems for implicit deadline tasks. > But! even implicit deadline would do it in a overloaded system. The > problem is that, by using the density the system easily becomes > overloaded (too pessimistic). > > Resuming (so far): > > 1) We can be pessimistic to the constrained deadline task, > with Xunlei's patch; > 2) Compute a new runtime to fit in the density - somehow pessimistic. > 3) Avoid replenishing a constrained deadline before the next period. > but the system will become overload very easily (density). > > I think the option 2 is the mid-term. > I am showing a _proof of concept_ patch bellow. I is based in the > Xunlei's changes in the verification. I need to polish it... but... > you guys can read the idea in C. > > I have the 3) too, but I am not sure if it is as good as 2. > > We still need to think more about the solution.... test more... I will > continue working on this tomorrow, discussing with luca and tommaso > as well. > > Thoughts? Am I missing something (probably, I am tired :-) )? > > --- > kernel/sched/deadline.c | 53 ++++++++++++++++++++++++++++--------------------- > 1 file changed, 30 insertions(+), 23 deletions(-) > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 7508129..6fa4887 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -696,34 +696,38 @@ void init_dl_task_timer(struct sched_dl_entity *dl_se) > } > > /* > - * During the activation, CBS checks if it can reuse the current task's > - * runtime and period. If the deadline of the task is in the past, CBS > - * cannot use the runtime, and so it replenishes the task. This rule > - * works fine for implicit deadline tasks (deadline == period), and the > - * CBS was designed for implicit deadline tasks. However, a task with > - * constrained deadline (deadine < period) might be awakened after the > - * deadline, but before the next period. In this case, replenishing the > - * task would allow it to run for runtime / deadline. As in this case > - * deadline < period, CBS enables a task to run for more than the > - * runtime / period. In a very loaded system, this can cause a domino > - * effect, making other tasks miss their deadlines. > - * > - * To avoid this problem, in the activation of a constrained deadline > - * task after the deadline but before the next period, throttle the > - * task and set the replenishing timer to the begin of the next period, > - * unless it is boosted. > + * XXX: Daniel will document it better in a clean patch, this is > + * a proof of concept. > */ > static inline void dl_check_constrained_dl(struct sched_dl_entity *dl_se) > { > struct task_struct *p = dl_task_of(dl_se); > struct rq *rq = rq_of_dl_rq(dl_rq_of_se(dl_se)); > + u64 clock = rq_clock(rq); > > - if (dl_time_before(dl_se->deadline, rq_clock(rq)) && > - dl_time_before(rq_clock(rq), dl_next_period(dl_se))) { > - if (unlikely(dl_se->dl_boosted || !start_dl_timer(p))) > - return; > - dl_se->dl_throttled = 1; > + /* we are in a new period */ > + if (dl_time_before(dl_next_period(dl_se), rq_clock(rq))) > + return; > + > + /* are we ok with the deadline and density? */ > + if (dl_time_before(rq_clock(rq), dl_se->deadline) && > + !dl_entity_overflow(dl_se, dl_se, rq_clock(rq))) > + return; > + > + /* is the density the problem? */ > + if (dl_entity_overflow(dl_se, dl_se, clock)) { > + /* reduces the runtime so it fits in the density */ > + dl_se->runtime = > + (div64_u64(dl_se->dl_runtime << 20, dl_se->dl_deadline) * > + (dl_se->deadline - clock)) >> 20; The more I read the code, the more I am confused why dl_entity_overflow() is needed, if the task is before its deadline, just let it run. E.g. For (runtime 2ms, deadline 4ms, period 8ms), for some reason was preempted after running a very short time 0.1ms, after 0.9ms it was scheduled back and immediately sleep 1ms, when it is awakened, remaining runtime is 1.9ms, remaining deadline(deadline-now) within this period is 2ms, but dl_entity_overflow() is true. However, clearly it can be correctly run 1.9ms before deadline comes wthin this period. We can add a condition in dl_runtime_exceeded(), if its deadline is passed, then zero its runtime if positive, and a new period begins. I did some tests with the following patch, seems it works well, please correct me if I am wrong. --- kernel/sched/deadline.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index a2ce590..600c530 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -498,8 +498,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq = dl_rq_of_se(dl_se); struct rq *rq = rq_of_dl_rq(dl_rq); - if (dl_time_before(dl_se->deadline, rq_clock(rq)) || - dl_entity_overflow(dl_se, pi_se, rq_clock(rq))) { + if (!dl_time_before(rq_clock(rq), dl_se->deadline)) { dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline; dl_se->runtime = pi_se->dl_runtime; } @@ -722,13 +721,22 @@ static inline void dl_check_constrained_dl(struct sched_dl_entity *dl_se) dl_time_before(rq_clock(rq), dl_next_period(dl_se))) { if (unlikely(dl_se->dl_boosted || !start_dl_timer(p))) return; + + if (dl_se->runtime > 0) + dl_se->runtime = 0; + dl_se->dl_throttled = 1; } } static -int dl_runtime_exceeded(struct sched_dl_entity *dl_se) +int dl_runtime_exceeded(struct rq *rq, struct sched_dl_entity *dl_se) { + if (!dl_time_before(rq_clock(rq), dl_se->deadline)) { + if (dl_se->runtime > 0) + dl_se->runtime = 0; + } + return (dl_se->runtime <= 0); } @@ -779,7 +787,7 @@ static void update_curr_dl(struct rq *rq) dl_se->runtime -= delta_exec; throttle: - if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) { + if (dl_runtime_exceeded(rq, dl_se) || dl_se->dl_yielded) { dl_se->dl_throttled = 1; __dequeue_task_dl(rq, curr, 0); if (unlikely(dl_se->dl_boosted || !start_dl_timer(curr))) -- 1.8.3.1