Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752499AbcKGKbq (ORCPT ); Mon, 7 Nov 2016 05:31:46 -0500 Received: from ms01.sssup.it ([193.205.80.99]:57532 "EHLO sssup.it" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752250AbcKGKbp (ORCPT ); Mon, 7 Nov 2016 05:31:45 -0500 Subject: Re: [PATCH] sched/rt: RT_RUNTIME_GREED sched feature To: Daniel Bristot de Oliveira , Ingo Molnar , Peter Zijlstra References: Cc: Steven Rostedt , Christoph Lameter , linux-rt-users , LKML From: Tommaso Cucinotta Message-ID: <04fe756b-27f6-b9d0-f0a3-ee66a403cd96@sssup.it> Date: Mon, 7 Nov 2016 11:31:47 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6010 Lines: 156 as anticipated live to Daniel: -) +1 for the general concept, we'd need something similar also for SCHED_DEADLINE -) only issue might be that, if a non-RT task wakes up after the unthrottle, it will have to wait, but worst-case it will have a chance in the next throttling window -) an alternative to unthrottling might be temporary class downgrade to sched_other, but that might be much more complex, instead this Daniel's one looks quite simple -) when considering also DEADLINE tasks, it might be good to think about how we'd like the throttling of DEADLINE and RT tasks to inter-relate, e.g.: a) DEADLINE unthrottles if there's no RT nor OTHER tasks? what if there's an unthrottled RT? b) DEADLINE throttles by downgrading to OTHER? c) DEADLINE throttles by downgrading to RT (RR/FIFO and what prio?) My2c, thanks! T. On 07/11/2016 09:17, Daniel Bristot de Oliveira wrote: > The rt throttling mechanism prevents the starvation of non-real-time > tasks by CPU intensive real-time tasks. In terms of percentage, > the default behavior allows real-time tasks to run up to 95% of a > given period, leaving the other 5% of the period for non-real-time > tasks. In the absence of non-rt tasks, the system goes idle for 5% > of the period. > > Although this behavior works fine for the purpose of avoiding > bad real-time tasks that can hang the system, some greed users > want to allow the real-time task to continue running in the absence > of non-real-time tasks starving. In other words, they do not want to > see the system going idle. > > This patch implements the RT_RUNTIME_GREED scheduler feature for greedy > users (TM). When enabled, this feature will check if non-rt tasks are > starving before throttling the real-time task. If the real-time task > becomes throttled, it will be unthrottled as soon as the system goes > idle, or when the next period starts, whichever comes first. > > This feature is enabled with the following command: > # echo RT_RUNTIME_GREED > /sys/kernel/debug/sched_features > > The user might also want to disable NO_RT_RUNTIME_SHARE logic, > to keep all CPUs with the same rt_runtime. > # echo NO_RT_RUNTIME_SHARE > /sys/kernel/debug/sched_features > > With these two options set, the user will guarantee some runtime > for non-rt-tasks on all CPUs, while keeping real-time tasks running > as much as possible. > > The feature is disabled by default, keeping the current behavior. > > Signed-off-by: Daniel Bristot de Oliveira > Reviewed-by: Steven Rostedt > Cc: Ingo Molnar > Cc: Peter Zijlstra > Cc: Steven Rostedt > Cc: Christoph Lameter > Cc: linux-rt-users > Cc: LKML > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 42d4027..c4c62ee 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -3275,7 +3275,8 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct pin_cookie cookie > if (unlikely(!p)) > p = idle_sched_class.pick_next_task(rq, prev, cookie); > > - return p; > + if (likely(p != RETRY_TASK)) > + return p; > } > > again: > diff --git a/kernel/sched/features.h b/kernel/sched/features.h > index 69631fa..3bd7a6d 100644 > --- a/kernel/sched/features.h > +++ b/kernel/sched/features.h > @@ -66,6 +66,7 @@ SCHED_FEAT(RT_PUSH_IPI, true) > > SCHED_FEAT(FORCE_SD_OVERLAP, false) > SCHED_FEAT(RT_RUNTIME_SHARE, true) > +SCHED_FEAT(RT_RUNTIME_GREED, false) > SCHED_FEAT(LB_MIN, false) > SCHED_FEAT(ATTACH_AGE_LOAD, true) > > diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c > index 5405d3f..0f23e06 100644 > --- a/kernel/sched/idle_task.c > +++ b/kernel/sched/idle_task.c > @@ -26,6 +26,10 @@ static void check_preempt_curr_idle(struct rq *rq, struct task_struct *p, int fl > static struct task_struct * > pick_next_task_idle(struct rq *rq, struct task_struct *prev, struct pin_cookie cookie) > { > + if (sched_feat(RT_RUNTIME_GREED)) > + if (try_to_unthrottle_rt_rq(&rq->rt)) > + return RETRY_TASK; > + > put_prev_task(rq, prev); > update_idle_core(rq); > schedstat_inc(rq->sched_goidle); > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > index 2516b8d..a6961a5 100644 > --- a/kernel/sched/rt.c > +++ b/kernel/sched/rt.c > @@ -631,6 +631,22 @@ static inline struct rt_bandwidth *sched_rt_bandwidth(struct rt_rq *rt_rq) > > #endif /* CONFIG_RT_GROUP_SCHED */ > > +static inline void unthrottle_rt_rq(struct rt_rq *rt_rq) > +{ > + rt_rq->rt_time = 0; > + rt_rq->rt_throttled = 0; > + sched_rt_rq_enqueue(rt_rq); > +} > + > +int try_to_unthrottle_rt_rq(struct rt_rq *rt_rq) > +{ > + if (rt_rq_throttled(rt_rq)) { > + unthrottle_rt_rq(rt_rq); > + return 1; > + } > + return 0; > +} > + > bool sched_rt_bandwidth_account(struct rt_rq *rt_rq) > { > struct rt_bandwidth *rt_b = sched_rt_bandwidth(rt_rq); > @@ -920,6 +936,18 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq) > * but accrue some time due to boosting. > */ > if (likely(rt_b->rt_runtime)) { > + if (sched_feat(RT_RUNTIME_GREED)) { > + struct rq *rq = rq_of_rt_rq(rt_rq); > + /* > + * If there is no other tasks able to run > + * on this rq, lets be greed and reset our > + * rt_time. > + */ > + if (rq->nr_running == rt_rq->rt_nr_running) { > + rt_rq->rt_time = 0; > + return 0; > + } > + } > rt_rq->rt_throttled = 1; > printk_deferred_once("sched: RT throttling activated\n"); > } else { > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 055f935..450ca34 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -502,6 +502,8 @@ struct rt_rq { > #endif > }; > > +int try_to_unthrottle_rt_rq(struct rt_rq *rt_rq); > + > /* Deadline class' related fields in a runqueue */ > struct dl_rq { > /* runqueue is an rbtree, ordered by deadline */