Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753159AbcKGNvq (ORCPT ); Mon, 7 Nov 2016 08:51:46 -0500 Received: from mail-qt0-f175.google.com ([209.85.216.175]:36495 "EHLO mail-qt0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753108AbcKGNvl (ORCPT ); Mon, 7 Nov 2016 08:51:41 -0500 Subject: Re: [PATCH] sched/rt: RT_RUNTIME_GREED sched feature To: Tommaso Cucinotta , Ingo Molnar , Peter Zijlstra References: <04fe756b-27f6-b9d0-f0a3-ee66a403cd96@sssup.it> Cc: Steven Rostedt , Christoph Lameter , linux-rt-users , LKML From: Daniel Bristot de Oliveira Message-ID: <0893b09f-7412-d6d3-4b8d-ad811af814dd@redhat.com> Date: Mon, 7 Nov 2016 14:51:37 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <04fe756b-27f6-b9d0-f0a3-ee66a403cd96@sssup.it> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3184 Lines: 70 Hi Tommaso, On 11/07/2016 11:31 AM, Tommaso Cucinotta wrote: > as anticipated live to Daniel: > -) +1 for the general concept, we'd need something similar also for > SCHED_DEADLINE Resumed: the sum of the runtime of deadline tasks will not be greater than the "to_ratio(global_rt_period(), global_rt_runtime())" - see init_dl_bw(). Therefore, DL rq will not be throttle by the RT throttling mechanism. Extended: RT tasks' throttling aims to bound, for all CPUS of a domain - when RT_RUNTIME_SHARING sharing is enabled; or per-rq - when RT_RUNTIME_SHARING is disabled; the amount of time that RT tasks can run continuously, in such way to provide some CPU time for non-real-time tasks to run. RT tasks need this global/local throttling mechanism to avoid the starvation of non-rt tasks because RT tasks do not have a limited runtime - RT task (or taskset) can run for an infinity runtime. DL tasks' throttling has another meaning. DL tasks' throttling aims to avoid *a* DL task for running for more than *its own* pre-allocated runtime. The sum of allocated runtime for all DL tasks will not to be greater than RT throttling enforcement runtime. The DL scheduler admission control already avoids this by limiting the amount of CPU time all DL tasks can consume (see init_dl_bw()). So, DL tasks are avoid ind the "global" throttling on before hand - in the admission control. GRUB might implement something <> for the DEADLINE scheduler. With GRUB, a deadline tasks will have more runtime than previously set/granted..... But I am quite sure it will still be bounded by the sum of the already allocated DL runtime, that will continue being smaller than "to_ratio(global_rt_period(), global_rt_runtime())". Am I missing something? > -) only issue might be that, if a non-RT task wakes up after the > unthrottle, it will have to wait, but worst-case it will have a chance > in the next throttling window In the current default behavior (RT_RUNTIME_SHARING), in a domain with more than two CPUs, the worst case easily become "infinity," because a CPU can borrow runtime from another CPU. There is no guarantee for minimum latency for non-rt tasks. Anyway, if the user wants to provide such guarantee, they just need not enable this feature, while disabling RT_RUNTIME_SHARING (or run the non-rt task as a deadline task ;-)) > -) an alternative to unthrottling might be temporary class downgrade to > sched_other, but that might be much more complex, instead this Daniel's > one looks quite simple Yeah, decrease the priority of the task would be something way more complicated and prone to errors. RT tasks would need to reduce its priority to a level higher than the IDLE task, but lower than SCHED_IDLE... > -) when considering also DEADLINE tasks, it might be good to think about > how we'd like the throttling of DEADLINE and RT tasks to inter-relate, > e.g.: Currently, DL tasks are limited (in the bw control) to the global RT throttling limit... I think that this might be an extension to GRUB... that is extending the current behavior... so... things for the future - and IMHO it is another topic - way more challenging. Comments are welcome :-) -- Daniel