Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752530AbdC0Iyb (ORCPT ); Mon, 27 Mar 2017 04:54:31 -0400 Received: from mail-vk0-f46.google.com ([209.85.213.46]:35889 "EHLO mail-vk0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752258AbdC0IyJ (ORCPT ); Mon, 27 Mar 2017 04:54:09 -0400 MIME-Version: 1.0 In-Reply-To: <20170327102028.291c99f0@luca> References: <1490327582-4376-1-git-send-email-luca.abeni@santannapisa.it> <1490327582-4376-3-git-send-email-luca.abeni@santannapisa.it> <20170324132041.6fxayfe3id4af3n5@hirez.programming.kicks-ass.net> <20170324224715.4098dbfb@nowhere> <20170324223146.1bd4bba5@grimm.local.home> <20170327102028.291c99f0@luca> From: Claudio Scordino Date: Mon, 27 Mar 2017 10:54:02 +0200 Message-ID: Subject: Re: [RFC v5 2/9] sched/deadline: improve the tracking of active utilization To: Luca Abeni Cc: Steven Rostedt , Peter Zijlstra , Linux Kernel , Ingo Molnar , Juri Lelli , Tommaso Cucinotta , Daniel Bristot de Oliveira , Joel Fernandes , Mathieu Poirier Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5735 Lines: 136 Hi guys, 2017-03-27 10:20 GMT+02:00 Luca Abeni : > On Fri, 24 Mar 2017 22:31:46 -0400 > Steven Rostedt wrote: > >> On Fri, 24 Mar 2017 22:47:15 +0100 >> luca abeni wrote: >> >> > Ok... Since I am not good at ascii art, would it be ok to add a >> > textual description? If yes, I'll add a comment like: >> > " >> > The utilization of a task is added to the runqueue's active >> > utilization when the task becomes active (is enqueued in the >> > runqueue), and is removed when the task becomes inactive. A task >> > does not become immediately inactive when it blocks, but becomes >> > inactive at the so called "0 lag time"; so, we setup the "inactive >> > timer" to fire at the "0 lag time". When the "inactive timer" >> > fires, the task utilization is removed from the runqueue's active >> > utilization. If the task wakes up again on the same runqueue before >> > the "0 lag time", the active utilization must not be changed and >> > the "inactive timer" must be cancelled. If the task wakes up again >> > on a different runqueue before the "0 lag time", then the task's >> > utilization must be removed from the previous runqueue's active >> > utilization and must be added to the new runqueue's active >> > utilization. In order to avoid races between a task waking up on a >> > runqueue while the "inactive timer" is running on a different CPU, >> > the "dl_non_contending" flag is used to indicate that a task is not >> > on a runqueue but is active (so, the flag is set when the task >> > blocks and is cleared when the "inactive timer" fires or when the >> > task wakes up). >> >> Sure, the above is great if you never want anyone to read it ;) >> >> Can you please break it up a little. My head starts to spin by the >> third line down. > > Ok... Maybe finding a clean and understandable way to explain the > above sentence is something that can be done at the OSPM summit? What about adding the following text to Documentation/ ? Does it make things clearer ? Cheers, Claudio The following figure illustrates the state names of the GRUB algorithm: ------------ (d) | Active | ------------->| | | | Contending | | ------------ | A | ---------- | | | | | | | Inactive | |(b) | (a) | | | | ---------- | | A | V | ------------ | | Active | --------------| Non | (c) | Contending | ------------ A task can be in one of the following states: - ActiveContending: if it is ready for execution (or executing); - ActiveNonContending: if it just blocked and has not yet surpassed the 0-lag time; - Inactive: if it is blocked and has surpassed the 0-lag time. For each runqueue, the algorithm GRUB keeps track of two different bandwidths: - Active bandwidth (running_bw): this is the sum of the bandwidths of all tasks in active state (i.e., ActiveContending or ActiveNonContending); - Total bandwidth (this_bw): this is the sum of all tasks "belonging" to the runqueue, including the tasks in Inactive state. State transitions: (a) When a task blocks, it does not become immediately inactive since its bandwidth cannot be immediately reclaimed without breaking the real-time guarantees. It therefore enters a transitional state called ActiveNonContending. The scheduler arms the "inactive timer" to fire at the 0-lag time, when the task's bandwidth can be reclaimed without breaking the real-time guarantees. (b) If the task wakes up before the inactive timer fires, the task re-enters the ActiveContending state and the "inactive timer" is canceled. In addition, if the task wakes up on a different runqueue, then the task's utilization must be removed from the previous runqueue's active utilization and must be added to the new runqueue's active utilization. In order to avoid races between a task waking up on a runqueue while the "inactive timer" is running on a different CPU, the "dl_non_contending" flag is used to indicate that a task is not on a runqueue but is active (so, the flag is set when the task blocks and is cleared when the "inactive timer" fires or when the task wakes up). (c) When the "inactive timer" fires, the task enters the Inactive state and its utilization is removed from the runqueue's active utilization. (d) When an inactive task wakes up, it enters the ActiveContending state and its utilization is added to the active utilization of the runqueue where it has been enqueued. The algorithm reclaims the bandwidth of the tasks in Inactive state. It does so by decrementing the runtime of the executing task Ti at a pace equal to (1 - Uinact)*dt where Uinact is the inactive utilization, computed as (this_bq - running_bw). The GRUB algorithm is described in the following papers: [1] G. Lipari, S. Baruah, Greedy reclamation of unused bandwidth in constant-bandwidth servers, 12th IEEE Euromicro Conference on Real-Time Systems, 2000. [2] L. Abeni, J. Lelli, C. Scordino, L. Palopoli, Greedy CPU reclaiming for SCHED DEADLINE. In Proceedings of the Real-Time Linux Workshop (RTLWS), Dusseldorf, Germany, 2014. [3] L. Abeni, G. Lipari, A. Parri, Y. Sun, Multicore CPU reclaiming: parallel or sequential?. In Proceedings of the 31st Annual ACM Symposium on Applied Computing, 2016.