Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752377AbcJITkh (ORCPT ); Sun, 9 Oct 2016 15:40:37 -0400 Received: from mail-lf0-f67.google.com ([209.85.215.67]:34267 "EHLO mail-lf0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752245AbcJITkg (ORCPT ); Sun, 9 Oct 2016 15:40:36 -0400 Date: Sun, 9 Oct 2016 21:39:38 +0200 From: Luca Abeni To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , Tommaso Cucinotta , Juri Lelli , Thomas Gleixner , Andrea Parri Subject: About group scheduling for SCHED_DEADLINE Message-ID: <20161009213938.3cec05ea@utopia> X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2730 Lines: 51 Hi all, after the SCHED_DEADLINE TODO page (https://github.com/jlelli/sched-deadline/wiki/TODOs) has been published, there has been a private exchange of emails about the "group scheduling (cgroups)" / "hierarchical DEADLINE server for FIFO/RR" item. I'd like to start a discussion about this topic, so that the TODO item can be implemented in a way that is agreed by everyone. I add in cc all the people involved in the previous email exchange about this topic + Andrea, who originally developed a patch implementing hierarchical SCHED_DEADLINE (see http://retis.sssup.it/~nino/publication/rtlws14bdm.pdf and cited papers); I do not know who else to cc, so feel free to forward this email to the relevant people or to tell me who to add in future emails. So, I started to think about this, and here are some ideas to start a discussion: 1) First of all, we need to decide the software interface. If I understand correctly (please correct me if I am wrong), cgroups let you specify a runtime and a period, and this means that the cgroup is reserved the specified runtime every period on all the cgroup's CPUs... In other words, it is not possible to reserve different runtimes/periods on different CPUs. Is this correct? Is this what we want for hierarchical SCHED_DEADLINE? Or do we want to allow the possibility to schedule a cgroup with multiple "deadline servers" having different runtime/period parameters? (the first solution is easier to implement, the second one offers more degrees of freedom that might be used to improve the real-time schedulability) 2) Is it ok have only two levels in the scheduling hierarchy (at least in the first implementation)? 3) If this "hierarchical SCHED_DEADLINE" is implemented using multiple "deadline servers" (one per cgroup's CPU) to schedule the cgroup's tasks, should these servers be bound to CPUs, or should they be free to migrate between the cgroup's CPUs? In the first case, each one of these deadline servers can be implemented as a sched_dl_entity structure that can be scheduled only on a specific runqueue. The second case is (in my understanding) more complex to implement, because the dl push/pull code uses task structures, so a dl scheduling entity per server is not enough (unless we modify the migration code). At least, this is what I understood when looking at the code. 4) From a more theoretical point of view, it would be good to define the scheduling model that needs to be implemented (based on something previously described on some paper, or defining a new model from scratch). Well, I hope this can be a good starting point for a discussion :) Luca