Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933138AbcKJM4z (ORCPT ); Thu, 10 Nov 2016 07:56:55 -0500 Received: from merlin.infradead.org ([205.233.59.134]:53464 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932697AbcKJM4x (ORCPT ); Thu, 10 Nov 2016 07:56:53 -0500 Date: Thu, 10 Nov 2016 13:56:46 +0100 From: Peter Zijlstra To: Henrik Austad Cc: Ingo Molnar , Thomas Gleixner , Juri Lelli , Luca Abeni , Steven Rostedt , Claudio Scordino , Tommaso Cucinotta , Daniel Bistrot de Oliveira , linux-kernel@vger.kernel.org Subject: Re: [RFD] sched/deadline: Support single CPU affinity Message-ID: <20161110125646.GS3117@twins.programming.kicks-ass.net> References: <20161110080807.GD11311@worktop.programming.kicks-ass.net> <20161110122100.GA30974@sisyphus.home.austad.us> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161110122100.GA30974@sisyphus.home.austad.us> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3979 Lines: 89 On Thu, Nov 10, 2016 at 01:21:00PM +0100, Henrik Austad wrote: > On Thu, Nov 10, 2016 at 09:08:07AM +0100, Peter Zijlstra wrote: > > We define the time to fail as: > > > > ttf(t) := t_d - t_b; where > > > > t_d is t's absolute deadline > > t_b is t's remaining budget > > > > This is the last possible moment we must schedule this task such that it can > > complete its work and not miss its deadline. > > To elaborate a bit on this (this is a modified LLF approach if my memory > serves): > > You have the dynamic time-to-failure (TtF), i.e. as the task progresses > (scheduled to run), the relative time-to-failure will remain constant. This > can be used to compare thasks to a running task and should minimize the > number of calculations required. > > Then you have the static Time-of-failure (ToF, which is the absoulte time > when a task will no longer be able to meet its deadline. This is what you > use for keeping a sorted list of tasks in the runqueue. As this is a fixed > point in time, you do not have to dynamically update or do crazy > calculation when inserting/removing threads from the rq. > > > If we then augment the regular EDF rules by, for local tasks, considering the > > time to fail and let this measure override the regular EDF pick when the > > time to fail can be overran by the EDF pick. > > Then, if you do this - do you need to constrict this to a local CPU? I > *think* you could do this in a global scheduler if you use ToF/TtF for all > deadline-tasks, I think you should be able to meet deadlines. > > I had a rant about this way back [1,2 Sec 11.4], I need to sit down and > re-read most of it, it has been a few too many years, but the idea was to > minimize the number of context-switches (which LLF is prone to get a lot > of) as well as minimize the computational overhead by avoiding > re-calculating time-of-failre/time-to-failre a lot. > > > That is, when: > > > > now + left_b > min(ttf) > > Why not just use ttf/tof for all deadline-tasks? We have all the > information available anyway and it would probably make the internal logic > easier? So the point is that the total task set isn't able to meet its deadlines per-se. Therefore, no matter which scheduling function you pick, be it EDF, LLF, TTF or EDZL, you'll miss deadlines. Consider the trivial example of 2 CPU and 3 tasks with u=2/3. That's just not going to work (with static job priority schedulers, so lets not do Pfair etc. ;-). The point here is to allow the local tasks to operate UP-like and therefore get UP-like guarantees. We know all these scheduling functions (EDF, LLF, TTF, EDZL) are optimal on UP. At the same time, we'd like the global task set to not unduly suffer, that is, if the total task set is schedulable under G-EDF, then we'd like to actually achieve this. So we have to distinguish between the local and global tasks from principle, as we have different criticality requirements for them. We want to always meet the local deadlines, but the global tasks are subject to tardiness depending on the AC function. Now, mixed criticality in general is 'proven' NP hard. But since the sporadic task model has at least 2 variables and we need to distinguish between 2 levels only, I feel this should be tractable. And this is where and why I introduced TTF, its a second measure, next to the exiting earlier deadline, to differentiate the criticality levels. This then also means we should only use TTF for local tasks, otherwise it cannot differentiate. Furthermore, as you mentioned, we don't immediately use the 0-laxity point as per EDZL (I've only read the link Tommaso provided, didn't get the article per paywall), since 0-laxity is a very fragile point, _any_ system disturbance after that will mean a fail, also, getting it requires a timer. So I slightly relaxed that to the last natural scheduling point before that, which is where the: now + leftmost_budget > min(ttf) constraint comes from.