Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933823AbcKJOgG (ORCPT ); Thu, 10 Nov 2016 09:36:06 -0500 Received: from mail-wm0-f66.google.com ([74.125.82.66]:34992 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933033AbcKJOgE (ORCPT ); Thu, 10 Nov 2016 09:36:04 -0500 Date: Thu, 10 Nov 2016 15:23:21 +0100 From: luca abeni To: Henrik Austad Cc: Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Juri Lelli , Steven Rostedt , Claudio Scordino , Tommaso Cucinotta , Daniel Bistrot de Oliveira , linux-kernel@vger.kernel.org Subject: Re: [RFD] sched/deadline: Support single CPU affinity Message-ID: <20161110152321.5372a582@sweethome> In-Reply-To: <20161110125635.GB30974@sisyphus.home.austad.us> References: <20161110080807.GD11311@worktop.programming.kicks-ass.net> <20161110122100.GA30974@sisyphus.home.austad.us> <20161110133840.7a6faa55@sweethome> <20161110125635.GB30974@sisyphus.home.austad.us> Organization: university of trento X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4887 Lines: 118 Hi Henrik, On Thu, 10 Nov 2016 13:56:35 +0100 Henrik Austad wrote: > On Thu, Nov 10, 2016 at 01:38:40PM +0100, luca abeni wrote: > > Hi Henrik, > > Hi Luca, > > > On Thu, 10 Nov 2016 13:21:00 +0100 > > Henrik Austad wrote: > > > On Thu, Nov 10, 2016 at 09:08:07AM +0100, Peter Zijlstra wrote: > > [...] > > > > We define the time to fail as: > > > > > > > > ttf(t) := t_d - t_b; where > > > > > > > > t_d is t's absolute deadline > > > > t_b is t's remaining budget > > > > > > > > This is the last possible moment we must schedule this task such > > > > that it can complete its work and not miss its deadline. > > > > > > To elaborate a bit on this (this is a modified LLF approach if my > > > memory serves): > > > > > > You have the dynamic time-to-failure (TtF), i.e. as the task > > > progresses (scheduled to run), the relative time-to-failure will > > > remain constant. This can be used to compare thasks to a running > > > task and should minimize the number of calculations required. > > > > > > Then you have the static Time-of-failure (ToF, which is the > > > absoulte time when a task will no longer be able to meet its > > > deadline. This is what you use for keeping a sorted list of tasks > > > in the runqueue. As this is a fixed point in time, you do not > > > have to dynamically update or do crazy calculation when > > > inserting/removing threads from the rq. > > > > Sorry, I am missing something here: if ttf is defined as > > ttf_i = d_i - q_i > > So I picked the naming somewhat independently of Peter, his approach > is the _absolute_ time of failure, the actual time X, irrespective of > the task running or not. > > I added 2 different measures for the same thing: > > * ToF: > The absolute time of failure is the point in time when the task will > no longer be able to meet its deadline. If a task is scheduled and is > running on a CPU, this value will move forward at the speed of > execution. i.e. when the task is running, this value is changing. > When the task is waiting in the runqueue, this value is constant. Ah, ok... So, if I understand well you ToF is Peter's ttf... Right? > TtF: > The relative time to failure is the value that is tied to the local > CPU so to speak. When a task is running, this value is constant as it > is the remaining time until the task is no longer able to meet its > deadline. When the task is enqueued, this value will steadily > decrease as it draws closer to the time when it will fail. So, if I understand weel, TtF = ToF - current time... Right? I think this is often called "laxity" or "slack time". No? [...] > > > > If we then augment the regular EDF rules by, for local tasks, > > > > considering the time to fail and let this measure override the > > > > regular EDF pick when the time to fail can be overran by the EDF > > > > pick. > > > > > > Then, if you do this - do you need to constrict this to a local > > > CPU? I *think* you could do this in a global scheduler if you use > > > ToF/TtF for all deadline-tasks, I think you should be able to > > > meet deadlines. > > I think the ToF/TtF scheduler will be equivalent to LLF (see > > previous emails)... Or am I misunderstanding something? (see above) > > And LLF is not optimal on multiple CPUs, so I do not think it will > > be able to meet deadlines if you use it as a global scheduler. > > I think I called it Earliest Failure First (I really wanted to call > it failure-driven scheduling but that implied a crappy scheduler ;) Ok, but... How is it different from LLF? In my understanding (but, again, maybe I am missing something) ToF and TtF are just a way to implement LLF more efficiently (because, as you notice, ToF does not change when a task is executing and TtF does not change when the task is executing). Luca > LLF is prone to high task-switch count when multiple threads gets > close to 0 laxity. But as I said, it's been a while since I last > worked through the theory, so I have some homework to do before > arguing too hard about this. > > > > I had a rant about this way back [1,2 Sec 11.4], I need to sit > > > down and re-read most of it, it has been a few too many years, > > > but the idea was to minimize the number of context-switches > > > (which LLF is prone to get a lot of) as well as minimize the > > > computational overhead by avoiding re-calculating > > > time-of-failre/time-to-failre a lot. > > > > That is, when: > > > > > > > > now + left_b > min(ttf) > > > > > > Why not just use ttf/tof for all deadline-tasks? We have all the > > > information available anyway and it would probably make the > > > internal logic easier? > > I think LLF causes more preemptions and migrations than EDF. > > yes, it does, which is why you need to adjust LLF to minimize the > number of task-switches. > > -Henrik