Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933172AbcKJM5G (ORCPT ); Thu, 10 Nov 2016 07:57:06 -0500 Received: from mail-lf0-f66.google.com ([209.85.215.66]:35401 "EHLO mail-lf0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932985AbcKJM4y (ORCPT ); Thu, 10 Nov 2016 07:56:54 -0500 Date: Thu, 10 Nov 2016 13:56:35 +0100 From: Henrik Austad To: luca abeni Cc: Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Juri Lelli , Steven Rostedt , Claudio Scordino , Tommaso Cucinotta , Daniel Bistrot de Oliveira , linux-kernel@vger.kernel.org Subject: Re: [RFD] sched/deadline: Support single CPU affinity Message-ID: <20161110125635.GB30974@sisyphus.home.austad.us> References: <20161110080807.GD11311@worktop.programming.kicks-ass.net> <20161110122100.GA30974@sisyphus.home.austad.us> <20161110133840.7a6faa55@sweethome> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Yylu36WmvOXNoKYn" Content-Disposition: inline In-Reply-To: <20161110133840.7a6faa55@sweethome> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5213 Lines: 139 --Yylu36WmvOXNoKYn Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 10, 2016 at 01:38:40PM +0100, luca abeni wrote: > Hi Henrik, Hi Luca, > On Thu, 10 Nov 2016 13:21:00 +0100 > Henrik Austad wrote: > > On Thu, Nov 10, 2016 at 09:08:07AM +0100, Peter Zijlstra wrote: > [...] > > > We define the time to fail as: > > >=20 > > > ttf(t) :=3D t_d - t_b; where > > >=20 > > > t_d is t's absolute deadline > > > t_b is t's remaining budget > > >=20 > > > This is the last possible moment we must schedule this task such > > > that it can complete its work and not miss its deadline. =20 > >=20 > > To elaborate a bit on this (this is a modified LLF approach if my > > memory serves): > >=20 > > You have the dynamic time-to-failure (TtF), i.e. as the task > > progresses (scheduled to run), the relative time-to-failure will > > remain constant. This can be used to compare thasks to a running task > > and should minimize the number of calculations required. > >=20 > > Then you have the static Time-of-failure (ToF, which is the absoulte > > time when a task will no longer be able to meet its deadline. This is > > what you use for keeping a sorted list of tasks in the runqueue. As > > this is a fixed point in time, you do not have to dynamically update > > or do crazy calculation when inserting/removing threads from the rq. >=20 > Sorry, I am missing something here: if ttf is defined as > ttf_i =3D d_i - q_i So I picked the naming somewhat independently of Peter, his approach is=20 the _absolute_ time of failure, the actual time X, irrespective of the task= =20 running or not. I added 2 different measures for the same thing: * ToF:=20 The absolute time of failure is the point in time when the task will no=20 longer be able to meet its deadline. If a task is scheduled and is running= =20 on a CPU, this value will move forward at the speed of execution. i.e. when= =20 the task is running, this value is changing. When the task is waiting in=20 the runqueue, this value is constant. TtF: The relative time to failure is the value that is tied to the local CPU so= =20 to speak. When a task is running, this value is constant as it is the=20 remaining time until the task is no longer able to meet its deadline. When= =20 the task is enqueued, this value will steadily decrease as it draws closer= =20 to the time when it will fail. So when a task is running on a CPU, you use TtF, when it is in the runqueue= =20 you compare ToF > (where d_i is the deadline of thask i and q_i is its remaining budget), > then it also is the time before which you have to schedule task i if > you do not want to miss the deadline... No? So, I do not understand the > difference with tof. So you can calculate one form the other given absolute deadline and=20 remaining budget (or consumed CPU-time). But it is handy to use both as it= =20 removes a lot of duplicity and once you get the hang of the terms, makes it= =20 a bit easier to reason about the system. > > > If we then augment the regular EDF rules by, for local tasks, > > > considering the time to fail and let this measure override the > > > regular EDF pick when the time to fail can be overran by the EDF > > > pick. =20 > >=20 > > Then, if you do this - do you need to constrict this to a local CPU? > > I *think* you could do this in a global scheduler if you use ToF/TtF > > for all deadline-tasks, I think you should be able to meet deadlines. > I think the ToF/TtF scheduler will be equivalent to LLF (see previous > emails)... Or am I misunderstanding something? (see above) > And LLF is not optimal on multiple CPUs, so I do not think it will be > able to meet deadlines if you use it as a global scheduler. I think I called it Earliest Failure First (I really wanted to call it=20 failure-driven scheduling but that implied a crappy scheduler ;) LLF is prone to high task-switch count when multiple threads gets close to= =20 0 laxity. But as I said, it's been a while since I last worked through the= =20 theory, so I have some homework to do before arguing too hard about this. > > I had a rant about this way back [1,2 Sec 11.4], I need to sit down > > and re-read most of it, it has been a few too many years, but the > > idea was to minimize the number of context-switches (which LLF is > > prone to get a lot of) as well as minimize the computational overhead > > by avoiding re-calculating time-of-failre/time-to-failre a lot. > >=20 > > > That is, when: > > >=20 > > > now + left_b > min(ttf) =20 > >=20 > > Why not just use ttf/tof for all deadline-tasks? We have all the=20 > > information available anyway and it would probably make the internal > > logic easier? > I think LLF causes more preemptions and migrations than EDF. yes, it does, which is why you need to adjust LLF to minimize the number of= =20 task-switches. -Henrik --Yylu36WmvOXNoKYn Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAlgkboMACgkQ6k5VT6v45lkQrgCdHf3vgBuxKby8p6J0e+zBNCZ3 kIUAn1fgPk013C1zj6vAq1zXW0ESA+1h =OUpS -----END PGP SIGNATURE----- --Yylu36WmvOXNoKYn--