Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp5310306imm; Tue, 18 Sep 2018 07:36:17 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbVwJIW6tQ03wTxp0SGnSy0KwsEzWQNpGMfWYjKDNR1xk3CFLDOTAPYfl7XU7whrz2bjnx3 X-Received: by 2002:a63:2605:: with SMTP id m5-v6mr27010051pgm.225.1537281377648; Tue, 18 Sep 2018 07:36:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537281377; cv=none; d=google.com; s=arc-20160816; b=DRlmWZ4LzrnxgTL6vfjvF7oveq22+fmJcZlVbS0xtIfuDIomU8NBISbLVqUc/zwUyG F008EEthMpjmXYLVnxK6Fagq3kaEXHsty6cjMGPn3tAR0CJDlAAD3xPo6EeKabX274io zjA9f9EpRVao2wMd0q4cAoPTrO0OrdIBK20w8wYUI5aDzYTBxKp/VqXLZLONhpnj0FIO lXGFhebTh9/vi+blB1bNIyEMwGOWO3s+CyBfH6WCHkdkrUhA+lWxcO/XIWJu6dAUsQiH cxlB67WajKBlnuW7M7kSQDTf+qO4F/dyyFEaFbHsi3eLUEslCr9i3XhrHLcmtr/F5U2k s1tA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to:date :cc:to:from:subject:message-id; bh=gdXKkFU2EQtAxmbvYvfA4mNzF3YaLE0om3ZoRGjs79M=; b=lM2Nu8w/bKyzOOv95R3+pEc2twPxCPj/CWLFrezGXm8fFlSlHQST50QnqRBmd86KXh YP/9UL2fFZsLOjf+lw16z2qQdyGe/aQWF8olB6Aby98HqCsjz5m9np/NQib+BF2uZR2C S9imnjUfA6a4rCtPCHFmzH7NNwQ/oq//4+EHpru5MoYIL4hg1CpPJ7ZS1VLZ03y5cudS IvIMBYlDDwJr//mtKDXTC6iWmRLhctiHCD+3unvt5sthrZ51iGwga4UOM+DcROh1Sfak 1QMkXM4QzSQyKGnIgNYXn4lKeNYHS6JOF306jTVVgg6TPxMtCc2MowWa7bjFWPKKR+JR ltqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z9-v6si18406102pln.462.2018.09.18.07.35.51; Tue, 18 Sep 2018 07:36:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729447AbeIRUIi (ORCPT + 99 others); Tue, 18 Sep 2018 16:08:38 -0400 Received: from shelob.surriel.com ([96.67.55.147]:41556 "EHLO shelob.surriel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727348AbeIRUIi (ORCPT ); Tue, 18 Sep 2018 16:08:38 -0400 Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1g2H6K-0001dD-OW; Tue, 18 Sep 2018 10:35:40 -0400 Message-ID: Subject: Re: Task group cleanups and optimizations (was: Re: [RFC 00/60] Coscheduling for Linux) From: Rik van Riel To: "Jan H." =?ISO-8859-1?Q?Sch=F6nherr?= , Peter Zijlstra Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Paul Turner , Vincent Guittot , Morten Rasmussen , Tim Chen Date: Tue, 18 Sep 2018 10:35:39 -0400 In-Reply-To: <08b930d9-7ffe-7df3-ab35-e7b58073e489@amazon.de> References: <20180907214047.26914-1-jschoenh@amazon.de> <20180914111251.GC24106@hirez.programming.kicks-ass.net> <1d86f497-9fef-0b19-50d6-d46ef1c0bffa@amazon.de> <282230fe-b8de-01f9-c19b-6070717ba5f8@amazon.de> <20180917094844.GR24124@hirez.programming.kicks-ass.net> <08b930d9-7ffe-7df3-ab35-e7b58073e489@amazon.de> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-+NPxZ2Y1B5mNYuNsf2uj" X-Mailer: Evolution 3.28.5 (3.28.5-1.fc28) Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-+NPxZ2Y1B5mNYuNsf2uj Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2018-09-18 at 15:22 +0200, Jan H. Sch=C3=B6nherr wrote: > On 09/17/2018 11:48 AM, Peter Zijlstra wrote: > > On Sat, Sep 15, 2018 at 10:48:20AM +0200, Jan H. Sch=C3=B6nherr wrote: > >=20 > > >=20 > > > CFS bandwidth control would also need to change significantly as > > > we would now > > > have to dequeue/enqueue nested cgroups below a > > > throttled/unthrottled hierarchy. > > > Unless *those* task groups don't participate in this flattening. > >=20 > > Right, so the whole bandwidth thing becomes a pain; the simplest > > solution is to detect the throttle at task-pick time, dequeue and > > try > > again. But that is indeed quite horrible. > >=20 > > I'm not quite sure how this will play out. > >=20 > > Anyway, if we pull off this flattening feat, then you can no longer > > use > > the hierarchy for this co-scheduling stuff. >=20 > Yeah. I might be a bit biased towards keeping or at least not fully > throwing away > the nesting of CFS runqueues. ;) I do not have a strong bias either way. However, I=20 would like the overhead of the cpu controller to be so low that we can actually use it :) Task priorities in a flat runqueue are relatively straightforward, with vruntime scaling just like done for nice levels, but I have to admit that throttled groups provide a challenge. Dequeueing throttled tasks is pretty straightforward, but requeueing them afterwards when they are no longer throttled could present a real challenge in some situations. > However, the only efficient way that I can currently think of, is a > hybrid model > between the "full nesting" that is currently there, and the "no > nesting" you were > describing above. >=20 > It would flatten all task groups that do not actively contribute some > function, > which would be all task groups purely for accounting purposes and > those for > *unthrottled* CFS hierarchies (and those for coscheduling that > contain exactly > one SE in a runqueue). The nesting would still be kept for > *throttled* hierarchies > (and the coscheduling stuff). (And if you wouldn't have mentioned a > way to get > rid of nesting completely, I would have kept a single level of > nesting for > accounting purposes as well.) >=20 > This would allow us to lazily dequeue SEs that have run out of > bandwidth when > we encounter them, and already enqueue them in the nested task group > (whose SE > is not enqueued at the moment). That way, it's still a O(1) operation > to re-enable > all tasks, once runtime is available again. And O(1) to throttle a > repeat offender. I suspect most systems will have a number of runnable tasks no larger than the number of CPUs most of the time. That makes "re-enable all the tasks" often equivalent to "re-enable one task". Can we handle the re-enabling (or waking up!) of one task almost as fast as we can without the cpu controller? --=20 All Rights Reversed. --=-+NPxZ2Y1B5mNYuNsf2uj Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEKR73pCCtJ5Xj3yADznnekoTE3oMFAluhDTsACgkQznnekoTE 3oP9UwgAufSl9MI+BwVFF2CVVDpB1vTf5en5CrkqN/e0MoA/tfsoDMjgD5NMuy9F orMIVzJHTcgmfJ9YJm8CMmVslQ8D7ZMWmei/UOd6ZT7Mv0FSDTJMCK/6ELe7ZLyT Knj521o03ylNsC0H5DdgdCNDYYM/LWZpy9ZNoqg9lYcStgA14eeb3AIE8C7lQ7AI hrSZ6F/tb9OvGveRXw8pQ7e7ZGMmL/Hn/UUp9nNSxiiQyil1tSdsMP9Gg23UaPI0 QktgDDlRusjIqCPqBAEZ5wgD4TF0NLeeRyYITQIRJs2/B7B4lLSdjc27bJbUZjEa HXGF+QlKRB9oWVRs7V0lMxQqFAQhXw== =WsYD -----END PGP SIGNATURE----- --=-+NPxZ2Y1B5mNYuNsf2uj--