Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932786AbcKYQdi (ORCPT ); Fri, 25 Nov 2016 11:33:38 -0500 Received: from mail-wm0-f65.google.com ([74.125.82.65]:32787 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932678AbcKYQda (ORCPT ); Fri, 25 Nov 2016 11:33:30 -0500 Subject: Re: RFC: documentation of the autogroup feature [v2] To: Peter Zijlstra References: <41d802dc-873a-ff02-17ff-93ce50f3e925@gmail.com> <1479901185.4306.38.camel@gmx.de> <327586fa-4672-d070-0ded-850654586273@gmail.com> <1479915229.4306.106.camel@gmx.de> <7513b0a5-c5d0-3a92-5849-995af22601e4@gmail.com> <1479921075.4306.153.camel@gmx.de> <1480078973.4075.58.camel@gmx.de> <20161125160456.GP3092@twins.programming.kicks-ass.net> Cc: mtk.manpages@gmail.com, Mike Galbraith , Ingo Molnar , linux-man , lkml , Thomas Gleixner From: "Michael Kerrisk (man-pages)" Message-ID: Date: Fri, 25 Nov 2016 17:33:23 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161125160456.GP3092@twins.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4933 Lines: 132 Hi Peter, On 11/25/2016 05:04 PM, Peter Zijlstra wrote: > On Fri, Nov 25, 2016 at 04:04:25PM +0100, Michael Kerrisk (man-pages) wrote: >>>> ┌─────────────────────────────────────────────────────┐ >>>> │FIXME │ >>>> ├─────────────────────────────────────────────────────┤ >>>> │How do the nice value of a process and the nice │ >>>> │value of an autogroup interact? Which has priority? │ >>>> │ │ >>>> │It *appears* that the autogroup nice value is used │ >>>> │for CPU distribution between task groups, and that │ >>>> │the process nice value has no effect there. (I.e., │ >>>> │suppose two autogroups each contain a CPU-bound │ >>>> │process, with one process having nice==0 and the │ >>>> │other having nice==19. It appears that they each │ >>>> │get 50% of the CPU.) It appears that the process │ >>>> │nice value has effect only with respect to schedul‐ │ >>>> │ing relative to other processes in the *same* auto‐ │ >>>> │group. Is this correct? │ >>>> └─────────────────────────────────────────────────────┘ >>> >>> Yup, entity nice level affects distribution among peer entities. >> >> Huh! I only just learned about this via my experiments while >> investigating autogroups. >> >> How long have things been like this? Always? (I don't think >> so.) Since the arrival of CFS? Since the arrival of >> autogrouping? (I'm guessing not.) Since some other point? >> (When?) > > Ever since cfs-cgroup, Okay. That begs the question still though. > this is a fundamental design point of cgroups, > and has therefore always been the case for autogroups (as that is > nothing more than an application of the cgroup code). Understood. >> It seems to me that this renders the traditional process >> nice pretty much useless. (I bet I'm not the only one who'd >> be surprised by the current behavior.) > > Its really rather fundamental to how the whole hierarchical things > works. > > CFS is a weighted fair queueing scheduler; this means each entity > receives: > > w_i > dt_i = dt -------- > \Sum w_j > > > CPU > ______/ \______ > / | | \ > A B C D > > > So if each entity {A,B,C,D} has equal weight, then they will receive > equal time. Explicitly, for C you get: > > > w_C > dt_C = dt ----------------------- > (w_A + w_B + w_C + w_D) > > > Extending this to a hierarchy, we get: > > > CPU > ______/ \______ > / | | \ > A B C D > / \ > E F > > Where C becomes a 'server' for entities {E,F}. The weight of C does not > depend on its child entities. This way the time of {E,F} becomes a > straight product of their ratio with C. That is; the whole thing > becomes, where l denotes the level in the hierarchy and i an > entity on that level: > > l w_g,i > dt_l,i = dt \Prod ---------- > g=0 \Sum w_g,j > > > Or more concretely, for E: > > w_E > dt_1,E = dt_0,C ----------- > (w_E + w_F) > > w_C w_E > = dt ----------------------- ----------- > (w_A + w_B + w_C + w_D) (w_E + w_F) > > > And this 'trivially' extends to SMP, with the tricky bit being that the > sums over all entities end up being machine wide, instead of per CPU, > which is a real and royal pain for performance. Okay -- you're really quite the ASCII artist. And somehow, I think you needed to compose the mail in LaTeX. But thanks for the detail. It's helpful, for me at least. > Note that this property, where the weight of the server entity is > independent from its child entities is a desired feature. Without that > it would be impossible to control the relative weights of groups, and > that is the sole parameter of the WFQ model. > > It is also why Linus so likes autogroups, each session competes equally > amongst one another. I get it. But, the behavior changes for the process nice value are undocumented, and they should be documented. I understand what the behavior change was. But not yet when. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/