Received: by 10.223.176.5 with SMTP id f5csp552907wra; Wed, 7 Feb 2018 03:49:20 -0800 (PST) X-Google-Smtp-Source: AH8x225JwZSNHAiZaB9nwB5pTCaTgPxTOmxkDD+MBtjjGZEdiIwAc2del8G1HkhX0kuiLK91YqPy X-Received: by 10.98.236.65 with SMTP id k62mr5725542pfh.223.1518004160556; Wed, 07 Feb 2018 03:49:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518004160; cv=none; d=google.com; s=arc-20160816; b=Rq2yVYTAyGyRMiqH9feB6xWXjMVT4qNACrBday5KctLItQvGFjxRcP+EoSVz/VnNQ4 hRk3EgNr0RflBumvOg7+gSQG+YnKrDiSdcND3I/IBEo320Y6+v8+0gOGHIsJ8ooGX12g 5s50IOQaABnJEtZqQR+H2OIc7lk0GhZgTkRXyUDnfGjLQ7Hqy1lUboGiMEG8nMO6esT9 WlmiAp+A8LRNQO4Pcw2fBOfeM6IKlaCWAtcrSkyf7qvx6QOiIcZka/T5KiLGmBWZ1+dU jj5zyhDGt/ThAZvmP1gqMZKMba6eOYd6pF795egVTWP6ld2zgp7tGs8fFl7FQ3M+keOO WhIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=gnkraAM+bR10DKNb1qi/SsJCsbR9+YceAZ7/uvPDmP0=; b=ax7OoMM6eExa7zXEttJiGs3pkYFPnN4wV1aZIBwAG+tLKfVt5O0Gdi1+w7OulVxt3A pmlw7yXSN1bRYWO0tbX9GsM+80de34k8+yPcpwd3TpvenDecXpSEgKGPLH3KejzrwJ6i kTcun5dQbHm+0WHYKubvQjq9V5BKIjmCvgvkF3NW2U1Ut/RAspe+HmpapkoWv5WovcsP w/ZFJ3w/8KukwAW/sbv6KIFp1C5FAG2EZ9M/nyCAxrj/WPTxJibk+bioGO5DcGhCrVGB l7CHxsH289Ff4RxJR+giIhg2OoFZ6wDCZVxCkHwtBbpwjBX6eGmMIxu7xdAt3B4UNb9H TDQA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id be8-v6si965442plb.792.2018.02.07.03.49.06; Wed, 07 Feb 2018 03:49:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753876AbeBGLs1 (ORCPT + 99 others); Wed, 7 Feb 2018 06:48:27 -0500 Received: from foss.arm.com ([217.140.101.70]:49794 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753478AbeBGLsZ (ORCPT ); Wed, 7 Feb 2018 06:48:25 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6DFFC80D; Wed, 7 Feb 2018 03:48:25 -0800 (PST) Received: from e110439-lin (e110439-lin.cambridge.arm.com [10.1.210.68]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 132743F53D; Wed, 7 Feb 2018 03:48:22 -0800 (PST) Date: Wed, 7 Feb 2018 11:48:20 +0000 From: Patrick Bellasi To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ingo Molnar , "Rafael J . Wysocki" , Viresh Kumar , Vincent Guittot , Paul Turner , Dietmar Eggemann , Morten Rasmussen , Juri Lelli , Todd Kjos , Joel Fernandes , Steve Muckle Subject: Re: [PATCH v4 1/3] sched/fair: add util_est on top of PELT Message-ID: <20180207114820.GI5739@e110439-lin> References: <20180206144131.31233-1-patrick.bellasi@arm.com> <20180206144131.31233-2-patrick.bellasi@arm.com> <20180206155056.GF2269@hirez.programming.kicks-ass.net> <20180206183315.GG5739@e110439-lin> <20180206190900.GN2249@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180206190900.GN2249@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06-Feb 20:09, Peter Zijlstra wrote: > On Tue, Feb 06, 2018 at 06:33:15PM +0000, Patrick Bellasi wrote: > > > Good point, I was actually expecting this question and I should have > > added it to the cover letter, sorry. > > > > The reasoning was: the task's estimated utilization is defined as the > > max between PELT and the "estimation". Where "estimation" is > > the max between EWMA and the last ENQUEUED utilization. > > > > Thus I was envisioning these two calls: > > > > _task_util_est := max(EWMA, ENQUEUED) > > task_util_est := max(util_avg, _task_util_est) > > > > but since now we have clients only for the first API, I've not added > > the second one. Still I would prefer to keep the "_" to make it clear > > that's and util_est's internal signal, not the actual task's estimated > > utilization. > > > > Does it make sense? > > > > Do you prefer I just remove the "_" and we will refactor it once we > > should add a customer for the proper task's util_est? > > Hurm... I was thinking: > > task_util_est := max(util_avg, EWMA) > > But the above mixes ENQUEUED into it.. *confused*. > > Also, I think I'm confused by the 'enqueued' name... it makes sense for > the cfs use-case, where we track the sum of all 'enqueued' tasks, but it > doesn't make sense for the task use-case where it tracks task_util() at > dequeue time. Can be confusing at first, I've changed name in this version while trying to consolidate concepts in a reasonable way. Let see if I can cast some light... First of all: a) task_util_est := max(util_avg, EWMA, enqueued) b) enqueued := util_avg@dequeue time a) why we mix "enqueued" into? I think we need all the three components to properly describe these scenarios: - a task becoming smaller: the "EWMA" amortize the task's utilization reduction, usually converging to the new lower utilization in ~6 activations, which should be reasonable at least for some mobile use-cases. - a task becoming bigger: the "enqueued" value better represents the task utilization while the "EWMA" catches up, but... that's true only from the second longer running activation. For the first bigger activation, when the task starts to run longer then before, at a certain point the util_avg will be bigger both "enqueued" and "EWMA", thus task_util_est() should just track the PELT's "util_avg". Here is a picture showing all these behaviors using a "ramp" task generated with rt-app: https://photos.app.goo.gl/kRYgvpS6PTgvu2AM2 b) why enqueued for tasks? Since v3 it was: task : util_avg { util_est { last, ewma } } cpu : cfs_rq { util_est_enqueued } and Joel noticed that, since now util_est{} is part of util_avg{}, which is included also by cfs_rq{}, then we can use the same util_avg{} for cfs_rq{} too. Problem was that using cfs_rq::util_est::last was not meaningful to me, hence I've try to find a concept which was working well for both cpus and tasks. Enqueued IMO seems to work for both, provided you read the task's enqueued util_est as: "the util_est of a task which is currently enqueued in its cfs_rq" which (only) happens to be the util_avg@dequeue time. IMO, also for tasks, that's not worst than util_est.last... Maybe it's just matter to add a bit of documentation in the util_est struct definition? Does that makes a bit more sense now? > > > > +/* > > > > + * Check if the specified (signed) value is within a specified margin, > > > > + * based on the observation that: > > > > + * abs(x) < y := (unsigned)(x + y - 1) < (2 * y - 1) > > > > + */ > > > > +static inline bool within_margin(long value, unsigned int margin) > > > > > > This mixing of long and int is dodgy, do we want to consistently use int > > > here? > > > > Right, perhaps better "unsigned int" for both params, isn't? > > Can't, must be signed, since @value is supposed to be able to be > negative remember? ;-) Right... :) > > > > + WRITE_ONCE(cfs_rq->avg.util_est.enqueued, util_est); > > > > > + WRITE_ONCE(p->se.avg.util_est.enqueued, util_last); > > > > > > Two stores to that word... can we fix that nicely? > > > > Good point, the single word comes from the goal to fit into the same > > cache line of sched_avg. > > Its the above two stores I confuzzled to be the same. So n/m all that. Ok, fine... however, isn't a single WRITE_ONCE on "struct util_est" still better? Did not checked at generated code... > > > > +SCHED_FEAT(UTIL_EST, false) > > > > > > Since you couldn't measure it, do we wants it true? > > > > I'm just a single tester so far, I would be more confident once > > someone volunteer to turn this on to give a better coverage. > > Lets just enable it by default, that way its far more likely someone > will complain about it :-), disabling it again is a trivial matter when > needed. Ok, at the end it's your call... but you right, this could help on testing it better. Moreover, on low-utilization (desktop like) workloads we do see measurable benefits for interactive tasks. > Also, your patch 2/3 have insufficient READ_ONCE()s. Damn, I'll look at it... Cheers Patrick -- #include Patrick Bellasi