Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp3351484ybd; Fri, 28 Jun 2019 07:11:05 -0700 (PDT) X-Google-Smtp-Source: APXvYqzhZy0yk7jpM9+ifbKwUNsWuzfVEnhiFgGHOdzP8goAHEyIwTSlL7lEvLDeDKu3NNy+pQr5 X-Received: by 2002:a17:90a:9281:: with SMTP id n1mr13115071pjo.25.1561731065115; Fri, 28 Jun 2019 07:11:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561731065; cv=none; d=google.com; s=arc-20160816; b=y8znYfedppuJ2ZRZ3/qlaZWG5qw6JaeUv8LIMhMh0nzqXZUxYhlNN0nkCDnRqBT5dM szz6ZB6Amc8jHij1rIH9xJ8kfL62SvhPA3IX2ygMH8a4s6PtDuWy0K1de/22W086X5sd q0JuEH5JdKMfx+wX/Vkbm7f0mHCZXEa/8Ao5mbopJ61M5EsACBmAiv9g/3YTX9x+eDR2 4SBx14pnkPL8Gud3MsEdSobj0Suce32PWdnmvDRlq3ioO92mHeADmklLkpRMQMRJgOLg MubrzouvpTncA31K629wvzN5gBFURm04LqdOpRg8MlgYuaX1goWMsFdxPOCBrHV6nq5O fSGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=z1+FUzTw7oKZJ5NHw5JC9NDgp1iC3C1l2LNlncBlIak=; b=WM7qd7q9UlmK9jMEfMGiYL/vK2Gn6m32ZXx8jcBulkhEcMCHrQQbhal3oDRsH//9L+ 0DRlz0UrBmSzcI6s9YZ3jKvFUXrmgRr8ms17SwRV8Md6ECRqaSr9vvion+DPSQMZFCXt C79X8BrzVbOGfXw2xtudMVnUW7SjdtvG2BuuSlgGWq0VD/33JdchI2wvoyVWiOWNfrvB vVUnwmgiYisQrLt3QLOL/QJAKaySORkerpxwOzk8wUWyEMfs5m3QBepzUN9WJhn8FkFB aNLBuLPiikX6jEpUjnqsjJwhPNt3KfEMij9sihoIgRpI5oUT7YGlqwtvz+AAffEpiBCv yHnw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h19si2267764pgm.379.2019.06.28.07.10.47; Fri, 28 Jun 2019 07:11:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726817AbfF1OKQ (ORCPT + 99 others); Fri, 28 Jun 2019 10:10:16 -0400 Received: from foss.arm.com ([217.140.110.172]:48680 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726657AbfF1OKQ (ORCPT ); Fri, 28 Jun 2019 10:10:16 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 891092B; Fri, 28 Jun 2019 07:10:15 -0700 (PDT) Received: from e110439-lin (e110439-lin.cambridge.arm.com [10.1.194.43]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 10DAB3F706; Fri, 28 Jun 2019 07:10:13 -0700 (PDT) Date: Fri, 28 Jun 2019 15:10:11 +0100 From: Patrick Bellasi To: Vincent Guittot Cc: Peter Zijlstra , linux-kernel , Ingo Molnar , "Rafael J . Wysocki" , Viresh Kumar , Douglas Raillard , Quentin Perret , Dietmar Eggemann , Morten Rasmussen , Juri Lelli Subject: Re: [PATCH] sched/fair: util_est: fast ramp-up EWMA on utilization increases Message-ID: <20190628141011.d4oo5ezp4kxgrfnn@e110439-lin> References: <20190620150555.15717-1-patrick.bellasi@arm.com> <20190628100751.lpcwsouacsi2swkm@e110439-lin> <20190628123800.GS3419@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28-Jun 15:51, Vincent Guittot wrote: > On Fri, 28 Jun 2019 at 14:38, Peter Zijlstra wrote: > > > > On Fri, Jun 28, 2019 at 11:08:14AM +0100, Patrick Bellasi wrote: > > > On 26-Jun 13:40, Vincent Guittot wrote: > > > > Hi Patrick, > > > > > > > > On Thu, 20 Jun 2019 at 17:06, Patrick Bellasi wrote: > > > > > > > > > > The estimated utilization for a task is currently defined based on: > > > > > - enqueued: the utilization value at the end of the last activation > > > > > - ewma: an exponential moving average which samples are the enqueued values > > > > > > > > > > According to this definition, when a task suddenly change it's bandwidth > > > > > requirements from small to big, the EWMA will need to collect multiple > > > > > samples before converging up to track the new big utilization. > > > > > > > > > > Moreover, after the PELT scale invariance update [1], in the above scenario we > > > > > can see that the utilization of the task has a significant drop from the first > > > > > big activation to the following one. That's implied by the new "time-scaling" > > > > > > > > Could you give us more details about this? I'm not sure to understand > > > > what changes between the 1st big activation and the following one ? > > > > > > We are after a solution for the problem Douglas Raillard discussed at > > > OSPM, specifically the "Task util drop after 1st idle" highlighted in > > > slide 6 of his presentation: > > > > > > http://retis.sssup.it/ospm-summit/Downloads/02_05-Douglas_Raillard-How_can_we_make_schedutil_even_more_effective.pdf > > > > > > > So I see the problem, and I don't hate the patch, but I'm still > > struggling to understand how exactly it related to the time-scaling > > stuff. Afaict the fundamental problem here is layering two averages. The > > AFAICT, it's not related to the time-scaling > > In fact the big 1st activation happens because task runs at low OPP > and hasn't enough time to finish its running phase before the time to > begin the next one happens. This means that the task will run several > computations phase in one go which is no more a 75% task. But in that case, running multiple activations back to back, should we not expect the util_avg to exceed the 75% mark? > From a pelt PoV, the task is far larger than a 75% task and its > utilization too because it runs far longer (even after scaling time > with frequency). Which thus should match my expectation above, no? > Once cpu reaches a high enough OPP that enable to have sleep phase > between each running phases, the task load tracking comes back to the > normal slope increase (the one that would have happen if task would > have jump from 5% to 75% but already running at max OPP) Indeed, I can see from the plots a change in slope. But there is also that big drop after the first big activation: 375 units in 1.1ms. Is that expected? I guess yes, since we fix the clock_pelt with the lost_idle_time. > > second (EWMA in our case) will always lag/delay the input of the first > > (PELT). > > > > The time-scaling thing might make matters worse, because that helps PELT > > ramp up faster, but that is not the primary issue. > > > > Or am I missing something? -- #include Patrick Bellasi