Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp939366imu; Fri, 11 Jan 2019 11:51:44 -0800 (PST) X-Google-Smtp-Source: ALg8bN6qtm4sjEIK91fwD4F1C9CK1S1P1TpbTI2kUHqO5DNKNiK8CpBh/seAFH0mtSvmi784pg1L X-Received: by 2002:a17:902:9f89:: with SMTP id g9mr16054097plq.214.1547236304869; Fri, 11 Jan 2019 11:51:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547236304; cv=none; d=google.com; s=arc-20160816; b=jaBKqJCX+Sh1Wb3sE7Ype1CaEM6HhwwyIeHcR7WD8MhISLEml5VvO1bW36Bl4g3USl f74G3mgwawNyorSyxN2iNfjYy74DY32IX/Z9FUp06zULEiOhaKlNl7lKaMeJloFSCosm QNg9ODS8FPwZH5OK0fM0rCXdNx1WZ939MiOKJ/kA/0tSArb4xPN6b7/tD4iru5Q9OmNK G9TebKCBeNyIBZxQkPR2CMN6cIhaIITgWesdUHnUGaQc4hoNba8FLDNxJatNy4mwU57w EZFMwSd+tsWipRC7op/UuF87KFgTGgvPKY/mForOerOanOVGZfyFA4RVoXEvBvaGIkCd oMrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=cRxNvt96ZeBca8R5/33FhkRtRgTlEVM72mVzdS4WA5k=; b=sxfj6/Gl71f25JAq8c0+AqDNhLjPzcSg/R4js/sXD7tgFpQDJuScaIACOrzgrtaTpT BK+hGzkQvyDq0c5JFd2jNG9k/ght/CAL2hP2KfdB1MP1Bu29mjYC9VLXW2AXiVdv5qxV XCtpbqa61XeHdym1wcdt0eOffmVkKsTE7DEn8I4EL168HiuB7tXrhChenZ3PfF16YDoH v5sEiroomzixaBrRm2saowUalVOq7Gdi/QAvhrlI3VSiX9i0zhIe3XCPD22ZMBoetm7s hR1KXBmNVjVQKo48Ag4pZl1uEyBXBPpfpZX0P9J0Ykh7ZSoErOVW0JTM3mgEXNALFuxn HZHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=EpfzB2cV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 28si6278185pgz.593.2019.01.11.11.51.29; Fri, 11 Jan 2019 11:51:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=EpfzB2cV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388858AbfAKOaC (ORCPT + 99 others); Fri, 11 Jan 2019 09:30:02 -0500 Received: from mail-it1-f194.google.com ([209.85.166.194]:52984 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387904AbfAKOaA (ORCPT ); Fri, 11 Jan 2019 09:30:00 -0500 Received: by mail-it1-f194.google.com with SMTP id g76so2720094itg.2 for ; Fri, 11 Jan 2019 06:29:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cRxNvt96ZeBca8R5/33FhkRtRgTlEVM72mVzdS4WA5k=; b=EpfzB2cVcUUU/uHkcz5P8GWbxZHO2z0iyydybyD7CJZm3GTlajma8+hJK/uCkSI1v5 fvN14br8iK7z49W12Qb79QAZJGsmcXv+0083LdE2IRQzggpa8gjThspo3ngckZuLvdFV jyc4om7Ys0MQRwvuyGsn/9jUsMUZ1krkwI+rE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cRxNvt96ZeBca8R5/33FhkRtRgTlEVM72mVzdS4WA5k=; b=gn+z7CIsuTZAhNiKtM4BOttxUOoGXERp98REdFoTnJvjhqX056yrY7DwYwXvtflUKW GRtHfm1ZEtTxdh+QP4oXkHMRT6AWK2HQ+x9uYv0mG1FiEj+lHP2UQvcsSqEMhQh3DpQ/ VCo79uAuD1zLFzoiDYtD8ZXjjgZDO1MSsXotBWaqUhjhbzIbEQgA996CC4p2XgeSsAd8 66fuKCYpRVDC+lbXBX5F1GoEK+HuzureAJ5iUOzFfiHv2zrneaoWXAUSlASIMqj3vmtn p5MDUjsqgCWDYlwKz4WPdgerqj92jVQ3WJzbtxsLuFUONeRRqvFpqh10Z1jQKOywM4ax +L9w== X-Gm-Message-State: AJcUukdy8wjNioHBp/sbmi/5crxsH5FX5cfyUTmP6lyxcESbBKVLwNcy EXN9vFjDzJxMuAHQmY6vQmYNa/eis+4OBnEpOaG6cQ== X-Received: by 2002:a02:6019:: with SMTP id i25mr10835422jac.137.1547216999281; Fri, 11 Jan 2019 06:29:59 -0800 (PST) MIME-Version: 1.0 References: <20181128115336.GB23094@e110439-lin> <20181128144039.GC23094@e110439-lin> <20181128152133.GD23094@e110439-lin> <20181128163545.GE23094@e110439-lin> <20181129150020.GF23094@e110439-lin> <20190110153031.4rh64xz2muctkffe@e110439-lin> In-Reply-To: <20190110153031.4rh64xz2muctkffe@e110439-lin> From: Vincent Guittot Date: Fri, 11 Jan 2019 15:29:48 +0100 Message-ID: Subject: Re: [PATCH v7 2/2] sched/fair: update scale invariance of PELT To: Patrick Bellasi Cc: Peter Zijlstra , Ingo Molnar , linux-kernel , "Rafael J. Wysocki" , Dietmar Eggemann , Morten Rasmussen , Paul Turner , Ben Segall , Thara Gopinath , pkondeti@codeaurora.org, Quentin Perret , Srinivas Pandruvada Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 10 Jan 2019 at 16:30, Patrick Bellasi wrote: > > On 29-Nov 17:19, Vincent Guittot wrote: > > On Thu, 29 Nov 2018 at 16:00, Patrick Bellasi wrote: > > > On 29-Nov 11:43, Vincent Guittot wrote: > > [...] > > > > Seems we agree that, when there is no idle time: > > > - the two 15% tasks will be overestimated > > > - their utilization will reach 50% after a while > > > > > > If I'm not wrong, we will have: > > > - 30% CPU util in ~16ms @1024 capacity > > > ~64ms @256 capacity > > > > > > Thus, the tasks will be certainly over-estimated after ~64ms. > > > Is that correct ? > > > > From a pure util_avg pov it's correct > > But i'd like to weight that a bit with the example below > > > > > Now, we can argue that 64ms is a pretty long time and thus it's quite > > > unlucky we will have no idle for such a long time. > > > > > > Still, I'm wondering if we should keep collecting those samples or > > > better find a way to detect that and skip the sampling. > > > > The problem is that you can have util_avg above capacity even with idle time > > In the 1st example of this thread, the 39ms/80ms task will reach 709 > > which is the value saved by util_est on a big core > > But on core with half capacity, there is still idle time so 709 is a > > correct value although above 512 > > Right, I see your point and (in principle) I like the idea of > collecting samples for tasks which happen to run at a lower capacity > then required and the utilization value makes sense... > > > In fact, max will be always above the linear ratio because it's based > > on geometric series > > > > And this is true even with 15.6ms/32ms (same ratio as above) task > > although the impact is smaller (max value, which should be saved by > > util est, becomes 587 in this case). > > However that's not always the case... as per my example above. > > Moreover, we should also consider that util_est is mainly meant to be > a lower-bound for tasks utilization. > That's why task_util_est() already returns the actual util_avg when > it's higher than the estimated utilization. I can imagine that the fact that we use max(util_avg, util_est) helps to keep using correct utilization in the scheduler when util_avg goes above cpu capacity whereas there is still idle time > > With your new signal and without any special check on samples > collection, if a task is limited because of thermal capping for > example, we could end up overestimating its utilization and thus > perhaps generating an unwanted frequency spike when the capping is > relaxed... and (even worst) it will take some more activations for the > estimated utilization to converge back to the actual utilization. > > Since we cannot easily know if there is idle time in a CPU when a task > completes an activation with a utilization higher then the CPU > capacity, I would better prefer to just skip the sampling with > something like: > > ---8<--- > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 9332863d122a..485053026533 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3639,6 +3639,7 @@ util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep) > { > long last_ewma_diff; > struct util_est ue; > + int cpu; > > if (!sched_feat(UTIL_EST)) > return; > @@ -3672,6 +3673,14 @@ util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep) > if (within_margin(last_ewma_diff, (SCHED_CAPACITY_SCALE / 100))) > return; > > + /* > + * To avoid overestimation of actual task utilization, skip updates if > + * we cannot grant there is idle time in this CPU. > + */ > + cpu = cpu_of(rq_of(cfs_rq)); > + if (task_util(p) > cpu_capacity(cpu)) > + return; > + > /* > * Update Task's estimated utilization > * > ---8<--- > > At least this will ensure that util_est always provides an actual > measured lower bound for a task utilization. > > If you think this makes sense, feel free to add such a patch on > top of your series. ok. I'm going to add it when rebasing the series Thanks Vincent > > Cheers Patrick > > -- > #include > > Patrick Bellasi