Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp2041693pxu; Fri, 18 Dec 2020 04:14:02 -0800 (PST) X-Google-Smtp-Source: ABdhPJwDNsv60Vpb+l45n/2CAcotMUw8JJCDE2IzIyr2ZWnoamI59z3Mj/KPHemRw/ZxxWkdxHYT X-Received: by 2002:a50:d5d5:: with SMTP id g21mr4234251edj.41.1608293641893; Fri, 18 Dec 2020 04:14:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1608293641; cv=none; d=google.com; s=arc-20160816; b=n+7Sjy+q914NoBH6TaXNffHVq7xQ/l9YR/lVWPbwwFGOgzUW9jcQKAqt3Qce7OuA4w SReoF9QjQaGoZfmNyqY/h3M8//PvXac2e6HDIjcJT12q1PUsVaT+fblDBWuh/mR2Y6wI E6xAdoOU2Y6qn7lpmM3N0iHzO0xc0tKsYpt9KZPVWKxcC8ou5fw5q9OMpjjLadgLxHIo 7sHzVlB/RMxb4KYqxiSUZrkIUUCADkJYQelO4s2nKcid86EnT6+jLfswZ4k7wK5LisL9 KYroTBxWgwxzW2O4Rq5OWuykq3znKoR3fWdoFChG56DmgcBkhpSC5O3cOY9CzmjTMlG0 fr5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:in-reply-to:subject :cc:to:from:user-agent:references; bh=gcOuitwK6Va7vuBGH6PtQLVtfFCsDFRDEstMjehtyb0=; b=TfeUJ0NHWd5B5ZHbiletg0iLSyYREaJSlDNOwgxmgv2mMPmzT50SAVlEQOLfK976Py is27D76s1krtwk+i2xEzWtKu+16FAg+44yvDnHWNAtmrnANPtRmPrTaAyqrHMBtJDV19 PksOY5LSFslidKolLf7LtYuyagJfM+Y0TiT5SJVBgjc/37c0bfUvDZGlI8ThBHw2fbc2 Y0gqjDJQy4ut8oArfJ02Ux6UGcf9ywGEXZoeXRQif6Qmyd3nnD7MqlC2cCBlqo27QSuQ pqDZ+eIsHsxtR+f5J85kOapjZH+Bu9HUao46UTZtBgxabluraL/Z+ffBZCAre93A3RL6 8aeQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u18si5823121eda.523.2020.12.18.04.13.39; Fri, 18 Dec 2020 04:14:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726200AbgLRLeD (ORCPT + 99 others); Fri, 18 Dec 2020 06:34:03 -0500 Received: from foss.arm.com ([217.140.110.172]:33638 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725710AbgLRLeC (ORCPT ); Fri, 18 Dec 2020 06:34:02 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7AB8D1FB; Fri, 18 Dec 2020 03:33:16 -0800 (PST) Received: from e113632-lin (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9EA633F66E; Fri, 18 Dec 2020 03:33:14 -0800 (PST) References: <20201218103258.GA3040@hirez.programming.kicks-ass.net> User-agent: mu4e 0.9.17; emacs 26.3 From: Valentin Schneider To: Peter Zijlstra Cc: "Rafael J. Wysocki" , Ingo Molnar , Thomas Gleixner , Vincent Guittot , Morten Rasmussen , dietmar.eggemann@arm.com, patrick.bellasi@matbug.net, lenb@kernel.org, linux-kernel@vger.kernel.org, ionela.voinescu@arm.com, qperret@google.com, viresh.kumar@linaro.org Subject: Re: [PATCH] sched: Add schedutil overview In-reply-to: <20201218103258.GA3040@hirez.programming.kicks-ass.net> Date: Fri, 18 Dec 2020 11:33:09 +0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Have some more nits below On 18/12/20 10:32, Peter Zijlstra wrote: > Signed-off-by: Peter Zijlstra (Intel) > --- > Documentation/scheduler/schedutil.txt | 168 ++++++++++++++++++++++++++++++++++ > 1 file changed, 168 insertions(+) > > --- /dev/null > +++ b/Documentation/scheduler/schedutil.txt [...] > +Frequency- / CPU Invariance > +--------------------------- > + > +Because consuming the CPU for 50% at 1GHz is not the same as consuming the CPU > +for 50% at 2GHz, nor is running 50% on a LITTLE CPU the same as running 50% on > +a big CPU, we allow architectures to scale the time delta with two ratios, one > +Dynamic Voltage and Frequency Scaling (DVFS) ratio and one microarch ratio. > + > +For simple DVFS architectures (where software is in full control) we trivially > +compute the ratio as: > + > + f_cur > + r_dvfs := ----- > + f_max > + > +For more dynamic systems where the hardware is in control of DVFS (Intel, > +ARMv8.4-AMU) we use hardware counters to provide us this ratio. For Intel Nit: To me this reads as if the presence of AMUs entail 'hardware is in control of DVFS', which doesn't seem right. How about: For more dynamic systems where the hardware is in control of DVFS we use hardware counters (Intel APERF/MPERF, ARMv8.4-AMU) to provide us this ratio. > +Schedutil / DVFS > +---------------- > + > +Every time the scheduler load tracking is updated (task wakeup, task > +migration, time progression) we call out to schedutil to update the hardware > +DVFS state. > + > +The basis is the CPU runqueue's 'running' metric, which per the above it is > +the frequency invariant utilization estimate of the CPU. From this we compute > +a desired frequency like: > + > + max( running, util_est ); if UTIL_EST > + u_cfs := { running; otherwise > + > + u_clamp := clamp( u_cfs, u_min, u_max ) > + > + u := u_cfs + u_rt + u_irq + u_dl; [approx. see source for more detail] > + > + f_des := min( f_max, 1.25 u * f_max ) > + In schedutil_cpu_util(), uclamp clamps both u_cfs and u_rt. I'm afraid the below might just bring more confusion; what do you think? clamp( u_cfs + u_rt, u_min, u_max ); if UCLAMP_TASK u_clamp := { u_cfs + u_rt; otherwise u := u_clamp + u_irq + u_dl; [approx. see source for more detail] (also, does this need a word about runnable rt tasks => goto max?) > +XXX IO-wait; when the update is due to a task wakeup from IO-completion we > +boost 'u' above. > + > +This frequency is then used to select a P-state/OPP or directly munged into a > +CPPC style request to the hardware. > + > +XXX: deadline tasks (Sporadic Task Model) allows us to calculate a hard f_min > +required to satisfy the workload. > + > +Because these callbacks are directly from the scheduler, the DVFS hardware > +interaction should be 'fast' and non-blocking. Schedutil supports > +rate-limiting DVFS requests for when hardware interaction is slow and > +expensive, this reduces effectiveness. > + > +For more information see: kernel/sched/cpufreq_schedutil.c > +