Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp485958pxb; Wed, 20 Jan 2021 11:53:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJyw2P6m7yiMUSiyfZqghztB4d/f67vfZBExKjQsQJZB9ZkX2qu8ipktUcYP/42z19nMYz6i X-Received: by 2002:a17:907:c01:: with SMTP id ga1mr6851499ejc.488.1611172427708; Wed, 20 Jan 2021 11:53:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611172427; cv=none; d=google.com; s=arc-20160816; b=jyd2MhSu0+hdEvDNP5/TV79yBt7288/ocJDn9Pg4GKU40w9ffBipkvhA3RnimcP58Y LMJIn9QImwwerbp800t2QfsgsklP/T8f1qh7VJHu1QBAFabz99eCH9p7YVbvQAqknF96 rdD+Y821qdR8HN5w3wrJmQpzngyMiGuE+I4UWGI9VhOuBGqWPaSy0wH/M79iTPGDKipB z5dH1w8H4IpqI4V0GnLsCQk+kpB3NlniCrATTVVQwbiPAD3zXVicsKvL85INHvaN9EcO jDOF08S1S58Ln3GDlpgDYT54z+/sg9Wu6IUg+bo9VcRnb4bicGEzynTWnk1crM+wFhEP Gyhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=EH5y2QowKAn8v5SlvfhqbBK94HYpC+WRNspA6XjRhPs=; b=JNjL6uTBi+50Vy64rAJD2HMHA0+xNgKWhw0KXlb/Nrpbq8LR4Xk32H9EjkvPES0SrI lc9gBluj2XyMUu7b9VLetZJz103rmR/THaLIEJg4vPgLjg6DkOfcFYoXgyhbxUQpTqu3 hLdhmrGSqvWqfy7WlRcNFCYPgpWg/zGgNU+sDWVTy3Aw/fFtk9j7EWKCgcHUjbwUKPr/ Ev4citQ0rApknb/PfrKPkmgXXTIqH+cXAblMJEn/Mn+fs++r/BujMaCmq8rapHI+ra74 7GKTJHgKI6WxLHcVwrA2IJt6GCiVxGWfIruaE9/fNIXmNfNMVpeGFA94jP9BG680NH2H PR6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=merlin.20170209 header.b=SxKAW++8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gg23si970118ejb.117.2021.01.20.11.53.23; Wed, 20 Jan 2021 11:53:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=merlin.20170209 header.b=SxKAW++8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392667AbhATTvM (ORCPT + 99 others); Wed, 20 Jan 2021 14:51:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392795AbhATTty (ORCPT ); Wed, 20 Jan 2021 14:49:54 -0500 Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:8b0:10b:1231::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DC23C061575 for ; Wed, 20 Jan 2021 11:49:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To:Subject:Sender :Reply-To:Content-ID:Content-Description; bh=EH5y2QowKAn8v5SlvfhqbBK94HYpC+WRNspA6XjRhPs=; b=SxKAW++8hK9nzflKXhB3aCZS+C TGWrgYBQrXbjWfyq+2r/dA7zOwvLCUeMc+GOho2Q6Lbr5w+UPnNpFT8G+fltPi7mb1A6A1enUZiZy gBBKlQsVwARbjjncT7nKHx1cT4DEMCutG7S7gPPX302+rGpnVnSTTQBgM8GkU1epx2+RUzeNOlmne Nu6CftM+4oynQgAZbJYMncY4jNmKxYiGCdbkq9c7lNFkN5lUz2LNNbQFHfAcdbUPWzXb7172Yo/0F hXve12T8i+OLPvHypkjUz5sumJpxJykyy89+PvgyevGNJDQ/+Jbt8MpXlVkHx5KHoQ8m7K3FmOCRW kr8onTmg==; Received: from [2601:1c0:6280:3f0::9abc] by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1l2JTG-0000ec-6l; Wed, 20 Jan 2021 19:48:50 +0000 Subject: Re: [PATCH 4/4] sched/fair: Add document for burstable CFS bandwidth control To: Huaixin Chang Cc: bsegall@google.com, dietmar.eggemann@arm.com, juri.lelli@redhat.com, khlebnikov@yandex-team.ru, linux-kernel@vger.kernel.org, mgorman@suse.de, mingo@redhat.com, pauld@redhead.com, peterz@infradead.org, pjt@google.com, rostedt@goodmis.org, shanpeic@linux.alibaba.com, vincent.guittot@linaro.org, xiyou.wangcong@gmail.com References: <20201217074620.58338-1-changhuaixin@linux.alibaba.com> <20210120122715.29493-1-changhuaixin@linux.alibaba.com> <20210120122715.29493-5-changhuaixin@linux.alibaba.com> From: Randy Dunlap Message-ID: <508f96e6-f1b8-0ac8-d9d5-ad83ddfc4be0@infradead.org> Date: Wed, 20 Jan 2021 11:48:42 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: <20210120122715.29493-5-changhuaixin@linux.alibaba.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi-- Some comments below: On 1/20/21 4:27 AM, Huaixin Chang wrote: > Basic description of usage and effect for CFS Bandwidth Control Burst. > > Signed-off-by: Huaixin Chang > Signed-off-by: Shanpei Chen > --- > Documentation/scheduler/sched-bwc.rst | 70 +++++++++++++++++++++++++++++++++-- > 1 file changed, 66 insertions(+), 4 deletions(-) > > diff --git a/Documentation/scheduler/sched-bwc.rst b/Documentation/scheduler/sched-bwc.rst > index 9801d6b284b1..2214ecaad393 100644 > --- a/Documentation/scheduler/sched-bwc.rst > +++ b/Documentation/scheduler/sched-bwc.rst > @@ -21,18 +21,46 @@ cfs_quota units at each period boundary. As threads consume this bandwidth it > is transferred to cpu-local "silos" on a demand basis. The amount transferred > within each of these updates is tunable and described as the "slice". > > +By default, CPU bandwidth consumption is strictly limited to quota within each > +given period. For the sequence of CPU usage u_i served under CFS bandwidth > +control, if for any j <= k N(j,k) is the number of periods from u_j to u_k: > + > + u_j+...+u_k <= quota * N(j,k) > + > +For a bursty sequence among which interval u_j...u_k are at the peak, CPU > +requests might have to wait for more periods to replenish enough quota. > +Otherwise, larger quota is required. > + > +With "burst" buffer, CPU requests might be served as long as: > + > + u_j+...+u_k <= B_j + quota * N(j,k) > + > +if for any j <= k N(j,k) is the number of periods from u_j to u_k and B_j is > +the accumulated quota from previous periods in burst buffer serving u_j. > +Burst buffer helps in that serving whole bursty CPU requests without throttling > +them can be done with moderate quota setting and accumulated quota in burst > +buffer, if: > + > + u_0+...+u_n <= B_0 + quota * N(0,n) > + > +where B_0 is the initial state of burst buffer. The maximum accumulated quota in > +the burst buffer is capped by burst. With proper burst setting, the available > +bandwidth is still determined by quota and period on the long run. > + > Management > ---------- > -Quota and period are managed within the cpu subsystem via cgroupfs. > +Quota, period and burst are managed within the cpu subsystem via cgroupfs. > > -cpu.cfs_quota_us: the total available run-time within a period (in microseconds) > +cpu.cfs_quota_us: run-time replenished within a period (in microseconds) > cpu.cfs_period_us: the length of a period (in microseconds) > +cpu.cfs_burst_us: the maximum accumulated run-time (in microseconds) > cpu.stat: exports throttling statistics [explained further below] > > The default values are:: > > cpu.cfs_period_us=100ms > - cpu.cfs_quota=-1 > + cpu.cfs_quota_us=-1 > + cpu.cfs_burst_us=0 > > A value of -1 for cpu.cfs_quota_us indicates that the group does not have any > bandwidth restriction in place, such a group is described as an unconstrained > @@ -48,6 +76,11 @@ more detail below. > Writing any negative value to cpu.cfs_quota_us will remove the bandwidth limit > and return the group to an unconstrained state once more. > > +A value of 0 for cpu.cfs_burst_us indicates that the group can not accumulate > +any unused bandwidth. It makes the traditional bandwidth control behavior for > +CFS unchanged. Writing any (valid) positive value(s) into cpu.cfs_burst_us > +will enact the cap on unused bandwidth accumulation. > + > Any updates to a group's bandwidth specification will result in it becoming > unthrottled if it is in a constrained state. > > @@ -65,9 +98,21 @@ This is tunable via procfs:: > Larger slice values will reduce transfer overheads, while smaller values allow > for more fine-grained consumption. > > +There is also a global switch to turn off burst for all groups:: > + /proc/sys/kernel/sched_cfs_bw_burst_enabled (default=1) > + > +By default it is enabled. Write 0 values means no accumulated CPU time can be Writing a 0 value means > +used for any group, even if cpu.cfs_burst_us is configured. > + > +Sometimes users might want a group to burst without accumulation. This is > +tunable via:: > + /proc/sys/kernel/sched_cfs_bw_burst_onset_percent (default=0) > + > +Up to 100% runtime of cpu.cfs_burst_us might be given on setting bandwidth. > + > Statistics > ---------- > -A group's bandwidth statistics are exported via 3 fields in cpu.stat. > +A group's bandwidth statistics are exported via 6 fields in cpu.stat. > > cpu.stat: > > @@ -75,6 +120,11 @@ cpu.stat: > - nr_throttled: Number of times the group has been throttled/limited. > - throttled_time: The total time duration (in nanoseconds) for which entities > of the group have been throttled. > +- current_bw: Current runtime in global pool. > +- nr_burst: Number of periods burst occurs. > +- burst_time: Cumulative wall-time that any cpus has used above quota in CPUs have used > + respective periods > + > > This interface is read-only. > > @@ -172,3 +222,15 @@ Examples > > By using a small period here we are ensuring a consistent latency > response at the expense of burst capacity. > + > +4. Limit a group to 20% of 1 CPU, and allow accumulate up to 60% of 1 CPU > + addtionally, in case accumulation has been done. additionally, > + > + With 50ms period, 10ms quota will be equivalent to 20% of 1 CPU. > + And 30ms burst will be equivalent to 60% of 1 CPU. > + > + # echo 10000 > cpu.cfs_quota_us /* quota = 10ms */ > + # echo 50000 > cpu.cfs_period_us /* period = 50ms */ > + # echo 30000 > cpu.cfs_burst_us /* burst = 30ms */ > + > + Larger buffer setting allows greater burst capacity. > HTH. -- ~Randy "He closes his eyes and drops the goggles. You can't get hurt by looking at a bitmap. Or can you?" (Neal Stephenson: Snow Crash)