Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752628AbdIANhj (ORCPT ); Fri, 1 Sep 2017 09:37:39 -0400 Received: from merlin.infradead.org ([205.233.59.134]:54956 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752530AbdIANfX (ORCPT ); Fri, 1 Sep 2017 09:35:23 -0400 Message-Id: <20170901132748.141477819@infradead.org> User-Agent: quilt/0.63-1 Date: Fri, 01 Sep 2017 15:21:02 +0200 From: Peter Zijlstra To: mingo@kernel.org, linux-kernel@vger.kernel.org, tj@kernel.org, josef@toxicpanda.com Cc: torvalds@linux-foundation.org, vincent.guittot@linaro.org, efault@gmx.de, pjt@google.com, clm@fb.com, dietmar.eggemann@arm.com, morten.rasmussen@arm.com, bsegall@google.com, yuyang.du@intel.com, peterz@infradead.org Subject: [PATCH -v2 03/18] sched/fair: Cure calc_cfs_shares() vs reweight_entity() References: <20170901132059.342024223@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline; filename=peterz-sched-fair-calc_cfs_shares-fixup.patch Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1298 Lines: 35 Vincent reported that when running in a cgroup, his root cfs_rq->avg.load_avg dropped to 0 on task idle. This is because reweight_entity() will now immediately propagate the weight change of the group entity to its cfs_rq, and as it happens, our approxmation (5) for calc_cfs_shares() results in 0 when the group is idle. Avoid this by using the correct (3) as a lower bound on (5). This way the empty cgroup will slowly decay instead of instantly drop to 0. Reported-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) --- kernel/sched/fair.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2703,11 +2703,10 @@ static long calc_cfs_shares(struct cfs_r tg_shares = READ_ONCE(tg->shares); /* - * This really should be: cfs_rq->avg.load_avg, but instead we use - * cfs_rq->load.weight, which is its upper bound. This helps ramp up - * the shares for small weight interactive tasks. + * Because (5) drops to 0 when the cfs_rq is idle, we need to use (3) + * as a lower bound. */ - load = scale_load_down(cfs_rq->load.weight); + load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg); tg_weight = atomic_long_read(&tg->load_avg);