Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp217694pxb; Wed, 3 Feb 2021 03:56:27 -0800 (PST) X-Google-Smtp-Source: ABdhPJxqOtHXTu7WYsblAsIoywXlqK5nvLXxXI71hN7BijS6cCqED3BinTwyaAz40LzAPZD9I+eC X-Received: by 2002:a17:906:f24a:: with SMTP id gy10mr2754635ejb.531.1612353387223; Wed, 03 Feb 2021 03:56:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612353387; cv=none; d=google.com; s=arc-20160816; b=UqbYleVL6iD7YpmanKAadsLvh56IpD2PJmOMWPLFKCi5kCTBFiqIcKbu895UrfJ4uz gpusxmf3pgkhMs+KMLyZFkXc7GRJPB49VY9Sy1nHoz79ztAeMOddBnInoIfhpOOkOuQT MHPBGzsL6YdcvSY640LlwYlixP5Vg/WsLNz5wR6+SF7UsE8yhwKH01v3ovd3LpLp007O n0V1H9E3pMmELTHNKsaYpFsArfZySY31oK2vdXIEsB55eKvSYUq2hJ7xB4XKWgwbqrBq +SuTV3FqxWf7hCLkUWNK/YkUE28WZMB4XpUk/yXNfse2tnKlpdZmEoE/tbHttLGuOnK4 9Jkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=o8eg60eNvMVstdgm7T3Pq3QDUzBaAPSozwQ1pcwzSps=; b=Kb+yNO6jxeoCFAVidMVOgezidaJfJJ4Nawv5K/RpffMmPwbt5n982XL5Cjy3Jdl/h2 AfuhBoayeruI3Ody+WT9G7gu3b3WGB1JTfvohI0HG9Ip7fSjq3yRCEBYmyT6MpcWf/B1 KLwLn6eZIjB5p+QfbgFrPki6rp1BrHe3IEq+Nd0M5MuEdnM9ctfRVgXqefcnabQIqr2M 2Vg8HbS2BDSb+934HiAgMlk5fkrcrHh/v3zPr0SVZxhJuh/5mODpMdPS/W5l5NsYxeCp 6FwZBJoTMt7fBjMvuIxP/CMtbKF5M6EpsX5LLnRv7ugPwPQVFy5gmBBg3o7BrLZ1HowM 4kVg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id jy25si1106026ejc.530.2021.02.03.03.56.02; Wed, 03 Feb 2021 03:56:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234132AbhBCLzP (ORCPT + 99 others); Wed, 3 Feb 2021 06:55:15 -0500 Received: from foss.arm.com ([217.140.110.172]:38638 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234210AbhBCLzN (ORCPT ); Wed, 3 Feb 2021 06:55:13 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0D9FFD6E; Wed, 3 Feb 2021 03:54:28 -0800 (PST) Received: from [192.168.178.6] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9D0533F73B; Wed, 3 Feb 2021 03:54:25 -0800 (PST) Subject: Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages() for NOHZ To: Vincent Guittot , Joel Fernandes Cc: linux-kernel , Paul McKenney , Frederic Weisbecker , Qais Yousef , Ben Segall , Daniel Bristot de Oliveira , Ingo Molnar , Juri Lelli , Mel Gorman , Peter Zijlstra , Steven Rostedt , "Uladzislau Rezki (Sony)" , Neeraj upadhyay References: <20210122154600.1722680-1-joel@joelfernandes.org> <20210129172727.GA30719@vingu-book> From: Dietmar Eggemann Message-ID: <09367fac-5184-56d1-3c86-998b5f2af838@arm.com> Date: Wed, 3 Feb 2021 12:54:20 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20210129172727.GA30719@vingu-book> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29/01/2021 18:27, Vincent Guittot wrote: > Le vendredi 29 janv. 2021 � 11:33:00 (+0100), Vincent Guittot a �crit : >> On Thu, 28 Jan 2021 at 16:09, Joel Fernandes wrote: >>> >>> Hi Vincent, >>> >>> On Thu, Jan 28, 2021 at 8:57 AM Vincent Guittot >>> wrote: >>>>> On Mon, Jan 25, 2021 at 03:42:41PM +0100, Vincent Guittot wrote: >>>>>> On Fri, 22 Jan 2021 at 20:10, Joel Fernandes wrote: >>>>>>> On Fri, Jan 22, 2021 at 05:56:22PM +0100, Vincent Guittot wrote: >>>>>>>> On Fri, 22 Jan 2021 at 16:46, Joel Fernandes (Google) >>>>>>>> wrote: [...] >> The only point that I agree with, is that running >> update_blocked_averages with preempt and irq off is not a good thing >> because we don't manage the number of csf_rq to update and I'm going >> to provide a patchset for this > > The patch below moves the update of the blocked load of CPUs outside newidle_balance(). > > Instead, the update is done with the usual idle load balance update. I'm working on an > additonnal patch that will select this cpu that is about to become idle, instead of a > random idle cpu but this 1st step fixe the problem of lot of update in newly idle. I'm trying to understand the need for this extra patch. The patch below moves away from doing update_blocked_averages() (1) for all CPUs in sched groups of the sched domain: newidle_balance()->load_balance()-> find_busiest_group()->update_sd_lb_stats()->update_sg_lb_stats() to: calling (1) for CPUs in nohz.idle_cpus_mask in _nohz_idle_balance() via update_nohz_stats() and for the ilb CPU. newidle_balance() calls (1) for newidle CPU already. What would be the benefit to choose newidle CPU as ilb CPU? > Signed-off-by: Vincent Guittot > --- > kernel/sched/fair.c | 32 +++----------------------------- > 1 file changed, 3 insertions(+), 29 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 197a51473e0c..8200b1d4df3d 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7421,8 +7421,6 @@ enum migration_type { > #define LBF_NEED_BREAK 0x02 > #define LBF_DST_PINNED 0x04 > #define LBF_SOME_PINNED 0x08 > -#define LBF_NOHZ_STATS 0x10 > -#define LBF_NOHZ_AGAIN 0x20 > > struct lb_env { > struct sched_domain *sd; > @@ -8426,9 +8424,6 @@ static inline void update_sg_lb_stats(struct lb_env *env, > for_each_cpu_and(i, sched_group_span(group), env->cpus) { > struct rq *rq = cpu_rq(i); > > - if ((env->flags & LBF_NOHZ_STATS) && update_nohz_stats(rq, false)) > - env->flags |= LBF_NOHZ_AGAIN; > - > sgs->group_load += cpu_load(rq); > sgs->group_util += cpu_util(i); > sgs->group_runnable += cpu_runnable(rq); > @@ -8969,11 +8964,6 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd > struct sg_lb_stats tmp_sgs; > int sg_status = 0; > > -#ifdef CONFIG_NO_HZ_COMMON > - if (env->idle == CPU_NEWLY_IDLE && READ_ONCE(nohz.has_blocked)) > - env->flags |= LBF_NOHZ_STATS; > -#endif > - > do { > struct sg_lb_stats *sgs = &tmp_sgs; > int local_group; > @@ -9010,15 +9000,6 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd > /* Tag domain that child domain prefers tasks go to siblings first */ > sds->prefer_sibling = child && child->flags & SD_PREFER_SIBLING; > > -#ifdef CONFIG_NO_HZ_COMMON > - if ((env->flags & LBF_NOHZ_AGAIN) && > - cpumask_subset(nohz.idle_cpus_mask, sched_domain_span(env->sd))) { > - > - WRITE_ONCE(nohz.next_blocked, > - jiffies + msecs_to_jiffies(LOAD_AVG_PERIOD)); > - } > -#endif > - > if (env->sd->flags & SD_NUMA) > env->fbq_type = fbq_classify_group(&sds->busiest_stat); > > @@ -10547,14 +10528,7 @@ static void nohz_newidle_balance(struct rq *this_rq) > return; > > raw_spin_unlock(&this_rq->lock); > - /* > - * This CPU is going to be idle and blocked load of idle CPUs > - * need to be updated. Run the ilb locally as it is a good > - * candidate for ilb instead of waking up another idle CPU. > - * Kick an normal ilb if we failed to do the update. > - */ > - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE)) > - kick_ilb(NOHZ_STATS_KICK); > + kick_ilb(NOHZ_STATS_KICK); > raw_spin_lock(&this_rq->lock); > } > > @@ -10616,8 +10590,6 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) > update_next_balance(sd, &next_balance); > rcu_read_unlock(); > > - nohz_newidle_balance(this_rq); > - > goto out; > } > > @@ -10683,6 +10655,8 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) > > if (pulled_task) > this_rq->idle_stamp = 0; > + else > + nohz_newidle_balance(this_rq); > > rq_repin_lock(this_rq, rf);