Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp1720985ybj; Wed, 6 May 2020 04:12:34 -0700 (PDT) X-Google-Smtp-Source: APiQypIDQLfB4glwLxp2p5HdtD7Xcp5L9182DTrbS+WEpBTokyk3c7rT0ru/a3g2331AW7Fx2P0u X-Received: by 2002:a05:6402:319c:: with SMTP id di28mr6664069edb.185.1588763554450; Wed, 06 May 2020 04:12:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588763554; cv=none; d=google.com; s=arc-20160816; b=CSwNU2PjbfLcMZdSOs5Lx/rXfastGsxzfhj6cYBysqsCE/SIKEbY6N89iW73LzkMy9 RxLR4wNBBSCFZqN1vVO/gzq9WfLXdGNGIk91xkRL1P0XqPUZE7hyVTo/xLpE6lU4f3yq unETIRYXak7LqQZX2uHAPDy6lvOCxYXysrs5BNS5/n2xQOHEu0ChupFf8m3Yt3K+tgLc hYEhmH/loLpgY0/FO4u0vLlM42knKNfuyXBE9IttOUC76l7bQxc6X4Iy44KTZp9iLJ/j /GVb9AksFUveHdc1jB+5JhiACKgFGgwA/qOCxIIIqn6iKMuVVSoQ5t3uPpf7oxQYilyn GH8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:in-reply-to :subject:cc:to:from:user-agent:references; bh=0casZcseyD0uo2F5o8vQpR3g+QfKfyI4qcYd6GOpa8A=; b=O7KzWZdBQAqcAud0YX2GGJ/9O7dnBsvhIEefRG8kMOc8ceM7MoCNSxrk0gsYxUQeCA BGtjzOvWaW/OE6hdqHGTH9IcCCIUd6iClNNW7nB+iLbKhfLlGGcHNx7FzttiAaufI36H FZUc5hkSCAcaSx/6vY1uTYYx57VLvF8grkXcVqESwGY8iPPk8HjKq5+4AkT/m/9LBM8Z SnObORJLoH7JbHvruFhddv8qS+Uq0EmuSjbUGBXy4q3QGaiefgFmcjud/r7hbpwPIBMu 8XjxUVhiScoRZ+fIxArJq2zmnx2ENgwoExbjZAC//LkKFlF1vvdpuoOUsK0t0cHhFIvS Gdnw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o4si860784edb.178.2020.05.06.04.12.11; Wed, 06 May 2020 04:12:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729163AbgEFK3P (ORCPT + 99 others); Wed, 6 May 2020 06:29:15 -0400 Received: from foss.arm.com ([217.140.110.172]:33274 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728338AbgEFK3O (ORCPT ); Wed, 6 May 2020 06:29:14 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D5F8730E; Wed, 6 May 2020 03:29:13 -0700 (PDT) Received: from e113632-lin (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8658D3F71F; Wed, 6 May 2020 03:29:12 -0700 (PDT) References: <20200503083407.GA27766@iZj6chx1xj0e0buvshuecpZ> <20200505134056.GA31680@iZj6chx1xj0e0buvshuecpZ> <20200505142711.GA12952@vingu-book> User-agent: mu4e 0.9.17; emacs 26.3 From: Valentin Schneider To: Vincent Guittot Cc: Peng Liu , dietmar.eggemann@arm.com, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, linux-kernel@vger.kernel.org Subject: Re: [PATCH] sched/fair: Fix nohz.next_balance update In-reply-to: <20200505142711.GA12952@vingu-book> Date: Wed, 06 May 2020 11:29:10 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/05/20 15:27, Vincent Guittot wrote: > So I would be in favor of something as simple as : > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 04098d678f3b..e028bc1c4744 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -10457,6 +10457,14 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, > } > } > > + /* > + * next_balance will be updated only when there is a need. > + * When the CPU is attached to null domain for ex, it will not be > + * updated. > + */ > + if (likely(update_next_balance)) > + nohz.next_balance = next_balance; > + > /* Newly idle CPU doesn't need an update */ > if (idle != CPU_NEWLY_IDLE) { > update_blocked_averages(this_cpu); > @@ -10477,14 +10485,6 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, > if (has_blocked_load) > WRITE_ONCE(nohz.has_blocked, 1); > > - /* > - * next_balance will be updated only when there is a need. > - * When the CPU is attached to null domain for ex, it will not be > - * updated. > - */ > - if (likely(update_next_balance)) > - nohz.next_balance = next_balance; > - > return ret; > } > But then we may skip an update if we goto abort, no? Imagine we have just NOHZ_STATS_KICK, so we don't call any rebalance_domains(), and then as we go through the last NOHZ CPU in the loop we hit need_resched(). We would end in the abort part without any update to nohz.next_balance, despite having accumulated relevant data in the local next_balance variable. Also note that in this case, nohz_idle_balance() will still return true. If we rip out just the one update we need from rebalance_domains(), then perhaps we could go with what Peng was initially suggesting? i.e. something like the below. --- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 46b7bd41573f..0a292e0a0731 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9934,22 +9934,8 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) * When the cpu is attached to null domain for ex, it will not be * updated. */ - if (likely(update_next_balance)) { + if (likely(update_next_balance)) rq->next_balance = next_balance; - -#ifdef CONFIG_NO_HZ_COMMON - /* - * If this CPU has been elected to perform the nohz idle - * balance. Other idle CPUs have already rebalanced with - * nohz_idle_balance() and nohz.next_balance has been - * updated accordingly. This CPU is now running the idle load - * balance for itself and we need to update the - * nohz.next_balance accordingly. - */ - if ((idle == CPU_IDLE) && time_after(nohz.next_balance, rq->next_balance)) - nohz.next_balance = rq->next_balance; -#endif - } } static inline int on_null_domain(struct rq *rq) @@ -10315,6 +10301,11 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, if (flags & NOHZ_BALANCE_KICK) rebalance_domains(this_rq, CPU_IDLE); + if (time_after(next_balance, this_rq->next_balance)) { + next_balance = this_rq->next_balance; + update_next_balance = 1; + } + WRITE_ONCE(nohz.next_blocked, now + msecs_to_jiffies(LOAD_AVG_PERIOD)); @@ -10551,6 +10542,17 @@ static __latent_entropy void run_rebalance_domains(struct softirq_action *h) /* normal load balance */ update_blocked_averages(this_rq->cpu); rebalance_domains(this_rq, idle); + +#ifdef CONFIG_NO_HZ_COMMON + /* + * NOHZ idle CPUs will be rebalanced with nohz_idle_balance() and thus + * nohz.next_balance will be updated accordingly. If there was no NOHZ + * kick, then we just need to update nohz.next_balance wrt *this* CPU. + */ + if ((idle == CPU_IDLE) && + time_after(nohz.next_balance, this_rq->next_balance)) + nohz.next_balance = this_rq->next_balance; +#endif } /* ---