Received: by 10.223.185.116 with SMTP id b49csp613372wrg; Tue, 20 Feb 2018 05:05:49 -0800 (PST) X-Google-Smtp-Source: AH8x225EnW75Dqn2EawDviw8039AkkTwc4Fe3eBxbLgtMiIZiPkABeHoYZTkugSwtF0710hJnp4U X-Received: by 2002:a17:902:50e:: with SMTP id 14-v6mr17356437plf.360.1519131949436; Tue, 20 Feb 2018 05:05:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519131949; cv=none; d=google.com; s=arc-20160816; b=sw21h/+8YMqWRCChDVqe5RlStqvcLNCn9TyxXxzXNKSQ3t2NzC+BJfdEg23xvlPAYs DqXzLkoNLe+yM2o0oGPHA8GyAVAKdDAoVi1lAA/SXfOCpEEwBzDxv7YDk30KAIojNWTu xp8DX14mHynaSOOOUb2utU+/KICHnZSst7GpEJiulApOeyJuU7SV7gyhYRkP7lSUS29K u9kz0+Wqzus8GEsMcGO/XQJOgqpkXdXqnOR7Rn1zrMFh3X9dtZ/2ye05D0CVoBcDSyKP dsaOvcOKWwlerJhTKDFUP284tCi+jtQmH3nd2s4NOiNYS0ZKNFckdRoodXEmMUqvpF4o lsiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=KOI1gIZIfv++n3nftucA7tAfKUzUyjECziHXGYXpHW0=; b=m1IW7y+M68k6NfgFeRFQiP+FdDW31vNH1467THEKmGDHM0Vjbd0tqO0n3eIrwiNPK3 NHwoaYaQyzuQljiUqCZnkvtS3VuePChiS9Y8Ix06ksU5KB7LIJxvfdoYdWot6222fwmL QWu0ljtGd3ZxabiWTnI0Rl/6LSf47aJjTlHnND0g1NwISbHkeBLrlerwHCJNo8xuoGaK Oov8DBRApVcvMAfjNOc+aBWE98rvHPQWrAPdHLeYFoQCbAQjxL3WEgM/r3YPhthRncK8 DLXeR/GtAn8O7aVeg6Es6MQlK5Bx5O57LcF3ft41eCueDkNfFpv9hspt35GpZCO9BVtT c/mg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=YE13fiwB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u16si6526684pgo.695.2018.02.20.05.05.27; Tue, 20 Feb 2018 05:05:49 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=YE13fiwB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752294AbeBTNDP (ORCPT + 99 others); Tue, 20 Feb 2018 08:03:15 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:46636 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752134AbeBTNCJ (ORCPT ); Tue, 20 Feb 2018 08:02:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=KOI1gIZIfv++n3nftucA7tAfKUzUyjECziHXGYXpHW0=; b=YE13fiwBOGYRCIfjGhwaO1SVb qXpMn7A6c7zJTRw+3hSRza1mYQMvw1YuZ7ro1HdrZApZkJsgG8xODi/o/8G7t/EAfkFRAZZrmfXFI Sv9kYxP2ivpNQnOLI+vUXedGOHHHtob+FLQDv4SV9kmVCO68VM93IC4MwlFqlQ1zORoPduXoRILT/ 22eqhm+PcvLKInSGHWzpS9h/fgt7DEDsjEesJtnONVBFGGD1daT29WMHHkVObyR3FvQpK+Pq0XWAM JpngIgBBhRUtmVzBjvDy4RTwYlbMHcGDOTv3d7+gZTMVnzHdccbFd58dNJ0D3Ymr4gmb8RRX2KLgO 8vVi2mHnw==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.89 #1 (Red Hat Linux)) id 1eo7YU-0005wz-Sc; Tue, 20 Feb 2018 13:02:00 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 11A4C2038D7AA; Tue, 20 Feb 2018 14:01:54 +0100 (CET) Date: Tue, 20 Feb 2018 12:59:17 +0100 From: Peter Zijlstra To: Vincent Guittot Cc: mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com, morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com Subject: Re: [PATCH v5 3/3] sched: update blocked load when newly idle Message-ID: <20180220115917.GA25201@hirez.programming.kicks-ass.net> References: <1518622006-16089-1-git-send-email-vincent.guittot@linaro.org> <1518622006-16089-4-git-send-email-vincent.guittot@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1518622006-16089-4-git-send-email-vincent.guittot@linaro.org> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 14, 2018 at 04:26:46PM +0100, Vincent Guittot wrote: > When NEWLY_IDLE load balance is not triggered, we might need to update the > blocked load anyway. We can kick an ilb so an idle CPU will take care of > updating blocked load or we can try to update them locally before entering > idle. In the latter case, we reuse part of the nohz_idle_balance. So I still don't like this, but then I couldn't come up with anything I did like either. Munged the patch a bit, I pulled out the code movement into a separate patch and otherwise reduced #ifdeffery a bit. --- Subject: sched: update blocked load when newly idle From: Vincent Guittot Date: Wed, 14 Feb 2018 16:26:46 +0100 When NEWLY_IDLE load balance is not triggered, we might need to update the blocked load anyway. We can kick an ilb so an idle CPU will take care of updating blocked load or we can try to update them locally before entering idle. In the latter case, we reuse part of the nohz_idle_balance. Cc: valentin.schneider@arm.com Cc: mingo@kernel.org Cc: brendan.jackman@arm.com Cc: dietmar.eggemann@arm.com Cc: morten.rasmussen@foss.arm.com Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Link: http://lkml.kernel.org/r/1518622006-16089-4-git-send-email-vincent.guittot@linaro.org --- kernel/sched/fair.c | 105 +++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 87 insertions(+), 18 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9394,10 +9394,14 @@ void nohz_balance_enter_idle(int cpu) } /* - * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the - * rebalancing for all the cpus for whom scheduler ticks are stopped. + * Internal function that runs load balance for all idle cpus. The load balance + * can be a simple update of blocked load or a complete load balance with + * tasks movement depending of flags. + * The function returns false if the loop has stopped before running + * through all idle CPUs. */ -static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, + enum cpu_idle_type idle) { /* Earliest time when we have to do rebalance again */ unsigned long now = jiffies; @@ -9405,20 +9409,10 @@ static bool nohz_idle_balance(struct rq bool has_blocked_load = false; int update_next_balance = 0; int this_cpu = this_rq->cpu; - unsigned int flags; int balance_cpu; + int ret = false; struct rq *rq; - if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) - return false; - - if (idle != CPU_IDLE) { - atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); - return false; - } - - flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); - SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK); /* @@ -9462,10 +9456,10 @@ static bool nohz_idle_balance(struct rq if (time_after_eq(jiffies, rq->next_balance)) { struct rq_flags rf; - rq_lock_irq(rq, &rf); + rq_lock_irqsave(rq, &rf); update_rq_clock(rq); cpu_load_update_idle(rq); - rq_unlock_irq(rq, &rf); + rq_unlock_irqrestore(rq, &rf); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(rq, CPU_IDLE); @@ -9477,13 +9471,21 @@ static bool nohz_idle_balance(struct rq } } - update_blocked_averages(this_cpu); + /* Newly idle CPU doesn't need an update */ + if (idle != CPU_NEWLY_IDLE) { + update_blocked_averages(this_cpu); + has_blocked_load |= this_rq->has_blocked_load; + } + if (flags & NOHZ_BALANCE_KICK) rebalance_domains(this_rq, CPU_IDLE); WRITE_ONCE(nohz.next_blocked, now + msecs_to_jiffies(LOAD_AVG_PERIOD)); + /* The full idle balance loop has been done */ + ret = true; + abort: /* There is still blocked load, enable periodic update */ if (has_blocked_load) @@ -9497,15 +9499,79 @@ static bool nohz_idle_balance(struct rq if (likely(update_next_balance)) nohz.next_balance = next_balance; + return ret; +} + +/* + * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the + * rebalancing for all the cpus for whom scheduler ticks are stopped. + */ +static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +{ + int this_cpu = this_rq->cpu; + unsigned int flags; + + if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) + return false; + + if (idle != CPU_IDLE) { + atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + return false; + } + + /* + * barrier, pairs with nohz_balance_enter_idle(), ensures ... + */ + flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + if (!(flags & NOHZ_KICK_MASK)) + return false; + + _nohz_idle_balance(this_rq, flags, idle); + return true; } + +static void nohz_newidle_balance(struct rq *this_rq) +{ + int this_cpu = this_rq->cpu; + + /* + * This CPU doesn't want to be disturbed by scheduler + * housekeeping + */ + if (!housekeeping_cpu(this_cpu, HK_FLAG_SCHED)) + return; + + /* Will wake up very soon. No time for doing anything else*/ + if (this_rq->avg_idle < sysctl_sched_migration_cost) + return; + + /* Don't need to update blocked load of idle CPUs*/ + if (!READ_ONCE(nohz.has_blocked) || + time_before(jiffies, READ_ONCE(nohz.next_blocked))) + return; + + raw_spin_unlock(&this_rq->lock); + /* + * This CPU is going to be idle and blocked load of idle CPUs + * need to be updated. Run the ilb locally as it is a good + * candidate for ilb instead of waking up another idle CPU. + * Kick an normal ilb if we failed to do the update. + */ + if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE)) + kick_ilb(NOHZ_STATS_KICK); + raw_spin_lock(&this_rq->lock); +} + #else /* !CONFIG_NO_HZ_COMMON */ static inline void nohz_balancer_kick(struct rq *rq) { } -static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +static inline bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) { return false; } + +static inline void nohz_newidle_balance(struct rq *this_rq) { } #endif /* CONFIG_NO_HZ_COMMON */ /* @@ -9542,12 +9608,15 @@ static int idle_balance(struct rq *this_ if (this_rq->avg_idle < sysctl_sched_migration_cost || !this_rq->rd->overload) { + rcu_read_lock(); sd = rcu_dereference_check_sched_domain(this_rq->sd); if (sd) update_next_balance(sd, &next_balance); rcu_read_unlock(); + nohz_newidle_balance(this_rq); + goto out; }