Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp3567294pxv; Mon, 19 Jul 2021 03:33:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwWWZAObqG8Vr9XEivlsYzgmJ0xOYHJPZDXXep+FwDXy+o2TDFLQul706ft1E8CWxdAovaM X-Received: by 2002:a5d:9e4f:: with SMTP id i15mr15651279ioi.58.1626690788549; Mon, 19 Jul 2021 03:33:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626690788; cv=none; d=google.com; s=arc-20160816; b=Gz94d/rp2FbqLGLy1Bn/Nq0WYZsG+atIOKlTx2h3V7M8RVy1t/5pRzfUMLPN9r9gwc 85UByDJllb6Zu2bI8bjFBMJwsNYM2q41j4Kif41KHOhWBTtaHpoXvNJzUt6y4hGc8ybF rLYx2QSIf8QDGO3Z1Z//3I+44uYraIJg86QfKJb5xhwvK9eXhhzZuSPVbPl04KZ1SzKz drm6jj5SZ8nfaRuBAk3NK6u/9gFXHgquT0ADfNWXupvkLHJWt5btUWM7vCPHHcnL9dhg zdm0B9MWzqp2xZ9MfkNyN7Mdre4zaDUhJipa82+VzIjpM4hDUQUeF1WR+Hgix+xii+jK fuGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=6Fal0YB4LLtxCmEwYy6UlTNFZmquP/gZDh2NUcvbt18=; b=veIUDoO0WYpPzyJQcCakfqxZ2Y7oEvoqULsb+sv0R+u3udmk9EnGgO2wjAw6t4UWP6 ERt3yKgUwnzakWaAkDV3RwvkL7PrzKU/GaFjsiGxaig+ibOMBuk6hJVliEhvUT18RwbX T2aGr8rG3DS8DOizjCW1j14s+UEBGDT9gPG22JfFvUFl66Mqmx850F48w2lha0eJbTMl 3JHIFySXdIx3a4862yKrlxq58MWjCRpB8ql+bWqPb4Ict52XGRcQ/TmRH4rYxANI10zB gM2W5mEqqiBpx0tpiRi0HmPtefA9mpg7mlJmtp+yZ52fexdwa4A7tdyRmM4rTMkmnoJ8 dJLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o18si22708706jah.71.2021.07.19.03.32.56; Mon, 19 Jul 2021 03:33:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236292AbhGSJux (ORCPT + 99 others); Mon, 19 Jul 2021 05:50:53 -0400 Received: from foss.arm.com ([217.140.110.172]:55020 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236234AbhGSJuq (ORCPT ); Mon, 19 Jul 2021 05:50:46 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C31091042; Mon, 19 Jul 2021 03:31:25 -0700 (PDT) Received: from e113632-lin.cambridge.arm.com (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id DD50F3F73D; Mon, 19 Jul 2021 03:31:24 -0700 (PDT) From: Valentin Schneider To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , Ingo Molnar , Vincent Guittot , Dietmar Eggemann Subject: [PATCH v2 2/2] sched/fair: Trigger nohz.next_balance updates when a CPU goes NOHZ-idle Date: Mon, 19 Jul 2021 11:31:17 +0100 Message-Id: <20210719103117.3624936-3-valentin.schneider@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210719103117.3624936-1-valentin.schneider@arm.com> References: <20210719103117.3624936-1-valentin.schneider@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Consider a system with some NOHZ-idle CPUs, such that nohz.idle_cpus_mask = S nohz.next_balance = T When a new CPU k goes NOHZ idle (nohz_balance_enter_idle()), we end up with: nohz.idle_cpus_mask = S \U {k} nohz.next_balance = T Note that the nohz.next_balance hasn't changed - it won't be updated until a NOHZ balance is triggered. This is problematic if the newly NOHZ idle CPU has an earlier rq.next_balance than the other NOHZ idle CPUs, IOW if: cpu_rq(k).next_balance < nohz.next_balance In such scenarios, the existing nohz.next_balance will prevent any NOHZ balance from happening, which itself will prevent nohz.next_balance from being updated to this new cpu_rq(k).next_balance. Unnecessary load balance delays of over 12ms caused by this were observed on an arm64 RB5 board. Use the new nohz.needs_update flag to mark the presence of newly-idle CPUs that need their rq->next_balance to be collated into nohz.next_balance. Trigger a NOHZ_NEXT_KICK when the flag is set. Signed-off-by: Valentin Schneider --- kernel/sched/fair.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 5c88698c3664..b5a4ea7715b9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5698,6 +5698,7 @@ static struct { cpumask_var_t idle_cpus_mask; atomic_t nr_cpus; int has_blocked; /* Idle CPUS has blocked load */ + int needs_update; /* Newly idle CPUs need their next_balance collated */ unsigned long next_balance; /* in jiffy units */ unsigned long next_blocked; /* Next update of blocked load in jiffies */ } nohz ____cacheline_aligned; @@ -10351,6 +10352,9 @@ static void nohz_balancer_kick(struct rq *rq) unlock: rcu_read_unlock(); out: + if (READ_ONCE(nohz.needs_update)) + flags |= NOHZ_NEXT_KICK; + if (flags) kick_ilb(flags); } @@ -10447,12 +10451,13 @@ void nohz_balance_enter_idle(int cpu) /* * Ensures that if nohz_idle_balance() fails to observe our * @idle_cpus_mask store, it must observe the @has_blocked - * store. + * and @needs_update stores. */ smp_mb__after_atomic(); set_cpu_sd_state_idle(cpu); + WRITE_ONCE(nohz.needs_update, 1); out: /* * Each time a cpu enter idle, we assume that it has blocked load and @@ -10501,13 +10506,17 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags, /* * We assume there will be no idle load after this update and clear * the has_blocked flag. If a cpu enters idle in the mean time, it will - * set the has_blocked flag and trig another update of idle load. + * set the has_blocked flag and trigger another update of idle load. * Because a cpu that becomes idle, is added to idle_cpus_mask before * setting the flag, we are sure to not clear the state and not * check the load of an idle cpu. + * + * Same applies to idle_cpus_mask vs needs_update. */ if (flags & NOHZ_STATS_KICK) WRITE_ONCE(nohz.has_blocked, 0); + if (flags & NOHZ_NEXT_KICK) + WRITE_ONCE(nohz.needs_update, 0); /* * Ensures that if we miss the CPU, we must see the has_blocked @@ -10531,6 +10540,8 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags, if (need_resched()) { if (flags & NOHZ_STATS_KICK) has_blocked_load = true; + if (flags & NOHZ_NEXT_KICK) + WRITE_ONCE(nohz.needs_update, 1); goto abort; } -- 2.25.1