Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp360911iob; Wed, 18 May 2022 03:57:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyTHnVTLdO83qED/8EhdSDrC9FEAiDyysGzz76lAoTsPtTeHKCWZz/KakP82Ys9Qiqy38Wl X-Received: by 2002:a17:90a:e7d2:b0:1dc:e6c6:604b with SMTP id kb18-20020a17090ae7d200b001dce6c6604bmr41230165pjb.183.1652871451996; Wed, 18 May 2022 03:57:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652871451; cv=none; d=google.com; s=arc-20160816; b=yVDi2CfP2G8cnCwbqhV7dqvwgQPQ5Phapl8GcQKGXzhfmkHUatRYjyXaS64PuA1zkE GfUQiZm3Fm4Ewuc99C+smV+E0TN+dEvyj4DPW+/GeH2KHqh2LMhby+uNP2pKy3NIXyNT eeeW8xHDoZ3Jzmx5Qr1xtpjImMNUXaqfrzOOTSp6k+A813Br7eocIOZySrsvNGzCSske gfo3jIzUzjTYm1CdV4C0aIKBbLVoaU5uh0t43jH+R2YxOJCG8NICZV5VobnCdmePXXkR Tj7cdKygcxjSGxghg8UhxCRBJstfSctT4kYeSD4Tfgi+0jVb1HQesTmXSycmPg5+eapD oa2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=3t7YSORU1Cl6BfFNsvYsAdrG53VfVbs6EoYHEoTGf1k=; b=Gjl+wNQJmi+zYD4jTPhq2oN/CKg9WH8HxUQGOxSYc/LFjVy4W/F8T4TQ2L0kSmTEeQ OY3jLUN9lVQTnjhZO67CHD6xiGgqWsBvfIo+CtuPLfsg6DGdqhQTcMU7n3ZV7k4f1j88 5YDXrjtx9gC7FdrPCKqCBvt4qvQ7z9CwbWpggeOWAOGc5n+QUjs0TNJxb0dWKGJPfH2N mL5jtXBV3SZGOb4XrMISOSaxW3UGXfCjc/lyNtD8J/KVh+nJPFXoriym/d0CLTJ83FVT gxwHQWzKh3l+S9FIlWXSsh+cZvYbquIXFo5qd4/S4AWod6ZsdHInW6sGBtizbVtbp8nl dFlA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id o14-20020a63920e000000b0039911b1bf43si2292077pgd.269.2022.05.18.03.57.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 May 2022 03:57:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C55CE81482; Wed, 18 May 2022 03:47:22 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235171AbiERKrP (ORCPT + 99 others); Wed, 18 May 2022 06:47:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235146AbiERKrD (ORCPT ); Wed, 18 May 2022 06:47:03 -0400 Received: from outbound-smtp22.blacknight.com (outbound-smtp22.blacknight.com [81.17.249.190]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39F43793B1 for ; Wed, 18 May 2022 03:46:56 -0700 (PDT) Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp22.blacknight.com (Postfix) with ESMTPS id 1D73CBAA9D for ; Wed, 18 May 2022 11:46:55 +0100 (IST) Received: (qmail 18669 invoked from network); 18 May 2022 10:46:55 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.198.246]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 18 May 2022 10:46:54 -0000 Date: Wed, 18 May 2022 11:46:52 +0100 From: Mel Gorman To: Peter Zijlstra Cc: Ingo Molnar , Vincent Guittot , Valentin Schneider , Aubrey Li , LKML Subject: Re: [PATCH 3/4] sched/numa: Apply imbalance limitations consistently Message-ID: <20220518104652.GO3441@techsingularity.net> References: <20220511143038.4620-1-mgorman@techsingularity.net> <20220511143038.4620-4-mgorman@techsingularity.net> <20220518093156.GD10117@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20220518093156.GD10117@worktop.programming.kicks-ass.net> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 18, 2022 at 11:31:56AM +0200, Peter Zijlstra wrote: > On Wed, May 11, 2022 at 03:30:37PM +0100, Mel Gorman wrote: > > > @@ -9108,6 +9108,24 @@ static inline bool allow_numa_imbalance(int running, int imb_numa_nr) > > return running <= imb_numa_nr; > > } > > > > +#define NUMA_IMBALANCE_MIN 2 > > + > > +static inline long adjust_numa_imbalance(int imbalance, > > + int dst_running, int imb_numa_nr) > > +{ > > + if (!allow_numa_imbalance(dst_running, imb_numa_nr)) > > + return imbalance; > > + > > + /* > > + * Allow a small imbalance based on a simple pair of communicating > > + * tasks that remain local when the destination is lightly loaded. > > + */ > > + if (imbalance <= NUMA_IMBALANCE_MIN) > > + return 0; > > + > > + return imbalance; > > +} > > > @@ -9334,24 +9356,6 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd > > } > > } > > > > -#define NUMA_IMBALANCE_MIN 2 > > - > > -static inline long adjust_numa_imbalance(int imbalance, > > - int dst_running, int imb_numa_nr) > > -{ > > - if (!allow_numa_imbalance(dst_running, imb_numa_nr)) > > - return imbalance; > > - > > - /* > > - * Allow a small imbalance based on a simple pair of communicating > > - * tasks that remain local when the destination is lightly loaded. > > - */ > > - if (imbalance <= NUMA_IMBALANCE_MIN) > > - return 0; > > - > > - return imbalance; > > -} > > If we're going to move that one up and remove the only other caller of > allow_numa_imbalance() we might as well move it up further still and > fold the functions. > > Hmm? > Yes, that would be fine and makes sense. I remember thinking that they should be folded and then failed to follow through. > (Although I do wonder about that 25% figure in the comment; that doesn't > seem to relate to any actual code anymore) > You're right, by the end of the series it's completely inaccurate and currently it's not accurate if there are multiple LLCs per node. I adjusted the wording to "Allow a NUMA imbalance if busy CPUs is less than the maximum threshold. Above this threshold, individual tasks may be contending for both memory bandwidth and any shared HT resources." Diff between v1 and v2 is now below diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 602c05b22805..51fde61ec756 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1536,8 +1536,31 @@ struct task_numa_env { static unsigned long cpu_load(struct rq *rq); static unsigned long cpu_runnable(struct rq *rq); -static inline long adjust_numa_imbalance(int imbalance, - int dst_running, int imb_numa_nr); + +#define NUMA_IMBALANCE_MIN 2 + +static inline long +adjust_numa_imbalance(int imbalance, int dst_running, int imb_numa_nr) +{ + /* + * Allow a NUMA imbalance if busy CPUs is less than the maximum + * threshold. Above this threshold, individual tasks may be contending + * for both memory bandwidth and any shared HT resources. This is an + * approximation as the number of running tasks may not be related to + * the number of busy CPUs due to sched_setaffinity. + */ + if (dst_running > imb_numa_nr) + return imbalance; + + /* + * Allow a small imbalance based on a simple pair of communicating + * tasks that remain local when the destination is lightly loaded. + */ + if (imbalance <= NUMA_IMBALANCE_MIN) + return 0; + + return imbalance; +} static inline enum numa_type numa_classify(unsigned int imbalance_pct, @@ -9098,34 +9121,6 @@ static bool update_pick_idlest(struct sched_group *idlest, return true; } -/* - * Allow a NUMA imbalance if busy CPUs is less than 25% of the domain. - * This is an approximation as the number of running tasks may not be - * related to the number of busy CPUs due to sched_setaffinity. - */ -static inline bool allow_numa_imbalance(int running, int imb_numa_nr) -{ - return running <= imb_numa_nr; -} - -#define NUMA_IMBALANCE_MIN 2 - -static inline long adjust_numa_imbalance(int imbalance, - int dst_running, int imb_numa_nr) -{ - if (!allow_numa_imbalance(dst_running, imb_numa_nr)) - return imbalance; - - /* - * Allow a small imbalance based on a simple pair of communicating - * tasks that remain local when the destination is lightly loaded. - */ - if (imbalance <= NUMA_IMBALANCE_MIN) - return 0; - - return imbalance; -} - /* * find_idlest_group() finds and returns the least busy CPU group within the * domain. @@ -9448,14 +9443,15 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s * idle cpus. */ env->migration_type = migrate_task; - env->imbalance = max_t(long, 0, (local->idle_cpus - - busiest->idle_cpus)); + env->imbalance = max_t(long, 0, + (local->idle_cpus - busiest->idle_cpus)); } /* Consider allowing a small imbalance between NUMA groups */ if (env->sd->flags & SD_NUMA) { env->imbalance = adjust_numa_imbalance(env->imbalance, - local->sum_nr_running + 1, env->sd->imb_numa_nr); + local->sum_nr_running + 1, + env->sd->imb_numa_nr); } /* Number of tasks to move to restore balance */