Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp17868178rwd; Tue, 27 Jun 2023 08:30:23 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ40Aq9Su//K/S5V5wXsu9n2ITp0rGCgNywGbsuXnRs8V9p1m9d3Wtuw8jvPIjmXPU54eQHb X-Received: by 2002:a17:907:7209:b0:989:1a52:72a1 with SMTP id dr9-20020a170907720900b009891a5272a1mr19825976ejc.28.1687879823353; Tue, 27 Jun 2023 08:30:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687879823; cv=none; d=google.com; s=arc-20160816; b=LZLacx6zzLfFf2Fzix8YuxJhjGMtGiymx8LGpcDoWswaHNo2S1qAusFhWDBuib7Q1i B799wUQTqIL9p8yJM4c6M9/mML6yb89630kVIWn3zovjl2qDpY8v8Tlpm0TQsKheD3Km iSe1T1XGzdEBEDfR9Vyov3zLygZznN4/UfwIN+f+6PECpH+n1SfWMc2AkmBi7sJ8ZlQ/ vp5C2M5BSLk+pLZDiSa+NYjs/jUghMpWZNGB/L3WSrWvuCRYarMaI6fWrCeZkFrmp1P/ Miid7qyKnnxkiLJcHSTH4x1mpHuxfffYBB72m2lV13lEF8yPFxM8zqYcKXCSJX/PZHvn LY+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=RfcNDpkfchZ6l+1xXo16R23gF9ny8X8OnzlBUZJvJr4=; fh=NhtBvSktfbSA+SwxpYZSHOfHHBFQ5Rm15rD4mAt9Q/c=; b=Yi2mmUrSM1R5dhbxu8/dompVgHZINnSMbhauuWV9VkyH4t8ESk2tddn9riemBw/MQB sQWnbGX396WhuugsZ6IDx+fcE55eCQHiE21WaBH/YYihbR9Yxu6hLYte+Ou6urArJMXZ V2JkAcAXrwW52V7xStp05GpoboMSbD7dkespZLjuYM9aAj+PDlJhcowq94JO7UTX5hux hs2+/agvcUrC+Po4swnX1wffN8INDIJ4UubknjQ1WmUIM22CSqwoCqbBxls2YChZuvCc 3+Q09iP97yfEx2gr9lAOpgWaNxV8MYBcL7fSktQiGzcO4vhHOraVZpkdQiS7hvrO7Tkn h72Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ox27-20020a170907101b00b00982b276713csi4124644ejb.880.2023.06.27.08.29.57; Tue, 27 Jun 2023 08:30:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232326AbjF0PTU (ORCPT + 99 others); Tue, 27 Jun 2023 11:19:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232292AbjF0PTT (ORCPT ); Tue, 27 Jun 2023 11:19:19 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 36F3E26B3; Tue, 27 Jun 2023 08:19:17 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DF6492F4; Tue, 27 Jun 2023 08:20:00 -0700 (PDT) Received: from localhost (ionvoi01-desktop.cambridge.arm.com [10.2.78.69]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 67B873F663; Tue, 27 Jun 2023 08:19:16 -0700 (PDT) Date: Tue, 27 Jun 2023 16:19:14 +0100 From: Ionela Voinescu To: Ricardo Neri Cc: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot , Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Zhao Liu , "Yuan, Perry" , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, "Tim C . Chen" , Zhao Liu Subject: Re: [PATCH v4 07/24] sched/fair: Compute IPC class scores for load balancing Message-ID: References: <20230613042422.5344-1-ricardo.neri-calderon@linux.intel.com> <20230613042422.5344-8-ricardo.neri-calderon@linux.intel.com> <20230625201155.GA3902@ranerica-svr.sc.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230625201155.GA3902@ranerica-svr.sc.intel.com> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hey, On Sunday 25 Jun 2023 at 13:11:55 (-0700), Ricardo Neri wrote: > On Thu, Jun 22, 2023 at 10:02:44AM +0100, Ionela Voinescu wrote: > > On Monday 12 Jun 2023 at 21:24:05 (-0700), Ricardo Neri wrote: > > > When using IPCC scores to break ties between two scheduling groups, it is > > > necessary to consider both the current score and the score that would > > > result after load balancing. > > > > > > Compute the combined IPC class score of a scheduling group and the local > > > scheduling group. Compute both the current score and the prospective score. > > > > > > Collect IPCC statistics only for asym_packing and fully_busy scheduling > > > groups. These are the only cases that use IPCC scores. > > > > > > These IPCC statistics are used during idle load balancing. The candidate > > > scheduling group will have one fewer busy CPU after load balancing. This > > > observation is important for cores with SMT support. > > > > > > The IPCC score of scheduling groups composed of SMT siblings needs to > > > consider that the siblings share CPU resources. When computing the total > > > IPCC score of the scheduling group, divide the score of each sibling by > > > the number of busy siblings. > > > > > > Cc: Ben Segall > > > Cc: Daniel Bristot de Oliveira > > > Cc: Dietmar Eggemann > > > Cc: Ionela Voinescu > > > Cc: Joel Fernandes (Google) > > > Cc: Len Brown > > > Cc: Lukasz Luba > > > Cc: Mel Gorman > > > Cc: Perry Yuan > > > Cc: Rafael J. Wysocki > > > Cc: Srinivas Pandruvada > > > Cc: Steven Rostedt > > > Cc: Tim C. Chen > > > Cc: Valentin Schneider > > > Cc: Zhao Liu > > > Cc: x86@kernel.org > > > Cc: linux-pm@vger.kernel.org > > > Cc: linux-kernel@vger.kernel.org > > > Signed-off-by: Ricardo Neri > > > --- > > > Changes since v3: > > > * None > > > > > > Changes since v2: > > > * Also collect IPCC stats for fully_busy sched groups. > > > * Restrict use of IPCC stats to SD_ASYM_PACKING. (Ionela) > > > * Handle errors of arch_get_ipcc_score(). (Ionela) > > > > > > Changes since v1: > > > * Implemented cleanups and reworks from PeterZ. I took all his > > > suggestions, except the computation of the IPC score before and after > > > load balancing. We are computing not the average score, but the *total*. > > > * Check for the SD_SHARE_CPUCAPACITY to compute the throughput of the SMT > > > siblings of a physical core. > > > * Used the new interface names. > > > * Reworded commit message for clarity. > > > --- > > > kernel/sched/fair.c | 68 +++++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 68 insertions(+) > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index c0cab5e501b6..a51c65c9335f 100644 > > > --- a/kernel/sched/fair.c > > > +++ b/kernel/sched/fair.c > > > @@ -9114,6 +9114,8 @@ struct sg_lb_stats { > > > unsigned long min_score; /* Min(score(rq->curr->ipcc)) */ > > > unsigned short min_ipcc; /* Class of the task with the minimum IPCC score in the rq */ > > > unsigned long sum_score; /* Sum(score(rq->curr->ipcc)) */ > > > + long ipcc_score_after; /* Prospective IPCC score after load balancing */ > > > + unsigned long ipcc_score_before; /* IPCC score before load balancing */ > > > #endif > > > }; > > > > > > @@ -9452,6 +9454,62 @@ static void update_sg_lb_ipcc_stats(int dst_cpu, struct sg_lb_stats *sgs, > > > } > > > } > > > > > > +static void update_sg_lb_stats_scores(struct sg_lb_stats *sgs, > > > + struct sched_group *sg, > > > + struct lb_env *env) > > > +{ > > > + unsigned long score_on_dst_cpu, before; > > > + int busy_cpus; > > > + long after; > > > + > > > + if (!sched_ipcc_enabled()) > > > + return; > > > + > > > + /* > > > + * IPCC scores are only useful during idle load balancing. For now, > > > + * only asym_packing uses IPCC scores. > > > + */ > > > + if (!(env->sd->flags & SD_ASYM_PACKING) || > > > + env->idle == CPU_NOT_IDLE) > > > + return; > > > + > > > + /* > > > + * IPCC scores are used to break ties only between these types of > > > + * groups. > > > + */ > > > + if (sgs->group_type != group_fully_busy && > > > + sgs->group_type != group_asym_packing) > > > + return; > > > + > > > + busy_cpus = sgs->group_weight - sgs->idle_cpus; > > > + > > > + /* No busy CPUs in the group. No tasks to move. */ > > > + if (!busy_cpus) > > > + return; > > > + > > > + score_on_dst_cpu = arch_get_ipcc_score(sgs->min_ipcc, env->dst_cpu); > > > + > > > + /* > > > + * Do not use IPC scores. sgs::ipcc_score_{after, before} will be zero > > > + * and not used. > > > + */ > > > + if (IS_ERR_VALUE(score_on_dst_cpu)) > > > + return; > > > + > > > + before = sgs->sum_score; > > > + after = before - sgs->min_score; > > > > I don't believe this can end up being negative as the sum of all > > scores should be higher or equal to the min score, right? > > Yes, I agree. `after` cannot be negative. > > > > > I'm just wondering if ipcc_score_after can be made unsigned long as well, > > just for consistency. > > Sure. I can make it of type unsigned long as well. > > > > > > + > > > + /* SMT siblings share throughput. */ > > > + if (busy_cpus > 1 && sg->flags & SD_SHARE_CPUCAPACITY) { > > > + before /= busy_cpus; > > > + /* One sibling will become idle after load balance. */ > > > + after /= busy_cpus - 1; > > > + } > > > + > > > + sgs->ipcc_score_after = after + score_on_dst_cpu; > > > + sgs->ipcc_score_before = before; > > > > Shouldn't the score_on_dst_cpu be added to "after" before being divided > > between the SMT siblings? > > No, because ipcc_score_after represents the joint score of the busiest > core and the destination core after load balance has taken place. The > destination core was previously idle and now contributes to the joint > score. > Right! score_on_dst_cpu does not contribute to the per-cpu throughput of the busiest core, but it reflects the improvement in score gained by the move to the destination. Thanks, Ionela. > Thanks and BR, > Ricardo