Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp1235266rdh; Fri, 24 Nov 2023 07:57:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IHdcaHgNkXo82V4v4BFYGYQ0zPYIKePsXsaTUuN5EmtnUr56VmOpON55uUlxlxd63NvFZrf X-Received: by 2002:a17:903:2810:b0:1cf:6d46:9f2f with SMTP id kp16-20020a170903281000b001cf6d469f2fmr3393667plb.48.1700841468242; Fri, 24 Nov 2023 07:57:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700841468; cv=none; d=google.com; s=arc-20160816; b=Cvn5HDPsSF23PbSoI2ARcwD45a0Q1Y10tT6Je9NThbjJzYhkvNmzkzXW+GgXlnyld4 XVAQj8kZxY+8VFZs5C1gZ679xRRqYEIIGoJgY9vMKX1Qnk6k9mHXV0Th4aaymb3qRudz yCrjAH4W2Yd7sXKh/x8/evyDkGD7fuVu+jtRBwKez6fBFKeOE/LPJpEVBetJOo1lBaZ1 TYrVz0s+mBhn6ObtsuKw02GEu8yo9xKMOXSJBy7INa9TPfwHlqigqpEIdNNvjzVbtNEF yvwFRbOM4cMFIgOzncvuq55MWB+uqp+lih4thbHkueqimvBRxyre6Ra7Rf0dtVxtxg3T n6KA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=B3cboSartCpNqBP467JzWOycu78vu4u6V0sVuqE6qUk=; fh=XEnzpXK/Dbu4a4uc5iLjpw/BJ7NZR2CluOlCdXBjmbI=; b=nogFx1sWYXWgVh+3XVoknR9zKGLnQlRcn0X1ImondjXLJw2gdGHg9OS9CcqznRHuY1 f91EYLNHr9abgHVghiTGcssEdfmvnwBWydASeArkkK/wr2VVVvsV+lfr6tbmnpLrdGeW t0uKiHkSeShvd8OC/Dqog+GpWm4bSQHr8dLO+JvippjlPmRcyiTSIbSdf2vEmZLmLpK+ YQb2KGaoBnHKiP0aAozTKyz80dvK4cHT7e4nJAFU5ckEFuigKFb/cadSkkJRRYBndACN 0NszuRn/Xi576T0RGjYxCa1FsrWOHTUa+sWkuTi8WKfwyktcaQYCp6XLHUcWnzp1Ek5G Kztw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=KQVOPlD9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id e15-20020a17090301cf00b001c589ba4a04si972030plh.24.2023.11.24.07.57.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 07:57:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=KQVOPlD9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 3CBBD80789A7; Fri, 24 Nov 2023 07:56:37 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345930AbjKXP4N (ORCPT + 99 others); Fri, 24 Nov 2023 10:56:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231428AbjKXP4L (ORCPT ); Fri, 24 Nov 2023 10:56:11 -0500 Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF06CD53 for ; Fri, 24 Nov 2023 07:56:16 -0800 (PST) Received: by mail-pj1-x1031.google.com with SMTP id 98e67ed59e1d1-2856cdb4c09so1337786a91.0 for ; Fri, 24 Nov 2023 07:56:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1700841376; x=1701446176; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=B3cboSartCpNqBP467JzWOycu78vu4u6V0sVuqE6qUk=; b=KQVOPlD92Oc/EbOd61M/iSaYw4eyYY+zsC4bq0GVDppQ4ScXvH2ZjhUTNjDbolyAZJ dSlO1OJepBLBE2E85npxxSSRMkrGmmGoNdGUsLFASpGyQZBKJiYwz/sxiGGtXZ7wuECV oIgCbT1d4HlMM2YDvhsIp2xQYNqL31FJD7wSFfcth/X2DCGArVVDkVNKyeLi4PXvElpc b9pJEQuU2r1fHxiKkIiQL1++axy69ylJs2CtV/DBgEgCD/e27FgrwWiv2YnuocorCQEz bNXGlblq+lph5tiP9scF5tCOYK/ZRR/anoLVZaurB1dElZVYc2c8cM9zDBryvaS7icqZ RbYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700841376; x=1701446176; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=B3cboSartCpNqBP467JzWOycu78vu4u6V0sVuqE6qUk=; b=qIashDeB448lWDZ1Zr41aLfjSFS9xeWCVhaV0RPB7/spr7C26Gl5NxY5PMvpHzOt/C dmRAEv0Sjh8DbnvOoFa77vpTGBlnKZ0u7LCLV5R31gg1pTgwhAUDNpJxtRUxbKrRIvS2 4Y+3/EpDVeyRT7dVkGEhSx1nrtX31SS8+aXmHFojfuitDC3x6fvBGky+D7uWxcr+yzVW A0ISV4I914iyH6aqjiiml4GDDeRGIsCi0bR1WqToalnyl7emR2ikcUnKpXYZ3o3wiiPr Pn7nX1hcAyARVsYyU1EIwwzKJPlLITMN/VVP8wwaVpsNINSJCKNhuWYjqVvraGT3kynV ZzeQ== X-Gm-Message-State: AOJu0YyfdgcCLO8aY9U84j7LpYrMwkFVwy7R+d2u0FwgmjRuCSK0EFDU eMuPNeL7TfSfUfW5/gcWCJCeJi0vbOR9QsyvD79gHQ== X-Received: by 2002:a17:90b:4f44:b0:285:8d5b:631f with SMTP id pj4-20020a17090b4f4400b002858d5b631fmr2503597pjb.5.1700841376098; Fri, 24 Nov 2023 07:56:16 -0800 (PST) MIME-Version: 1.0 References: <20231124153323.3202444-1-pierre.gondois@arm.com> In-Reply-To: <20231124153323.3202444-1-pierre.gondois@arm.com> From: Vincent Guittot Date: Fri, 24 Nov 2023 16:56:04 +0100 Message-ID: Subject: Re: [PATCH v2] sched/fair: Use all little CPUs for CPU-bound workload To: Pierre Gondois Cc: linux-kernel@vger.kernel.org, Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 24 Nov 2023 07:56:37 -0800 (PST) On Fri, 24 Nov 2023 at 16:33, Pierre Gondois wrote: > > Running n CPU-bound tasks on an n CPUs platform: > - with asymmetric CPU capacity > - not having SD_SHARE_PKG_RESOURCES flag set at the DIE > sched domain level (i.e. not DynamIQ systems) Nit: SD_SHARE_PKG_RESOURCES is never set at the DIE level. In case of DynamIQ systems, all CPUs are in the same MC level which has SD_SHARE_PKG_RESOURCES flag > might result in a task placement where two tasks run on a big CPU > and none on a little CPU. This placement could be more optimal by > using all CPUs. > > Testing platform: > Juno-r2: > - 2 big CPUs (1-2), maximum capacity of 1024 > - 4 little CPUs (0,3-5), maximum capacity of 383 > > Testing workload ([1]): > Spawn 6 CPU-bound tasks. During the first 100ms (step 1), each tasks > is affine to a CPU, except for: > - one little CPU which is left idle. > - one big CPU which has 2 tasks affine. > After the 100ms (step 2), remove the cpumask affinity. > > Before patch: > During step 2, the load balancer running from the idle CPU tags sched > domains as: > - little CPUs: 'group_has_spare'. Indeed, 3 CPU-bound tasks run on a > 4 CPUs sched-domain, and the idle CPU provides enough spare > capacity. > - big CPUs: 'group_overloaded'. Indeed, 3 tasks run on a 2 CPUs > sched-domain, so the following path is used: > group_is_overloaded() > \-if (sgs->sum_nr_running <= sgs->group_weight) return true; > > The following path which would change the migration type to > 'migrate_task' is not taken: > calculate_imbalance() > \-if (env->idle != CPU_NOT_IDLE && env->imbalance == 0) > as the local group has some spare capacity, so the imbalance > is not 0. > > The migration type requested is 'migrate_util' and the busiest > runqueue is the big CPU's runqueue having 2 tasks (each having a > utilization of 512). The idle little CPU cannot pull one of these > task as its capacity is too small for the task. The following path > is used: > detach_tasks() > \-case migrate_util: > \-if (util > env->imbalance) goto next; > > After patch: > As the number of failed balancing attempts grows (with > 'nr_balance_failed'), progressively make it easier to migrate > a big task to the idling little CPU. A similar mechanism is > used for the 'migrate_load' migration type. > > Improvement: > Running the testing workload [1] with the step 2 representing > a ~10s load for a big CPU: > Before patch: ~19.3s > After patch: ~18s (-6.7%) > > Similar issue reported at: > https://lore.kernel.org/lkml/20230716014125.139577-1-qyousef@layalina.io/ > > v1: > https://lore.kernel.org/all/20231110125902.2152380-1-pierre.gondois@arm.com/ > > Suggested-by: Vincent Guittot > Signed-off-by: Pierre Gondois Reviewed-by: Vincent Guittot > --- > kernel/sched/fair.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index df348aa55d3c..53c18fd23ae7 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -8907,7 +8907,7 @@ static int detach_tasks(struct lb_env *env) > case migrate_util: > util = task_util_est(p); > > - if (util > env->imbalance) > + if (shr_bound(util, env->sd->nr_balance_failed) > env->imbalance) > goto next; > > env->imbalance -= util; > -- > 2.25.1 >