Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1598580rwd; Tue, 13 Jun 2023 11:15:30 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4G/iv5bBlgMGanerJSneHys0gn1WCpF7GEUvKfb7L5tv5KArw9/yxFxwSNt+jO4H/wspTY X-Received: by 2002:a17:902:e802:b0:1b3:de47:fafc with SMTP id u2-20020a170902e80200b001b3de47fafcmr4277746plg.7.1686680130385; Tue, 13 Jun 2023 11:15:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686680130; cv=none; d=google.com; s=arc-20160816; b=Gowfm3XIue1xMWBvYAOp+46KwruQCACWY91KIQTBAtuthZ+h+Phk+onAofZTaAlP+S TwE3fvlOqkYwsOBl+T85yx0KWMM3Jvp1PG/lhKUfajxN9Xvevv+L12bHWjOAzhRhp0Sk jz2CaDtOpI1QaoYPho8FNQ4jcvG0lya1mifnEqTgBEc3gcNjf+c75CzcJWvFuO505Nrq 16g9lDkSUuwtCKLoLSnnBiab94MKwCWgiFGsrBAZJo1nrVLeWhdipP3BJOL044zNTFtG FVU7bxYguQmU8g0jtdPaniSw25aJCuFVl0I7C2kEhP58Y40rbnZcBOeHolqKh8CWzeed g4/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=8xokuZhvpghp2Y7wi40uppzZc7OzhKtApAvlTTDGLIo=; b=MQQSAN7is4fyFHGy57VnicD/LPIdoJu2RBiMS/BKZ2w/ChA6UEP+CRzssKbLcJf3j0 TI8W3Rbhtyi5YtC1uehpGuMJTa3ARJt1QEA++C8dFj6rAK57beDHx+gHq+VgfpJqF4qn A7w2/qpbhYzcyCTh3KX8arjfuSqfnEyipjufJ5DGHc5EG7IfzMIpP0t6kWPDHJYB6YFv Nd+wazij9tqPLGz3IWx/wrc0rTfYso1KsmNCijGDoFGmFhiUuUsgMl38fhOFpekedz/N MlfYoapsbSg7KeKnlaHw18JHRjTXE5eFVkr8N0raQ3I7YbGeqrvA3AXKLdbGR/WM38ME vBWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="d/8fjFgA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c17-20020a170902d49100b001b02658db01si9523658plg.580.2023.06.13.11.15.17; Tue, 13 Jun 2023 11:15:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="d/8fjFgA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234951AbjFMRrn (ORCPT + 99 others); Tue, 13 Jun 2023 13:47:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41216 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229552AbjFMRrm (ORCPT ); Tue, 13 Jun 2023 13:47:42 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B34013E for ; Tue, 13 Jun 2023 10:47:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686678461; x=1718214461; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=moJn4Wg/QkVVRa5xXN16iaIWdvxmWOJHmMLub2pzT1k=; b=d/8fjFgAHBqbLSPDgJa3IGcGGWFmam2pAoYHUcgZPI2qeXtmwezONOxv lQrJWoex/wj1w5De55QO//5ZesGHKOiy4wwnNwxBwyieG03GiSfVgSVxD vLsX/fWXKbDVAwIF8nzHZYkUqGE/vCS+L9qTmz4P6CWXbagMyOGmQcGTg q6Xn52cv6LYk43ZlZ59DaxAN7iaz/8DoioOmnjRdg97SgFQPe2LjtPpvz kb2TIO3nJjh2vArRAXZqMc1aJirpu2oK6GzRQtdZOgJ0k2c/OjZfP7poD etXaR3vxzB7iDrCrNxLeJ06iK1XNWJcSeRRjSeXyGcL0ghYnL304U7CfR g==; X-IronPort-AV: E=McAfee;i="6600,9927,10740"; a="355908277" X-IronPort-AV: E=Sophos;i="6.00,240,1681196400"; d="scan'208";a="355908277" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2023 10:46:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10740"; a="711741019" X-IronPort-AV: E=Sophos;i="6.00,240,1681196400"; d="scan'208";a="711741019" Received: from srekam-mobl.amr.corp.intel.com (HELO [10.212.191.248]) ([10.212.191.248]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2023 10:46:37 -0700 Message-ID: <321a474bfa562164a56f504144d6b33eb2f7acbd.camel@linux.intel.com> Subject: Re: [Patch v2 3/6] sched/fair: Implement prefer sibling imbalance calculation between asymmetric groups From: Tim Chen To: Peter Zijlstra Cc: Juri Lelli , Vincent Guittot , Ricardo Neri , "Ravi V . Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J . Wysocki" , Srinivas Pandruvada , Steven Rostedt , Valentin Schneider , Ionela Voinescu , x86@kernel.org, linux-kernel@vger.kernel.org, Shrikanth Hegde , Srikar Dronamraju , naveen.n.rao@linux.vnet.ibm.com, Yicong Yang , Barry Song , Chen Yu , Hillf Danton Date: Tue, 13 Jun 2023 10:46:36 -0700 In-Reply-To: <20230612120528.GL4253@hirez.programming.kicks-ass.net> References: <20230612120528.GL4253@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4 (3.44.4-2.fc36) MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2023-06-12 at 14:05 +0200, Peter Zijlstra wrote: > On Thu, Jun 08, 2023 at 03:32:29PM -0700, Tim Chen wrote: >=20 > >=20 > > + if (env->sd->flags & SD_ASYM_PACKING) { > > + int limit; > > + > > + if (!busiest->sum_nr_running) > > + goto out; >=20 > This seems out-of-place, shouldn't we have terminate sooner if busiest > is empty? Yes. Should move this check to the beginning. >=20 > > + > > + if (sched_asym_prefer(env->dst_cpu, sds->busiest->asym_prefer_cpu)) = { > > + /* Don't leave preferred core idle */ > > + if (imbalance =3D=3D 0 && local->sum_nr_running < ncores_local) > > + imbalance =3D 1; > > + goto out; > > + } > > + > > + /* Limit tasks moved from preferred group, don't leave cores idle */ > > + limit =3D busiest->sum_nr_running; > > + lsub_positive(&limit, ncores_busiest); > > + if (imbalance > limit) > > + imbalance =3D limit; >=20 > How does this affect the server parts that have larger than single core > turbo domains? Are you thinking about the case where the local group is completely empty so there's turbo headroom and we should move at least one task, even though CPU in busiest group has higher priority? In other words, are you suggesting we should add if (imbalance =3D=3D 0 && busiest->sum_nr_running > 0 && local->sum_nr_running =3D=3D 0) imbalance =3D 1; =09 >=20 > > + > > + goto out; > > + } > > + > > + /* Take advantage of resource in an empty sched group */ > > + if (imbalance =3D=3D 0 && local->sum_nr_running =3D=3D 0 && > > + busiest->sum_nr_running > 1) > > + imbalance =3D 1; > > +out: > > + return imbalance << 1; > > +} >=20 >=20 > But basically you have: >=20 > LcBn - BcLn > imb =3D ----------- > LcBc >=20 > Which makes sense, except you then return: >=20 > imb * 2 >=20 > which then made me wonder about rounding. >=20 > Do we want to to add (LcBc -1) or (LcBc/2) to resp. ceil() or round() > the thing before division? Because currently it uses floor(). >=20 > If you evaludate it like: >=20 >=20 > 2 * (LcBn - BcLn) > imb =3D ----------------- > LcBc >=20 > The result is different from what you have now. If I do the rounding after multiplying imb by two (call it imb_new), the difference with imb I am returning now (call it imb_old) will be at most 1. Note that imb_old returned is always a multiple of 2. I will be using imb in calculate_imbalance() and divide it by 2 there to get the number tasks to move from busiest group. So when there is a difference of 1 between imb_old and imb_new, the difference will be trimmed off after the division of 2. We will get the same number of tasks to move with either imb_old or imb_new in calculate_imbalance() so the two computations will arrive at the same result eventually. >=20 > What actual behaviour is desired in these low imbalance cases? and can > you write a comment as to why we do as we do etc..? I do not keep imb as=20 2 * (LcBn - BcLn) imb =3D ----------------- LcBc as it is easier to leave out the factor of 2 in the middle of sibling_imblalance() computation so I can directly interpret imb as the number of tasks to move, and add the factor of two when I actually need to return the imbalance. Would you like me to add this reasoning in the comments? Thanks. Tim =20