Received: by 2002:ac0:e350:0:0:0:0:0 with SMTP id g16csp338714imn; Thu, 4 Aug 2022 07:57:59 -0700 (PDT) X-Google-Smtp-Source: AA6agR6xa3Y96NR+3qg0Syro6tGbJzaPVnQSUcDAry1wAKTe94fNncQrQKaxP/N05sunOzkw5Ipy X-Received: by 2002:a65:6bd6:0:b0:39d:4f85:9ecf with SMTP id e22-20020a656bd6000000b0039d4f859ecfmr1993264pgw.336.1659625078964; Thu, 04 Aug 2022 07:57:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659625078; cv=none; d=google.com; s=arc-20160816; b=YYlA3hkBs5WQNpDa/9pATAwEVa4g+zujXRXYx7097W/1TZeLITJD8Dek1DpilXc6ls vl5UTTczggOEJPczwNUxNOuvvjvccIL3kGcUuRbDVCowBujaIO66+pFofx86WBlMJ2Fr JYp6qPKWx9VP2bbp0oQdkwODGA31v3ZBMOdy0LnfEObVDA1z2PVyX/SP3VX6tYELbrmG jght1QxamEkCJb/LwQMTJbF23Zx76J+lxZMkJbc/p5mS6/I1ta30VQhJPQPexSmm1e8q AP2RMIAPidaZXixgPdT/SON7S8VbN4Pclt/dRMOarDxN4UBEV5eup5qfIuFmKOnPxKKX rQNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=BXZ1g0x7eWjb6iu7+MnVPYCKSqTpXbpUsQU1q5t3FM8=; b=dMWJvpiByJk2n5hle/r2X9C0EOasczSjgXahcfoXw6lkkYfTilUWF4jNPEX0eXxB4h KxGIEBoJsEnUeORmyzwox2rG1mN9surVRGnKg2FrAcCCtCh9mR8PS+oFneF7nZ8NCCbd T9yRt2vnNpnzNya4iNpqsg0ZxjZMQTYrO+h4kqvRpZzxLRjQrARld+Ovo/3Vxdw0oysK 7KocZIT80ofXcuGR3LjxPonbXKQlzkPVGtrrX+1TTdq48cbWluJMtW4bXpv0PQ5PYBGd AZsjFfWg9DSbnmpurMkttLysGtXkWIjXXRJV6eR790XfIHwtlDNHPymkD/uF3xcs7LhE y1QA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pi5-20020a17090b1e4500b001e6820f720esi6195555pjb.125.2022.08.04.07.57.45; Thu, 04 Aug 2022 07:57:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240070AbiHDOh0 (ORCPT + 99 others); Thu, 4 Aug 2022 10:37:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239948AbiHDOg5 (ORCPT ); Thu, 4 Aug 2022 10:36:57 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 172174D820 for ; Thu, 4 Aug 2022 07:36:55 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 822D311FB; Thu, 4 Aug 2022 07:36:55 -0700 (PDT) Received: from localhost.localdomain (unknown [10.57.10.177]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 56C3A3F73B; Thu, 4 Aug 2022 07:36:53 -0700 (PDT) From: Qais Yousef To: Ingo Molnar , "Peter Zijlstra (Intel)" , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, Xuewen Yan , Lukasz Luba , Wei Wang , Jonathan JMChen , Hank , Qais Yousef Subject: [PATCH v2 8/9] sched/fair: Detect capacity inversion Date: Thu, 4 Aug 2022 15:36:08 +0100 Message-Id: <20220804143609.515789-9-qais.yousef@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220804143609.515789-1-qais.yousef@arm.com> References: <20220804143609.515789-1-qais.yousef@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Check each performance domain to see if thermal pressure is causing its capacity to be lower than another performance domain. We assume that each performance domain has CPUs with the same capacities, which is similar to an assumption made in energy_model.c We also assume that thermal pressure impacts all CPUs in a performance domain equally. If there're multiple performance domains with the same capacity_orig, we will trigger a capacity inversion if the domain is under thermal pressure. The new cpu_in_capacity_inversion() should help users to know when information about capacity_orig are not reliable and can opt in to use the inverted capacity as the 'actual' capacity_orig. Signed-off-by: Qais Yousef --- kernel/sched/fair.c | 63 +++++++++++++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 19 +++++++++++++ 2 files changed, 79 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 59ba7106ddc6..cb32dc9a057f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8659,16 +8659,73 @@ static unsigned long scale_rt_capacity(int cpu) static void update_cpu_capacity(struct sched_domain *sd, int cpu) { + unsigned long capacity_orig = arch_scale_cpu_capacity(cpu); unsigned long capacity = scale_rt_capacity(cpu); struct sched_group *sdg = sd->groups; + struct rq *rq = cpu_rq(cpu); - cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(cpu); + rq->cpu_capacity_orig = capacity_orig; if (!capacity) capacity = 1; - cpu_rq(cpu)->cpu_capacity = capacity; - trace_sched_cpu_capacity_tp(cpu_rq(cpu)); + rq->cpu_capacity = capacity; + + /* + * Detect if the performance domain is in capacity inversion state. + * + * Capacity inversion happens when another perf domain with equal or + * lower capacity_orig_of() ends up having higher capacity than this + * domain after subtracting thermal pressure. + * + * We only take into account thermal pressure in this detection as it's + * the only metric that actually results in *real* reduction of + * capacity due to performance points (OPPs) being dropped/become + * unreachable due to thermal throttling. + * + * We assume: + * * That all cpus in a perf domain have the same capacity_orig + * (same uArch). + * * Thermal pressure will impact all cpus in this perf domain + * equally. + */ + if (static_branch_unlikely(&sched_asym_cpucapacity)) { + unsigned long inv_cap = capacity_orig - thermal_load_avg(rq); + struct perf_domain *pd = rcu_dereference(rq->rd->pd); + + rq->cpu_capacity_inverted = 0; + + for (; pd; pd = pd->next) { + struct cpumask *pd_span = perf_domain_span(pd); + unsigned long pd_cap_orig, pd_cap; + + cpu = cpumask_any(pd_span); + pd_cap_orig = arch_scale_cpu_capacity(cpu); + + if (capacity_orig < pd_cap_orig) + continue; + + /* + * handle the case of multiple perf domains have the + * same capacity_orig but one of them is under higher + * thermal pressure. We record it as capacity + * inversion. + */ + if (capacity_orig == pd_cap_orig) { + pd_cap = pd_cap_orig - thermal_load_avg(cpu_rq(cpu)); + + if (pd_cap > inv_cap) { + rq->cpu_capacity_inverted = inv_cap; + break; + } + } else if (pd_cap_orig > inv_cap) { + rq->cpu_capacity_inverted = inv_cap; + break; + } + } + } + + trace_sched_cpu_capacity_tp(rq); sdg->sgc->capacity = capacity; sdg->sgc->min_capacity = capacity; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index caf017f7def6..541a70fa55b3 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1033,6 +1033,7 @@ struct rq { unsigned long cpu_capacity; unsigned long cpu_capacity_orig; + unsigned long cpu_capacity_inverted; struct callback_head *balance_callback; @@ -2865,6 +2866,24 @@ static inline unsigned long capacity_orig_of(int cpu) return cpu_rq(cpu)->cpu_capacity_orig; } +/* + * Returns inverted capacity if the CPU is in capacity inversion state. + * 0 otherwise. + * + * Capacity inversion detection only considers thermal impact where actual + * performance points (OPPs) gets dropped. + * + * Capacity inversion state happens when another performance domain that has + * equal or lower capacity_orig_of() becomes effectively larger than the perf + * domain this CPU belongs to due to thermal pressure throttling it hard. + * + * See comment in update_cpu_capacity(). + */ +static inline unsigned long cpu_in_capacity_inversion(int cpu) +{ + return cpu_rq(cpu)->cpu_capacity_inverted; +} + /** * enum cpu_util_type - CPU utilization type * @FREQUENCY_UTIL: Utilization used to select frequency -- 2.25.1