Received: by 2002:a05:6358:f14:b0:e5:3b68:ec04 with SMTP id b20csp3936853rwj; Tue, 20 Dec 2022 04:05:39 -0800 (PST) X-Google-Smtp-Source: AMrXdXtuhuTfF18SIn03Nein3DKCg9hO7GHON+m6murGhkroC5AaX00582tO1WtxUZFb9tVEKDLp X-Received: by 2002:a05:6a20:8f03:b0:af:aa38:12e0 with SMTP id b3-20020a056a208f0300b000afaa3812e0mr30853225pzk.34.1671537939503; Tue, 20 Dec 2022 04:05:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671537939; cv=none; d=google.com; s=arc-20160816; b=Al+zmBV81FBgG7VrLXnISs7rDbNKVNygKIFWkfzfgtyDlRGxmGX0nx1CXYWZHJg41J OR8okAJkkbrBSKkj1UHBoCnmtjjDiqrjjidz6D5xFqTMy2RDY7V4JH1oOXG2bjAwwNUE OzLWCd+iuHovFyqk22gAXWS3S7K3ebHhg1+upPdAuja5Ua7F2EfNy1VvuPfHsOZNesBY CGKXXsc6dyrJzNsvjei5P8DkgZ9FMELhWXdvFy3EC+vawRHr3vxgJESwOcJ5LC9hq4dp PdTuGs/iGFu4ZeTR/Qctf4bfwIGpMJ/LrFH6IxnIFDOQc/24OoZEX0wWGaAYVCccArFc wVMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=2IFHwf3PZoIlb3TFc7LKyLHPvFKgU3mx6c729/ao6P8=; b=zsWKibyt2xq7lKiIfnhwrWmjmJ4kICRvNoprbm/7UOqCBcvboEBpCusTSFpX36mAWc 6dpblAH+6D+wDJgX6tjWKZywGv8QGMUPPl/R6Zb1z/GXZHNuGBVBvSPuGsbemIeRkcJm 9eeLUlybTTBjlUVRcGaavlgO+dueVBZDETppziC/oUXpzTXG20sZTyMIgS+3klcF1aOE eR7DiOkOaHlSv5BYkbc2rpm+Rmt+HhbLLRqAVyrsPmcQPPDP59NKNzQ+o0iFhDXNS9fw YRAhJquw2Br4E9wIZTpSO47X5Gh4h082v8b6JIqi9BmtDXYb4nGexUyGNvnkj3pMuWUy ZtdA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@layalina-io.20210112.gappssmtp.com header.s=20210112 header.b=KGBOMCVc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k64-20020a633d43000000b004827ef5f0ccsi15009494pga.352.2022.12.20.04.05.29; Tue, 20 Dec 2022 04:05:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@layalina-io.20210112.gappssmtp.com header.s=20210112 header.b=KGBOMCVc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233340AbiLTLvr (ORCPT + 71 others); Tue, 20 Dec 2022 06:51:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233248AbiLTLvf (ORCPT ); Tue, 20 Dec 2022 06:51:35 -0500 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36C6FBCB8 for ; Tue, 20 Dec 2022 03:51:33 -0800 (PST) Received: by mail-wr1-x435.google.com with SMTP id m14so11468981wrh.7 for ; Tue, 20 Dec 2022 03:51:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2IFHwf3PZoIlb3TFc7LKyLHPvFKgU3mx6c729/ao6P8=; b=KGBOMCVcGVKBXskn32NrUuRBQsNEZGoWTiRBlVrmOwTdCYTkMp08EVAkvjN2E1gls5 AjBv7U4VTcBXUqaGK1TIgnWA//h9h2684PY3noMtSij6DU4TaImgmd8ju275B+nC4fRx JApm7FXhBPqF/ZYwOrSO+dumozi1gUELkBzj8njaJpHH9q0VpBEcKZ9whf6Vx+5foqJm sb5b7WGKYmTSup4mtfFCUi8RR0bn7qRuPc9UWwDFy6dVdd1bSYRvqvhGr1/T23Z7fy/P 3jhlBt/CLBO5q2KsJmgLX9QenXspEfq7SBT+TSdxS/HdmC8zGpd6XxKn19uPYVYLeCSa LYvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2IFHwf3PZoIlb3TFc7LKyLHPvFKgU3mx6c729/ao6P8=; b=a488CxGkQSu1kMNph9e5IbP/jknmqt+muH6UuwbLYwQOP0LPluxuwcnGH8GSZErKhd 7o62EvuN61CMDio6LiPnSkr8uKxC0VyEJYZpiGiM0WfOkIVSrQLt2SjeOqedcz+Jo0/p E/ZfXi0XbNVxuwaaFBb6mKlR3U0RX9aDHi9T/4bZKP4GP11uO0tuALaCYfrvem1fjjJ2 CP+6N6zllILqldtvJNZ5ezu/9i7UXbxzxK91pWoH1o5hTah5AZRRei9T62zxziSA4wci C4yAnWGAo+cm8p9kqEhuViLmaokPMj8akezKH4cu828ejeeSXt66YrQ3/Fof6vsu28ZH T1wQ== X-Gm-Message-State: ANoB5pl9tY7QMNmHaM8ElqdkNmPoKHCwrtTEEOwdx8Vw2XAUiuqgcZ+C gWMkmMHXU53woqB4ClI3PPKh5w== X-Received: by 2002:a5d:5950:0:b0:242:88f8:a67c with SMTP id e16-20020a5d5950000000b0024288f8a67cmr27941684wri.42.1671537091800; Tue, 20 Dec 2022 03:51:31 -0800 (PST) Received: from airbuntu (host86-130-134-87.range86-130.btcentralplus.com. [86.130.134.87]) by smtp.gmail.com with ESMTPSA id s13-20020a5d510d000000b00241e5b917d0sm14811325wrt.36.2022.12.20.03.51.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Dec 2022 03:51:31 -0800 (PST) Date: Tue, 20 Dec 2022 11:51:30 +0000 From: Qais Yousef To: Lukasz Luba Cc: Ingo Molnar , Peter Zijlstra , Dietmar Eggemann , "Rafael J. Wysocki" , Viresh Kumar , linux-pm@vger.kernel.org, Vincent Guittot , linux-kernel@vger.kernel.org, Wei Wang , Xuewen Yan , Hank , Jonathan JMChen Subject: Re: [RFC PATCH 3/3] sched/fair: Traverse cpufreq policies to detect capacity inversion Message-ID: <20221220115130.lhhakj36kn3opqtz@airbuntu> References: <20221127141742.1644023-4-qyousef@layalina.io> <20221203143323.w32boxa6asqvvdnp@airbuntu> <20221205110159.nd5igwvsaj55jar7@airbuntu> <20221208140526.vvmjxlz6akgqyoma@airbuntu> <20221209164739.GA24368@vingu-book> <20221212184317.sntxy3h6k44oz4mo@airbuntu> <19bd3f60-63ea-4ccc-b5a2-6507276c8f0d@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <19bd3f60-63ea-4ccc-b5a2-6507276c8f0d@arm.com> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/13/22 17:42, Lukasz Luba wrote: > Hi Qais, > > I thought I could help with this issue. Thanks Lukasz! > > On 12/12/22 18:43, Qais Yousef wrote: > > On 12/09/22 17:47, Vincent Guittot wrote: > > > > [...] > > > > > > > > > This patch loops on all cpufreq policy in sched softirq, how can this > > > > > > > be sane ? and not only in eas mode but also in the default asymmetric > > > > > > > > > > > > Hmm I'm still puzzled. Why it's not sane to do it here but it's okay to do it > > > > > > in the wake up path in feec()? > > > > > > > > > > feec() should be considered as an exception not as the default rule. > > > > > Thing like above which loops for_each on external subsystem should be > > > > > prevented and the fact that feec loops all PDs doesn't means that we > > > > > can put that everywhere else > > > > > > > > Fair enough. But really understanding the root cause behind this limitation > > > > will be very helpful. I don't have the same appreciation of why this is > > > > a problem, and shedding more light will help me to think more about it in the > > > > future. > > > > > > > > > > Take the example of 1k cores with per cpu policy. Do you really think a > > > for_each_cpufreq_policy would be reasonable ? > > > > Hmm I don't think such an HMP system makes sense to ever exist. > > > > That system has to be a multi-socket system and I doubt inversion detection is > > something of value. > > > > Point taken anyway. Let's find another way to do this. > > > > Another way might be to use the 'update' code path, which sets this > information source, for the thermal pressure. That code path isn't as > hot as this in the task scheduler. Furthermore, we would also > have time and handle properly CPU hotplug callbacks there. > > So something like this, I have in mind: > > ------------------------------8<----------------------------- > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c > index e7d6e6657ffa..7f372a93e21b 100644 > --- a/drivers/base/arch_topology.c > +++ b/drivers/base/arch_topology.c > @@ -16,6 +16,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -153,6 +154,33 @@ void topology_set_freq_scale(const struct cpumask > *cpus, unsigned long cur_freq, > DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE; > DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE; > EXPORT_PER_CPU_SYMBOL_GPL(cpu_scale); > > +static struct cpumask highest_capacity_mask; > > +static struct cpumask highest_capacity_mask; > +static unsigned int max_possible_capacity; > +static DEFINE_MUTEX(max_capacity_lock); > + > +static void max_capacity_update(const struct cpumask *cpus, > + unsigned long capacity) > +{ > + mutex_lock(&max_capacity_lock); > + > + if (max_possible_capacity < capacity) { > + max_possible_capacity = capacity; > + > + cpumask_clear(&highest_capacity_mask); > + > + cpumask_or(&highest_capacity_mask, > + &highest_capacity_mask, cpus); > + } > + > + mutex_unlock(&max_capacity_lock); > +} > + > +bool topology_test_max_cpu_capacity(unsigned int cpu) > +{ > + return cpumask_test_cpu(cpu, &highest_capacity_mask); > +} > +EXPORT_SYMBOL_GPL(topology_test_max_cpu_capacity); > + > void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity) > { > per_cpu(cpu_scale, cpu) = capacity; > @@ -203,6 +231,8 @@ void topology_update_thermal_pressure(const struct > cpumask *cpus, > > for_each_cpu(cpu, cpus) > WRITE_ONCE(per_cpu(thermal_pressure, cpu), th_pressure); > + > + max_capacity_update(cpus, capacity); > } > EXPORT_SYMBOL_GPL(topology_update_thermal_pressure); > > > --------------------------->8-------------------------------- > > We could use the RCU if there is a potential to read racy date > while the updater modifies the mask in the meantime. Mutex is to > serialize the thermal writers which might be kicked for two > policies at the same time. > > If you like I can develop and test such code in the arch_topology.c As we discussed offline, Vincent is keen on decoupling the util_fits_cpu() logic from HMP - which means I need to reword this differently. Let's keep this in the back burner in case we need to revisit it again. Appreciate the proposal!! Many thanks -- Qais Yousef