Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp5718428rwb; Mon, 5 Dec 2022 03:12:32 -0800 (PST) X-Google-Smtp-Source: AA0mqf7NWsNefekWRa/m9lQpRbbhufnFoKccvBJd9PojWQktdIStY6sc/3YUTGzkduQzO2KM1Veu X-Received: by 2002:a17:903:2487:b0:189:bdd4:1d67 with SMTP id p7-20020a170903248700b00189bdd41d67mr13842667plw.12.1670238752329; Mon, 05 Dec 2022 03:12:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670238752; cv=none; d=google.com; s=arc-20160816; b=097ZXobCLfjcARTHv9o5XcVXxE4gSvJB/QNHulrSqszqmvKv83VEuKBPY/B50kiR7w ZCx3KmdgxgUINH+1+ml2wQgGcI3zaHrAcbu2Gw3Y5skAzRKadhIfkQWRhJ1wG1guwyxB vtSNcjXNUin8x0NF5kW2VVeXI78SN81KdXAhNwSR73lh3nMOmt5So+CzbIFd8bx+YwW5 guLAAhUMzC1lxYgmngwA6lljdcXSrsxY2vu+C69QPTRi/5hqndFa/m1yR3cblyCvVTEV Vrwl3gvXDYCT7SA3FKLV5AKGoi9jiIMJ/zXLj3PqWjXSxC9zt8HV7QYhUoVYG0bMhIfU nuuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=50n9+I4ZxZRBd4Gzgxnr+F5t6glYKeTWrpBn2vhsZRE=; b=EqqwibTNLDZMKVPHDUsB0EjW4TUjopjiXZWzlwIzyop4u/VOgwpnvHak4c2bK3e9iW nLteq5w35eveJ5v7wCmhXMCHCPRW1R6+9B0w/p25og8oOX7hTmQWiZyLyBDvaetG0H4I A6db8WdEJ9uqP8uo96NHZ/msrwoSZ7bERld+iUAkOQgmE39Suo1/1CXlUKZsVzZOEMol xmQiHzhOOUPDBTuyraM1pbsLxNE36jwR5R1Ox3cYtFy8Ud2N6Sor17ZVltYOERIucV86 Gr0uAvhRJW0Wn5pTjVYVcaocrVZ0K7NG2/CUzaiAJD50lHSnJwflfSorSpAb/lDJJDV4 GMYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@layalina-io.20210112.gappssmtp.com header.s=20210112 header.b=k6LIQd5I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pj1-20020a17090b4f4100b00218fa7c0396si15400469pjb.105.2022.12.05.03.12.21; Mon, 05 Dec 2022 03:12:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@layalina-io.20210112.gappssmtp.com header.s=20210112 header.b=k6LIQd5I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230493AbiLELCI (ORCPT + 82 others); Mon, 5 Dec 2022 06:02:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230353AbiLELCF (ORCPT ); Mon, 5 Dec 2022 06:02:05 -0500 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE700F034 for ; Mon, 5 Dec 2022 03:02:03 -0800 (PST) Received: by mail-wr1-x436.google.com with SMTP id y16so18006595wrm.2 for ; Mon, 05 Dec 2022 03:02:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=50n9+I4ZxZRBd4Gzgxnr+F5t6glYKeTWrpBn2vhsZRE=; b=k6LIQd5Iev9CRSssx9gOdJxTMz6B/GodnspKI9plJT4mnElcDfYL/C6O2GzZSQG4uJ kZV8xkD0fGRN5PWuAqtrKW+FOq7Va24G6hgmtAmtVa9bzmEhHe2UqfjagkG7EC8jyNbN GPC+MaUgIo8FSovEfpkMbBit5cKQjAzi7v8EvH7Zk+5SG9xuMujJCzjSL3+vP6PDs+7x QC0DV55+7C6Nx5Fr7dFDrhGvgQw/4Xeqi6+MCNYCrGy1GrO9s5H6GRmHY6SiZmWd8jZ5 +Ge5s0T1GJTM4K4+uC29OEvM09ycL7f9KdrUq0Kak+zWvtd4oI0edjm+6k5Ym9e3GyQH xrdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=50n9+I4ZxZRBd4Gzgxnr+F5t6glYKeTWrpBn2vhsZRE=; b=OmtkJCrPFmf8bUoPZKAaZj6KkFqrg3AwgCaIjHqnNZIt2qBCkAtfW4t04h9Uvt3r48 QsmTVERAWjUescEAR2Ewyt9tcVwgYy4GfSEAR9qkVMm+kUcyiwgxoqrJsxDngRUYZ8qK IgWkmtBVNVrGvownIna2oZwt/oBNesJ+FxOqNPXRSiRDrSekZVw2w3nqMAmIX7sha9Qq UicNTbhLULcW+YSN2nEyo0xWvyEQjYYztaGI52Nr7k9+FFkGmly+fDktvZrsTFiaOYWx eGa1qa8bCskhKhzANV4OWmzAktP+nhCV8Py0MZjdGgGSR8jyhVe8tSCkFWq0KJvR9lpg kFSw== X-Gm-Message-State: ANoB5plXC/6HL4bkhIGzQhhKRBPjTEN8PfnJAGkIRrdztXQ7Bpl9cuwp 3BvTZYKgYup1B0bPbqgSn+I6vw== X-Received: by 2002:adf:e305:0:b0:236:6089:cc5e with SMTP id b5-20020adfe305000000b002366089cc5emr40746437wrj.118.1670238122223; Mon, 05 Dec 2022 03:02:02 -0800 (PST) Received: from airbuntu (host86-130-134-87.range86-130.btcentralplus.com. [86.130.134.87]) by smtp.gmail.com with ESMTPSA id j11-20020a05600c190b00b003b47e75b401sm23742162wmq.37.2022.12.05.03.02.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Dec 2022 03:02:01 -0800 (PST) Date: Mon, 5 Dec 2022 11:01:59 +0000 From: Qais Yousef To: Vincent Guittot Cc: Ingo Molnar , Peter Zijlstra , Dietmar Eggemann , "Rafael J. Wysocki" , Viresh Kumar , linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, Lukasz Luba , Wei Wang , Xuewen Yan , Hank , Jonathan JMChen Subject: Re: [RFC PATCH 3/3] sched/fair: Traverse cpufreq policies to detect capacity inversion Message-ID: <20221205110159.nd5igwvsaj55jar7@airbuntu> References: <20221127141742.1644023-1-qyousef@layalina.io> <20221127141742.1644023-4-qyousef@layalina.io> <20221203143323.w32boxa6asqvvdnp@airbuntu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/04/22 12:35, Vincent Guittot wrote: > On Sat, 3 Dec 2022 at 15:33, Qais Yousef wrote: > > > > On 12/02/22 15:57, Vincent Guittot wrote: > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > > index 7c0dd57e562a..4bbbca85134b 100644 > > > > --- a/kernel/sched/fair.c > > > > +++ b/kernel/sched/fair.c > > > > @@ -8856,23 +8856,20 @@ static void update_cpu_capacity(struct sched_domain *sd, int cpu) > > > > * * Thermal pressure will impact all cpus in this perf domain > > > > * equally. > > > > */ > > > > - if (sched_energy_enabled()) { > > > > + if (static_branch_unlikely(&sched_asym_cpucapacity)) { > > > > unsigned long inv_cap = capacity_orig - thermal_load_avg(rq); > > > > - struct perf_domain *pd = rcu_dereference(rq->rd->pd); > > > > + struct cpufreq_policy *policy, __maybe_unused *policy_n; > > > > > > > > rq->cpu_capacity_inverted = 0; > > > > > > > > - SCHED_WARN_ON(!rcu_read_lock_held()); > > > > - > > > > - for (; pd; pd = pd->next) { > > > > - struct cpumask *pd_span = perf_domain_span(pd); > > > > + for_each_active_policy_safe(policy, policy_n) { > > > > > > So you are looping all cpufreq policy (and before the perf domain) in > > > the period load balance. That' really not something we should or want > > > to do > > > > Why is it not acceptable in the period load balance but acceptable in the hot > > wake up path in feec()? What's the difference? > > This patch loops on all cpufreq policy in sched softirq, how can this > be sane ? and not only in eas mode but also in the default asymmetric Hmm I'm still puzzled. Why it's not sane to do it here but it's okay to do it in the wake up path in feec()? AFAICT the number of cpufreq policies and perf domains have 1:1 mapping. In feec() we not only loop perf domains, but we go through each cpu of each domain to find max_spare_cap then compute_energy(). Which is more intensive. In worst case scenario where there's a perf domain for each cpu, then we'll loop through every CPU *and* compute energy its energy cost in the wake up path. Why it's deemed acceptable in wake up path which is more critical AFAIU but not period load balance which is expected to do more work? What am I missing? > performance one. > > This inverted detection doesn't look like the right way to fix your > problem IMO. That being said, i agree that I haven't made any other > proposal apart that I think that you should use a different rules for > task and for overutilized and part of your problem comes from this. We discussed this before; I need to revisit the thread but I can't see how overutilized different than task will fix the issue. They should be unified by design. I'm all ears if there's a simpler way to address the problem :-) The problem is that thermal pressure on big cpu is not important from uclamp perspective until it is in inversion state. It is quite common to have a system where the medium capacity is in 500 range. If the big is under thermal pressure that it drops to 800, then it is still a fitting CPU from uclamp perspective. Keep in mind uclamp_min is useful for tasks whose utilization is small So we need to be selective when thermal pressure is actually helping out or just creating unnecessary problems. The only other option I had in mind was to do the detection when we update the thermal_pressure in the topology code. But that didn't look better alternative to me. > > Then this make eas and util_fits_cpu even more Arm specific and I What is the Arm specific part about it? Why it wouldn't work on non-Arm systems? > still hope to merge sched_asym_cpucapacity and asym_packing a some > levels because they looks more and more similar but each side is > trying to add some SoC specific policy Oh, it seems Intel relies on asym_packing for their hybrid support approach? I think sched_asym_cpucapacity was designed to be generic. If I gathered correctly lack of support for SMT and inability to provide energy model outside of DT were some required extensions. To be honest, I personally think EAS can be useful on SMP systems and it would be nice to enable it outside of sched_asym_cpucapacity. I'm interested to hear more about this unification idea actually. If you feel a bit chatty to describe in more detail how do you see this being unified, that could be enlightening for some of us who work in this area :-) Thanks! -- Qais Yousef