Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp761277rdh; Sun, 24 Sep 2023 10:30:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGa8pxHc35DapmGLrVsa2lSNfPZpaI2wHah/Z2dFVrRStNxaOwNFrb+jQr73IGtitW2zBUy X-Received: by 2002:a05:6a21:7742:b0:15e:b281:a33c with SMTP id bc2-20020a056a21774200b0015eb281a33cmr1790428pzc.56.1695576631065; Sun, 24 Sep 2023 10:30:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695576631; cv=none; d=google.com; s=arc-20160816; b=ivOR7frl2dUFXaDxmbrpLBsEL3QORhyQIYDl1aS3cSS9xESOM5i/9ZNLQddwcS+PFZ nXjCRQ1akyyQPBIZI6Qy1YiQII5n7o/zyQ+5DGaUecoPWmxXteEzgn6zPVL9lujk8t3i 7nwhVCUgjgtLQJM/Lk6/z+7e48nUInBV1OWNu6UZGHLcVsgEEYiofTdNFeZ8bb5UUAOn sYYf4pi25dJDZe0gjp+E+VpxkoZeTCW02M3BByBFzeu9a65sdpmh4EfccJBDmGV0J867 i79igYipzk36h929Z8cW5GfjKwa0p5Nb1Dy0Y8yqIvTec2xd1xpegYTMgPClm14mn25x /tPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Wz9hAphZ4hbgq5LD/NlJ0pJY551maKJPXOl5M0PmVhc=; fh=rsWP5efm/MCkpaAvJLHch/VXRHcZ2mOg66v/XHlSLLo=; b=qoKwuQwlu3M3eaWzNH7JQMjJEACnh0XDruFBamJRqLaR7HrYz8ImDLuVNVtQBQOz10 XP3nc38pl9+MvKmbVw7/pk/bSApUtq4f2WBGyIbuKmdvZA2c0n73SGWgBEWSUPEBXTcm RugYgM8szN+oIrXofLk1IJO/n5eq6zMsF9WJuitK6mYzVTxVeA0RYnyTtV72Nz/0A4Zc O2/gM81HoLPD93fT8lDv0CRSTSt5sEK8DROupeasFYDeTA7jNNqIue33mU4PDzTVaWPo oJwavXaYQNLISvlXn6jBetWBMdTw2UgVhLeAGF1R1Cqm6d6GZdeoben/LFXRVKZyxieI O19w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@layalina-io.20230601.gappssmtp.com header.s=20230601 header.b=TJoFcbX4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id s188-20020a635ec5000000b0057404ce2fc8si8189314pgb.529.2023.09.24.10.30.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 24 Sep 2023 10:30:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@layalina-io.20230601.gappssmtp.com header.s=20230601 header.b=TJoFcbX4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 5C355805FB9C; Sun, 24 Sep 2023 10:23:14 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230029AbjIXRXO (ORCPT + 99 others); Sun, 24 Sep 2023 13:23:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229710AbjIXRXN (ORCPT ); Sun, 24 Sep 2023 13:23:13 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A8DE109 for ; Sun, 24 Sep 2023 10:23:06 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id ffacd0b85a97d-3231df68584so1220505f8f.1 for ; Sun, 24 Sep 2023 10:23:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1695576184; x=1696180984; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Wz9hAphZ4hbgq5LD/NlJ0pJY551maKJPXOl5M0PmVhc=; b=TJoFcbX4gS9pGljVHj+CXzKVIya+ZWavbGkjBFYLg/sER7Aa+oGBvf6I3FUtQeaCeP 4vR7w6oKawaeP7AGmZlMMtKF2zLOGoGnleoaQKqTCLeBGGtYpcd7N/D2a/Bmi/Ypwmax TSjKpMIALqiOqSJaHydJfI2RcpVKTtlgtfIqqfxpkLvGCiXXel/XiCmLP8jVSFPdpR8C XWLWhpuOgeMn9UgC26r1RRRdSr8hwpZ8G139v6z7+6Uc9vhZ8vfPt27//eJ0tUX6EQ34 c70qw1NZ/Y28/NDUhImuJhJj3rzmhI48dCMGDNyOCMGf6YZwuRZGae7FR9Eup/ibHw/M uO5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695576184; x=1696180984; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Wz9hAphZ4hbgq5LD/NlJ0pJY551maKJPXOl5M0PmVhc=; b=gwO9xiAzlP2BcacrYa2urtJsMuqydxdO/PU9wtrA94BKAJ7XDQ5LWH0Xw+Nec99AvL Kbp7N9vJHkb0Oz382t1MJRTlILm4X4Gp3jjo0C3IQNyuNZAcGAAbugMPH1yDf7MZZurz WdD/+so+0nMPuteMMDkHwQL4lwUWeFRwXH0wqPpBtwDnXATHfPu1JQqHBga7qfyZFzdo C+w0BBDmhNbUYAX4B3zNC2320XQbW6xd5Kwc04dxxGwkqI6NEXVxTcUOaVbTXJKBtOB1 nFk3Miy8rXQGpZep+rGP0P0ISmaymr5Sz93YjJeXtJlFYhfxDlAR5pTkKW8F0WXh9Nkr OLrA== X-Gm-Message-State: AOJu0Yw8ThI0e64tJkNjKua8OxgpmEjs6bdorAMNZRjE2rN26cTotudb pHTovuJqZrI7yKNyKOgIYvwX2w== X-Received: by 2002:a05:6000:12ca:b0:31f:f94e:d276 with SMTP id l10-20020a05600012ca00b0031ff94ed276mr4912977wrx.51.1695576184210; Sun, 24 Sep 2023 10:23:04 -0700 (PDT) Received: from airbuntu (host109-151-228-137.range109-151.btcentralplus.com. [109.151.228.137]) by smtp.gmail.com with ESMTPSA id n11-20020a5d4c4b000000b0031fbbe347ebsm9645933wrt.22.2023.09.24.10.23.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 24 Sep 2023 10:23:03 -0700 (PDT) Date: Sun, 24 Sep 2023 18:23:01 +0100 From: Qais Yousef To: Vincent Guittot Cc: "Rafael J. Wysocki" , Viresh Kumar , Ingo Molnar , Peter Zijlstra , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Dietmar Eggemann , Lukasz Luba Subject: Re: [PATCH 2/4] sched: cpufreq: Fix apply_dvfs_headroom() escaping uclamp constraints Message-ID: <20230924172301.7lqdcsnpqk7trtno@airbuntu> References: <20230820210640.585311-3-qyousef@layalina.io> <20230829163740.uadhv2jfjuumqk3w@airbuntu> <20230907215555.exjxho34ntkjmn6r@airbuntu> <20230910174638.qe7jqq6mq36brh6o@airbuntu> <20230916192509.bportepj7dbgp6ro@airbuntu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Sun, 24 Sep 2023 10:23:14 -0700 (PDT) On 09/24/23 09:58, Vincent Guittot wrote: > > Shouldn't it be (568+128)*1.25 = 870? Which is almost the 860 above. We calmped > > the 812 to 800, with rounding errors that almost accounts for the 10 points > > difference between 870 and 860.. > > no I voluntarily use 568 + 128*1.25. I added dvfs headroom for irq > just to ensure that you will not raise that I removed the headroom for > irq and focus on the use case but it might have created more > confusion. > > My example above demonstrate that only taking care of cases with null > irq pressure is not enough and you can still ends up above 800 > > IIUC you point with uclamp_max. It is a performance limit that you > don't want to cross because of CFS.This means that we should not go > above 800 in my example because of cfs utilization: Irq needs between > 128 and CFS asks 568 so the system needs 696 which is below the 800 > uclamp. Even if you add the dvfs headroom on irq, the system is still > below 800. Only when you add dfvs headroom to cfs then you go above > 800 but it's not needed because uclamp say that you should not go Yep, absolutely. It seems we agree that CFS shouldn't go above 800 if it is capped even if there's headroom, but the question you have on the way it is being applied. As long as we agree on this part which is a fundamental behavior question that I thought is the pain point, the implementation details are certainly something that I can improve on. > above 800 because of CFS so we should stay at 800 whereas both current > formula and your new formula return a value above 800 I'm not sure how to handle irq, rt and dl here to be honest. They seem to have been taken as an 'additional' demand on top of CFS. So yes, we'll go above but irq, and dl don't have knowledge about ucalmp_max. RT does and will be equally capped like CFS. I kept current behavior the same, but I did wonder about them too in patch 4. So in a system where there are active CFS, RT, DL and IRQ and both CFS and RT had a cap of 800, then they won't ask for me. But once we add IRQ and DL on top, then we'll go above. You think we shouldn't? See below for a suggestion. > > I am still not sure if you mean we are mixing up the code and we need better > > abstraction or something else. > > > > Beside the abstraction problem, which I agree with, I can't see what I am > > mixing up yet :( Sorry I think I need more helping hand to see it. > > There is a mix between actual utilization and performance limit and > when we add both we then lose important information as highlighted by > my example. If the current formula is not correct because we can go > above uclamp_max value, your proposal is not better. And the root > cause is mainly coming from adding utilization with performance limit > (i.e. uclamp) > > That's why I said that we need a new interface to enable cpufreq to > not blindly apply its headroom but to make smarter decision at cpufreq > level Okay I see. I tend to agree here too. The question is should cpufreq take each util (cfs, rt, dl, irq) as input and do the decision on its own. Or should the scheduler add them and pass the aggregated value? If the latter, how can cpufreq know how to apply the limit? From what I see all these decisions has to happen in the same function but not split. It seems the sticking point is how we interpret irq pressure with uclamp. It seems you think we should apply any uclamp capping to this, which I think would make sense. And DL bandwidth we need to max() with the aggregated value. So I think the formula could be util = cfs + rt pressure + irq pressure unsigned long cpufreq_convert_util_to_freq(rq, util, dl_bw) { eff_util = apply_dvfs_headroom(util); eff_util = uclamp_rq_util_with(rq, util, NULL); eff_util = max(eff_util, dl_bw); } so we add the utilization of cfs, rt and irq (as per current formula). And then let cpufreq do the headroom and limit management. I changed the way we handle dl_bw as it is actually requesting to run at a specific level and not really a pressure. So we max() it with eff_util. If there's a DL task on the rq then it'd be running and the frequency it needs is defined by its bandwidth. We could also keep it as it is with unsigned long cpufreq_convert_util_to_freq(rq, util, dl_bw) { eff_util = apply_dvfs_headroom(util); eff_util = uclamp_rq_util_with(rq, util, NULL); eff_util += dl_bw; } RT has uclamp knowledge so it'll either run at max or whatever value it might have requested via uclamp_min. But DL doesn't set any uclamp_min and must be either added or max()ed. I'm not sure which is more correct yet, but maybe adding actually is better to ensure the CPU runs higher to handle all the tasks on the rq. What do you think? Thanks! -- Qais Yousef