Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp4548848rdb; Tue, 12 Dec 2023 02:47:37 -0800 (PST) X-Google-Smtp-Source: AGHT+IG1OSmaNHlWlZWrY1HF2kXkb2jJy3VfLObE5D7DIXYvRBmrDkYeAVN/JRQP2RU+pFY2c31D X-Received: by 2002:a17:902:e889:b0:1d0:c418:1754 with SMTP id w9-20020a170902e88900b001d0c4181754mr2748351plg.93.1702378056858; Tue, 12 Dec 2023 02:47:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702378056; cv=none; d=google.com; s=arc-20160816; b=JZTqeEpJ4txL2V1RPv6ZK/sDvtkUVNjyirhoXYckkiyjvRqXcj8BtjrorDRoum/8Lp cZAmWOPY7LkyEXjZC2fUlEYXJrMMz+ia9uVnK2ELWsG0sd3XuzgDps6wyTZPPvrqEp9Q V+rPfzgc7twdrXe3zqIZlJh32Z5cA6CmymFcpfIjIN3WSHfi5ZCUhNWcEC2AhL+HIRo9 jFsIdR878sJatk466kT0S0MzXE+FFRo9Ya6Hrfiv2t4PTj6ns8JMxdPcBMeV7ltYE9Qx hk7kylEqai/1Ah8w2ge0wHCLaEqy77chX1PoFrxgXi658FTfL9ypxfmsv9741uspt8ur p3GQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=9M1Q2lfXniAKFEM21h2wSMWlli3DZ0ozSHmccLVy/fI=; fh=HZCsiYtk31/YWXFsX6tA5jFGR6J0uHfrdDpaDfYRc0g=; b=Th7yfDCE8HBlGpdaIY5XxBYnJjsJRDTLaSzHAuUDFw1bE1KmW/ROdh5E5/IiyXOTPo g+oUGV8Tue9gVxZL6+y+dmkQ9e+YHRHEKlxHHHqsrI/6q02tBa4qaKIg7zNt2cH3QjZO uCNAkyKGWtJirLjEHzYzX85DPBBaWMVe4w4ehHMGrsljMkXsIB5YrZIgoQDJzF7jgsFS xyoimGJAsmggpp6KDQzd7msIAztq0TNW1iF632JhMMykPuIQwASGp0lO+D+IKZdAVwQt m6Fnzo4HRz1dps7y2yRviaS87gpzIy6PMxHYkPMrBu0WWYBKIS+/uWWjnUdftL7+E7B3 0esw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id h9-20020a170902b94900b001cfd754d79esi7710242pls.79.2023.12.12.02.47.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 02:47:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 6AC2880B6309; Tue, 12 Dec 2023 02:47:34 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232066AbjLLKrS (ORCPT + 99 others); Tue, 12 Dec 2023 05:47:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229379AbjLLKrR (ORCPT ); Tue, 12 Dec 2023 05:47:17 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id ABB869F; Tue, 12 Dec 2023 02:47:23 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D5EFC143D; Tue, 12 Dec 2023 02:48:09 -0800 (PST) Received: from [10.1.35.59] (e133649.arm.com [10.1.35.59]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 187593F762; Tue, 12 Dec 2023 02:47:20 -0800 (PST) Message-ID: Date: Tue, 12 Dec 2023 10:47:19 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/4] sched/fair: Be less aggressive in calling cpufreq_update_util() Content-Language: en-US To: Qais Yousef , Ingo Molnar , Peter Zijlstra , "Rafael J. Wysocki" , Viresh Kumar , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Lukasz Luba , Wei Wang , Rick Yiu , Chung-Kai Mei References: <20231208015242.385103-1-qyousef@layalina.io> <20231208015242.385103-2-qyousef@layalina.io> From: Hongyan Xia In-Reply-To: <20231208015242.385103-2-qyousef@layalina.io> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 12 Dec 2023 02:47:34 -0800 (PST) On 08/12/2023 01:52, Qais Yousef wrote: > Due to the way code is structured, it makes a lot of sense to trigger > cpufreq_update_util() from update_load_avg(). But this is too aggressive > as in most cases we are iterating through entities in a loop to > update_load_avg() in the hierarchy. So we end up sending too many > request in an loop as we're updating the hierarchy. Do you mean the for_each_sched_entity(se) loop? I think we update CPU frequency only once at the root CFS? > Combine this with the rate limit in schedutil, we could end up > prematurely send up a wrong frequency update before we have actually > updated all entities appropriately. > > Be smarter about it by limiting the trigger to perform frequency updates > after all accounting logic has done. This ended up being in the > following points: > > 1. enqueue/dequeue_task_fair() > 2. throttle/unthrottle_cfs_rq() > 3. attach/detach_task_cfs_rq() > 4. task_tick_fair() > 5. __sched_group_set_shares() > > This is not 100% ideal still due to other limitations that might be > a bit harder to handle. Namely we can end up with premature update > request in the following situations: > > a. Simultaneous task enqueue on the CPU where 2nd task is bigger and > requires higher freq. The trigger to cpufreq_update_util() by the > first task will lead to dropping the 2nd request until tick. Or > another CPU in the same policy trigger a freq update. > > b. CPUs sharing a policy can end up with the same race in a but the > simultaneous enqueue happens on different CPUs in the same policy. > > The above though are limitations in the governor/hardware, and from > scheduler point of view at least that's the best we can do. The > governor might consider smarter logic to aggregate near simultaneous > request and honour the higher one. > > Signed-off-by: Qais Yousef (Google) > --- > kernel/sched/fair.c | 55 ++++++++++++--------------------------------- > 1 file changed, 14 insertions(+), 41 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index b83448be3f79..f99910fc6705 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3997,29 +3997,6 @@ static inline void update_cfs_group(struct sched_entity *se) > } > #endif /* CONFIG_FAIR_GROUP_SCHED */ > > -static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq, int flags) > -{ > - struct rq *rq = rq_of(cfs_rq); > - > - if (&rq->cfs == cfs_rq) { Here. I think this restricts frequency updates to the root CFS? > - /* > - * There are a few boundary cases this might miss but it should > - * get called often enough that that should (hopefully) not be > - * a real problem. > - * > - * It will not get called when we go idle, because the idle > - * thread is a different class (!fair), nor will the utilization > - * number include things like RT tasks. > - * > - * As is, the util number is not freq-invariant (we'd have to > - * implement arch_scale_freq_capacity() for that). > - * > - * See cpu_util_cfs(). > - */ > - cpufreq_update_util(rq, flags); > - } > -} > - > #ifdef CONFIG_SMP > static inline bool load_avg_is_decayed(struct sched_avg *sa) > { > [...]