Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp1285819rda; Mon, 23 Oct 2023 08:05:49 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEVwKetsTSDgWaeFOuikyU5M6BWxaDwWRMqp5CZ7oRwsB39ZqXGolbOsZtQR/CtJLzfsH7i X-Received: by 2002:a17:902:e80b:b0:1c5:6157:f073 with SMTP id u11-20020a170902e80b00b001c56157f073mr9458323plg.11.1698073548830; Mon, 23 Oct 2023 08:05:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698073548; cv=none; d=google.com; s=arc-20160816; b=oPJNxB8E/TFq+0h9KnScExzLtckolUYpNRaaDD0uGVFS/Rlo8SR1KLbrpbqj6MaKc8 ru60H56pXO/qIez4tlloFAoZlETZxMzYgW/raQTUd2HefAh8Zt7xViVlvsgcM2CsWZSg 2i1LeRZPMFsR2uf0CD/B0qNc+fjASY0ITxGAkUuFWC1mIhqGc5sUmPcbAgok/2UuXASU zNsjCy29wf9ky2Tc0OW2U2Bc1+84pzwC7lBvYJ6t9DQ4uoQ6jTd5Vv52/ZcdKcZ/GYwg rS4tLzKa4flt5Gd4Rh7WeJgWrHPaq1c2dt8MPtMjc1OoID4GCLMITPsYQUrFGML8Xoay 1uFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=LP22W6oQFzZud0euZemSNk4L/DqpXelDWzAGRIaAB18=; fh=2WKoopA8dpZvrA8Vinb7qm07qhYRmQhD3gmdp1Q4wEg=; b=dCTElGFfJF/O5BzdjCzftZnyhHy6UxJEtiWf41t9TX8BIkdwLq+RXIoHWCdOqckMwK d5N411ED7hiJOicHkv+1IHQ4HGL6C8jysRPk3EsBo4LhMx+Hu6Binj7e5z/BiZ/ycfv3 LyKqocj6q43Xq/QgLN0/q1ygu2oQv3CfRoGXGfNeP1VjKynBECr7//nTnWFa9jHVacVR FvkERXlWmzk9lCyGn8w5CjvFLlMKbRorry4+KTkzgF2KrpheDOIjvViGzfLv+JEHmzxC 2D6lQYPXTIXlUn3tqkNGsvoafFsc8iaXLOkCpCSDfBxGtCqxf5DlQQ/HLYVbJpliCDfb OI5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=JKe8XE5u; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id q15-20020a17090311cf00b001b7ea20dbf2si6572298plh.224.2023.10.23.08.05.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 08:05:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=JKe8XE5u; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 1CA7B8087E1F; Mon, 23 Oct 2023 08:05:28 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232661AbjJWPEr (ORCPT + 99 others); Mon, 23 Oct 2023 11:04:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230289AbjJWPEn (ORCPT ); Mon, 23 Oct 2023 11:04:43 -0400 Received: from smtpout.efficios.com (unknown [IPv6:2607:5300:203:b2ee::31e5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1CA210E2 for ; Mon, 23 Oct 2023 08:04:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1698073475; bh=sDRdFc5BSFXVvCn+U20zgOqgig9pshAjBSzon0L8ejg=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=JKe8XE5uQdUmcZRiPXBh6E0nCm7tDivCEs8ciX3zmC8j6iE6MK9m9HP23/u9r9L8s i1iki/byqqXZuyaOGyIS/PT0kq7N9jcgPG+Ed8JZbDOXZmTZ0umiEINq83PdXSv4Cc Sr8We8rVUQaLkKhnNO53dwnCMUwGnndeW+u2JseAIHMJR4IyvAaoqswZPuc/vGu3FN Bd9LEn2Tc61G5sKBqjEKQIKqWX918rFC3ebrbeYzvjIjT2Tmqb2+zlv4fWOM0VQgJb +Y7fyfSAoU5JQ2blarpxP1ZcVEZYBqEL2F1ACfFziPJAfXv+8HV+0nzyw2PDTk9Jge fCV8Le3PNdKnQ== Received: from [172.16.0.134] (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4SDdl70Tb2z1YjT; Mon, 23 Oct 2023 11:04:35 -0400 (EDT) Message-ID: Date: Mon, 23 Oct 2023 11:04:49 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v2 1/2] sched/fair: Introduce UTIL_FITS_CAPACITY feature (v2) Content-Language: en-US To: Dietmar Eggemann , Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Valentin Schneider , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Vincent Guittot , Juri Lelli , Swapnil Sapkal , Aaron Lu , Chen Yu , Tim Chen , K Prateek Nayak , "Gautham R . Shenoy" , x86@kernel.org References: <20231019160523.1582101-1-mathieu.desnoyers@efficios.com> <20231019160523.1582101-2-mathieu.desnoyers@efficios.com> From: Mathieu Desnoyers In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 23 Oct 2023 08:05:28 -0700 (PDT) On 2023-10-23 10:11, Dietmar Eggemann wrote: > On 19/10/2023 18:05, Mathieu Desnoyers wrote: [...] >> >> +static unsigned long scale_rt_capacity(int cpu); >> + >> +/* >> + * Returns true if adding the task utilization to the estimated >> + * utilization of the runnable tasks on @cpu does not exceed the >> + * capacity of @cpu. >> + * >> + * This considers only the utilization of _runnable_ tasks on the @cpu >> + * runqueue, excluding blocked and sleeping tasks. This is achieved by >> + * using the runqueue util_est.enqueued. >> + */ >> +static inline bool task_fits_remaining_cpu_capacity(unsigned long task_util, >> + int cpu) > > This is almost like the existing task_fits_cpu(p, cpu) (used in Capacity > Aware Scheduling (CAS) for Asymmetric CPU capacity systems) except the > latter only uses `util = task_util_est(p)` and deals with uclamp as well > and only tests whether p could fit on the CPU. This is indeed a major difference between how asym capacity check works and what is introduced here: asym capacity check only checks whether the given task theoretically fits in the cpu if that cpu was completely idle, without considering the current cpu utilization. My approach is to consider the current util_est of the cpu to check whether the task fits in the remaining capacity. I did not want to use the existing task_fits_cpu() helper because the notions of uclamp bounds appear to be heavily tied to the fact that it checks whether the task fits in an _idle_ runqueue, whereas the check I am introducing here is much more restrictive: it checks that the task fits on the runqueue within the remaining capacity. > > Or like find_energy_efficient_cpu() (feec(), used in > Energy-Aware-Scheduling (EAS)) which uses cpu_util(cpu, p, cpu, 0) to get: > > max(util_avg(CPU + p), util_est(CPU + p)) I've tried using cpu_util(), but unfortunately anything that considers blocked/sleeping tasks in its utilization total does not work for my use-case. From cpu_util(): * CPU utilization is the sum of running time of runnable tasks plus the * recent utilization of currently non-runnable tasks on that CPU. > > feec() > ... > for (; pd; pd = pd->next) > ... > util = cpu_util(cpu, p, cpu, 0); > ... > fits = util_fits_cpu(util, util_min, util_max, cpu) > ^^^^^^^^^^^^^^^^^^ > not used when uclamp is not active (1) > ... > capacity = capacity_of(cpu) > fits = fits_capacity(util, capacity) > if (!uclamp_is_used()) (1) > return fits > > So not introducing new functions like task_fits_remaining_cpu_capacity() > in this area and using existing one would be good. If the notion of uclamp is not tied to the way asym capacity check is done against a theoretically idle runqueue, I'd be OK with using this, but so far both appear to be very much tied. When I stumbled on this fundamental difference between asym cpu capacity check and the check introduced here, I've started wondering whether the asym cpu capacity check would benefit from considering the target cpu current utilization as well. > >> +{ >> + unsigned long total_util; >> + >> + if (!sched_util_fits_capacity_active()) >> + return false; >> + total_util = READ_ONCE(cpu_rq(cpu)->cfs.avg.util_est.enqueued) + task_util; >> + return fits_capacity(total_util, scale_rt_capacity(cpu)); > > Why not use: > > static unsigned long capacity_of(int cpu) > return cpu_rq(cpu)->cpu_capacity; > > which is maintained in update_cpu_capacity() as scale_rt_capacity(cpu)? The reason for preferring scale_rt_capacity(cpu) over capacity_of(cpu) is that update_cpu_capacity() only runs periodically every balance-interval, therefore providing a coarse-grained remaining capacity approximation with respect to irq, rt, dl, and thermal utilization. If it turns out that being coarse-grained is good enough, we may be able to save some cycles by using capacity_of(), but not without carefully considering the impacts of being imprecise. > > [...] > >> @@ -7173,7 +7200,8 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) >> if (recent_used_cpu != prev && >> recent_used_cpu != target && >> cpus_share_cache(recent_used_cpu, target) && >> - (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) && >> + (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu) || >> + task_fits_remaining_cpu_capacity(task_util, recent_used_cpu)) && >> cpumask_test_cpu(recent_used_cpu, p->cpus_ptr) && >> asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) { >> return recent_used_cpu; >> diff --git a/kernel/sched/features.h b/kernel/sched/features.h >> index ee7f23c76bd3..9a84a1401123 100644 >> --- a/kernel/sched/features.h >> +++ b/kernel/sched/features.h >> @@ -97,6 +97,12 @@ SCHED_FEAT(WA_BIAS, true) >> SCHED_FEAT(UTIL_EST, true) >> SCHED_FEAT(UTIL_EST_FASTUP, true) > > IMHO, asymmetric CPU capacity systems would have to disable the sched > feature UTIL_FITS_CAPACITY. Otherwise CAS could deliver different > results. task_fits_remaining_cpu_capacity() and asym_fits_cpu() work > slightly different. I don't think they should be mutually exclusive. We should look into the differences between those two more closely to make them work nicely together instead. For instance, why does asym capacity only consider whether tasks fit in a theoretically idle runqueue, when it could use the current utilization of the runqueue to check that the task fits in the remaining capacity ? Unfortunately I don't have a machine with asym cpu to test locally. Thanks for your feedback ! Mathieu > > [...] > -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com