Received: by 2002:ab2:620c:0:b0:1ef:ffd0:ce49 with SMTP id o12csp1022150lqt; Tue, 19 Mar 2024 10:23:55 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXizQiw5IyAWfPxvTZHhYb/H76M+rmKuYHOYthdtbmpgBTYTKskcBbgCvEbQcsvw1vdE4HCc3BamrKdgPla5TbK7dREwNO22QnTFgDR/A== X-Google-Smtp-Source: AGHT+IGpiXe82Ay2MKc6ptnUderlfSQ1Q2Omegi8WYRKtT52A8I7uQWeYayPp59+VvnJM5+S6FDH X-Received: by 2002:a17:906:a1d6:b0:a46:af60:7c72 with SMTP id bx22-20020a170906a1d600b00a46af607c72mr2166709ejb.54.1710869034903; Tue, 19 Mar 2024 10:23:54 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710869034; cv=pass; d=google.com; s=arc-20160816; b=XNwB4JVw6saRG6hodUtu+r4znY4h7bWv37lmPVtq8ziCgV+dcUOkFoNFGd2b+fYO4G 5yqYZWASyZMjuKQnDgL2W+LiwJ5r03utfP1WhXE/hx/TtRG6ROUViqMRs9YnWXH/x9kT qflu+OlRpU4h2Nmnp2EqyEZq783JLvJeMIRnSLuELmH5pEKK6JRiTELdwhLGiFm6MwIh Jxze743BWEgXQAF4RHUfI00SHOOfbwYq/xaOCUmnA3PU1b9HkFxdb4+N8mTb2xgtMo2u zRLxk0u+HRqyDfzUTdILsVAYHVzi6nM7UDglvuzySYNeOuZrceBmUtfUeI/lDLXhq1CK eGzQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=g1690f3lcA8287jAe1Vw8fI6vIkorN4gSkxtvPgBAOI=; fh=Q2HMes0FR4suIJbZ2jAh3+/rf5QZh22nARdftuKpDIQ=; b=Oy2bjnNltkg9ybbMpEBeFcTa9Ut7Kd5bkGxUbf76ckU9ZUzy9GOMMNJXNONGG/2O4v dQgbjHsKo8rr+48w2nBs5dJIdxkFMuEb0oIWnReMJcUe3k3qShmeDV2qkkP4xgnIg47k CcObnXU7tUodLRPcaRNE2xx9FN9FEUruFQx5BWBvllQA9NDu4x1xmCaBKHuZCV+vqGAQ KW9IYxeTx6kdAJW3rL2qlSagId+Z/VrL0fPk991UknqMoqYGVsDnms/Zdg1K5MBNGHu6 ItZn8sqzloT33oQXZOQa7jG2Ne/VqmY7UqRn2LfkDGu/ZQoR9CwYx1F4irZuUOvNvNx9 QEoA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-107887-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-107887-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id x22-20020a1709060a5600b00a46eaef9f8dsi206972ejf.285.2024.03.19.10.23.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Mar 2024 10:23:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-107887-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-107887-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-107887-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 3D14A1F25920 for ; Tue, 19 Mar 2024 17:05:48 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0FB6518C22; Tue, 19 Mar 2024 17:05:41 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D10D1171A1 for ; Tue, 19 Mar 2024 17:05:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710867940; cv=none; b=kK7TpfYdI3cpClg4leJHg9sXRwSOXznkYMZe6l2UDEaQVxu9uwE+gjwOCGwHwN8ddak4RtVwyfB41JzD8t/GVmEOdR5bcC7Rh5BFquKN1Uouu+QtXq2/Vxegs2OmFBMFHGEG5njn1b9wmjcnmhdLawImVR2bUJEtQS6wTXOaXfk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710867940; c=relaxed/simple; bh=m3AmCp2d6STrFkXDov5OL40gpXmXzlECYI8H6pyhN/o=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=EFi/5nJcBX66qOx8JitzY1xBYNpZH+xQXQ1Qz4sxJIOpXfEmw4b4krRRb8DXhZHmew7qhIthHuQGRVFhMlD+WATV2exG8802Kfea3hNwFSb3Y8BSXIQq2Z0tfB07I6A8Z8AygM2Ri2G2eqnJ6L/PjbLJ2G8eflpfuFBeAFbUi0s= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 84570106F; Tue, 19 Mar 2024 10:06:06 -0700 (PDT) Received: from [10.1.36.47] (e133649.arm.com [10.1.36.47]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2EA773F67D; Tue, 19 Mar 2024 10:05:27 -0700 (PDT) Message-ID: Date: Tue, 19 Mar 2024 17:05:26 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v2 1/7] Revert "sched/uclamp: Set max_spare_cap_cpu even if max_spare_cap is 0" To: Dietmar Eggemann , Ingo Molnar , Peter Zijlstra , Vincent Guittot , Juri Lelli , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider Cc: Qais Yousef , Morten Rasmussen , Lukasz Luba , Christian Loehle , linux-kernel@vger.kernel.org, David Dai , Saravana Kannan References: <37be0494-7e38-4275-b6eb-62a2eb2f6d46@arm.com> Content-Language: en-US From: Hongyan Xia In-Reply-To: <37be0494-7e38-4275-b6eb-62a2eb2f6d46@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 19/03/2024 15:34, Dietmar Eggemann wrote: > On 01/02/2024 14:11, Hongyan Xia wrote: >> From: Hongyan Xia >> >> That commit creates further problems because 0 spare capacity can be >> either a real indication that the CPU is maxed out, or the CPU is >> UCLAMP_MAX throttled, but we end up giving all of them a chance which >> can results in bogus energy calculations. It also tends to schedule >> tasks on the same CPU and requires load balancing patches. Sum >> aggregation solves these problems and this patch is not needed. >> >> This reverts commit 6b00a40147653c8ea748e8f4396510f252763364. > > I assume you did this revert especially for the 'Scenario 5: 8 tasks > with UCLAMP_MAX of 120' testcase? More or less. Actually you can already see the problem in Scenario 1. Ideally the 4 uclamp_max tasks should be evenly distributed on 4 little CPUs, but from time to time task placement places more than 1 such task on the same CPU, leaving some other little CPUs not occupied. > IMHO, the issue is especially visible in compute_energy()'s busy_time > computation with a valid destination CPU (dst_cpu >= 0). I.e. when we > have to add performance domain (pd) and task busy time. > > find_energy_efficient_cpu() (feec()) > > for each pd > for each cpu in pd > > set {prev_,max}_spare_cap > > bail if prev_ and max_spare_cap < 0 (was == 0 before ) > > {base_,prev_,cur_}energy = compute_energy > > So with the patch we potentially compute energy for a saturated PD > according: > > compute_energy() > > if (dst_cpu >= 0) > busy_time = min(eenv->pd_cap, eenv->busy_time + eenv->task_busy_time) > <----(a)---> <--------------(b)-------------------> > > energy = em_cpu_energy(pd->em_pd, max_util, busy_time, eenv->cpu_cap) > > If (b) > (a) then we're saturated and 'energy' is bogus. Yeah, I think what's happening is because placing more tasks on the same CPU won't increase energy computation, so in the end task placement thinks it's the better decision. The root issue is that once you have uclamp_max, you can theoretically fit an infinite number of such tasks on the same CPU. > The way to fix this is up for discussion: > > (1) feec() returning prev_cpu > (2) feec() returning -1 (forcing wakeup into sis() -> sic()) > (3) using uclamped values for task and rq utilization > > None of those have immediately given the desired task placement on > mainline (2 tasks on each of the 4 little CPUs and no task on the 2 big > CPUs on my [l B B l l l] w/ CPU capacities = [446 1024 1024 446 446 446] > machine) you can achieve with uclamp sum aggregation. Personally from the results I've seen I definitely prefer (3), although (3) has other problems. One thing is that sum aggregation pushes up utilization with uclamp_min, but its energy consumption definitely won't be that high. The real energy is between its util_avg and util_avg_uclamp. I haven't seen this as a real problem, but maybe we can see even better task placement if this is accounted for. > [...]