Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3949270pxb; Mon, 1 Feb 2021 08:39:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJzInNNLsheLXgHGPMhQTM7z0oRB14onJF1mbSE9xjEo0SsIfBhqzPbWl778Jx2dFAF0vHyw X-Received: by 2002:a17:906:6dc6:: with SMTP id j6mr18571844ejt.88.1612197543671; Mon, 01 Feb 2021 08:39:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612197543; cv=none; d=google.com; s=arc-20160816; b=OovTOQ6plg2qGISq1fqwTjlZjc4VNs7irJ/aHw6vCPRZtR4+oMFQb2KDkD0ZjMMZ0h ix/47FYDizwiozH1X8tVszP4NQ8Qo/eLFR7hXx0Jt6cTj+zAXQ26JCiLq5VIpkhl7wCV ByxSUa+Y0QRebjm2uSO5DZA+tGSHtek7tmPElw/eGvj9HhxhThiBNuDmpLUBgNzgANR7 CR2AOkJjVqiHuDvKm/ux+8/FvvBYeBlE4t5p8/DsokJIgLJtrEkwhxZVzvqbeA3zEme1 VXB913ZYjklFXnBWslicCZac1tDSy0DerbsOEN/kDU+o64MV05r8ggcp9bGsxdF2N8md ewgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=/aM3YS0kg7pAW+66mON9XYk/snAm2EMOyD1wX38e5wY=; b=D8eL8jb2d/j1zD3lzMeR8abnPgxtN0gsaLFdNGigOZ0Fp/O8rBcyo10v1Rm/nxVex9 Z8j6Yswo6ganxpF7gZla7Wn8zuV7wJ89y/t53jx4iUc2OEeyaCraIlY4vpbMnN2bumNh p5YmE/BQAtC9KPOZz2cR8DyelWec1xIy1pmK3ir8jp7bJ8npop0wgkisEqgVTPxpow08 wwGjWeWyBOiI0NY/QNoEXo6VKETBOY3rV35UBpk0SNkF0U/fdu9mh80f21+a6ZM9LS/Z 8uvpSrpsmUqqSDvEVog8QdZxsHq/M9YwE+pBlXoQipR+je6Zmaa42ZlQQwMIg60oQDsZ 1dXg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rp11si1515659ejb.724.2021.02.01.08.38.37; Mon, 01 Feb 2021 08:39:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229557AbhBAQh5 (ORCPT + 99 others); Mon, 1 Feb 2021 11:37:57 -0500 Received: from foss.arm.com ([217.140.110.172]:34370 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229500AbhBAQh5 (ORCPT ); Mon, 1 Feb 2021 11:37:57 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 47F7A1042; Mon, 1 Feb 2021 08:37:10 -0800 (PST) Received: from [10.57.8.191] (unknown [10.57.8.191]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F0D063F718; Mon, 1 Feb 2021 08:37:07 -0800 (PST) Subject: Re: [RFC][PATCH 0/3] New thermal interface allowing IPA to get max power To: "Rafael J. Wysocki" Cc: Linux Kernel Mailing List , Linux PM , Viresh Kumar , Daniel Lezcano , Dietmar Eggemann , Amit Kucheria , "Zhang, Rui" , Chanwoo Choi , Myungjoo Ham , Kyungmin Park References: <20210126104001.20361-1-lukasz.luba@arm.com> From: Lukasz Luba Message-ID: Date: Mon, 1 Feb 2021 16:37:05 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Rafael, On 2/1/21 2:19 PM, Rafael J. Wysocki wrote: > On Tue, Jan 26, 2021 at 11:40 AM Lukasz Luba wrote: >> >> Hi all, >> >> This patch set tries to add the missing feature in the Intelligent Power >> Allocation (IPA) governor which is: frequency limit set by user space. >> User can set max allowed frequency for a given device which has impact on >> max allowed power. > > If there is more than one frequency that can be limited for the given > device, are you going to add a limit knob for each of them? I might be unclear. I was referring to normal sysfs scaling_max_freq, which sets the max frequency for CPU: echo XYZ > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq similar for devfreq device, like GPU. > >> In current design there is no mechanism to figure this >> out. IPA must know the maximum allowed power for every device. It is then >> used for proper power split and divvy-up. When the user limit for max >> frequency is not know, IPA assumes it is the highest possible frequency. >> It causes wrong power split across the devices. > > Do I think correctly that this depends on the Energy Model? Not directly, but IPA uses the max freq to ask EM for max power. The issue is that I don't know this 'max freq' for a given device, because user might set a limit for that device. In that case IPA still blindly picks up the power for highest frequency. > >> This new mechanism provides the max allowed frequency to the thermal >> framework and then max allowed power to the IPA. >> The implementation is done in this way because currently there is no way >> to retrieve the limits from the PM QoS, without uncapping the local >> thermal limit and reading the next value. > > The above is unclear. What PM QoS limit are you referring to in the > first place? The PM QoS which we use in thermal for setting the frequency limits, for cpufreq_cooling [1] and for devfreq_cooling [2]. I am able to read that PM QoS value, but it's the lowest, but not set by user. Example: 2000MHz 1800MHz <----- user set this to 'max freq' 1400MHz <----- thermal set that to 'max freq' then PM QoS would give me the 1400MHz, because it is the limit for the max freq. That's why I said that PM QoS is not able to give me the user limit, unless I revert in IPA the capping for that device. > >> It would be a heavy way of >> doing these things, since it should be done every polling time (e.g. 50ms). >> Also, the value stored in PM QoS can be different than the real OPP 'rate' >> so still would need conversion into proper OPP for comparison with EM. >> Furthermore, uncapping the device in thermal just to check the user freq >> limit is not the safest way. >> Thus, this simple implementation moves the calculation of the proper >> frequency to the sysfs write code, since it's called less often. The value >> is then used as-is in the thermal framework without any hassle. >> >> As it's a RFC, it still misses the cpufreq sysfs implementation, > > What exactly do you mean by this? I haven't modified cpufreq.c and cpufreq_cooling.c because maybe for CPUs there is a way to solve it differently or you might don't want at all to modify CPUs code. > >> but would be addressed if all agree. > > Depending on the answers above. > > But my general comment would be that it might turn out to be > unrealistic to expect user space to know what frequency limit to use > to get the desired result in terms of constraining power. > There are scenarios, where middleware (which is aware what is on the foreground in mobile) might limit the GPU max freq, to not burn out some power spent on highest OPPs. Regards, Lukasz [1] https://elixir.bootlin.com/linux/latest/source/drivers/thermal/cpufreq_cooling.c#L443 [2] https://elixir.bootlin.com/linux/latest/source/drivers/thermal/devfreq_cooling.c#L106