Received: by 2002:a05:6a10:7420:0:0:0:0 with SMTP id hk32csp623152pxb; Thu, 17 Feb 2022 11:01:34 -0800 (PST) X-Google-Smtp-Source: ABdhPJxR/PIFC/0ehiwJ9sVkAGoDfIeJDs/+j/S2oVlYaYXPrKl2XraAVo/5cbmSdY12bePzYgPH X-Received: by 2002:a50:9f8e:0:b0:410:8c:5dcc with SMTP id c14-20020a509f8e000000b00410008c5dccmr4154499edf.94.1645124494345; Thu, 17 Feb 2022 11:01:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645124494; cv=none; d=google.com; s=arc-20160816; b=QqdGw52ex+yqvFijsvXs3HkW6ehDxTnJPXmQ5MzNIztiY0X/93R+0uv0i/bZwhoaEB AAPB1XOpa3O3Y4UekWRy65bOx6BBK0/5P5KJbGRi6PS56QskxuzFgaB/xuSCmlsZtEK8 hIqn5EOhyzt4bXvxBalbTIg1EOMdmIuDR3Ks3UAnPRyaiqrJgn4DN+f/Nvyll5ZhD/Bl MVsnnJIRcnZ0X3ItJtZW8u5qmuZlGjEQMifr6GhTsWTXJ3D6UsfE9dhxZCDYkDvyPI7x Ns94/NCLNtJ/Wl0cpflZVmXta0EdpcUdyZGKgF2xUJ781rAsVzhvpkJSrtgsdop8+X9D cjjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=bS67ZjDJYHpN5IMowcwG7RaGzyf6cUqkR665doFwjrI=; b=wOQCN0FAeDMe8CaUWNDFucGGAGkvNOELdfJfz08EjsDVfw1X5X8ubhsd4OJ6j9O/M+ zffqNlvYsGINZQbc3ga4365Tbp3xMnTNvWxsfiZoH+vbQoDPLRXFFeMn8FZgurR8AYYa Azz0u4swILl5QATh07HqxAW4Mnt2GFH8Mse4OAUQfPvVnxOSagX7qTNHMz6wVWEcXZc/ +1rhm6dAR9f40osdoZCQCm1aoCp81JiKtWVmuZoU0qAwb8m8o/dvI9f/G9yNOhJ9i5pr cWDRwwZ/crV7lSCyuFREDWS9523WD5EFDTn1FlkGzyZItQLRG2Rto8k0QRSZzgHKxqHc 93bQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=I4mZboe+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z14si2192719ejr.253.2022.02.17.11.01.06; Thu, 17 Feb 2022 11:01:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=I4mZboe+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243606AbiBQROg (ORCPT + 99 others); Thu, 17 Feb 2022 12:14:36 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:34684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243569AbiBQROf (ORCPT ); Thu, 17 Feb 2022 12:14:35 -0500 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C4BC29C11C for ; Thu, 17 Feb 2022 09:14:20 -0800 (PST) Received: by mail-pj1-x1035.google.com with SMTP id v13-20020a17090ac90d00b001b87bc106bdso10004481pjt.4 for ; Thu, 17 Feb 2022 09:14:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=bS67ZjDJYHpN5IMowcwG7RaGzyf6cUqkR665doFwjrI=; b=I4mZboe+ZJ75pqxrOX+SrtQJVXqoASz1VlxFG1UctFilrzU8feEwny53HC1CJOl/HZ Q1o2NtuwJWSC0rUcn7WORfbjJJgg3gDIesrNCvr7FMK2La+EL8ykVwJDVSRUbLmxJU9N ZDEEIpD+WwcQPo4x1WJf6wLjucTIJq5n+qMP4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=bS67ZjDJYHpN5IMowcwG7RaGzyf6cUqkR665doFwjrI=; b=LOKkOmS5QAWU15ZEhoZ3EhPUwXoU3v5b1JGkZ/pOv3j2fvg8mAxYXfHNrcRSuOPOcO VABmkYu3gtageFag90hquFKW/+Tc88JatHm5/8lmy0ouZHRlIbOVDt37mk9+xu8TMtNt Njswcy87hiCrizLl5OI40QEofAp7hI53tLpEAxz7dWL2FXS00myHQZMUm3TGAyHf7/9v 9bH4NMCIgv26/M1NHsElnACAQRc9yaVjnGna8/PRuOQHAV62zd92c/t5pkxGkS0MW4yw AWL1j+YtIZK64ohDtkp3euHNqYgto7dCFASPA95wM3xDW5FtZgRuvMllLJsEAHDHtpXl t8tQ== X-Gm-Message-State: AOAM530iqIfAjPX572RzlfTTIRyDFhGf6NKGefAJ64urcFJhU60+1RHG C76HmDPccvUrrwAsxB3Nxvm4gA== X-Received: by 2002:a17:90b:3b91:b0:1b9:9bc2:92dd with SMTP id pc17-20020a17090b3b9100b001b99bc292ddmr8245662pjb.188.1645118059875; Thu, 17 Feb 2022 09:14:19 -0800 (PST) Received: from localhost ([2620:15c:202:201:20:e0d2:8c14:1e68]) by smtp.gmail.com with UTF8SMTPSA id g1sm185230pfu.32.2022.02.17.09.14.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 17 Feb 2022 09:14:19 -0800 (PST) Date: Thu, 17 Feb 2022 09:14:17 -0800 From: Matthias Kaehlcke To: Doug Anderson Cc: Lukasz Luba , Daniel Lezcano , LKML , Linux PM , amit daniel kachhap , Viresh Kumar , "Rafael J. Wysocki" , Amit Kucheria , Zhang Rui , Dietmar Eggemann , Pierre.Gondois@arm.com, Stephen Boyd , Rajendra Nayak , Bjorn Andersson , jorcrous@amazon.com, Rob Clark Subject: Re: [PATCH 1/2] thermal: cooling: Check Energy Model type in cpufreq_cooling and devfreq_cooling Message-ID: References: <4a7d4e94-1461-5bac-5798-29998af9793a@arm.com> <7c059f4f-7439-0cad-c398-96dbde4e49c1@linaro.org> <5b8ca53e-3595-85fd-5ae9-a5e8285e8513@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 17, 2022 at 08:37:39AM -0800, Doug Anderson wrote: > Hi, > > On Thu, Feb 17, 2022 at 2:47 AM Lukasz Luba wrote: > > > > Hi Daniel, > > > > On 2/17/22 10:10 AM, Daniel Lezcano wrote: > > > On 16/02/2022 18:33, Doug Anderson wrote: > > >> Hi, > > >> > > >> On Wed, Feb 16, 2022 at 7:35 AM Lukasz Luba wrote: > > >>> > > >>> Hi Matthias, > > >>> > > >>> On 2/9/22 10:17 PM, Matthias Kaehlcke wrote: > > >>>> On Wed, Feb 09, 2022 at 11:16:36AM +0000, Lukasz Luba wrote: > > >>>>> > > >>>>> > > >>>>> On 2/8/22 5:25 PM, Matthias Kaehlcke wrote: > > >>>>>> On Tue, Feb 08, 2022 at 09:32:28AM +0000, Lukasz Luba wrote: > > >>>>>>> > > >>>>>>> > > >>> > > >>> [snip] > > >>> > > >>>>>>> Could you point me to those devices please? > > >>>>>> > > >>>>>> arch/arm64/boot/dts/qcom/sc7180-trogdor-* > > >>>>>> > > >>>>>> Though as per above they shouldn't be impacted by your change, > > >>>>>> since the > > >>>>>> CPUs always pretend to use milli-Watts. > > >>>>>> > > >>>>>> [skipped some questions/answers since sc7180 isn't actually > > >>>>>> impacted by > > >>>>>> the change] > > >>>>> > > >>>>> Thank you Matthias. I will investigate your setup to get better > > >>>>> understanding. > > >>>> > > >>>> Thanks! > > >>>> > > >>> > > >>> I've checked those DT files and related code. > > >>> As you already said, this patch is safe for them. > > >>> So we can apply it IMO. > > >>> > > >>> > > >>> -------------Off-topic------------------ > > >>> Not in $subject comments: > > >>> > > >>> AFAICS based on two files which define thermal zones: > > >>> sc7180-trogdor-homestar.dtsi > > >>> sc7180-trogdor-coachz.dtsi > > >>> > > >>> only the 'big' cores are used as cooling devices in the > > >>> 'skin_temp_thermal' - the CPU6 and CPU7. > > >>> > > >>> I assume you don't want to model at all the power usage > > >>> from the Little cluster (which is quite big: 6 CPUs), do you? > > >>> I can see that the Little CPUs have small dyn-power-coeff > > >>> ~30% of the big and lower max freq, but still might be worth > > >>> to add them to IPA. You might give them more 'weight', to > > >>> make sure they receive more power during power split. > > >>> > > >>> You also don't have GPU cooling device in that thermal zone. > > >>> Based on my experience if your GPU is a power hungry one, > > >>> e.g. 2-4Watts, you might get better results when you model > > >>> this 'hot' device (which impacts your temp sensor reported value). > > >> > > >> I think the two boards you point at (homestar and coachz) are just the > > >> two that override the default defined in the SoC dtsi file. If you > > >> look in sc7180.dtsi you'll see 'gpuss1-thermal' which has a cooling > > >> map. You can also see the cooling maps for the littles. > > >> > > >> I guess we don't have a `dynamic-power-coefficient` for the GPU, > > >> though? Seems like we should, but I haven't dug through all the code > > >> here... > > > > > > The dynamic-power-coefficient is available for OPPs which includes > > > CPUfreq and devfreq. As the GPU is managed by devfreq, setting the > > > dynamic-power-coefficient makes the energy model available for it. > > > > > > However, the OPPs must define the frequency and the voltage. That is the > > > case for most platforms except on QCom platform. > > > > > > That may not be specified as it uses a frequency index and the hardware > > > does the voltage change in our back. The QCom cpufreq backend get the > > > voltage table from a register (or whatever) and completes the voltage > > > values for the OPPs, thus adding the information which is missing in the > > > device tree. The energy model can then initializes itself and allows the > > > usage of the Energy Aware Scheduler. > > > > > > However this piece of code is missing for the GPU part. > > > > > > > Thank you for joining the discussion. I don't know about that Qcom > > GPU voltage information is missing. > > > > If the voltage is not available (only the frequencies), there is > > another way. There is an 'advanced' EM which uses registration function: > > em_dev_register_perf_domain(). It uses a local driver callback to get > > power for each found frequency. It has benefit because there is no > > restriction to 'fit' into the math formula, instead just avg power > > values can be feed into EM. It's called 'advanced' EM [1]. > > It seems like there _should_ be a way to get the voltage out for GPU > operating points, like is done with cpufreq in > qcom_cpufreq_hw_read_lut(), but it might need someone with Qualcomm > documentation to help with it. Maybe Rajendra would be able to help? > Adding Jordon and Rob to this conversation in case they're aware of > anything. > > As you said, we could just list a power for each frequency, though. > > I'm actually not sure which one would be more accurate across a range > of devices with different "corners": specifying a dynamic power > coefficient used for all "corners" and then using the actual voltage > and doing the math, or specifying a power number for each frequency > and ignoring the actual voltage used. In any case we're trying to get > ballpark numbers and not every device will be exactly the same, so > probably it doesn't matter that much. > > > > Now we hit (again) the DT & EM issue (it's an old one, IIRC Morten > > was proposing from ~2014 this upstream, but EAS wasn't merged back > > then): > > where to store these power-freq values, which are then used by the > > callback. We have the 'dynamic-power-coefficient' in DT, but > > it has limitations. It would be good to have this simple array > > attached to the GPU/CPU node. IMHO it meet the requirement of DT, > > it describes the HW (it would have HZ and Watts values). > > > > Doug, Matthias could you have a look at that function and its > > usage, please [1]? > > If you guys would support me in this, I would start, with an RFC > > proposal, a discussion on LKML. > > > > [1] > > https://elixir.bootlin.com/linux/v5.17-rc4/source/Documentation/power/energy-model.rst#L87 > > Matthias: I think you've spent more time on the thermal stuff than me > so I'll assume you'll follow-up here. If not then please yell! > > Ideally, though, someone from Qualcomm would jump in an own this. > Basically it allows more intelligently throttling the GPU and CPU > together in tandem instead of treating them separately IIUC, right? Yes, I think for the em_dev_register_perf_domain() route support from Qualcomm would be needed since "Drivers must provide a callback function returning tuples for each performance state. ".