Received: by 2002:a05:6a10:7420:0:0:0:0 with SMTP id hk32csp554314pxb; Thu, 17 Feb 2022 09:33:32 -0800 (PST) X-Google-Smtp-Source: ABdhPJzrrDqolFoDA5bYZtoBfnDYtuHoH4+wXHzO42FhojLYAu8DmVAmyFouds+uSjU9WVyvlB+9 X-Received: by 2002:a05:6402:d0d:b0:410:5e47:e013 with SMTP id eb13-20020a0564020d0d00b004105e47e013mr3920119edb.97.1645119211782; Thu, 17 Feb 2022 09:33:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645119211; cv=none; d=google.com; s=arc-20160816; b=gg3feHR2O66BGwkszjJ26FO0P88QBJzD2YbVIBei9JAEUBH98l9BWH1ZArbd+fYnIT 3AvTbcXJA1Fs/nzY7oPENpozb5Mxl/pvyJu5Q48GDLyvOJBYzHmhUdmJi5kUDb63QMlZ xe/FBW1uzyZmJ8NxbD7uLDsDZ5WdpTJyjsGG/fw4JLGZkXd99maaQNvvzoBDm11AgFKS vGD37FLNtqPP8itaXezk9xXK6tV1YY9ke+QurHdkONLMKFQle7pwd20wcJsk7iSCMp3i 2K17yf9F+P/6AZ2lWQEjsbYTgy8Flruep5Hl20uMD7Rugtgm7a0dKhZAA71mg8CyCS0b 5q1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=UONcOwfoMw9pDe0rAm3joCYRBAlJSecwdegwEBnOH94=; b=UtN/pE1wf2P9E/WjC3gAXwsODc0lIE0dws5HNRlt+Mq0ZQf/cxiyx3GWpqUis5ESdw Yq8mIml7AD59FZJnqVOX/oOaEy+N/vk/AeeO71da24pxIAfN56VT/BsCePKmyBpCZzI5 /vOIuGmXP4XPk+0Tf3zNDTcGIehQT9/LG13i8xGwHSa84rTHOWBodB1FSY/89kwSPJij FPa9v4fZ98X0C+8ggFWGoCgULOsD9spvxj+KYGluYO6vgUOQFjYD4zkg8hnElvp3Ke5F MdrvRirgTG967tCKLEqx6bQRsI011xZL2z/kh7Hra0NCcZfTp+xM5DlKRkmrNyu5L4sl I+9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=BtDO6RRg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q10si4916401edd.174.2022.02.17.09.33.08; Thu, 17 Feb 2022 09:33:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=BtDO6RRg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243311AbiBQQiQ (ORCPT + 99 others); Thu, 17 Feb 2022 11:38:16 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:46714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243039AbiBQQiP (ORCPT ); Thu, 17 Feb 2022 11:38:15 -0500 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F1402B3191 for ; Thu, 17 Feb 2022 08:38:00 -0800 (PST) Received: by mail-ed1-x533.google.com with SMTP id q17so10715844edd.4 for ; Thu, 17 Feb 2022 08:38:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=UONcOwfoMw9pDe0rAm3joCYRBAlJSecwdegwEBnOH94=; b=BtDO6RRgSUv6ucKn+TGhcqPblYqbEYF7TwKvQNAFwkl2M7GjGv8hef8O/yBTGEHQPq P/sbxxp3UcCm26QbSnlduFWQh7o0rjHbP6Y3novYM+F8S6LDHRWACDrnYUcNJfsOAFQa KqvWJsy8MTLOwCtpGXPEDjeAQUGhIaQqPHS2Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=UONcOwfoMw9pDe0rAm3joCYRBAlJSecwdegwEBnOH94=; b=kpSbz/aJ0MMIjChbc1HgJc60Sc+C8yTU/63fU30IEEUh7kbAFQNTbZhfhA4Jh32ktB squgh4+GjEKVJFsbUDLIFM6xVB9bJFB7Ugp7sAzqNr5YHJdqzLEuY5me4wt0yRVdV/LL AokyOaVYUGIUO1ghT+YWZj3ZRrMItX2/8iKfAKDpU++M/OKEcc816DhaqRvfywcK+Q/q sUSvi9TA6XYc/8I4NYure+0Th5lc5ZAXbhC6htHq6rF9/LPIfpLkV8m0YAJA9eX1xqIp RxNj8y3Mny322psCSiWa6GAeMulrQH9y92ntyuClHII3dvqPheTnDZkR9gzHjn6iH7av sC/A== X-Gm-Message-State: AOAM533r7/hR2KrwidGUDtfVr9c3nCf7nNnSRXOYpWWHUgamynBA3K80 bmMz7mD1dOWrgtJrai60ZFN87zAoyI/x9eBD2Mc= X-Received: by 2002:a05:6402:1d49:b0:412:8cf5:73fa with SMTP id dz9-20020a0564021d4900b004128cf573famr3563338edb.334.1645115878720; Thu, 17 Feb 2022 08:37:58 -0800 (PST) Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com. [209.85.221.49]) by smtp.gmail.com with ESMTPSA id f22sm1361112ejl.46.2022.02.17.08.37.55 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 17 Feb 2022 08:37:56 -0800 (PST) Received: by mail-wr1-f49.google.com with SMTP id w11so10049815wra.4 for ; Thu, 17 Feb 2022 08:37:55 -0800 (PST) X-Received: by 2002:a5d:4c48:0:b0:1e4:aeab:c77e with SMTP id n8-20020a5d4c48000000b001e4aeabc77emr2927222wrt.342.1645115875087; Thu, 17 Feb 2022 08:37:55 -0800 (PST) MIME-Version: 1.0 References: <20220207073036.14901-1-lukasz.luba@arm.com> <20220207073036.14901-2-lukasz.luba@arm.com> <4a7d4e94-1461-5bac-5798-29998af9793a@arm.com> <7c059f4f-7439-0cad-c398-96dbde4e49c1@linaro.org> <5b8ca53e-3595-85fd-5ae9-a5e8285e8513@arm.com> In-Reply-To: <5b8ca53e-3595-85fd-5ae9-a5e8285e8513@arm.com> From: Doug Anderson Date: Thu, 17 Feb 2022 08:37:39 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 1/2] thermal: cooling: Check Energy Model type in cpufreq_cooling and devfreq_cooling To: Lukasz Luba Cc: Daniel Lezcano , Matthias Kaehlcke , LKML , Linux PM , amit daniel kachhap , Viresh Kumar , "Rafael J. Wysocki" , Amit Kucheria , Zhang Rui , Dietmar Eggemann , Pierre.Gondois@arm.com, Stephen Boyd , Rajendra Nayak , Bjorn Andersson , jorcrous@amazon.com, Rob Clark Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Thu, Feb 17, 2022 at 2:47 AM Lukasz Luba wrote: > > Hi Daniel, > > On 2/17/22 10:10 AM, Daniel Lezcano wrote: > > On 16/02/2022 18:33, Doug Anderson wrote: > >> Hi, > >> > >> On Wed, Feb 16, 2022 at 7:35 AM Lukasz Luba wrote: > >>> > >>> Hi Matthias, > >>> > >>> On 2/9/22 10:17 PM, Matthias Kaehlcke wrote: > >>>> On Wed, Feb 09, 2022 at 11:16:36AM +0000, Lukasz Luba wrote: > >>>>> > >>>>> > >>>>> On 2/8/22 5:25 PM, Matthias Kaehlcke wrote: > >>>>>> On Tue, Feb 08, 2022 at 09:32:28AM +0000, Lukasz Luba wrote: > >>>>>>> > >>>>>>> > >>> > >>> [snip] > >>> > >>>>>>> Could you point me to those devices please? > >>>>>> > >>>>>> arch/arm64/boot/dts/qcom/sc7180-trogdor-* > >>>>>> > >>>>>> Though as per above they shouldn't be impacted by your change, > >>>>>> since the > >>>>>> CPUs always pretend to use milli-Watts. > >>>>>> > >>>>>> [skipped some questions/answers since sc7180 isn't actually > >>>>>> impacted by > >>>>>> the change] > >>>>> > >>>>> Thank you Matthias. I will investigate your setup to get better > >>>>> understanding. > >>>> > >>>> Thanks! > >>>> > >>> > >>> I've checked those DT files and related code. > >>> As you already said, this patch is safe for them. > >>> So we can apply it IMO. > >>> > >>> > >>> -------------Off-topic------------------ > >>> Not in $subject comments: > >>> > >>> AFAICS based on two files which define thermal zones: > >>> sc7180-trogdor-homestar.dtsi > >>> sc7180-trogdor-coachz.dtsi > >>> > >>> only the 'big' cores are used as cooling devices in the > >>> 'skin_temp_thermal' - the CPU6 and CPU7. > >>> > >>> I assume you don't want to model at all the power usage > >>> from the Little cluster (which is quite big: 6 CPUs), do you? > >>> I can see that the Little CPUs have small dyn-power-coeff > >>> ~30% of the big and lower max freq, but still might be worth > >>> to add them to IPA. You might give them more 'weight', to > >>> make sure they receive more power during power split. > >>> > >>> You also don't have GPU cooling device in that thermal zone. > >>> Based on my experience if your GPU is a power hungry one, > >>> e.g. 2-4Watts, you might get better results when you model > >>> this 'hot' device (which impacts your temp sensor reported value). > >> > >> I think the two boards you point at (homestar and coachz) are just the > >> two that override the default defined in the SoC dtsi file. If you > >> look in sc7180.dtsi you'll see 'gpuss1-thermal' which has a cooling > >> map. You can also see the cooling maps for the littles. > >> > >> I guess we don't have a `dynamic-power-coefficient` for the GPU, > >> though? Seems like we should, but I haven't dug through all the code > >> here... > > > > The dynamic-power-coefficient is available for OPPs which includes > > CPUfreq and devfreq. As the GPU is managed by devfreq, setting the > > dynamic-power-coefficient makes the energy model available for it. > > > > However, the OPPs must define the frequency and the voltage. That is the > > case for most platforms except on QCom platform. > > > > That may not be specified as it uses a frequency index and the hardware > > does the voltage change in our back. The QCom cpufreq backend get the > > voltage table from a register (or whatever) and completes the voltage > > values for the OPPs, thus adding the information which is missing in the > > device tree. The energy model can then initializes itself and allows the > > usage of the Energy Aware Scheduler. > > > > However this piece of code is missing for the GPU part. > > > > Thank you for joining the discussion. I don't know about that Qcom > GPU voltage information is missing. > > If the voltage is not available (only the frequencies), there is > another way. There is an 'advanced' EM which uses registration function: > em_dev_register_perf_domain(). It uses a local driver callback to get > power for each found frequency. It has benefit because there is no > restriction to 'fit' into the math formula, instead just avg power > values can be feed into EM. It's called 'advanced' EM [1]. It seems like there _should_ be a way to get the voltage out for GPU operating points, like is done with cpufreq in qcom_cpufreq_hw_read_lut(), but it might need someone with Qualcomm documentation to help with it. Maybe Rajendra would be able to help? Adding Jordon and Rob to this conversation in case they're aware of anything. As you said, we could just list a power for each frequency, though. I'm actually not sure which one would be more accurate across a range of devices with different "corners": specifying a dynamic power coefficient used for all "corners" and then using the actual voltage and doing the math, or specifying a power number for each frequency and ignoring the actual voltage used. In any case we're trying to get ballpark numbers and not every device will be exactly the same, so probably it doesn't matter that much. > Now we hit (again) the DT & EM issue (it's an old one, IIRC Morten > was proposing from ~2014 this upstream, but EAS wasn't merged back > then): > where to store these power-freq values, which are then used by the > callback. We have the 'dynamic-power-coefficient' in DT, but > it has limitations. It would be good to have this simple array > attached to the GPU/CPU node. IMHO it meet the requirement of DT, > it describes the HW (it would have HZ and Watts values). > > Doug, Matthias could you have a look at that function and its > usage, please [1]? > If you guys would support me in this, I would start, with an RFC > proposal, a discussion on LKML. > > [1] > https://elixir.bootlin.com/linux/v5.17-rc4/source/Documentation/power/energy-model.rst#L87 Matthias: I think you've spent more time on the thermal stuff than me so I'll assume you'll follow-up here. If not then please yell! Ideally, though, someone from Qualcomm would jump in an own this. Basically it allows more intelligently throttling the GPU and CPU together in tandem instead of treating them separately IIUC, right? -Doug