Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20;
Message-ID: <8bd5cf36-e1fb-305c-08c5-3bbc80204866@collabora.com>
Date:   Mon, 16 Jan 2023 15:16:48 +0300
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.6.0
Subject: Re: [Patch v1 08/10] cpufreq: tegra194: add OPP support and set
 bandwidth
To:     Sumit Gupta <sumitg@nvidia.com>,
        Dmitry Osipenko <digetx@gmail.com>, treding@nvidia.com,
        krzysztof.kozlowski@linaro.org, viresh.kumar@linaro.org,
        rafael@kernel.org, jonathanh@nvidia.com, robh+dt@kernel.org,
        linux-kernel@vger.kernel.org, linux-tegra@vger.kernel.org,
        linux-pm@vger.kernel.org, devicetree@vger.kernel.org
Cc:     sanjayc@nvidia.com, ksitaraman@nvidia.com, ishah@nvidia.com,
        bbasu@nvidia.com, Rajkumar Kasirajan <rkasirajan@nvidia.com>
References: <20221220160240.27494-1-sumitg@nvidia.com>
 <20221220160240.27494-9-sumitg@nvidia.com>
 <4e3e4485-ba22-eb47-fb95-e8d626160bc6@gmail.com>
 <8e6d7dd3-1bdc-ee4b-0c1e-1ae9cd8e4f29@nvidia.com>
Content-Language: en-US
From:   Dmitry Osipenko <dmitry.osipenko@collabora.com>
In-Reply-To: <8e6d7dd3-1bdc-ee4b-0c1e-1ae9cd8e4f29@nvidia.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: bulk

On 1/13/23 16:50, Sumit Gupta wrote:
> 
> 
> On 22/12/22 21:16, Dmitry Osipenko wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> 20.12.2022 19:02, Sumit Gupta пишет:
>>> Add support to use OPP table from DT in Tegra194 cpufreq driver.
>>> Tegra SoC's receive the frequency lookup table (LUT) from BPMP-FW.
>>> Cross check the OPP's present in DT against the LUT from BPMP-FW
>>> and enable only those DT OPP's which are present in LUT also.
>>>
>>> The OPP table in DT has CPU Frequency to bandwidth mapping where
>>> the bandwidth value is per MC channel. DRAM bandwidth depends on the
>>> number of MC channels which can vary as per the boot configuration.
>>> This per channel bandwidth from OPP table will be later converted by
>>> MC driver to final bandwidth value by multiplying with number of
>>> channels before sending the request to BPMP-FW.
>>>
>>> If OPP table is not present in DT, then use the LUT from BPMP-FW directy
>>> as the frequency table and not do the DRAM frequency scaling which is
>>> same as the current behavior.
>>>
>>> Now, as the CPU Frequency table is being controlling through OPP table
>>> in DT. Keeping fewer entries in the table will create less frequency
>>> steps and scale fast to high frequencies if required.
>>
>> It's not exactly clear what you're doing here. Are you going to scale
>> memory BW based on CPU freq? If yes, then this is wrong because CPU freq
>> is independent from the memory subsystem.
>>
>> All Tegra30+ SoCs have ACTMON hardware unit that monitors CPU memory
>> activity and CPU memory BW should be scaled based on CPU memory events
>> counter. We have ACTMON devfreq driver for older SoCs. I have no clue
>> how ACTMON can be accessed on T186+, perhaps there should be a BPMP FW
>> API for that.
>>
> 
> Yes, scaling the memory BW based on CPU freq.
> Referred below patch set for previous generation of Tegra Soc's which
> you mentioned and tried to trace the history.
> 
> https://patchwork.ozlabs.org/project/linux-tegra/patch/1418719298-25314-3-git-send-email-tomeu.vizoso@collabora.com/
> 
> In new Tegra Soc's, actmon counter control and usage has been moved to
> BPMP-FW where only 'MCALL' counter is used and 'MCCPU is not being used.
> Using the actmon counter was a reactive way to scale the frequency which
> is less effective due to averaging over a time period.
> We are now using the proactive way where clients tell their bandwidth
> needs to help achieve better performance.

You don't know what bandwidth CPU needs, you trying to guess it.

It should be a bad decision to use cpufreq for memory bandwidth scaling.
You'll be wasting memory power 90% of time because cpufreq doesn't have
relation to the DRAM, your heuristics will be wrong and won't do
anything good compared to using ACTMON. The L2 CPU cache + memory
prefetching hides memory from CPU. And cpufreq should be less reactive
than ACTMON in general.

Scaling memory freq based on cpufreq is what downstream NV kernel did
10+ years ago for the oldest Tegra generations. Today upstream has all
the necessary infrastructure for doing memory bandwidth scaling properly
and we even using h/w memory counters on T20. It's strange that you want
to bring the downstream archaity to the modern upstream for the latest
Tegra generations.

If you can skip the BPMP-FW and use ACTMON directly from kernel, then
that's what I suggest to do.

-- 
Best regards,
Dmitry