2024-05-22 08:16:08

by Shivnandan Kumar

[permalink] [raw]
Subject: [PATCH] soc: qcom: icc-bwmon: Update zone1_thres_count to 3

Update zone1_thres_count to 3 from 16 so that
driver can reduce bus vote in 3 sample windows instead
of waiting for 16 windows. This is in line with downstream
implementation.

Signed-off-by: Shivnandan Kumar <[email protected]>
---
drivers/soc/qcom/icc-bwmon.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/qcom/icc-bwmon.c b/drivers/soc/qcom/icc-bwmon.c
index 656706259353..f1065427bb80 100644
--- a/drivers/soc/qcom/icc-bwmon.c
+++ b/drivers/soc/qcom/icc-bwmon.c
@@ -815,7 +815,7 @@ static const struct icc_bwmon_data msm8998_bwmon_data = {
static const struct icc_bwmon_data sdm845_cpu_bwmon_data = {
.sample_ms = 4,
.count_unit_kb = 64,
- .zone1_thres_count = 16,
+ .zone1_thres_count = 3,
.zone3_thres_count = 1,
.quirks = BWMON_HAS_GLOBAL_IRQ,
.regmap_fields = sdm845_cpu_bwmon_reg_fields,
@@ -834,7 +834,7 @@ static const struct icc_bwmon_data sdm845_llcc_bwmon_data = {
static const struct icc_bwmon_data sc7280_llcc_bwmon_data = {
.sample_ms = 4,
.count_unit_kb = 64,
- .zone1_thres_count = 16,
+ .zone1_thres_count = 3,
.zone3_thres_count = 1,
.quirks = BWMON_NEEDS_FORCE_CLEAR,
.regmap_fields = sdm845_llcc_bwmon_reg_fields,
--
2.25.1



2024-05-22 08:30:18

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH] soc: qcom: icc-bwmon: Update zone1_thres_count to 3

On 22/05/2024 10:15, Shivnandan Kumar wrote:
> Update zone1_thres_count to 3 from 16 so that
> driver can reduce bus vote in 3 sample windows instead
> of waiting for 16 windows. This is in line with downstream
> implementation.
>

This might make bwmon quite jittery. I don't think downstream is the
source of truth here. Please provide some measurements *on mainline tree*.

Best regards,
Krzysztof


2024-05-22 09:05:42

by Shivnandan Kumar

[permalink] [raw]
Subject: Re: [PATCH] soc: qcom: icc-bwmon: Update zone1_thres_count to 3



On 5/22/2024 1:58 PM, Krzysztof Kozlowski wrote:
> On 22/05/2024 10:15, Shivnandan Kumar wrote:
>> Update zone1_thres_count to 3 from 16 so that
>> driver can reduce bus vote in 3 sample windows instead
>> of waiting for 16 windows. This is in line with downstream
>> implementation.
>>
>
> This might make bwmon quite jittery. I don't think downstream is the
> source of truth here. Please provide some measurements *on mainline tree*.
>

Hi Krzysztof,

The 16-window (64 ms) waiting time is too long to reduce the bus vote.
At higher FPS, there will be multiple frames in 64ms e.g. 4 frames at
60FPS in 64ms. Hence, delay of 64ms in decision making will lead to
higher power regression. I’ve tested this change for 4K video playback
on mainline tree, and there’s a significant power-saving.

I propose to make it a tunable,so that user space can tune it
based on runtime depending on fps.

USECASE zone1_thres_count=16 zone1_thres_count=3
4K video playback 236.15 mA 203.15 mA

Thanks,
Shivnandan

> Best regards,
> Krzysztof
>

2024-05-22 09:19:56

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH] soc: qcom: icc-bwmon: Update zone1_thres_count to 3

On 22/05/2024 11:05, Shivnandan Kumar wrote:
>
>
> On 5/22/2024 1:58 PM, Krzysztof Kozlowski wrote:
>> On 22/05/2024 10:15, Shivnandan Kumar wrote:
>>> Update zone1_thres_count to 3 from 16 so that
>>> driver can reduce bus vote in 3 sample windows instead
>>> of waiting for 16 windows. This is in line with downstream
>>> implementation.
>>>
>>
>> This might make bwmon quite jittery. I don't think downstream is the
>> source of truth here. Please provide some measurements *on mainline tree*.
>>
>
> Hi Krzysztof,
>
> The 16-window (64 ms) waiting time is too long to reduce the bus vote.
> At higher FPS, there will be multiple frames in 64ms e.g. 4 frames at
> 60FPS in 64ms. Hence, delay of 64ms in decision making will lead to
> higher power regression. I’ve tested this change for 4K video playback
> on mainline tree, and there’s a significant power-saving.

Please include it, with measurement below, in the commit msg.

> I propose to make it a tunable,so that user space can tune it
> based on runtime depending on fps.>
> USECASE zone1_thres_count=16 zone1_thres_count=3
> 4K video playback 236.15 mA 203.15 mA
>
> Thanks,
> Shivnandan
>
>> Best regards,
>> Krzysztof
>>

Best regards,
Krzysztof


2024-05-22 16:58:18

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH] soc: qcom: icc-bwmon: Update zone1_thres_count to 3

On Wed, May 22, 2024 at 02:35:21PM +0530, Shivnandan Kumar wrote:
>
>
> On 5/22/2024 1:58 PM, Krzysztof Kozlowski wrote:
> > On 22/05/2024 10:15, Shivnandan Kumar wrote:
> > > Update zone1_thres_count to 3 from 16 so that
> > > driver can reduce bus vote in 3 sample windows instead
> > > of waiting for 16 windows. This is in line with downstream
> > > implementation.
> > >
> >
> > This might make bwmon quite jittery. I don't think downstream is the
> > source of truth here. Please provide some measurements *on mainline tree*.
> >
>
> Hi Krzysztof,
>
> The 16-window (64 ms) waiting time is too long to reduce the bus vote.
> At higher FPS, there will be multiple frames in 64ms e.g. 4 frames at 60FPS
> in 64ms. Hence, delay of 64ms in decision making will lead to higher power
> regression. I’ve tested this change for 4K video playback on mainline tree,
> and there’s a significant power-saving.
>

As requested by Krzysztof already, update your commit message to capture
such motivation.

Please read and follow this:
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes

> I propose to make it a tunable,so that user space can tune it
> based on runtime depending on fps.
>

I presume that in e.g. Android there could be some sort of power HAL
that tweaks this value dynamically? In a general purpose system, how do
we make sure that this value stays relevant for multiple types of use
cases?

> USECASE zone1_thres_count=16 zone1_thres_count=3
> 4K video playback 236.15 mA 203.15 mA

4k video playback is a fairly specific (and generally unusual) use case.
Is there any impact (negative or positive) for other use
cases/workloads?

Regards,
Bjorn

>
> Thanks,
> Shivnandan
>
> > Best regards,
> > Krzysztof
> >