2020-05-26 15:19:44

by Benjamin GAIGNARD

[permalink] [raw]
Subject: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS

A first round [1] of discussions and suggestions have already be done on
this series but without found a solution to the problem. I resend it to
progress on this topic.

When start streaming from the sensor the CPU load could remain very low
because almost all the capture pipeline is done in hardware (i.e. without
using the CPU) and let believe to cpufreq governor that it could use lower
frequencies. If the governor decides to use a too low frequency that
becomes a problem when we need to acknowledge the interrupt during the
blanking time.
The delay to ack the interrupt and perform all the other actions before
the next frame is very short and doesn't allow to the cpufreq governor to
provide the required burst of power. That led to drop the half of the frames.

To avoid this problem, DCMI driver informs the cpufreq governors by adding
a cpufreq minimum load QoS resquest.

Benjamin

[1] https://lkml.org/lkml/2020/4/24/360

Benjamin Gaignard (3):
PM: QoS: Introduce cpufreq minimum load QoS
cpufreq: governor: Use minimum load QoS
media: stm32-dcmi: Inform cpufreq governors about cpu load needs

drivers/cpufreq/cpufreq_governor.c | 5 +
drivers/media/platform/stm32/stm32-dcmi.c | 8 ++
include/linux/pm_qos.h | 12 ++
kernel/power/qos.c | 213 ++++++++++++++++++++++++++++++
4 files changed, 238 insertions(+)

--
2.15.0


2020-05-26 15:20:59

by Benjamin GAIGNARD

[permalink] [raw]
Subject: [RFC 2/3] cpufreq: governor: Use minimum load QoS

Make sure that the returned load is above the system-wide minimum
load QoS.
Devices could set this specific QoS to inform governors about their
need in terms of CPU load when computing it from idle time isn't accurate.

Signed-off-by: Benjamin Gaignard <[email protected]>
---
drivers/cpufreq/cpufreq_governor.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index f99ae45efaea..1494e5e4c788 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -118,6 +118,7 @@ unsigned int dbs_update(struct cpufreq_policy *policy)
unsigned int ignore_nice = dbs_data->ignore_nice_load;
unsigned int max_load = 0, idle_periods = UINT_MAX;
unsigned int sampling_rate, io_busy, j;
+ unsigned int qos_min_load;

/*
* Sometimes governors may use an additional multiplier to increase
@@ -225,6 +226,10 @@ unsigned int dbs_update(struct cpufreq_policy *policy)

policy_dbs->idle_periods = idle_periods;

+ qos_min_load = cpufreq_minload_qos_limit();
+ if (qos_min_load > max_load)
+ max_load = qos_min_load;
+
return max_load;
}
EXPORT_SYMBOL_GPL(dbs_update);
--
2.15.0

2020-05-27 15:34:21

by Valentin Schneider

[permalink] [raw]
Subject: Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS


Hi Benjamin,

On 26/05/20 16:16, Benjamin Gaignard wrote:
> A first round [1] of discussions and suggestions have already be done on
> this series but without found a solution to the problem. I resend it to
> progress on this topic.
>

Apologies for sleeping on that previous thread.

So what had been suggested over there was to use uclamp to boost the
frequency of the handling thread; however if you use threaded IRQs you
get RT threads, which already get the max frequency by default (at least
with schedutil).

Does that not work for you, and if so, why?

> When start streaming from the sensor the CPU load could remain very low
> because almost all the capture pipeline is done in hardware (i.e. without
> using the CPU) and let believe to cpufreq governor that it could use lower
> frequencies. If the governor decides to use a too low frequency that
> becomes a problem when we need to acknowledge the interrupt during the
> blanking time.
> The delay to ack the interrupt and perform all the other actions before
> the next frame is very short and doesn't allow to the cpufreq governor to
> provide the required burst of power. That led to drop the half of the frames.
>
> To avoid this problem, DCMI driver informs the cpufreq governors by adding
> a cpufreq minimum load QoS resquest.
>
> Benjamin
>
> [1] https://lkml.org/lkml/2020/4/24/360
>
> Benjamin Gaignard (3):
> PM: QoS: Introduce cpufreq minimum load QoS
> cpufreq: governor: Use minimum load QoS
> media: stm32-dcmi: Inform cpufreq governors about cpu load needs
>
> drivers/cpufreq/cpufreq_governor.c | 5 +
> drivers/media/platform/stm32/stm32-dcmi.c | 8 ++
> include/linux/pm_qos.h | 12 ++
> kernel/power/qos.c | 213 ++++++++++++++++++++++++++++++
> 4 files changed, 238 insertions(+)

2020-05-27 15:41:27

by Benjamin GAIGNARD

[permalink] [raw]
Subject: Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS



On 5/27/20 12:09 PM, Valentin Schneider wrote:
> Hi Benjamin,
>
> On 26/05/20 16:16, Benjamin Gaignard wrote:
>> A first round [1] of discussions and suggestions have already be done on
>> this series but without found a solution to the problem. I resend it to
>> progress on this topic.
>>
> Apologies for sleeping on that previous thread.
>
> So what had been suggested over there was to use uclamp to boost the
> frequency of the handling thread; however if you use threaded IRQs you
> get RT threads, which already get the max frequency by default (at least
> with schedutil).
>
> Does that not work for you, and if so, why?
That doesn't work because almost everything is done by the hardware blocks
without charge the CPU so the thread isn't running. I have done the
tests with schedutil
and ondemand scheduler (which is the one I'm targeting). I have no
issues when using
performance scheduler because it always keep the highest frequencies.


>
>> When start streaming from the sensor the CPU load could remain very low
>> because almost all the capture pipeline is done in hardware (i.e. without
>> using the CPU) and let believe to cpufreq governor that it could use lower
>> frequencies. If the governor decides to use a too low frequency that
>> becomes a problem when we need to acknowledge the interrupt during the
>> blanking time.
>> The delay to ack the interrupt and perform all the other actions before
>> the next frame is very short and doesn't allow to the cpufreq governor to
>> provide the required burst of power. That led to drop the half of the frames.
>>
>> To avoid this problem, DCMI driver informs the cpufreq governors by adding
>> a cpufreq minimum load QoS resquest.
>>
>> Benjamin
>>
>> [1] https://lkml.org/lkml/2020/4/24/360
>>
>> Benjamin Gaignard (3):
>> PM: QoS: Introduce cpufreq minimum load QoS
>> cpufreq: governor: Use minimum load QoS
>> media: stm32-dcmi: Inform cpufreq governors about cpu load needs
>>
>> drivers/cpufreq/cpufreq_governor.c | 5 +
>> drivers/media/platform/stm32/stm32-dcmi.c | 8 ++
>> include/linux/pm_qos.h | 12 ++
>> kernel/power/qos.c | 213 ++++++++++++++++++++++++++++++
>> 4 files changed, 238 insertions(+)

2020-05-27 15:50:46

by Vincent Guittot

[permalink] [raw]
Subject: Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS

On Wed, 27 May 2020 at 13:17, Benjamin GAIGNARD
<[email protected]> wrote:
>
>
>
> On 5/27/20 12:09 PM, Valentin Schneider wrote:
> > Hi Benjamin,
> >
> > On 26/05/20 16:16, Benjamin Gaignard wrote:
> >> A first round [1] of discussions and suggestions have already be done on
> >> this series but without found a solution to the problem. I resend it to
> >> progress on this topic.
> >>
> > Apologies for sleeping on that previous thread.
> >
> > So what had been suggested over there was to use uclamp to boost the
> > frequency of the handling thread; however if you use threaded IRQs you
> > get RT threads, which already get the max frequency by default (at least
> > with schedutil).
> >
> > Does that not work for you, and if so, why?
> That doesn't work because almost everything is done by the hardware blocks
> without charge the CPU so the thread isn't running. I have done the
> tests with schedutil
> and ondemand scheduler (which is the one I'm targeting). I have no
> issues when using
> performance scheduler because it always keep the highest frequencies.

IMHO, the only way to ensure a min frequency for anything else than a
thread is to use freq_qos_add_request() just like cpufreq cooling
device but for the opposite QoS. This can be applied only on the
frequency domain of the CPU which handles the interrupt.
Have you also checked the wakeup latency of your idle state ?

>
>
> >
> >> When start streaming from the sensor the CPU load could remain very low
> >> because almost all the capture pipeline is done in hardware (i.e. without
> >> using the CPU) and let believe to cpufreq governor that it could use lower
> >> frequencies. If the governor decides to use a too low frequency that
> >> becomes a problem when we need to acknowledge the interrupt during the
> >> blanking time.
> >> The delay to ack the interrupt and perform all the other actions before
> >> the next frame is very short and doesn't allow to the cpufreq governor to
> >> provide the required burst of power. That led to drop the half of the frames.
> >>
> >> To avoid this problem, DCMI driver informs the cpufreq governors by adding
> >> a cpufreq minimum load QoS resquest.
> >>
> >> Benjamin
> >>
> >> [1] https://lkml.org/lkml/2020/4/24/360
> >>
> >> Benjamin Gaignard (3):
> >> PM: QoS: Introduce cpufreq minimum load QoS
> >> cpufreq: governor: Use minimum load QoS
> >> media: stm32-dcmi: Inform cpufreq governors about cpu load needs
> >>
> >> drivers/cpufreq/cpufreq_governor.c | 5 +
> >> drivers/media/platform/stm32/stm32-dcmi.c | 8 ++
> >> include/linux/pm_qos.h | 12 ++
> >> kernel/power/qos.c | 213 ++++++++++++++++++++++++++++++
> >> 4 files changed, 238 insertions(+)

2020-05-27 15:50:56

by Valentin Schneider

[permalink] [raw]
Subject: Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS


On 27/05/20 12:17, Benjamin GAIGNARD wrote:
> On 5/27/20 12:09 PM, Valentin Schneider wrote:
>> Hi Benjamin,
>>
>> On 26/05/20 16:16, Benjamin Gaignard wrote:
>>> A first round [1] of discussions and suggestions have already be done on
>>> this series but without found a solution to the problem. I resend it to
>>> progress on this topic.
>>>
>> Apologies for sleeping on that previous thread.
>>
>> So what had been suggested over there was to use uclamp to boost the
>> frequency of the handling thread; however if you use threaded IRQs you
>> get RT threads, which already get the max frequency by default (at least
>> with schedutil).
>>
>> Does that not work for you, and if so, why?
>
> That doesn't work because almost everything is done by the hardware blocks
> without charge the CPU so the thread isn't running.

I'm not sure I follow; the frequency of the CPU doesn't matter while
your hardware blocks are spinning, right? AIUI what matters is running
your interrupt handler / action at max freq, which you get if you use
threaded IRQs and schedutil.

I think it would help if you could clarify which tasks / parts of your
pipeline you need running at high frequencies. The point is that setting
a QoS request affects all tasks, whereas we could be smarter and only
boost the required tasks.

> I have done the
> tests with schedutil
> and ondemand scheduler (which is the one I'm targeting). I have no
> issues when using
> performance scheduler because it always keep the highest frequencies.
>
>
>>
>>> When start streaming from the sensor the CPU load could remain very low
>>> because almost all the capture pipeline is done in hardware (i.e. without
>>> using the CPU) and let believe to cpufreq governor that it could use lower
>>> frequencies. If the governor decides to use a too low frequency that
>>> becomes a problem when we need to acknowledge the interrupt during the
>>> blanking time.
>>> The delay to ack the interrupt and perform all the other actions before
>>> the next frame is very short and doesn't allow to the cpufreq governor to
>>> provide the required burst of power. That led to drop the half of the frames.
>>>
>>> To avoid this problem, DCMI driver informs the cpufreq governors by adding
>>> a cpufreq minimum load QoS resquest.
>>>
>>> Benjamin
>>>
>>> [1] https://lkml.org/lkml/2020/4/24/360
>>>
>>> Benjamin Gaignard (3):
>>> PM: QoS: Introduce cpufreq minimum load QoS
>>> cpufreq: governor: Use minimum load QoS
>>> media: stm32-dcmi: Inform cpufreq governors about cpu load needs
>>>
>>> drivers/cpufreq/cpufreq_governor.c | 5 +
>>> drivers/media/platform/stm32/stm32-dcmi.c | 8 ++
>>> include/linux/pm_qos.h | 12 ++
>>> kernel/power/qos.c | 213 ++++++++++++++++++++++++++++++
>>> 4 files changed, 238 insertions(+)

2020-05-27 15:53:05

by Benjamin GAIGNARD

[permalink] [raw]
Subject: Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS



On 5/27/20 2:14 PM, Valentin Schneider wrote:
> On 27/05/20 12:17, Benjamin GAIGNARD wrote:
>> On 5/27/20 12:09 PM, Valentin Schneider wrote:
>>> Hi Benjamin,
>>>
>>> On 26/05/20 16:16, Benjamin Gaignard wrote:
>>>> A first round [1] of discussions and suggestions have already be done on
>>>> this series but without found a solution to the problem. I resend it to
>>>> progress on this topic.
>>>>
>>> Apologies for sleeping on that previous thread.
>>>
>>> So what had been suggested over there was to use uclamp to boost the
>>> frequency of the handling thread; however if you use threaded IRQs you
>>> get RT threads, which already get the max frequency by default (at least
>>> with schedutil).
>>>
>>> Does that not work for you, and if so, why?
>> That doesn't work because almost everything is done by the hardware blocks
>> without charge the CPU so the thread isn't running.
> I'm not sure I follow; the frequency of the CPU doesn't matter while
> your hardware blocks are spinning, right? AIUI what matters is running
> your interrupt handler / action at max freq, which you get if you use
> threaded IRQs and schedutil.
Yes but not limited to schedutil.
Given the latency needed to change of frequencies I think it could
already too late
to change the CPU frequency when handling the threaded interrupt.
>
> I think it would help if you could clarify which tasks / parts of your
> pipeline you need running at high frequencies. The point is that setting
> a QoS request affects all tasks, whereas we could be smarter and only
> boost the required tasks.
What make us drop frames is that the threaded IRQ is scheduled too late.
The not thread part of the interrupt handler where we clear the
interrupt flags
is going fine but the thread part not.
>
>> I have done the
>> tests with schedutil
>> and ondemand scheduler (which is the one I'm targeting). I have no
>> issues when using
>> performance scheduler because it always keep the highest frequencies.
>>
>>
>>>> When start streaming from the sensor the CPU load could remain very low
>>>> because almost all the capture pipeline is done in hardware (i.e. without
>>>> using the CPU) and let believe to cpufreq governor that it could use lower
>>>> frequencies. If the governor decides to use a too low frequency that
>>>> becomes a problem when we need to acknowledge the interrupt during the
>>>> blanking time.
>>>> The delay to ack the interrupt and perform all the other actions before
>>>> the next frame is very short and doesn't allow to the cpufreq governor to
>>>> provide the required burst of power. That led to drop the half of the frames.
>>>>
>>>> To avoid this problem, DCMI driver informs the cpufreq governors by adding
>>>> a cpufreq minimum load QoS resquest.
>>>>
>>>> Benjamin
>>>>
>>>> [1] https://lkml.org/lkml/2020/4/24/360
>>>>
>>>> Benjamin Gaignard (3):
>>>> PM: QoS: Introduce cpufreq minimum load QoS
>>>> cpufreq: governor: Use minimum load QoS
>>>> media: stm32-dcmi: Inform cpufreq governors about cpu load needs
>>>>
>>>> drivers/cpufreq/cpufreq_governor.c | 5 +
>>>> drivers/media/platform/stm32/stm32-dcmi.c | 8 ++
>>>> include/linux/pm_qos.h | 12 ++
>>>> kernel/power/qos.c | 213 ++++++++++++++++++++++++++++++
>>>> 4 files changed, 238 insertions(+)

2020-05-27 16:00:32

by Benjamin GAIGNARD

[permalink] [raw]
Subject: Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS



On 5/27/20 2:48 PM, Benjamin GAIGNARD wrote:
>
>
> On 5/27/20 2:22 PM, Vincent Guittot wrote:
>> On Wed, 27 May 2020 at 13:17, Benjamin GAIGNARD
>> <[email protected]> wrote:
>>>
>>>
>>> On 5/27/20 12:09 PM, Valentin Schneider wrote:
>>>> Hi Benjamin,
>>>>
>>>> On 26/05/20 16:16, Benjamin Gaignard wrote:
>>>>> A first round [1] of discussions and suggestions have already be
>>>>> done on
>>>>> this series but without found a solution to the problem. I resend
>>>>> it to
>>>>> progress on this topic.
>>>>>
>>>> Apologies for sleeping on that previous thread.
>>>>
>>>> So what had been suggested over there was to use uclamp to boost the
>>>> frequency of the handling thread; however if you use threaded IRQs you
>>>> get RT threads, which already get the max frequency by default (at
>>>> least
>>>> with schedutil).
>>>>
>>>> Does that not work for you, and if so, why?
>>> That doesn't work because almost everything is done by the hardware
>>> blocks
>>> without charge the CPU so the thread isn't running. I have done the
>>> tests with schedutil
>>> and ondemand scheduler (which is the one I'm targeting). I have no
>>> issues when using
>>> performance scheduler because it always keep the highest frequencies.
>> IMHO, the only way to ensure a min frequency for anything else than a
>> thread is to use freq_qos_add_request() just like cpufreq cooling
>> device but for the opposite QoS. This can be applied only on the
>> frequency domain of the CPU which handles the interrupt.
> I will give a try with this idea.
> Thanks.

Adding freq_qos_add_request(FREQ_QOS_MIN) when starting streaming frames
solve my problem. I remove the request at the end of the streaming to
restore
the default value.

Benjamin


>> Have you also checked the wakeup latency of your idle state ?
> It just could go in WFI so latency should be minimal.
>>
>>>
>>>>> When start streaming from the sensor the CPU load could remain
>>>>> very low
>>>>> because almost all the capture pipeline is done in hardware (i.e.
>>>>> without
>>>>> using the CPU) and let believe to cpufreq governor that it could
>>>>> use lower
>>>>> frequencies. If the governor decides to use a too low frequency that
>>>>> becomes a problem when we need to acknowledge the interrupt during
>>>>> the
>>>>> blanking time.
>>>>> The delay to ack the interrupt and perform all the other actions
>>>>> before
>>>>> the next frame is very short and doesn't allow to the cpufreq
>>>>> governor to
>>>>> provide the required burst of power. That led to drop the half of
>>>>> the frames.
>>>>>
>>>>> To avoid this problem, DCMI driver informs the cpufreq governors
>>>>> by adding
>>>>> a cpufreq minimum load QoS resquest.
>>>>>
>>>>> Benjamin
>>>>>
>>>>> [1] https://lkml.org/lkml/2020/4/24/360
>>>>>
>>>>> Benjamin Gaignard (3):
>>>>>     PM: QoS: Introduce cpufreq minimum load QoS
>>>>>     cpufreq: governor: Use minimum load QoS
>>>>>     media: stm32-dcmi: Inform cpufreq governors about cpu load needs
>>>>>
>>>>>    drivers/cpufreq/cpufreq_governor.c        |   5 +
>>>>>    drivers/media/platform/stm32/stm32-dcmi.c |   8 ++
>>>>>    include/linux/pm_qos.h                    |  12 ++
>>>>>    kernel/power/qos.c                        | 213
>>>>> ++++++++++++++++++++++++++++++
>>>>>    4 files changed, 238 insertions(+)
>

2020-05-27 16:03:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS

On Wed, May 27, 2020 at 4:54 PM Benjamin GAIGNARD
<[email protected]> wrote:
>
>
>
> On 5/27/20 2:48 PM, Benjamin GAIGNARD wrote:
> >
> >
> > On 5/27/20 2:22 PM, Vincent Guittot wrote:
> >> On Wed, 27 May 2020 at 13:17, Benjamin GAIGNARD
> >> <[email protected]> wrote:
> >>>
> >>>
> >>> On 5/27/20 12:09 PM, Valentin Schneider wrote:
> >>>> Hi Benjamin,
> >>>>
> >>>> On 26/05/20 16:16, Benjamin Gaignard wrote:
> >>>>> A first round [1] of discussions and suggestions have already be
> >>>>> done on
> >>>>> this series but without found a solution to the problem. I resend
> >>>>> it to
> >>>>> progress on this topic.
> >>>>>
> >>>> Apologies for sleeping on that previous thread.
> >>>>
> >>>> So what had been suggested over there was to use uclamp to boost the
> >>>> frequency of the handling thread; however if you use threaded IRQs you
> >>>> get RT threads, which already get the max frequency by default (at
> >>>> least
> >>>> with schedutil).
> >>>>
> >>>> Does that not work for you, and if so, why?
> >>> That doesn't work because almost everything is done by the hardware
> >>> blocks
> >>> without charge the CPU so the thread isn't running. I have done the
> >>> tests with schedutil
> >>> and ondemand scheduler (which is the one I'm targeting). I have no
> >>> issues when using
> >>> performance scheduler because it always keep the highest frequencies.
> >> IMHO, the only way to ensure a min frequency for anything else than a
> >> thread is to use freq_qos_add_request() just like cpufreq cooling
> >> device but for the opposite QoS. This can be applied only on the
> >> frequency domain of the CPU which handles the interrupt.
> > I will give a try with this idea.
> > Thanks.
>
> Adding freq_qos_add_request(FREQ_QOS_MIN) when starting streaming frames
> solve my problem. I remove the request at the end of the streaming to
> restore
> the default value.

You may as well add the request once at the init time with the request
value set to PM_QOS_MIN_FREQUENCY_DEFAULT_VALUE initially and update
it as needed going forward.

2020-05-27 17:46:26

by Benjamin GAIGNARD

[permalink] [raw]
Subject: Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS



On 5/27/20 2:22 PM, Vincent Guittot wrote:
> On Wed, 27 May 2020 at 13:17, Benjamin GAIGNARD
> <[email protected]> wrote:
>>
>>
>> On 5/27/20 12:09 PM, Valentin Schneider wrote:
>>> Hi Benjamin,
>>>
>>> On 26/05/20 16:16, Benjamin Gaignard wrote:
>>>> A first round [1] of discussions and suggestions have already be done on
>>>> this series but without found a solution to the problem. I resend it to
>>>> progress on this topic.
>>>>
>>> Apologies for sleeping on that previous thread.
>>>
>>> So what had been suggested over there was to use uclamp to boost the
>>> frequency of the handling thread; however if you use threaded IRQs you
>>> get RT threads, which already get the max frequency by default (at least
>>> with schedutil).
>>>
>>> Does that not work for you, and if so, why?
>> That doesn't work because almost everything is done by the hardware blocks
>> without charge the CPU so the thread isn't running. I have done the
>> tests with schedutil
>> and ondemand scheduler (which is the one I'm targeting). I have no
>> issues when using
>> performance scheduler because it always keep the highest frequencies.
> IMHO, the only way to ensure a min frequency for anything else than a
> thread is to use freq_qos_add_request() just like cpufreq cooling
> device but for the opposite QoS. This can be applied only on the
> frequency domain of the CPU which handles the interrupt.
I will give a try with this idea.
Thanks.
> Have you also checked the wakeup latency of your idle state ?
It just could go in WFI so latency should be minimal.
>
>>
>>>> When start streaming from the sensor the CPU load could remain very low
>>>> because almost all the capture pipeline is done in hardware (i.e. without
>>>> using the CPU) and let believe to cpufreq governor that it could use lower
>>>> frequencies. If the governor decides to use a too low frequency that
>>>> becomes a problem when we need to acknowledge the interrupt during the
>>>> blanking time.
>>>> The delay to ack the interrupt and perform all the other actions before
>>>> the next frame is very short and doesn't allow to the cpufreq governor to
>>>> provide the required burst of power. That led to drop the half of the frames.
>>>>
>>>> To avoid this problem, DCMI driver informs the cpufreq governors by adding
>>>> a cpufreq minimum load QoS resquest.
>>>>
>>>> Benjamin
>>>>
>>>> [1] https://lkml.org/lkml/2020/4/24/360
>>>>
>>>> Benjamin Gaignard (3):
>>>> PM: QoS: Introduce cpufreq minimum load QoS
>>>> cpufreq: governor: Use minimum load QoS
>>>> media: stm32-dcmi: Inform cpufreq governors about cpu load needs
>>>>
>>>> drivers/cpufreq/cpufreq_governor.c | 5 +
>>>> drivers/media/platform/stm32/stm32-dcmi.c | 8 ++
>>>> include/linux/pm_qos.h | 12 ++
>>>> kernel/power/qos.c | 213 ++++++++++++++++++++++++++++++
>>>> 4 files changed, 238 insertions(+)

2020-05-27 18:48:33

by Valentin Schneider

[permalink] [raw]
Subject: Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS


On 27/05/20 14:11, Benjamin GAIGNARD wrote:
> On 5/27/20 2:14 PM, Valentin Schneider wrote:
>> On 27/05/20 12:17, Benjamin GAIGNARD wrote:
>>> On 5/27/20 12:09 PM, Valentin Schneider wrote:
>>>> Hi Benjamin,
>>>>
>>>> On 26/05/20 16:16, Benjamin Gaignard wrote:
>>>>> A first round [1] of discussions and suggestions have already be done on
>>>>> this series but without found a solution to the problem. I resend it to
>>>>> progress on this topic.
>>>>>
>>>> Apologies for sleeping on that previous thread.
>>>>
>>>> So what had been suggested over there was to use uclamp to boost the
>>>> frequency of the handling thread; however if you use threaded IRQs you
>>>> get RT threads, which already get the max frequency by default (at least
>>>> with schedutil).
>>>>
>>>> Does that not work for you, and if so, why?
>>> That doesn't work because almost everything is done by the hardware blocks
>>> without charge the CPU so the thread isn't running.
>> I'm not sure I follow; the frequency of the CPU doesn't matter while
>> your hardware blocks are spinning, right? AIUI what matters is running
>> your interrupt handler / action at max freq, which you get if you use
>> threaded IRQs and schedutil.
> Yes but not limited to schedutil.
> Given the latency needed to change of frequencies I think it could
> already too late
> to change the CPU frequency when handling the threaded interrupt.

Right, on my Juno the transition latency (i.e. worse case) is about
1.2ms; I can see that eating into your time budget, depending on the
framerate you're going for.

Vincent's got a point, if you can limit that max-freq-hold to a single
frequency domain, that would probably be a tad better.

Thanks for persisting through my questioning :-)

>>
>> I think it would help if you could clarify which tasks / parts of your
>> pipeline you need running at high frequencies. The point is that setting
>> a QoS request affects all tasks, whereas we could be smarter and only
>> boost the required tasks.
> What make us drop frames is that the threaded IRQ is scheduled too late.
> The not thread part of the interrupt handler where we clear the
> interrupt flags
> is going fine but the thread part not.
>>
>>> I have done the
>>> tests with schedutil
>>> and ondemand scheduler (which is the one I'm targeting). I have no
>>> issues when using
>>> performance scheduler because it always keep the highest frequencies.
>>>
>>>
>>>>> When start streaming from the sensor the CPU load could remain very low
>>>>> because almost all the capture pipeline is done in hardware (i.e. without
>>>>> using the CPU) and let believe to cpufreq governor that it could use lower
>>>>> frequencies. If the governor decides to use a too low frequency that
>>>>> becomes a problem when we need to acknowledge the interrupt during the
>>>>> blanking time.
>>>>> The delay to ack the interrupt and perform all the other actions before
>>>>> the next frame is very short and doesn't allow to the cpufreq governor to
>>>>> provide the required burst of power. That led to drop the half of the frames.
>>>>>
>>>>> To avoid this problem, DCMI driver informs the cpufreq governors by adding
>>>>> a cpufreq minimum load QoS resquest.
>>>>>
>>>>> Benjamin
>>>>>
>>>>> [1] https://lkml.org/lkml/2020/4/24/360
>>>>>
>>>>> Benjamin Gaignard (3):
>>>>> PM: QoS: Introduce cpufreq minimum load QoS
>>>>> cpufreq: governor: Use minimum load QoS
>>>>> media: stm32-dcmi: Inform cpufreq governors about cpu load needs
>>>>>
>>>>> drivers/cpufreq/cpufreq_governor.c | 5 +
>>>>> drivers/media/platform/stm32/stm32-dcmi.c | 8 ++
>>>>> include/linux/pm_qos.h | 12 ++
>>>>> kernel/power/qos.c | 213 ++++++++++++++++++++++++++++++
>>>>> 4 files changed, 238 insertions(+)