2023-12-15 13:05:33

by Anna-Maria Behnsen

[permalink] [raw]
Subject: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

When there is no cpuidle driver, the system tries to stop the tick even if
the system is fully loaded. But stopping the tick is not for free and it
decreases performance on a fully loaded system. As there is no (cpuidle)
framework which brings CPU in a power saving state when nothing needs to be
done, there is also no power saving benefit when stopping the tick.

Therefore do not stop the tick when there is no cpuidle driver.

Signed-off-by: Anna-Maria Behnsen <[email protected]>
---
kernel/sched/idle.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 565f8374ddbb..fd111686aaf3 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -165,8 +165,6 @@ static void cpuidle_idle_call(void)
*/

if (cpuidle_not_available(drv, dev)) {
- tick_nohz_idle_stop_tick();
-
default_idle_call();
goto exit_idle;
}
--
2.39.2



2023-12-21 15:30:02

by Pierre Gondois

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

Hello Anna-Maria,

On 12/15/23 14:05, Anna-Maria Behnsen wrote:
> When there is no cpuidle driver, the system tries to stop the tick even if
> the system is fully loaded. But stopping the tick is not for free and it
> decreases performance on a fully loaded system. As there is no (cpuidle)
> framework which brings CPU in a power saving state when nothing needs to be
> done, there is also no power saving benefit when stopping the tick.

Just in case is wasn't taken into consideration:
-
Stopping the tick isn't free on a busy system, but it should also cost
something to regularly handle ticks on each CPU of an idle system.

FWIU, disabling the ticks also allows to add a CPU to the 'nohz.idle_cpus_mask'
mask, which helps the idle load balancer picking an idle CPU to do load
balancing for all the idle CPUs (cf. kick_ilb()).

It seems better to do one periodic balancing for all the idle CPUs rather
than periodically waking-up all CPUs to try to balance.

-
I would have assumed that if the system was fully loaded, ticks would
not be stopped, or maybe I misunderstood the case.
I assume the wake-up latency would be improved if the tick doesn't
have to be re-setup again.

Regards,
Pierre

>
> Therefore do not stop the tick when there is no cpuidle driver.
>
> Signed-off-by: Anna-Maria Behnsen <[email protected]>
> ---
> kernel/sched/idle.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 565f8374ddbb..fd111686aaf3 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -165,8 +165,6 @@ static void cpuidle_idle_call(void)
> */
>
> if (cpuidle_not_available(drv, dev)) {
> - tick_nohz_idle_stop_tick();
> -
> default_idle_call();
> goto exit_idle;
> }

2024-01-09 16:25:09

by Anna-Maria Behnsen

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

Hello Pierre,

Pierre Gondois <[email protected]> writes:

> Hello Anna-Maria,
>
> On 12/15/23 14:05, Anna-Maria Behnsen wrote:
>> When there is no cpuidle driver, the system tries to stop the tick even if
>> the system is fully loaded. But stopping the tick is not for free and it
>> decreases performance on a fully loaded system. As there is no (cpuidle)
>> framework which brings CPU in a power saving state when nothing needs to be
>> done, there is also no power saving benefit when stopping the tick.
>
> Just in case is wasn't taken into consideration:
> -
> Stopping the tick isn't free on a busy system, but it should also cost
> something to regularly handle ticks on each CPU of an idle system.
>
> FWIU, disabling the ticks also allows to add a CPU to the 'nohz.idle_cpus_mask'
> mask, which helps the idle load balancer picking an idle CPU to do load
> balancing for all the idle CPUs (cf. kick_ilb()).
>
> It seems better to do one periodic balancing for all the idle CPUs rather
> than periodically waking-up all CPUs to try to balance.
>
> -
> I would have assumed that if the system was fully loaded, ticks would
> not be stopped, or maybe I misunderstood the case.
> I assume the wake-up latency would be improved if the tick doesn't
> have to be re-setup again.
>

Your answer confuses me a little...

When there is a cpuidle driver, trying to stop the tick is not done
unconditionally. It is only done when the CPU is in a state that it
could go into a deeper C sleep - this is decided by cpuidle
driver/governor.

When there is no cpuidle driver, there is no instance which could bring
the CPU into a deeper C state. But at the moment the code does
unconditionally try to stop the tick. So the aim of the patch is to
remove this unconditional stop of the tick.

And NOHZ is independant on the cpuidle infrastructure. But when there is
no cpuidle driver, it doesn't makes sense to use then also NOHZ.

Thanks,

Anna-Maria




2024-01-10 10:20:42

by Pierre Gondois

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

Hello Anna-Maria,

On 1/9/24 17:24, Anna-Maria Behnsen wrote:
> Hello Pierre,
>
> Pierre Gondois <[email protected]> writes:
>
>> Hello Anna-Maria,
>>
>> On 12/15/23 14:05, Anna-Maria Behnsen wrote:
>>> When there is no cpuidle driver, the system tries to stop the tick even if
>>> the system is fully loaded. But stopping the tick is not for free and it
>>> decreases performance on a fully loaded system. As there is no (cpuidle)
>>> framework which brings CPU in a power saving state when nothing needs to be
>>> done, there is also no power saving benefit when stopping the tick.
>>
>> Just in case is wasn't taken into consideration:
>> -
>> Stopping the tick isn't free on a busy system, but it should also cost
>> something to regularly handle ticks on each CPU of an idle system.
>>
>> FWIU, disabling the ticks also allows to add a CPU to the 'nohz.idle_cpus_mask'
>> mask, which helps the idle load balancer picking an idle CPU to do load
>> balancing for all the idle CPUs (cf. kick_ilb()).
>>
>> It seems better to do one periodic balancing for all the idle CPUs rather
>> than periodically waking-up all CPUs to try to balance.
>>
>> -
>> I would have assumed that if the system was fully loaded, ticks would
>> not be stopped, or maybe I misunderstood the case.
>> I assume the wake-up latency would be improved if the tick doesn't
>> have to be re-setup again.
>>
>
> Your answer confuses me a little...
>
> When there is a cpuidle driver, trying to stop the tick is not done
> unconditionally. It is only done when the CPU is in a state that it
> could go into a deeper C sleep - this is decided by cpuidle
> driver/governor.

Yes right.

>
> When there is no cpuidle driver, there is no instance which could bring
> the CPU into a deeper C state. But at the moment the code does
> unconditionally try to stop the tick. So the aim of the patch is to
> remove this unconditional stop of the tick.

I agree that the absence of cpuidle driver prevents from reaching deep
idle states. FWIU, there is however still benefits in stopping the tick
on such platform.
-
I agree that bringing up/down the ticks costs something and that removing
tick_nohz_idle_stop_tick() can improve performance, but I assumed stopping
the ticks had some interest regarding energy consumption.
Keeping the tick forever on an idle CPU should not be useful.
-
About nohz.idle_cpus_mask, I was referring to the following path:
do_idle()
\-cpuidle_idle_call()
\-tick_nohz_idle_stop_tick()
\-nohz_balance_enter_idle()
\-cpumask_set_cpu(cpu, nohz.idle_cpus_mask);
\-atomic_inc(&nohz.nr_cpus);

Removing tick_nohz_idle_stop_tick() also means not using nohz.idle_cpus_mask
and the logic around it to find an idle CPU to balance tasks.

Hope the re-phrasing makes the 2 points a bit clearer,
Regards,
Pierre


>
> And NOHZ is independant on the cpuidle infrastructure. But when there is
> no cpuidle driver, it doesn't makes sense to use then also NOHZ.
>
> Thanks,
>
> Anna-Maria
>
>
>

2024-01-12 10:56:59

by Anna-Maria Behnsen

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

Pierre Gondois <[email protected]> writes:

> Hello Anna-Maria,
>
> On 1/9/24 17:24, Anna-Maria Behnsen wrote:
>>
>> When there is no cpuidle driver, there is no instance which could bring
>> the CPU into a deeper C state. But at the moment the code does
>> unconditionally try to stop the tick. So the aim of the patch is to
>> remove this unconditional stop of the tick.
>
> I agree that the absence of cpuidle driver prevents from reaching deep
> idle states. FWIU, there is however still benefits in stopping the tick
> on such platform.

What's the benefit?

Thanks,

Anna-Maria


2024-01-12 13:40:17

by Pierre Gondois

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

Hello Anna-Maria,

On 1/12/24 11:56, Anna-Maria Behnsen wrote:
> Pierre Gondois <[email protected]> writes:
>
>> Hello Anna-Maria,
>>
>> On 1/9/24 17:24, Anna-Maria Behnsen wrote:
>>>
>>> When there is no cpuidle driver, there is no instance which could bring
>>> the CPU into a deeper C state. But at the moment the code does
>>> unconditionally try to stop the tick. So the aim of the patch is to
>>> remove this unconditional stop of the tick.
>>
>> I agree that the absence of cpuidle driver prevents from reaching deep
>> idle states. FWIU, there is however still benefits in stopping the tick
>> on such platform.
>
> What's the benefit?

I did the following test:
- on an arm64 Juno-r2 platform (2 big A-72 and 4 little A-53 CPUs)
- booting with 'cpuidle.off=1'
- using the energy counters of the platforms
(the counters measure energy for the whole cluster of big/little CPUs)
- letting the platform idling during 10s

Without patch:
| | big-CPUs | little-CPUs |
|:------|-------------:|------------:|
| count | 10 | 10 |
| mean | 0.353266 | 0.33399 |
| std | 0.000254574 | 0.00206803 |
| min | 0.352991 | 0.332145 |
| 25% | 0.353039 | 0.332506 |
| 50% | 0.353267 | 0.333089 |
| 75% | 0.353412 | 0.335231 |
| max | 0.353737 | 0.337964 |

With patch:
| | big-CPUs | little-CPUs |
|:------|-------------:|-------------:|
| count | 10 | 10 |
| mean | 0.375086 | 0.352451 |
| std | 0.000299919 | 0.000752727 |
| min | 0.374527 | 0.351743 |
| 25% | 0.374872 | 0.35181 |
| 50% | 0.37512 | 0.352063 |
| 75% | 0.375335 | 0.353256 |
| max | 0.375485 | 0.353461 |

So the energy consumption would be up:
- ~6% for the big CPUs
- ~10% for the litte CPUs

Regards,
Pierre


>
> Thanks,
>
> Anna-Maria
>

2024-01-12 14:53:18

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

On Fri, Jan 12 2024 at 14:39, Pierre Gondois wrote:
> On 1/12/24 11:56, Anna-Maria Behnsen wrote:
>> Pierre Gondois <[email protected]> writes:
>>> I agree that the absence of cpuidle driver prevents from reaching deep
>>> idle states. FWIU, there is however still benefits in stopping the tick
>>> on such platform.
>>
>> What's the benefit?
>
> I did the following test:
> - on an arm64 Juno-r2 platform (2 big A-72 and 4 little A-53 CPUs)
> - booting with 'cpuidle.off=1'
> - using the energy counters of the platforms
> (the counters measure energy for the whole cluster of big/little CPUs)
> - letting the platform idling during 10s
>
> So the energy consumption would be up:
> - ~6% for the big CPUs
> - ~10% for the litte CPUs

Fair enough, but what's the actual usecase?

NOHZ w/o cpuidle driver seems a rather academic exercise to me.

Thanks,

tglx

2024-01-15 12:41:06

by Pierre Gondois

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

Hello Thomas,

On 1/12/24 15:52, Thomas Gleixner wrote:
> On Fri, Jan 12 2024 at 14:39, Pierre Gondois wrote:
>> On 1/12/24 11:56, Anna-Maria Behnsen wrote:
>>> Pierre Gondois <[email protected]> writes:
>>>> I agree that the absence of cpuidle driver prevents from reaching deep
>>>> idle states. FWIU, there is however still benefits in stopping the tick
>>>> on such platform.
>>>
>>> What's the benefit?
>>
>> I did the following test:
>> - on an arm64 Juno-r2 platform (2 big A-72 and 4 little A-53 CPUs)
>> - booting with 'cpuidle.off=1'
>> - using the energy counters of the platforms
>> (the counters measure energy for the whole cluster of big/little CPUs)
>> - letting the platform idling during 10s
>>
>> So the energy consumption would be up:
>> - ~6% for the big CPUs
>> - ~10% for the litte CPUs
>
> Fair enough, but what's the actual usecase?
>
> NOHZ w/o cpuidle driver seems a rather academic exercise to me.

I thought Anna-Maria had a use-case for this.
I just wanted to point out that this patch could potentially
increase the energy consumption for her use-case, nothing more,

Regards,
Pierre

>
> Thanks,
>
> tglx

2024-01-15 13:14:07

by Anna-Maria Behnsen

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

Pierre Gondois <[email protected]> writes:

> Hello Thomas,
>
> On 1/12/24 15:52, Thomas Gleixner wrote:
>> On Fri, Jan 12 2024 at 14:39, Pierre Gondois wrote:
>>> On 1/12/24 11:56, Anna-Maria Behnsen wrote:
>>>> Pierre Gondois <[email protected]> writes:
>>>>> I agree that the absence of cpuidle driver prevents from reaching deep
>>>>> idle states. FWIU, there is however still benefits in stopping the tick
>>>>> on such platform.
>>>>
>>>> What's the benefit?
>>>
>>> I did the following test:
>>> - on an arm64 Juno-r2 platform (2 big A-72 and 4 little A-53 CPUs)
>>> - booting with 'cpuidle.off=1'
>>> - using the energy counters of the platforms
>>> (the counters measure energy for the whole cluster of big/little CPUs)
>>> - letting the platform idling during 10s
>>>
>>> So the energy consumption would be up:
>>> - ~6% for the big CPUs
>>> - ~10% for the litte CPUs
>>
>> Fair enough, but what's the actual usecase?
>>
>> NOHZ w/o cpuidle driver seems a rather academic exercise to me.
>
> I thought Anna-Maria had a use-case for this.
> I just wanted to point out that this patch could potentially
> increase the energy consumption for her use-case, nothing more,
>

I saw tons of calls trying to stop the tick on a loaded system - which
decreased performance. Deep sleep states were disabled (by accident) in
the BIOS but NOHZ was enabled. So my proposal is to remove this
unconditional call trying to stop the tick.

Thanks,

Anna-Maria

2024-01-15 13:29:46

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

On Mon, 15 Jan 2024 at 13:40, Pierre Gondois <[email protected]> wrote:
>
> Hello Thomas,
>
> On 1/12/24 15:52, Thomas Gleixner wrote:
> > On Fri, Jan 12 2024 at 14:39, Pierre Gondois wrote:
> >> On 1/12/24 11:56, Anna-Maria Behnsen wrote:
> >>> Pierre Gondois <[email protected]> writes:
> >>>> I agree that the absence of cpuidle driver prevents from reaching deep
> >>>> idle states. FWIU, there is however still benefits in stopping the tick
> >>>> on such platform.
> >>>
> >>> What's the benefit?
> >>
> >> I did the following test:
> >> - on an arm64 Juno-r2 platform (2 big A-72 and 4 little A-53 CPUs)
> >> - booting with 'cpuidle.off=1'
> >> - using the energy counters of the platforms
> >> (the counters measure energy for the whole cluster of big/little CPUs)
> >> - letting the platform idling during 10s
> >>
> >> So the energy consumption would be up:
> >> - ~6% for the big CPUs
> >> - ~10% for the litte CPUs
> >
> > Fair enough, but what's the actual usecase?
> >
> > NOHZ w/o cpuidle driver seems a rather academic exercise to me.

Don't know if it's really a valid use case but can't we have VMs in
such a configuration ?
NOHZ enabled and no cpuidle driver as VM doesn't manage HW anyway ?

>
> I thought Anna-Maria had a use-case for this.
> I just wanted to point out that this patch could potentially
> increase the energy consumption for her use-case, nothing more,
>
> Regards,
> Pierre
>
> >
> > Thanks,
> >
> > tglx

2024-01-15 15:42:54

by Pierre Gondois

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver



On 1/15/24 14:29, Vincent Guittot wrote:
> On Mon, 15 Jan 2024 at 13:40, Pierre Gondois <[email protected]> wrote:
>>
>> Hello Thomas,
>>
>> On 1/12/24 15:52, Thomas Gleixner wrote:
>>> On Fri, Jan 12 2024 at 14:39, Pierre Gondois wrote:
>>>> On 1/12/24 11:56, Anna-Maria Behnsen wrote:
>>>>> Pierre Gondois <[email protected]> writes:
>>>>>> I agree that the absence of cpuidle driver prevents from reaching deep
>>>>>> idle states. FWIU, there is however still benefits in stopping the tick
>>>>>> on such platform.
>>>>>
>>>>> What's the benefit?
>>>>
>>>> I did the following test:
>>>> - on an arm64 Juno-r2 platform (2 big A-72 and 4 little A-53 CPUs)
>>>> - booting with 'cpuidle.off=1'
>>>> - using the energy counters of the platforms
>>>> (the counters measure energy for the whole cluster of big/little CPUs)
>>>> - letting the platform idling during 10s
>>>>
>>>> So the energy consumption would be up:
>>>> - ~6% for the big CPUs
>>>> - ~10% for the litte CPUs
>>>
>>> Fair enough, but what's the actual usecase?
>>>
>>> NOHZ w/o cpuidle driver seems a rather academic exercise to me.
>
> Don't know if it's really a valid use case but can't we have VMs in
> such a configuration ?
> NOHZ enabled and no cpuidle driver as VM doesn't manage HW anyway ?

Yes right,
I tried with a kvmtool generated VM and it seemed to be the case:

$ grep . /sys/devices/system/cpu/cpuidle/*
/sys/devices/system/cpu/cpuidle/available_governors:menu
/sys/devices/system/cpu/cpuidle/current_driver:none
/sys/devices/system/cpu/cpuidle/current_governor:menu
/sys/devices/system/cpu/cpuidle/current_governor_ro:menu


>
>>
>> I thought Anna-Maria had a use-case for this.
>> I just wanted to point out that this patch could potentially
>> increase the energy consumption for her use-case, nothing more,
>>
>> Regards,
>> Pierre
>>
>>>
>>> Thanks,
>>>
>>> tglx

2024-01-22 10:26:46

by Anna-Maria Behnsen

[permalink] [raw]
Subject: Re: [PATCH] sched/idle: Prevent stopping the tick when there is no cpuidle driver

Hi,

Pierre Gondois <[email protected]> writes:
> On 1/15/24 14:29, Vincent Guittot wrote:
>> On Mon, 15 Jan 2024 at 13:40, Pierre Gondois <[email protected]> wrote:
>>>
>>> Hello Thomas,
>>>
>>> On 1/12/24 15:52, Thomas Gleixner wrote:
>>>> On Fri, Jan 12 2024 at 14:39, Pierre Gondois wrote:
>>>>> On 1/12/24 11:56, Anna-Maria Behnsen wrote:
>>>>>> Pierre Gondois <[email protected]> writes:
>>>>>>> I agree that the absence of cpuidle driver prevents from reaching deep
>>>>>>> idle states. FWIU, there is however still benefits in stopping the tick
>>>>>>> on such platform.
>>>>>>
>>>>>> What's the benefit?
>>>>>
>>>>> I did the following test:
>>>>> - on an arm64 Juno-r2 platform (2 big A-72 and 4 little A-53 CPUs)
>>>>> - booting with 'cpuidle.off=1'
>>>>> - using the energy counters of the platforms
>>>>> (the counters measure energy for the whole cluster of big/little CPUs)
>>>>> - letting the platform idling during 10s
>>>>>
>>>>> So the energy consumption would be up:
>>>>> - ~6% for the big CPUs
>>>>> - ~10% for the litte CPUs
>>>>
>>>> Fair enough, but what's the actual usecase?
>>>>
>>>> NOHZ w/o cpuidle driver seems a rather academic exercise to me.
>>
>> Don't know if it's really a valid use case but can't we have VMs in
>> such a configuration ?
>> NOHZ enabled and no cpuidle driver as VM doesn't manage HW anyway ?
>
> Yes right,
> I tried with a kvmtool generated VM and it seemed to be the case:
>
> $ grep . /sys/devices/system/cpu/cpuidle/*
> /sys/devices/system/cpu/cpuidle/available_governors:menu
> /sys/devices/system/cpu/cpuidle/current_driver:none
> /sys/devices/system/cpu/cpuidle/current_governor:menu
> /sys/devices/system/cpu/cpuidle/current_governor_ro:menu
>

So it's not on me to decide whether it is valid to skip stopping the
tick in this setting or not. I observed this unconditional call (which
is not for free) on a fully loaded system which decreases performance.

If there is a reasonable condition that could be added for stopping the
tick, this might also be a good solution or even a better solution. But
only checking whether cpuidle driver is available or not and then
unconditionally stopping the tick, doesn't make sense IMHO.

Thanks,

Anna-Maria