2021-06-02 18:21:36

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

Hi All,

This series of patches addresses some theoretical shortcoming in the
TEO (Timer Events Oriented) cpuidle governor by reworking its idle
state selection logic to some extent.

Patches [1-2/5] are introductory cleanups and the substantial changes are
made in patches [3-4/5] (please refer to the changelogs of these two
patches for details). The last patch only deals with documentation.

Even though this work is mostly based on theoretical considerations, it
shows a measurable reduction of the number of cases in which the shallowest
idle state is selected while it would be more beneficial to select a deeper
one or the deepest idle state is selected while it would be more beneficial to
select a shallower one, which should be a noticeable improvement.

Thanks!




2021-06-02 18:23:13

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v1 1/5] cpuidle: teo: Cosmetic modifications of teo_update()

From: "Rafael J. Wysocki" <[email protected]>

Rename a local variable in teo_update() so that its purpose is better
reflected by its name and use one more local variable in the loop
over the CPU idle states in that function to make the code somewhat
easier to read.

No functional impact.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/cpuidle/governors/teo.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/cpuidle/governors/teo.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/governors/teo.c
+++ linux-pm/drivers/cpuidle/governors/teo.c
@@ -117,7 +117,7 @@ static DEFINE_PER_CPU(struct teo_cpu, te
static void teo_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
{
struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu);
- int i, idx_hit = 0, idx_timer = 0;
+ int i, idx_timer = 0, idx_duration = 0;
unsigned int hits, misses;
u64 measured_ns;

@@ -156,14 +156,15 @@ static void teo_update(struct cpuidle_dr
* states matching the sleep length and the measured idle duration.
*/
for (i = 0; i < drv->state_count; i++) {
+ s64 target_residency_ns = drv->states[i].target_residency_ns;
unsigned int early_hits = cpu_data->states[i].early_hits;

cpu_data->states[i].early_hits -= early_hits >> DECAY_SHIFT;

- if (drv->states[i].target_residency_ns <= cpu_data->sleep_length_ns) {
+ if (target_residency_ns <= cpu_data->sleep_length_ns) {
idx_timer = i;
- if (drv->states[i].target_residency_ns <= measured_ns)
- idx_hit = i;
+ if (target_residency_ns <= measured_ns)
+ idx_duration = i;
}
}

@@ -181,11 +182,11 @@ static void teo_update(struct cpuidle_dr
misses = cpu_data->states[idx_timer].misses;
misses -= misses >> DECAY_SHIFT;

- if (idx_timer == idx_hit) {
+ if (idx_timer == idx_duration) {
hits += PULSE;
} else {
misses += PULSE;
- cpu_data->states[idx_hit].early_hits += PULSE;
+ cpu_data->states[idx_duration].early_hits += PULSE;
}

cpu_data->states[idx_timer].misses = misses;



2021-07-04 21:01:54

by Doug Smythies

[permalink] [raw]
Subject: RE: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

Hi Rafael,

On 2021.06.02 11:14 Rafael J. Wysocki wrote:

> Hi All,
>
> This series of patches addresses some theoretical shortcoming in the
> TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> state selection logic to some extent.
>
> Patches [1-2/5] are introductory cleanups and the substantial changes are
> made in patches [3-4/5] (please refer to the changelogs of these two
> patches for details). The last patch only deals with documentation.
>
> Even though this work is mostly based on theoretical considerations, it
> shows a measurable reduction of the number of cases in which the
shallowest
> idle state is selected while it would be more beneficial to select a
deeper
> one or the deepest idle state is selected while it would be more
beneficial to
> select a shallower one, which should be a noticeable improvement.

Do you have any test results to share? Or test methods that I can try?
I have done a few tests, and generally don't notice much difference.
Perhaps an increase in idle state 2 below (was to shallow) numbers.
I am searching for some results that would offset the below:

The difficulty I am having with this patch set is the additional overhead
which becomes significant at the extremes, where idle state 0 is dominant.
Throughout the history of teo, I have used multiple one core pipe-tests
for this particular test. Some results:

CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
HWP: disabled
CPU frequency scaling driver: intel_pstate, active, powersave
Pipe-tests are run forever, printing average loop time for the
Last 2.5 million loops. 1021 of those are again averaged.
Total = 2.5525e9 loops
The power and idle data is sampled for 100 minutes.

Note 1: other tests were also done and also with passive,
schedutil, but it isn't relevant for this test because the
CPU frequency stays pinned at maximum.

Note 2: I use TCC offset for thermal throttling, but I disabled it
for these tests, because the temperature needed to go higher
than my normal throttling point.

Idle configuration 1: As a COMETLAKE processor, with 4 idle states.
Kernel 5.13-RC4.

Before patch set average:
2.8014 uSec/loop
113.9 watts
Idle state 0 residency: 9.450%
Idle state 0 entries per minute: 256,812,896.6

After patch set average:
2.8264 uSec/loop, 0.89% slower
114.0 watts
Idle state 0 residency: 8.677%
Idle state 0 entries per minute: 254,560,049.9

Menu governor:
2.8051 uSec/loop, 0.13% slower
113.9 watts
Idle state 0 residency: 8.437%
Idle state 0 entries per minute: 256,436,417.2

O.K., perhaps not so bad, but also not many idle states.

Idle configuration 2: As a SKYLAKE processor, with 9 idle states.
i.e.:
/drivers/idle/intel_idle.c
static const struct x86_cpu_id intel_idle_ids[] __initconst
...
X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &idle_cpu_skx),
+ X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE, &idle_cpu_skl),

Purpose: To demonstrate increasing overhead as a function of number
of idle states.
Kernel 5.13.

Before patch set average:
2.8394 uSec/loop
114.2 watts
Idle state 0 residency: 7.212%
Idle state 0 entries per minute: 253,391,954.3

After patch set average:
2.9103 uSec/loop, 2.5% slower
114.4 watts, 0.18% more
Idle state 0 residency: 6.152%, 14.7% less.
Idle state 0 entries per minute: 244,024,752.1

Menu governor:
2.8141 uSec/loop, 0.89% faster
113.9 watts, 0.26% less
Idle state 0 residency: 7.167%, 0.6% less
Idle state 0 entries per minute: 255,650,610.7

Another potentially interesting test was the ebizzy test:
Records per second, averaged over many tests, varying
threads and intervals:

passive, schedutil:
Before: 6771.977
After: 5502.643, -18.7%
Menu: 10728.89, +58.4%

Active, powersave:
Before: 8361.82
After: 8463.31, +1.2%
Menu: 8225.58, -1.6%

I think it has more to do with CPU scaling governors
than this patch set, so:

performance:
Before: 12137.33
After: 12083.26, -0.4%
Menu: 11983.73, -1.3%

These and other test results available here:
(encoded to prevent a barrage of bots)

double u double u double u dot smythies dot com
/~doug/linux/idle/teo-2021-06/

... a day later ...

I might have an answer to my own question.
By switching to cross core pipe-tests, and only loading down one
CPU per core, I was able to get a lot more activity in other idle states.
The test runs for 100 minutes, and the results change with time, but
I'll leave that investigation for another day (there is no throttling):

1st 50 tests:
Before: 3.888 uSec/loop
After: 3.764 uSec/loop
Menu: 3.464 uSec/loop

Tests 50 to 100:
Before: 4.329 uSec/loop
After: 3.919 uSec/loop
Menu: 3.514 uSec/loop

Tests 200 to 250:
Before: 5.089 uSec/loop
After: 4.364 uSec/loop
Menu: 4.619 uSec/loop

Tests 280 to 330:
Before: 5.142 uSec/loop
After: 4.464 uSec/loop
Menu: 4.619 uSec/loop

Notice that the "after" this patch set is applied eventually does
better than using the menu governor. Its processor package power
always remains less, than the menu governor.

The results can be viewed graphically at the above link, but the
most dramatic results are:

Idle state 3 above % goes from 70% to 5%.
Idle state 2 below % goes from 13% to less than 1%.

... Doug


Attachments:
winmail.dat (4.88 kB)

2021-07-05 13:27:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

On Sun, Jul 4, 2021 at 11:01 PM Doug Smythies <[email protected]> wrote:
>
> Hi Rafael,
>
> On 2021.06.02 11:14 Rafael J. Wysocki wrote:
>
> > Hi All,
> >
> > This series of patches addresses some theoretical shortcoming in the
> > TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> > state selection logic to some extent.
> >
> > Patches [1-2/5] are introductory cleanups and the substantial changes are
> > made in patches [3-4/5] (please refer to the changelogs of these two
> > patches for details). The last patch only deals with documentation.
> >
> > Even though this work is mostly based on theoretical considerations, it
> > shows a measurable reduction of the number of cases in which the
> shallowest
> > idle state is selected while it would be more beneficial to select a
> deeper
> > one or the deepest idle state is selected while it would be more
> beneficial to
> > select a shallower one, which should be a noticeable improvement.
>
> Do you have any test results to share? Or test methods that I can try?
> I have done a few tests, and generally don't notice much difference.
> Perhaps an increase in idle state 2 below (was to shallow) numbers.
> I am searching for some results that would offset the below:
>
> The difficulty I am having with this patch set is the additional overhead
> which becomes significant at the extremes, where idle state 0 is dominant.
> Throughout the history of teo, I have used multiple one core pipe-tests
> for this particular test. Some results:
>
> CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
> HWP: disabled
> CPU frequency scaling driver: intel_pstate, active, powersave
> Pipe-tests are run forever, printing average loop time for the
> Last 2.5 million loops. 1021 of those are again averaged.
> Total = 2.5525e9 loops
> The power and idle data is sampled for 100 minutes.
>
> Note 1: other tests were also done and also with passive,
> schedutil, but it isn't relevant for this test because the
> CPU frequency stays pinned at maximum.
>
> Note 2: I use TCC offset for thermal throttling, but I disabled it
> for these tests, because the temperature needed to go higher
> than my normal throttling point.
>
> Idle configuration 1: As a COMETLAKE processor, with 4 idle states.
> Kernel 5.13-RC4.
>
> Before patch set average:
> 2.8014 uSec/loop
> 113.9 watts
> Idle state 0 residency: 9.450%
> Idle state 0 entries per minute: 256,812,896.6
>
> After patch set average:
> 2.8264 uSec/loop, 0.89% slower
> 114.0 watts
> Idle state 0 residency: 8.677%
> Idle state 0 entries per minute: 254,560,049.9
>
> Menu governor:
> 2.8051 uSec/loop, 0.13% slower
> 113.9 watts
> Idle state 0 residency: 8.437%
> Idle state 0 entries per minute: 256,436,417.2
>
> O.K., perhaps not so bad, but also not many idle states.
>
> Idle configuration 2: As a SKYLAKE processor, with 9 idle states.
> i.e.:
> /drivers/idle/intel_idle.c
> static const struct x86_cpu_id intel_idle_ids[] __initconst
> ...
> X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &idle_cpu_skx),
> + X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE, &idle_cpu_skl),
>
> Purpose: To demonstrate increasing overhead as a function of number
> of idle states.
> Kernel 5.13.
>
> Before patch set average:
> 2.8394 uSec/loop
> 114.2 watts
> Idle state 0 residency: 7.212%
> Idle state 0 entries per minute: 253,391,954.3
>
> After patch set average:
> 2.9103 uSec/loop, 2.5% slower
> 114.4 watts, 0.18% more
> Idle state 0 residency: 6.152%, 14.7% less.
> Idle state 0 entries per minute: 244,024,752.1
>
> Menu governor:
> 2.8141 uSec/loop, 0.89% faster
> 113.9 watts, 0.26% less
> Idle state 0 residency: 7.167%, 0.6% less
> Idle state 0 entries per minute: 255,650,610.7
>
> Another potentially interesting test was the ebizzy test:
> Records per second, averaged over many tests, varying
> threads and intervals:
>
> passive, schedutil:
> Before: 6771.977
> After: 5502.643, -18.7%
> Menu: 10728.89, +58.4%
>
> Active, powersave:
> Before: 8361.82
> After: 8463.31, +1.2%
> Menu: 8225.58, -1.6%
>
> I think it has more to do with CPU scaling governors
> than this patch set, so:
>
> performance:
> Before: 12137.33
> After: 12083.26, -0.4%
> Menu: 11983.73, -1.3%
>
> These and other test results available here:
> (encoded to prevent a barrage of bots)
>
> double u double u double u dot smythies dot com
> /~doug/linux/idle/teo-2021-06/
>
> ... a day later ...
>
> I might have an answer to my own question.
> By switching to cross core pipe-tests, and only loading down one
> CPU per core, I was able to get a lot more activity in other idle states.
> The test runs for 100 minutes, and the results change with time, but
> I'll leave that investigation for another day (there is no throttling):
>
> 1st 50 tests:
> Before: 3.888 uSec/loop
> After: 3.764 uSec/loop
> Menu: 3.464 uSec/loop
>
> Tests 50 to 100:
> Before: 4.329 uSec/loop
> After: 3.919 uSec/loop
> Menu: 3.514 uSec/loop
>
> Tests 200 to 250:
> Before: 5.089 uSec/loop
> After: 4.364 uSec/loop
> Menu: 4.619 uSec/loop
>
> Tests 280 to 330:
> Before: 5.142 uSec/loop
> After: 4.464 uSec/loop
> Menu: 4.619 uSec/loop
>
> Notice that the "after" this patch set is applied eventually does
> better than using the menu governor. Its processor package power
> always remains less, than the menu governor.

That's good news, thanks!

> The results can be viewed graphically at the above link, but the
> most dramatic results are:
>
> Idle state 3 above % goes from 70% to 5%.
> Idle state 2 below % goes from 13% to less than 1%.

This also looks promising.

Thank you for all of the results!

2021-07-27 20:09:34

by Doug Smythies

[permalink] [raw]
Subject: RE: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

Hi Rafael,

Further to my reply of 2021.07.04 on this, I have
continued to work with and test this patch set.

On 2021.06.02 11:14 Rafael J. Wysocki wrote:

>This series of patches addresses some theoretical shortcoming in the
> TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> state selection logic to some extent.
>
> Patches [1-2/5] are introductory cleanups and the substantial changes are
> made in patches [3-4/5] (please refer to the changelogs of these two
> patches for details). The last patch only deals with documentation.
>
> Even though this work is mostly based on theoretical considerations, it
> shows a measurable reduction of the number of cases in which the shallowest
> idle state is selected while it would be more beneficial to select a deeper
> one or the deepest idle state is selected while it would be more beneficial to
> select a shallower one, which should be a noticeable improvement.

I am concentrating in the idle state 0 and 1 area.
When I disable idle state 0, the expectation is its
usage will fall to idle state 1. It doesn't.

Conditions:
CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
HWP: disabled
CPU frequency scaling driver: intel_pstate, active
CPU frequency scaling governor: performance.
Idle configuration: As a COMETLAKE processor, with 4 idle states.
Sample time for below: 1 minute.
Workflow: Cross core named pipe token passing, 12 threads.

Kernel 5.14-rc3: idle: teo governor

All idle states enabled: PASS
Processor: 97 watts
Idle state 0 entries: 811151
Idle state 1 entries: 140300776
Idle state 2 entries: 889
Idle state 3 entries: 8

Idle state 0 disabled: FAIL <<<<<
Processor: 96 watts
Idle state 0 entries: 0
Idle state 1 entries: 65599283
Idle state 2 entries: 364399
Idle state 3 entries: 65112651

Kernel 5.14-rc3: idle: menu governor

All idle states enabled: PASS
Processor: 102 watts
Idle state 0 entries: 169320747
Idle state 1 entries: 1860110
Idle state 2 entries: 14
Idle state 3 entries: 54

Idle state 0 disabled: PASS
Processor: 96.7 watts
Idle state 0 entries: 0
Idle state 1 entries: 141936790
Idle state 2 entries: 0
Idle state 3 entries: 6

Prior to this patch set:
Kernel 5.13: idle: teo governor

All idle states enabled: PASS
Processor: 97 watts
Idle state 0 entries: 446735
Idle state 1 entries: 140903027
Idle state 2 entries: 0
Idle state 3 entries: 0

Idle state 0 disabled: PASS
Processor: 96 watts
Idle state 0 entries: 0
Idle state 1 entries: 139308125
Idle state 2 entries: 0
Idle state 3 entries: 0

I haven't tried to isolate the issue in the code, yet.
Nor have I explored to determine if there might
be other potential idle state disabled issues.

... Doug



2021-07-28 13:58:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

On Tue, Jul 27, 2021 at 10:06 PM Doug Smythies <[email protected]> wrote:
>
> Hi Rafael,
>
> Further to my reply of 2021.07.04 on this, I have
> continued to work with and test this patch set.
>
> On 2021.06.02 11:14 Rafael J. Wysocki wrote:
>
> >This series of patches addresses some theoretical shortcoming in the
> > TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> > state selection logic to some extent.
> >
> > Patches [1-2/5] are introductory cleanups and the substantial changes are
> > made in patches [3-4/5] (please refer to the changelogs of these two
> > patches for details). The last patch only deals with documentation.
> >
> > Even though this work is mostly based on theoretical considerations, it
> > shows a measurable reduction of the number of cases in which the shallowest
> > idle state is selected while it would be more beneficial to select a deeper
> > one or the deepest idle state is selected while it would be more beneficial to
> > select a shallower one, which should be a noticeable improvement.
>
> I am concentrating in the idle state 0 and 1 area.
> When I disable idle state 0, the expectation is its
> usage will fall to idle state 1. It doesn't.
>
> Conditions:
> CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
> HWP: disabled
> CPU frequency scaling driver: intel_pstate, active
> CPU frequency scaling governor: performance.
> Idle configuration: As a COMETLAKE processor, with 4 idle states.
> Sample time for below: 1 minute.
> Workflow: Cross core named pipe token passing, 12 threads.
>
> Kernel 5.14-rc3: idle: teo governor
>
> All idle states enabled: PASS
> Processor: 97 watts
> Idle state 0 entries: 811151
> Idle state 1 entries: 140300776
> Idle state 2 entries: 889
> Idle state 3 entries: 8
>
> Idle state 0 disabled: FAIL <<<<<
> Processor: 96 watts
> Idle state 0 entries: 0
> Idle state 1 entries: 65599283
> Idle state 2 entries: 364399
> Idle state 3 entries: 65112651

This looks odd.

Thanks for the report, I'll take a look at this.

2021-07-28 17:48:55

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

On Wednesday, July 28, 2021 3:52:51 PM CEST Rafael J. Wysocki wrote:
> On Tue, Jul 27, 2021 at 10:06 PM Doug Smythies <[email protected]> wrote:
> >
> > Hi Rafael,
> >
> > Further to my reply of 2021.07.04 on this, I have
> > continued to work with and test this patch set.
> >
> > On 2021.06.02 11:14 Rafael J. Wysocki wrote:
> >
> > >This series of patches addresses some theoretical shortcoming in the
> > > TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> > > state selection logic to some extent.
> > >
> > > Patches [1-2/5] are introductory cleanups and the substantial changes are
> > > made in patches [3-4/5] (please refer to the changelogs of these two
> > > patches for details). The last patch only deals with documentation.
> > >
> > > Even though this work is mostly based on theoretical considerations, it
> > > shows a measurable reduction of the number of cases in which the shallowest
> > > idle state is selected while it would be more beneficial to select a deeper
> > > one or the deepest idle state is selected while it would be more beneficial to
> > > select a shallower one, which should be a noticeable improvement.
> >
> > I am concentrating in the idle state 0 and 1 area.
> > When I disable idle state 0, the expectation is its
> > usage will fall to idle state 1. It doesn't.
> >
> > Conditions:
> > CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
> > HWP: disabled
> > CPU frequency scaling driver: intel_pstate, active
> > CPU frequency scaling governor: performance.
> > Idle configuration: As a COMETLAKE processor, with 4 idle states.
> > Sample time for below: 1 minute.
> > Workflow: Cross core named pipe token passing, 12 threads.
> >
> > Kernel 5.14-rc3: idle: teo governor
> >
> > All idle states enabled: PASS
> > Processor: 97 watts
> > Idle state 0 entries: 811151
> > Idle state 1 entries: 140300776
> > Idle state 2 entries: 889
> > Idle state 3 entries: 8
> >
> > Idle state 0 disabled: FAIL <<<<<
> > Processor: 96 watts
> > Idle state 0 entries: 0
> > Idle state 1 entries: 65599283
> > Idle state 2 entries: 364399
> > Idle state 3 entries: 65112651
>
> This looks odd.
>
> Thanks for the report, I'll take a look at this.

I have found an issue in the code that may be responsible for the
observed behavior and should be addressed by the appended patch (not
tested yet).

Basically, the "disabled" check in the second loop over states in
teo_select() needs to exclude the first enabled state, because
there are no more states to check after that.

Plus the time span check needs to be done when the given state
is about to be selected, because otherwise the function may end up
returning a state for which the sums are too low.

Thanks!

---
drivers/cpuidle/governors/teo.c | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)

Index: linux-pm/drivers/cpuidle/governors/teo.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/governors/teo.c
+++ linux-pm/drivers/cpuidle/governors/teo.c
@@ -404,25 +404,27 @@ static int teo_select(struct cpuidle_dri
intercept_sum += bin->intercepts;
recent_sum += bin->recent;

- if (dev->states_usage[i].disable)
+ if (dev->states_usage[i].disable && i > idx0)
continue;

span_ns = teo_middle_of_bin(i, drv);
- if (!teo_time_ok(span_ns)) {
- /*
- * The current state is too shallow, so select
- * the first enabled deeper state.
- */
- duration_ns = last_enabled_span_ns;
- idx = last_enabled_idx;
- break;
- }

if ((!alt_recent || 2 * recent_sum > idx_recent_sum) &&
(!alt_intercepts ||
2 * intercept_sum > idx_intercept_sum)) {
- idx = i;
- duration_ns = span_ns;
+ if (!teo_time_ok(span_ns) ||
+ dev->states_usage[i].disable) {
+ /*
+ * The current state is too shallow or
+ * disabled, so select the first enabled
+ * deeper state.
+ */
+ duration_ns = last_enabled_span_ns;
+ idx = last_enabled_idx;
+ } else {
+ idx = i;
+ duration_ns = span_ns;
+ }
break;
}





2021-07-29 06:36:41

by Doug Smythies

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

On Wed, Jul 28, 2021 at 10:47 AM Rafael J. Wysocki <[email protected]> wrote:
>
> On Wednesday, July 28, 2021 3:52:51 PM CEST Rafael J. Wysocki wrote:
> > On Tue, Jul 27, 2021 at 10:06 PM Doug Smythies <[email protected]> wrote:
> > >
> > > Hi Rafael,
> > >
> > > Further to my reply of 2021.07.04 on this, I have
> > > continued to work with and test this patch set.
> > >
> > > On 2021.06.02 11:14 Rafael J. Wysocki wrote:
> > >
> > > >This series of patches addresses some theoretical shortcoming in the
> > > > TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> > > > state selection logic to some extent.
> > > >
> > > > Patches [1-2/5] are introductory cleanups and the substantial changes are
> > > > made in patches [3-4/5] (please refer to the changelogs of these two
> > > > patches for details). The last patch only deals with documentation.
> > > >
> > > > Even though this work is mostly based on theoretical considerations, it
> > > > shows a measurable reduction of the number of cases in which the shallowest
> > > > idle state is selected while it would be more beneficial to select a deeper
> > > > one or the deepest idle state is selected while it would be more beneficial to
> > > > select a shallower one, which should be a noticeable improvement.
> > >
> > > I am concentrating in the idle state 0 and 1 area.
> > > When I disable idle state 0, the expectation is its
> > > usage will fall to idle state 1. It doesn't.
> > >
> > > Conditions:
> > > CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
> > > HWP: disabled
> > > CPU frequency scaling driver: intel_pstate, active
> > > CPU frequency scaling governor: performance.
> > > Idle configuration: As a COMETLAKE processor, with 4 idle states.
> > > Sample time for below: 1 minute.
> > > Workflow: Cross core named pipe token passing, 12 threads.
> > >
> > > Kernel 5.14-rc3: idle: teo governor
> > >
> > > All idle states enabled: PASS
> > > Processor: 97 watts
> > > Idle state 0 entries: 811151
> > > Idle state 1 entries: 140300776
> > > Idle state 2 entries: 889
> > > Idle state 3 entries: 8
> > >
> > > Idle state 0 disabled: FAIL <<<<<
> > > Processor: 96 watts
> > > Idle state 0 entries: 0
> > > Idle state 1 entries: 65599283
> > > Idle state 2 entries: 364399
> > > Idle state 3 entries: 65112651
> >
> > This looks odd.
> >
> > Thanks for the report, I'll take a look at this.
>
> I have found an issue in the code that may be responsible for the
> observed behavior and should be addressed by the appended patch (not
> tested yet).
>
> Basically, the "disabled" check in the second loop over states in
> teo_select() needs to exclude the first enabled state, because
> there are no more states to check after that.
>
> Plus the time span check needs to be done when the given state
> is about to be selected, because otherwise the function may end up
> returning a state for which the sums are too low.
>
> Thanks!
>
> ---
> drivers/cpuidle/governors/teo.c | 26 ++++++++++++++------------
> 1 file changed, 14 insertions(+), 12 deletions(-)
>
> Index: linux-pm/drivers/cpuidle/governors/teo.c
> ===================================================================
> --- linux-pm.orig/drivers/cpuidle/governors/teo.c
> +++ linux-pm/drivers/cpuidle/governors/teo.c
> @@ -404,25 +404,27 @@ static int teo_select(struct cpuidle_dri
> intercept_sum += bin->intercepts;
> recent_sum += bin->recent;
>
> - if (dev->states_usage[i].disable)
> + if (dev->states_usage[i].disable && i > idx0)
> continue;
>
> span_ns = teo_middle_of_bin(i, drv);
> - if (!teo_time_ok(span_ns)) {
> - /*
> - * The current state is too shallow, so select
> - * the first enabled deeper state.
> - */
> - duration_ns = last_enabled_span_ns;
> - idx = last_enabled_idx;
> - break;
> - }
>
> if ((!alt_recent || 2 * recent_sum > idx_recent_sum) &&
> (!alt_intercepts ||
> 2 * intercept_sum > idx_intercept_sum)) {
> - idx = i;
> - duration_ns = span_ns;
> + if (!teo_time_ok(span_ns) ||
> + dev->states_usage[i].disable) {
> + /*
> + * The current state is too shallow or
> + * disabled, so select the first enabled
> + * deeper state.
> + */
> + duration_ns = last_enabled_span_ns;
> + idx = last_enabled_idx;
> + } else {
> + idx = i;
> + duration_ns = span_ns;
> + }
> break;
> }

Hi Rafael,

I tried the patch and when I disabled idle state 0
got, very similar to before:

Idle state 0 disabled: FAIL
Processor: 95 watts
Idle state 0 entries: 0
Idle state 1 entries: 65,475,534
Idle state 2 entries: 333144
Idle state 3 entries: 65,247,048

However, I accidently left it for about 30 minutes
and noticed:

Idle state 0 disabled:
Processor: 83 watts
Idle state 0 entries: 0
Idle state 1 entries: 88,706,831
Idle state 2 entries: 100
Idle state 3 entries: 662

I went back to unmodified kernel 5.13-rc3 and
let it run longer with idle state 0 disabled, and
after 30 minutes it had changed but nowhere
near as much:

Idle state 0 disabled:
Processor: 87 watts
Idle state 0 entries: 0
Idle state 1 entries: 70,361,020
Idle state 2 entries: 71219
Idle state 3 entries: 27,249,975

... Doug

2021-07-29 15:28:32

by Doug Smythies

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

On Wed, Jul 28, 2021 at 11:34 PM Doug Smythies <[email protected]> wrote:
>
> On Wed, Jul 28, 2021 at 10:47 AM Rafael J. Wysocki <[email protected]> wrote:
> >
> > On Wednesday, July 28, 2021 3:52:51 PM CEST Rafael J. Wysocki wrote:
> > > On Tue, Jul 27, 2021 at 10:06 PM Doug Smythies <[email protected]> wrote:
> > > >
> > > > Hi Rafael,
> > > >
> > > > Further to my reply of 2021.07.04 on this, I have
> > > > continued to work with and test this patch set.
> > > >
> > > > On 2021.06.02 11:14 Rafael J. Wysocki wrote:
> > > >
> > > > >This series of patches addresses some theoretical shortcoming in the
> > > > > TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> > > > > state selection logic to some extent.
> > > > >
> > > > > Patches [1-2/5] are introductory cleanups and the substantial changes are
> > > > > made in patches [3-4/5] (please refer to the changelogs of these two
> > > > > patches for details). The last patch only deals with documentation.
> > > > >
> > > > > Even though this work is mostly based on theoretical considerations, it
> > > > > shows a measurable reduction of the number of cases in which the shallowest
> > > > > idle state is selected while it would be more beneficial to select a deeper
> > > > > one or the deepest idle state is selected while it would be more beneficial to
> > > > > select a shallower one, which should be a noticeable improvement.
> > > >
> > > > I am concentrating in the idle state 0 and 1 area.
> > > > When I disable idle state 0, the expectation is its
> > > > usage will fall to idle state 1. It doesn't.
> > > >
> > > > Conditions:
> > > > CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
> > > > HWP: disabled
> > > > CPU frequency scaling driver: intel_pstate, active
> > > > CPU frequency scaling governor: performance.
> > > > Idle configuration: As a COMETLAKE processor, with 4 idle states.
> > > > Sample time for below: 1 minute.
> > > > Workflow: Cross core named pipe token passing, 12 threads.
> > > >
> > > > Kernel 5.14-rc3: idle: teo governor
> > > >
> > > > All idle states enabled: PASS
> > > > Processor: 97 watts
> > > > Idle state 0 entries: 811151
> > > > Idle state 1 entries: 140300776
> > > > Idle state 2 entries: 889
> > > > Idle state 3 entries: 8
> > > >
> > > > Idle state 0 disabled: FAIL <<<<<
> > > > Processor: 96 watts
> > > > Idle state 0 entries: 0
> > > > Idle state 1 entries: 65599283
> > > > Idle state 2 entries: 364399
> > > > Idle state 3 entries: 65112651
> > >
> > > This looks odd.
> > >
> > > Thanks for the report, I'll take a look at this.
> >
> > I have found an issue in the code that may be responsible for the
> > observed behavior and should be addressed by the appended patch (not
> > tested yet).
> >
> > Basically, the "disabled" check in the second loop over states in
> > teo_select() needs to exclude the first enabled state, because
> > there are no more states to check after that.
> >
> > Plus the time span check needs to be done when the given state
> > is about to be selected, because otherwise the function may end up
> > returning a state for which the sums are too low.
> >
> > Thanks!
> >
> > ---
> > drivers/cpuidle/governors/teo.c | 26 ++++++++++++++------------
> > 1 file changed, 14 insertions(+), 12 deletions(-)
> >
> > Index: linux-pm/drivers/cpuidle/governors/teo.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpuidle/governors/teo.c
> > +++ linux-pm/drivers/cpuidle/governors/teo.c
> > @@ -404,25 +404,27 @@ static int teo_select(struct cpuidle_dri
> > intercept_sum += bin->intercepts;
> > recent_sum += bin->recent;
> >
> > - if (dev->states_usage[i].disable)
> > + if (dev->states_usage[i].disable && i > idx0)
> > continue;
> >
> > span_ns = teo_middle_of_bin(i, drv);
> > - if (!teo_time_ok(span_ns)) {
> > - /*
> > - * The current state is too shallow, so select
> > - * the first enabled deeper state.
> > - */
> > - duration_ns = last_enabled_span_ns;
> > - idx = last_enabled_idx;
> > - break;
> > - }
> >
> > if ((!alt_recent || 2 * recent_sum > idx_recent_sum) &&
> > (!alt_intercepts ||
> > 2 * intercept_sum > idx_intercept_sum)) {
> > - idx = i;
> > - duration_ns = span_ns;
> > + if (!teo_time_ok(span_ns) ||
> > + dev->states_usage[i].disable) {
> > + /*
> > + * The current state is too shallow or
> > + * disabled, so select the first enabled
> > + * deeper state.
> > + */
> > + duration_ns = last_enabled_span_ns;
> > + idx = last_enabled_idx;
> > + } else {
> > + idx = i;
> > + duration_ns = span_ns;
> > + }
> > break;
> > }
>
> Hi Rafael,
>
> I tried the patch and when I disabled idle state 0
> got, very similar to before:
>
> Idle state 0 disabled: FAIL
> Processor: 95 watts
> Idle state 0 entries: 0
> Idle state 1 entries: 65,475,534
> Idle state 2 entries: 333144
> Idle state 3 entries: 65,247,048
>
> However, I accidently left it for about 30 minutes
> and noticed:
>
> Idle state 0 disabled:
> Processor: 83 watts
> Idle state 0 entries: 0
> Idle state 1 entries: 88,706,831
> Idle state 2 entries: 100
> Idle state 3 entries: 662
>
> I went back to unmodified kernel 5.13-rc3 and

Sorry, 5.14-rc3.

> let it run longer with idle state 0 disabled, and
> after 30 minutes it had changed but nowhere
> near as much:
>
> Idle state 0 disabled:
> Processor: 87 watts
> Idle state 0 entries: 0
> Idle state 1 entries: 70,361,020
> Idle state 2 entries: 71219
> Idle state 3 entries: 27,249,975

Addendum: So far the workflow used for this
thread has been event based. If I switch to
a timer based workflow, everything works as
expected for both kernels, 5.14-rc3 unmodified
and modified with the patch from herein.

... Doug

2021-07-29 16:19:04

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

On Thursday, July 29, 2021 8:34:37 AM CEST Doug Smythies wrote:
> On Wed, Jul 28, 2021 at 10:47 AM Rafael J. Wysocki <[email protected]> wrote:
> >
> > On Wednesday, July 28, 2021 3:52:51 PM CEST Rafael J. Wysocki wrote:
> > > On Tue, Jul 27, 2021 at 10:06 PM Doug Smythies <[email protected]> wrote:
> > > >
> > > > Hi Rafael,
> > > >
> > > > Further to my reply of 2021.07.04 on this, I have
> > > > continued to work with and test this patch set.
> > > >
> > > > On 2021.06.02 11:14 Rafael J. Wysocki wrote:
> > > >
> > > > >This series of patches addresses some theoretical shortcoming in the
> > > > > TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> > > > > state selection logic to some extent.
> > > > >
> > > > > Patches [1-2/5] are introductory cleanups and the substantial changes are
> > > > > made in patches [3-4/5] (please refer to the changelogs of these two
> > > > > patches for details). The last patch only deals with documentation.
> > > > >
> > > > > Even though this work is mostly based on theoretical considerations, it
> > > > > shows a measurable reduction of the number of cases in which the shallowest
> > > > > idle state is selected while it would be more beneficial to select a deeper
> > > > > one or the deepest idle state is selected while it would be more beneficial to
> > > > > select a shallower one, which should be a noticeable improvement.
> > > >
> > > > I am concentrating in the idle state 0 and 1 area.
> > > > When I disable idle state 0, the expectation is its
> > > > usage will fall to idle state 1. It doesn't.
> > > >
> > > > Conditions:
> > > > CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
> > > > HWP: disabled
> > > > CPU frequency scaling driver: intel_pstate, active
> > > > CPU frequency scaling governor: performance.
> > > > Idle configuration: As a COMETLAKE processor, with 4 idle states.
> > > > Sample time for below: 1 minute.
> > > > Workflow: Cross core named pipe token passing, 12 threads.
> > > >
> > > > Kernel 5.14-rc3: idle: teo governor
> > > >
> > > > All idle states enabled: PASS
> > > > Processor: 97 watts
> > > > Idle state 0 entries: 811151
> > > > Idle state 1 entries: 140300776
> > > > Idle state 2 entries: 889
> > > > Idle state 3 entries: 8
> > > >
> > > > Idle state 0 disabled: FAIL <<<<<
> > > > Processor: 96 watts
> > > > Idle state 0 entries: 0
> > > > Idle state 1 entries: 65599283
> > > > Idle state 2 entries: 364399
> > > > Idle state 3 entries: 65112651
> > >
> > > This looks odd.
> > >
> > > Thanks for the report, I'll take a look at this.
> >
> > I have found an issue in the code that may be responsible for the
> > observed behavior and should be addressed by the appended patch (not
> > tested yet).
> >
> > Basically, the "disabled" check in the second loop over states in
> > teo_select() needs to exclude the first enabled state, because
> > there are no more states to check after that.
> >
> > Plus the time span check needs to be done when the given state
> > is about to be selected, because otherwise the function may end up
> > returning a state for which the sums are too low.
> >
> > Thanks!
> >
> > ---
> > drivers/cpuidle/governors/teo.c | 26 ++++++++++++++------------
> > 1 file changed, 14 insertions(+), 12 deletions(-)
> >
> > Index: linux-pm/drivers/cpuidle/governors/teo.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpuidle/governors/teo.c
> > +++ linux-pm/drivers/cpuidle/governors/teo.c
> > @@ -404,25 +404,27 @@ static int teo_select(struct cpuidle_dri
> > intercept_sum += bin->intercepts;
> > recent_sum += bin->recent;
> >
> > - if (dev->states_usage[i].disable)
> > + if (dev->states_usage[i].disable && i > idx0)
> > continue;
> >
> > span_ns = teo_middle_of_bin(i, drv);
> > - if (!teo_time_ok(span_ns)) {
> > - /*
> > - * The current state is too shallow, so select
> > - * the first enabled deeper state.
> > - */
> > - duration_ns = last_enabled_span_ns;
> > - idx = last_enabled_idx;
> > - break;
> > - }
> >
> > if ((!alt_recent || 2 * recent_sum > idx_recent_sum) &&
> > (!alt_intercepts ||
> > 2 * intercept_sum > idx_intercept_sum)) {
> > - idx = i;
> > - duration_ns = span_ns;
> > + if (!teo_time_ok(span_ns) ||
> > + dev->states_usage[i].disable) {
> > + /*
> > + * The current state is too shallow or
> > + * disabled, so select the first enabled
> > + * deeper state.
> > + */
> > + duration_ns = last_enabled_span_ns;
> > + idx = last_enabled_idx;
> > + } else {
> > + idx = i;
> > + duration_ns = span_ns;
> > + }
> > break;
> > }
>
> Hi Rafael,

Hi Doug,

Thanks for the feedback, much appreciated!

> I tried the patch and when I disabled idle state 0
> got, very similar to before:
>
> Idle state 0 disabled: FAIL
> Processor: 95 watts
> Idle state 0 entries: 0
> Idle state 1 entries: 65,475,534
> Idle state 2 entries: 333144
> Idle state 3 entries: 65,247,048
>
> However, I accidently left it for about 30 minutes
> and noticed:
>
> Idle state 0 disabled:
> Processor: 83 watts
> Idle state 0 entries: 0
> Idle state 1 entries: 88,706,831
> Idle state 2 entries: 100
> Idle state 3 entries: 662

This means that idle state 0 data are disregarded after disabling it
and that most likely is because the second loop in teo_select() should
be over all states down to idle state 0 (not only down to the first
enabled one).

So below is an updated patch (not tested yet).

---
drivers/cpuidle/governors/teo.c | 28 +++++++++++++++-------------
1 file changed, 15 insertions(+), 13 deletions(-)

Index: linux-pm/drivers/cpuidle/governors/teo.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/governors/teo.c
+++ linux-pm/drivers/cpuidle/governors/teo.c
@@ -397,32 +397,34 @@ static int teo_select(struct cpuidle_dri
intercept_sum = 0;
recent_sum = 0;

- for (i = idx - 1; i >= idx0; i--) {
+ for (i = idx - 1; i >= 0; i--) {
struct teo_bin *bin = &cpu_data->state_bins[i];
s64 span_ns;

intercept_sum += bin->intercepts;
recent_sum += bin->recent;

- if (dev->states_usage[i].disable)
+ if (dev->states_usage[i].disable && i > 0)
continue;

span_ns = teo_middle_of_bin(i, drv);
- if (!teo_time_ok(span_ns)) {
- /*
- * The current state is too shallow, so select
- * the first enabled deeper state.
- */
- duration_ns = last_enabled_span_ns;
- idx = last_enabled_idx;
- break;
- }

if ((!alt_recent || 2 * recent_sum > idx_recent_sum) &&
(!alt_intercepts ||
2 * intercept_sum > idx_intercept_sum)) {
- idx = i;
- duration_ns = span_ns;
+ if (!teo_time_ok(span_ns) ||
+ dev->states_usage[i].disable) {
+ /*
+ * The current state is too shallow or
+ * disabled, so select the first enabled
+ * deeper state.
+ */
+ duration_ns = last_enabled_span_ns;
+ idx = last_enabled_idx;
+ } else {
+ idx = i;
+ duration_ns = span_ns;
+ }
break;
}





2021-07-29 16:21:54

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

On Thu, Jul 29, 2021 at 5:24 PM Doug Smythies <[email protected]> wrote:
>
> On Wed, Jul 28, 2021 at 11:34 PM Doug Smythies <[email protected]> wrote:
> >
> > On Wed, Jul 28, 2021 at 10:47 AM Rafael J. Wysocki <[email protected]> wrote:
> > >
> > > On Wednesday, July 28, 2021 3:52:51 PM CEST Rafael J. Wysocki wrote:
> > > > On Tue, Jul 27, 2021 at 10:06 PM Doug Smythies <[email protected]> wrote:
> > > > >
> > > > > Hi Rafael,
> > > > >
> > > > > Further to my reply of 2021.07.04 on this, I have
> > > > > continued to work with and test this patch set.
> > > > >
> > > > > On 2021.06.02 11:14 Rafael J. Wysocki wrote:
> > > > >
> > > > > >This series of patches addresses some theoretical shortcoming in the
> > > > > > TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> > > > > > state selection logic to some extent.
> > > > > >
> > > > > > Patches [1-2/5] are introductory cleanups and the substantial changes are
> > > > > > made in patches [3-4/5] (please refer to the changelogs of these two
> > > > > > patches for details). The last patch only deals with documentation.
> > > > > >
> > > > > > Even though this work is mostly based on theoretical considerations, it
> > > > > > shows a measurable reduction of the number of cases in which the shallowest
> > > > > > idle state is selected while it would be more beneficial to select a deeper
> > > > > > one or the deepest idle state is selected while it would be more beneficial to
> > > > > > select a shallower one, which should be a noticeable improvement.
> > > > >
> > > > > I am concentrating in the idle state 0 and 1 area.
> > > > > When I disable idle state 0, the expectation is its
> > > > > usage will fall to idle state 1. It doesn't.
> > > > >
> > > > > Conditions:
> > > > > CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
> > > > > HWP: disabled
> > > > > CPU frequency scaling driver: intel_pstate, active
> > > > > CPU frequency scaling governor: performance.
> > > > > Idle configuration: As a COMETLAKE processor, with 4 idle states.
> > > > > Sample time for below: 1 minute.
> > > > > Workflow: Cross core named pipe token passing, 12 threads.
> > > > >
> > > > > Kernel 5.14-rc3: idle: teo governor
> > > > >
> > > > > All idle states enabled: PASS
> > > > > Processor: 97 watts
> > > > > Idle state 0 entries: 811151
> > > > > Idle state 1 entries: 140300776
> > > > > Idle state 2 entries: 889
> > > > > Idle state 3 entries: 8
> > > > >
> > > > > Idle state 0 disabled: FAIL <<<<<
> > > > > Processor: 96 watts
> > > > > Idle state 0 entries: 0
> > > > > Idle state 1 entries: 65599283
> > > > > Idle state 2 entries: 364399
> > > > > Idle state 3 entries: 65112651
> > > >
> > > > This looks odd.
> > > >
> > > > Thanks for the report, I'll take a look at this.
> > >
> > > I have found an issue in the code that may be responsible for the
> > > observed behavior and should be addressed by the appended patch (not
> > > tested yet).
> > >
> > > Basically, the "disabled" check in the second loop over states in
> > > teo_select() needs to exclude the first enabled state, because
> > > there are no more states to check after that.
> > >
> > > Plus the time span check needs to be done when the given state
> > > is about to be selected, because otherwise the function may end up
> > > returning a state for which the sums are too low.
> > >
> > > Thanks!
> > >
> > > ---
> > > drivers/cpuidle/governors/teo.c | 26 ++++++++++++++------------
> > > 1 file changed, 14 insertions(+), 12 deletions(-)
> > >
> > > Index: linux-pm/drivers/cpuidle/governors/teo.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/cpuidle/governors/teo.c
> > > +++ linux-pm/drivers/cpuidle/governors/teo.c
> > > @@ -404,25 +404,27 @@ static int teo_select(struct cpuidle_dri
> > > intercept_sum += bin->intercepts;
> > > recent_sum += bin->recent;
> > >
> > > - if (dev->states_usage[i].disable)
> > > + if (dev->states_usage[i].disable && i > idx0)
> > > continue;
> > >
> > > span_ns = teo_middle_of_bin(i, drv);
> > > - if (!teo_time_ok(span_ns)) {
> > > - /*
> > > - * The current state is too shallow, so select
> > > - * the first enabled deeper state.
> > > - */
> > > - duration_ns = last_enabled_span_ns;
> > > - idx = last_enabled_idx;
> > > - break;
> > > - }
> > >
> > > if ((!alt_recent || 2 * recent_sum > idx_recent_sum) &&
> > > (!alt_intercepts ||
> > > 2 * intercept_sum > idx_intercept_sum)) {
> > > - idx = i;
> > > - duration_ns = span_ns;
> > > + if (!teo_time_ok(span_ns) ||
> > > + dev->states_usage[i].disable) {
> > > + /*
> > > + * The current state is too shallow or
> > > + * disabled, so select the first enabled
> > > + * deeper state.
> > > + */
> > > + duration_ns = last_enabled_span_ns;
> > > + idx = last_enabled_idx;
> > > + } else {
> > > + idx = i;
> > > + duration_ns = span_ns;
> > > + }
> > > break;
> > > }
> >
> > Hi Rafael,
> >
> > I tried the patch and when I disabled idle state 0
> > got, very similar to before:
> >
> > Idle state 0 disabled: FAIL
> > Processor: 95 watts
> > Idle state 0 entries: 0
> > Idle state 1 entries: 65,475,534
> > Idle state 2 entries: 333144
> > Idle state 3 entries: 65,247,048
> >
> > However, I accidently left it for about 30 minutes
> > and noticed:
> >
> > Idle state 0 disabled:
> > Processor: 83 watts
> > Idle state 0 entries: 0
> > Idle state 1 entries: 88,706,831
> > Idle state 2 entries: 100
> > Idle state 3 entries: 662
> >
> > I went back to unmodified kernel 5.13-rc3 and
>
> Sorry, 5.14-rc3.
>
> > let it run longer with idle state 0 disabled, and
> > after 30 minutes it had changed but nowhere
> > near as much:
> >
> > Idle state 0 disabled:
> > Processor: 87 watts
> > Idle state 0 entries: 0
> > Idle state 1 entries: 70,361,020
> > Idle state 2 entries: 71219
> > Idle state 3 entries: 27,249,975
>
> Addendum: So far the workflow used for this
> thread has been event based. If I switch to
> a timer based workflow, everything works as
> expected for both kernels, 5.14-rc3 unmodified
> and modified with the patch from herein.

Yes, the affected case is when the governor selects states that are
shallower than indicated by the time till the next timer.

2021-07-30 03:38:47

by Doug Smythies

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

On Thu, Jul 29, 2021 at 9:14 AM Rafael J. Wysocki <[email protected]> wrote:
>
... [snip]...
>
> This means that idle state 0 data are disregarded after disabling it
> and that most likely is because the second loop in teo_select() should
> be over all states down to idle state 0 (not only down to the first
> enabled one).
>
> So below is an updated patch (not tested yet).

Hi Rafael,

This updated patch works great / solves the problem.
Tested-by: Doug Smythies <[email protected]>

Thank you very much.

... Doug

>
> ---
> drivers/cpuidle/governors/teo.c | 28 +++++++++++++++-------------
> 1 file changed, 15 insertions(+), 13 deletions(-)
>
> Index: linux-pm/drivers/cpuidle/governors/teo.c
> ===================================================================
> --- linux-pm.orig/drivers/cpuidle/governors/teo.c
> +++ linux-pm/drivers/cpuidle/governors/teo.c
> @@ -397,32 +397,34 @@ static int teo_select(struct cpuidle_dri
> intercept_sum = 0;
> recent_sum = 0;
>
> - for (i = idx - 1; i >= idx0; i--) {
> + for (i = idx - 1; i >= 0; i--) {
> struct teo_bin *bin = &cpu_data->state_bins[i];
> s64 span_ns;
>
> intercept_sum += bin->intercepts;
> recent_sum += bin->recent;
>
> - if (dev->states_usage[i].disable)
> + if (dev->states_usage[i].disable && i > 0)
> continue;
>
> span_ns = teo_middle_of_bin(i, drv);
> - if (!teo_time_ok(span_ns)) {
> - /*
> - * The current state is too shallow, so select
> - * the first enabled deeper state.
> - */
> - duration_ns = last_enabled_span_ns;
> - idx = last_enabled_idx;
> - break;
> - }
>
> if ((!alt_recent || 2 * recent_sum > idx_recent_sum) &&
> (!alt_intercepts ||
> 2 * intercept_sum > idx_intercept_sum)) {
> - idx = i;
> - duration_ns = span_ns;
> + if (!teo_time_ok(span_ns) ||
> + dev->states_usage[i].disable) {
> + /*
> + * The current state is too shallow or
> + * disabled, so select the first enabled
> + * deeper state.
> + */
> + duration_ns = last_enabled_span_ns;
> + idx = last_enabled_idx;
> + } else {
> + idx = i;
> + duration_ns = span_ns;
> + }
> break;
> }

2021-07-30 13:27:33

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic

On Fri, Jul 30, 2021 at 5:36 AM Doug Smythies <[email protected]> wrote:
>
> On Thu, Jul 29, 2021 at 9:14 AM Rafael J. Wysocki <[email protected]> wrote:
> >
> ... [snip]...
> >
> > This means that idle state 0 data are disregarded after disabling it
> > and that most likely is because the second loop in teo_select() should
> > be over all states down to idle state 0 (not only down to the first
> > enabled one).
> >
> > So below is an updated patch (not tested yet).
>
> Hi Rafael,
>
> This updated patch works great / solves the problem.
> Tested-by: Doug Smythies <[email protected]>
>
> Thank you very much.

Thank you and you're welcome!

I've found a small issue in the patch though (it needs to check the
time span before setting the "last enabled" index), so let me submit a
proper patch with a changelog based on this one.