2017-09-22 23:05:16

by Leo Yan

[permalink] [raw]
Subject: [PATCH 1/2] sched/fair: make capacity_margin __read_mostly

Variable 'capacity_margin' is used with read operation for most cases
to calculate the capacity margin, put it into __read_mostly section.

Cc: Dietmar Eggemann <[email protected]>
Cc: Morten Rasmussen <[email protected]>
Cc: Chris Redpath <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Vincent Guittot <[email protected]>
Signed-off-by: Leo Yan <[email protected]>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 70ba32e..ad03bf4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -129,7 +129,7 @@ unsigned int sysctl_sched_cfs_bandwidth_slice = 5000UL;
*
* (default: ~20%)
*/
-unsigned int capacity_margin = 1280;
+unsigned int capacity_margin __read_mostly = 1280;

static inline void update_load_add(struct load_weight *lw, unsigned long inc)
{
--
2.7.4


2017-09-22 23:05:20

by Leo Yan

[permalink] [raw]
Subject: [PATCH 2/2] cpufreq: schedutil: consolidate capacity margin calculation

Scheduler CFS class has variable 'capacity_margin' to calculate the
capacity margin, and schedutil governor also needs to compensate the
same margin for frequency tipping point. Below are formulas used in
CFS class and schedutil governor separately:

CFS: U` = U * capacity_margin / 1024 = U * 1.25
Schedutil: U` = U + U >> 2 = U + U * 0.25 = U * 1.25

This patch consolidates the capacity margin calculation so let
schedutil to use same formula with CFS class. As result this can avoid
the mismatch issue between schedutil and CFS class after change
'capacity_margin' to other values.

Cc: Dietmar Eggemann <[email protected]>
Cc: Morten Rasmussen <[email protected]>
Cc: Chris Redpath <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Vincent Guittot <[email protected]>
Signed-off-by: Leo Yan <[email protected]>
---
kernel/sched/cpufreq_schedutil.c | 5 +++--
kernel/sched/sched.h | 1 +
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 9209d83..067abbe 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -155,7 +155,8 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
*
* next_freq = C * curr_freq * util_raw / max
*
- * Take C = 1.25 for the frequency tipping point at (util / max) = 0.8.
+ * Take C = capacity_margin / 1024 = 1.25, so it's for the frequency tipping
+ * point at (util / max) = 0.8.
*
* The lowest driver-supported frequency which is equal or greater than the raw
* next_freq (as calculated above) is returned, subject to policy min/max and
@@ -168,7 +169,7 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
unsigned int freq = arch_scale_freq_invariant() ?
policy->cpuinfo.max_freq : policy->cur;

- freq = (freq + (freq >> 2)) * util / max;
+ freq = (freq * capacity_margin / 1024) * util / max;

if (freq == sg_policy->cached_raw_freq && sg_policy->next_freq != UINT_MAX)
return sg_policy->next_freq;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 14db76c..cf75bdc 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -52,6 +52,7 @@ struct cpuidle_state;
#define TASK_ON_RQ_MIGRATING 2

extern __read_mostly int scheduler_running;
+extern unsigned int capacity_margin __read_mostly;

extern unsigned long calc_load_update;
extern atomic_long_t calc_load_tasks;
--
2.7.4

2017-09-25 08:15:40

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/2] cpufreq: schedutil: consolidate capacity margin calculation

On Sat, Sep 23, 2017 at 07:04:44AM +0800, Leo Yan wrote:
> + freq = (freq * capacity_margin / 1024) * util / max;

We have SCHED_CAPACITY_SCALE for that..

2017-09-25 13:55:16

by Patrick Bellasi

[permalink] [raw]
Subject: Re: [PATCH 2/2] cpufreq: schedutil: consolidate capacity margin calculation

On 23-Sep 07:04, Leo Yan wrote:
> Scheduler CFS class has variable 'capacity_margin' to calculate the
> capacity margin, and schedutil governor also needs to compensate the
> same margin for frequency tipping point. Below are formulas used in
> CFS class and schedutil governor separately:
>
> CFS: U` = U * capacity_margin / 1024 = U * 1.25
> Schedutil: U` = U + U >> 2 = U + U * 0.25 = U * 1.25
>
> This patch consolidates the capacity margin calculation so let
> schedutil to use same formula with CFS class. As result this can avoid
> the mismatch issue between schedutil and CFS class after change
> 'capacity_margin' to other values.
>
> Cc: Dietmar Eggemann <[email protected]>
> Cc: Morten Rasmussen <[email protected]>
> Cc: Chris Redpath <[email protected]>
> Cc: Joel Fernandes <[email protected]>
> Cc: Vincent Guittot <[email protected]>
> Signed-off-by: Leo Yan <[email protected]>
> ---
> kernel/sched/cpufreq_schedutil.c | 5 +++--
> kernel/sched/sched.h | 1 +
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 9209d83..067abbe 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -155,7 +155,8 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> *
> * next_freq = C * curr_freq * util_raw / max
> *
> - * Take C = 1.25 for the frequency tipping point at (util / max) = 0.8.
> + * Take C = capacity_margin / 1024 = 1.25, so it's for the frequency tipping
> + * point at (util / max) = 0.8.
> *
> * The lowest driver-supported frequency which is equal or greater than the raw
> * next_freq (as calculated above) is returned, subject to policy min/max and
> @@ -168,7 +169,7 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> unsigned int freq = arch_scale_freq_invariant() ?
> policy->cpuinfo.max_freq : policy->cur;
>
> - freq = (freq + (freq >> 2)) * util / max;
> + freq = (freq * capacity_margin / 1024) * util / max;

The compiler should be smart enough but perhaps you can better use:

freq *= (capacity_margin >> SCHED_CAPACITY_SHIFT));
freq *= util / max;

> if (freq == sg_policy->cached_raw_freq && sg_policy->next_freq != UINT_MAX)
> return sg_policy->next_freq;
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 14db76c..cf75bdc 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -52,6 +52,7 @@ struct cpuidle_state;
> #define TASK_ON_RQ_MIGRATING 2
>
> extern __read_mostly int scheduler_running;
> +extern unsigned int capacity_margin __read_mostly;
>
> extern unsigned long calc_load_update;
> extern atomic_long_t calc_load_tasks;
> --
> 2.7.4
>

--
#include <best/regards.h>

Patrick Bellasi

2017-09-28 06:33:41

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH 2/2] cpufreq: schedutil: consolidate capacity margin calculation

On Mon, Sep 25, 2017 at 02:55:07PM +0100, Patrick Bellasi wrote:
> On 23-Sep 07:04, Leo Yan wrote:
> > Scheduler CFS class has variable 'capacity_margin' to calculate the
> > capacity margin, and schedutil governor also needs to compensate the
> > same margin for frequency tipping point. Below are formulas used in
> > CFS class and schedutil governor separately:
> >
> > CFS: U` = U * capacity_margin / 1024 = U * 1.25
> > Schedutil: U` = U + U >> 2 = U + U * 0.25 = U * 1.25
> >
> > This patch consolidates the capacity margin calculation so let
> > schedutil to use same formula with CFS class. As result this can avoid
> > the mismatch issue between schedutil and CFS class after change
> > 'capacity_margin' to other values.
> >
> > Cc: Dietmar Eggemann <[email protected]>
> > Cc: Morten Rasmussen <[email protected]>
> > Cc: Chris Redpath <[email protected]>
> > Cc: Joel Fernandes <[email protected]>
> > Cc: Vincent Guittot <[email protected]>
> > Signed-off-by: Leo Yan <[email protected]>
> > ---
> > kernel/sched/cpufreq_schedutil.c | 5 +++--
> > kernel/sched/sched.h | 1 +
> > 2 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 9209d83..067abbe 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -155,7 +155,8 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> > *
> > * next_freq = C * curr_freq * util_raw / max
> > *
> > - * Take C = 1.25 for the frequency tipping point at (util / max) = 0.8.
> > + * Take C = capacity_margin / 1024 = 1.25, so it's for the frequency tipping
> > + * point at (util / max) = 0.8.
> > *
> > * The lowest driver-supported frequency which is equal or greater than the raw
> > * next_freq (as calculated above) is returned, subject to policy min/max and
> > @@ -168,7 +169,7 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> > unsigned int freq = arch_scale_freq_invariant() ?
> > policy->cpuinfo.max_freq : policy->cur;
> >
> > - freq = (freq + (freq >> 2)) * util / max;
> > + freq = (freq * capacity_margin / 1024) * util / max;
>
> The compiler should be smart enough but perhaps you can better use:
>
> freq *= (capacity_margin >> SCHED_CAPACITY_SHIFT));
> freq *= util / max;

Thanks for the suggestion. The '*=' has low precedence than other
operators, will fix as below:

freq = freq * capacity_margin >> SCHED_CAPACITY_SHIFT;
freq = freq * util / max;

> > if (freq == sg_policy->cached_raw_freq && sg_policy->next_freq != UINT_MAX)
> > return sg_policy->next_freq;
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index 14db76c..cf75bdc 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -52,6 +52,7 @@ struct cpuidle_state;
> > #define TASK_ON_RQ_MIGRATING 2
> >
> > extern __read_mostly int scheduler_running;
> > +extern unsigned int capacity_margin __read_mostly;
> >
> > extern unsigned long calc_load_update;
> > extern atomic_long_t calc_load_tasks;
> > --
> > 2.7.4
> >
>
> --
> #include <best/regards.h>
>
> Patrick Bellasi