2013-05-23 08:35:45

by Lukasz Majewski

[permalink] [raw]
Subject: [PATCH 1/2] sched: Use do_div() for 64 bit division at power utilization calculation (putil)

Now explicit casting is done when power usage variable (putil) is calculated

Signed-off-by: Lukasz Majewski <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
This patch was developed on top of the following Alex's repository:
https://github.com/alexshi/power-scheduling/commits/power-scheduling
---
kernel/sched/fair.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f0ead38..14d29b3 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3471,8 +3471,10 @@ find_leader_cpu(struct sched_group *group, struct task_struct *p, int this_cpu,
int leader_cpu = -1;
int i;
/* percentage of the task's util */
- unsigned putil = (u64)(p->se.avg.runnable_avg_sum << SCHED_POWER_SHIFT)
- / (p->se.avg.runnable_avg_period + 1);
+ unsigned putil;
+ u64 tmp = (u64) p->se.avg.runnable_avg_sum << SCHED_POWER_SHIFT;
+ do_div(tmp, (p->se.avg.runnable_avg_period + 1));
+ putil = (unsigned) tmp;

/* bias toward local cpu */
if (cpumask_test_cpu(this_cpu, tsk_cpus_allowed(p)) &&
--
1.7.10.4


2013-05-23 08:36:01

by Lukasz Majewski

[permalink] [raw]
Subject: [PATCH 2/2] sched:fix: Signed overflow prevention for vacancy calculation

Nasty bug with vacancy calculation has been fixed.
In the if statement the FULL_UTIL is a large constant, max_cfs_util()
returns unsigned and putil is also defined as unsigned. The outcome
is that this condition is always true.

As observed, this was the reason for frequent jumps of processes between
CPUs.

Signed-off-by: Lukasz Majewski <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
This patch was developed on top of the following Alex's repository:
https://github.com/alexshi/power-scheduling/commits/power-scheduling
---
kernel/sched/fair.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 14d29b3..8b07f6c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3476,9 +3476,10 @@ find_leader_cpu(struct sched_group *group, struct task_struct *p, int this_cpu,
do_div(tmp, (p->se.avg.runnable_avg_period + 1));
putil = (unsigned) tmp;

+ vacancy = (int) (FULL_UTIL - max_cfs_util(this_cpu, wakeup) -
+ (putil << 2));
/* bias toward local cpu */
- if (cpumask_test_cpu(this_cpu, tsk_cpus_allowed(p)) &&
- FULL_UTIL - max_cfs_util(this_cpu, wakeup) - (putil << 2) > 0)
+ if (cpumask_test_cpu(this_cpu, tsk_cpus_allowed(p)) && vacancy > 0)
return this_cpu;

/* Traverse only the allowed CPUs */
--
1.7.10.4

2013-05-30 01:49:36

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH 1/2] sched: Use do_div() for 64 bit division at power utilization calculation (putil)

On 05/23/2013 04:34 PM, Lukasz Majewski wrote:
> Now explicit casting is done when power usage variable (putil) is calculated
>
> Signed-off-by: Lukasz Majewski <[email protected]>
> Signed-off-by: Kyungmin Park <[email protected]>
> ---
> This patch was developed on top of the following Alex's repository:
> https://github.com/alexshi/power-scheduling/commits/power-scheduling
> ---
> kernel/sched/fair.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>


Thanks for catch this issue. seems use div_u64 is better, and there are 2 same bugs.
so, could I rewrite the patch like following?
---

>From 9f72c25607351981898d99822f5a66e0ca67a3da Mon Sep 17 00:00:00 2001
From: Alex Shi <[email protected]>
Date: Wed, 29 May 2013 11:09:39 +0800
Subject: [PATCH 1/2] sched: fix cast on power utilization calculation and use
div_u64

Now explicit casting is done when power usage variable (putil) is
calculated.
div_u64 is optimized on u32.

Signed-off-by: Lukasz Majewski <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
Signed-off-by: Alex Shi <[email protected]>
---
kernel/sched/fair.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 09ae48a..3a4917c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1504,8 +1504,8 @@ static inline void update_rq_runnable_avg(struct rq *rq, int runnable)
__update_tg_runnable_avg(&rq->avg, &rq->cfs);

period = rq->avg.runnable_avg_period ? rq->avg.runnable_avg_period : 1;
- rq->util = (u64)(rq->avg.runnable_avg_sum << SCHED_POWER_SHIFT)
- / period;
+ rq->util = div_u64(((u64)rq->avg.runnable_avg_sum << SCHED_POWER_SHIFT),
+ period);
}

/* Add the load generated by se into cfs_rq's child load-average */
@@ -3407,8 +3407,8 @@ static int is_sd_full(struct sched_domain *sd,
/* p maybe a new forked task */
putil = FULL_UTIL;
else
- putil = (u64)(p->se.avg.runnable_avg_sum << SCHED_POWER_SHIFT)
- / (p->se.avg.runnable_avg_period + 1);
+ putil = div_u64(((u64)p->se.avg.runnable_avg_sum << SCHED_POWER_SHIFT),
+ p->se.avg.runnable_avg_period + 1);

/* Try to collect the domain's utilization */
group = sd->groups;
@@ -3463,9 +3463,11 @@ find_leader_cpu(struct sched_group *group, struct task_struct *p, int this_cpu,
int vacancy, min_vacancy = INT_MAX;
int leader_cpu = -1;
int i;
+
/* percentage of the task's util */
- unsigned putil = (u64)(p->se.avg.runnable_avg_sum << SCHED_POWER_SHIFT)
- / (p->se.avg.runnable_avg_period + 1);
+ unsigned putil;
+ putil = div_u64(((u64)p->se.avg.runnable_avg_sum << SCHED_POWER_SHIFT),
+ p->se.avg.runnable_avg_period + 1);

/* bias toward local cpu */
if (cpumask_test_cpu(this_cpu, tsk_cpus_allowed(p)) &&
--
1.7.12


--
Thanks
Alex

2013-05-30 01:51:36

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH 2/2] sched:fix: Signed overflow prevention for vacancy calculation

On 05/23/2013 04:34 PM, Lukasz Majewski wrote:
> Nasty bug with vacancy calculation has been fixed.
> In the if statement the FULL_UTIL is a large constant, max_cfs_util()
> returns unsigned and putil is also defined as unsigned. The outcome
> is that this condition is always true.
>
> As observed, this was the reason for frequent jumps of processes between
> CPUs.
>
> Signed-off-by: Lukasz Majewski <[email protected]>
> Signed-off-by: Kyungmin Park <[email protected]>

Thanks for the catching!
I do a little change, would you like to keep the authorship?

---

>From 4f7f2d091b7210e0be689d3e063d3cc82da1b1af Mon Sep 17 00:00:00 2001
From: Lukasz Majewski <[email protected]>
Date: Thu, 30 May 2013 09:21:46 +0800
Subject: [PATCH 2/2] sched:fix: Signed overflow prevention for vacancy
calculation

Nasty bug with vacancy calculation has been fixed.
In the if statement the FULL_UTIL is a large constant, max_cfs_util()
returns unsigned and putil is also defined as unsigned. The outcome
is that this condition is always true.

As observed, this was the reason for frequent jumps of processes between
CPUs.

Signed-off-by: Lukasz Majewski <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
Signed-off-by: Alex Shi <[email protected]>
---
kernel/sched/fair.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3a4917c..56a9d16 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3468,10 +3468,10 @@ find_leader_cpu(struct sched_group *group, struct task_struct *p, int this_cpu,
unsigned putil;
putil = div_u64(((u64)p->se.avg.runnable_avg_sum << SCHED_POWER_SHIFT),
p->se.avg.runnable_avg_period + 1);
+ vacancy = FULL_UTIL - max_cfs_util(this_cpu, wakeup) - (putil << 2);

/* bias toward local cpu */
- if (cpumask_test_cpu(this_cpu, tsk_cpus_allowed(p)) &&
- FULL_UTIL - max_cfs_util(this_cpu, wakeup) - (putil << 2) > 0)
+ if (cpumask_test_cpu(this_cpu, tsk_cpus_allowed(p)) && vacancy > 0)
return this_cpu;

/* Traverse only the allowed CPUs */
--
1.7.12


--
Thanks
Alex

2013-05-31 06:15:04

by Lukasz Majewski

[permalink] [raw]
Subject: Re: [PATCH 1/2] sched: Use do_div() for 64 bit division at power utilization calculation (putil)

Hi Alex,

> On 05/23/2013 04:34 PM, Lukasz Majewski wrote:
> > Now explicit casting is done when power usage variable (putil) is
> > calculated
> >
> > Signed-off-by: Lukasz Majewski <[email protected]>
> > Signed-off-by: Kyungmin Park <[email protected]>
> > ---
> > This patch was developed on top of the following Alex's repository:
> > https://github.com/alexshi/power-scheduling/commits/power-scheduling
> > ---
> > kernel/sched/fair.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
>
>
> Thanks for catch this issue. seems use div_u64 is better, and there
> are 2 same bugs. so, could I rewrite the patch like following?

Yes, no problem.

> ---
>
> From 9f72c25607351981898d99822f5a66e0ca67a3da Mon Sep 17 00:00:00 2001
> From: Alex Shi <[email protected]>
> Date: Wed, 29 May 2013 11:09:39 +0800
> Subject: [PATCH 1/2] sched: fix cast on power utilization calculation
> and use div_u64
>
> Now explicit casting is done when power usage variable (putil) is
> calculated.
> div_u64 is optimized on u32.
>
> Signed-off-by: Lukasz Majewski <[email protected]>
> Signed-off-by: Kyungmin Park <[email protected]>
> Signed-off-by: Alex Shi <[email protected]>
> ---
> kernel/sched/fair.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 09ae48a..3a4917c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1504,8 +1504,8 @@ static inline void
> update_rq_runnable_avg(struct rq *rq, int runnable)
> __update_tg_runnable_avg(&rq->avg, &rq->cfs);
> period = rq->avg.runnable_avg_period ?
> rq->avg.runnable_avg_period : 1;
> - rq->util = (u64)(rq->avg.runnable_avg_sum <<
> SCHED_POWER_SHIFT)
> - / period;
> + rq->util = div_u64(((u64)rq->avg.runnable_avg_sum <<
> SCHED_POWER_SHIFT),
> + period);
> }
>
> /* Add the load generated by se into cfs_rq's child load-average */
> @@ -3407,8 +3407,8 @@ static int is_sd_full(struct sched_domain *sd,
> /* p maybe a new forked task */
> putil = FULL_UTIL;
> else
> - putil = (u64)(p->se.avg.runnable_avg_sum <<
> SCHED_POWER_SHIFT)
> - / (p->se.avg.runnable_avg_period +
> 1);
> + putil = div_u64(((u64)p->se.avg.runnable_avg_sum <<
> SCHED_POWER_SHIFT),
> + p->se.avg.runnable_avg_period + 1);
>
> /* Try to collect the domain's utilization */
> group = sd->groups;
> @@ -3463,9 +3463,11 @@ find_leader_cpu(struct sched_group *group,
> struct task_struct *p, int this_cpu, int vacancy, min_vacancy =
> INT_MAX; int leader_cpu = -1;
> int i;
> +
> /* percentage of the task's util */
> - unsigned putil = (u64)(p->se.avg.runnable_avg_sum <<
> SCHED_POWER_SHIFT)
> - / (p->se.avg.runnable_avg_period +
> 1);
> + unsigned putil;
> + putil = div_u64(((u64)p->se.avg.runnable_avg_sum <<
> SCHED_POWER_SHIFT),
> + p->se.avg.runnable_avg_period + 1);
>
> /* bias toward local cpu */
> if (cpumask_test_cpu(this_cpu, tsk_cpus_allowed(p)) &&



--
Best regards,

Lukasz Majewski

Samsung R&D Poland (SRPOL) | Linux Platform Group