2013-08-19 16:16:37

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 01/10] sched: Remove one division operation in find_busiest_queue()

From: Joonsoo Kim <[email protected]>

Remove one division operation in find_busiest_queue() by using
crosswise multiplication:

wl_i / power_i > wl_j / power_j :=
wl_i * power_j > wl_j * power_i

Signed-off-by: Joonsoo Kim <[email protected]>
[peterz: expanded changelog]
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
kernel/sched/fair.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5018,7 +5018,7 @@ static struct rq *find_busiest_queue(str
struct sched_group *group)
{
struct rq *busiest = NULL, *rq;
- unsigned long max_load = 0;
+ unsigned long busiest_load = 0, busiest_power = SCHED_POWER_SCALE;
int i;

for_each_cpu(i, sched_group_cpus(group)) {
@@ -5049,10 +5049,9 @@ static struct rq *find_busiest_queue(str
* the load can be moved away from the cpu that is potentially
* running at a lower capacity.
*/
- wl = (wl * SCHED_POWER_SCALE) / power;
-
- if (wl > max_load) {
- max_load = wl;
+ if (wl * busiest_power > busiest_load * power) {
+ busiest_load = wl;
+ busiest_power = power;
busiest = rq;
}
}


2013-08-22 08:59:02

by Paul Turner

[permalink] [raw]
Subject: Re: [PATCH 01/10] sched: Remove one division operation in find_busiest_queue()

On Mon, Aug 19, 2013 at 9:00 AM, Peter Zijlstra <[email protected]> wrote:
> From: Joonsoo Kim <[email protected]>
>
> Remove one division operation in find_busiest_queue() by using
> crosswise multiplication:
>
> wl_i / power_i > wl_j / power_j :=
> wl_i * power_j > wl_j * power_i
>
> Signed-off-by: Joonsoo Kim <[email protected]>
> [peterz: expanded changelog]
> Signed-off-by: Peter Zijlstra <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> ---
> kernel/sched/fair.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5018,7 +5018,7 @@ static struct rq *find_busiest_queue(str
> struct sched_group *group)
> {
> struct rq *busiest = NULL, *rq;
> - unsigned long max_load = 0;
> + unsigned long busiest_load = 0, busiest_power = SCHED_POWER_SCALE;

Initializing this to SCHED_POWER_SCALE assigns a meaning that isn't
really there. How about just 1?

> int i;
>
> for_each_cpu(i, sched_group_cpus(group)) {
> @@ -5049,10 +5049,9 @@ static struct rq *find_busiest_queue(str
> * the load can be moved away from the cpu that is potentially
> * running at a lower capacity.
> */
> - wl = (wl * SCHED_POWER_SCALE) / power;
> -
> - if (wl > max_load) {
> - max_load = wl;

A comment wouldn't hurt here.

> + if (wl * busiest_power > busiest_load * power) {
> + busiest_load = wl;
> + busiest_power = power;
> busiest = rq;
> }
> }
>
>

2013-08-22 10:25:26

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 01/10] sched: Remove one division operation in find_busiest_queue()

On Thu, Aug 22, 2013 at 01:58:28AM -0700, Paul Turner wrote:

> > wl_i / power_i > wl_j / power_j :=
> > wl_i * power_j > wl_j * power_i
> >
> > struct rq *busiest = NULL, *rq;
> > - unsigned long max_load = 0;
> > + unsigned long busiest_load = 0, busiest_power = SCHED_POWER_SCALE;
>
> Initializing this to SCHED_POWER_SCALE assigns a meaning that isn't
> really there. How about just 1?

Right, 1 works, all we really need is for wl to be > 0.

> > int i;
> >
> > for_each_cpu(i, sched_group_cpus(group)) {
> > @@ -5049,10 +5049,9 @@ static struct rq *find_busiest_queue(str
> > * the load can be moved away from the cpu that is potentially
> > * running at a lower capacity.
> > */
> > - wl = (wl * SCHED_POWER_SCALE) / power;
> > -
> > - if (wl > max_load) {
> > - max_load = wl;
>
> A comment wouldn't hurt here.

Agreed, something like so?

/*
* Since we're looking for max(wl_i / power_i) crosswise multiplication
* to rid ourselves of the division works out to:
* wl_i * power_j > wl_j * power_i; where j is our previous maximum.
*/

> > + if (wl * busiest_power > busiest_load * power) {
> > + busiest_load = wl;
> > + busiest_power = power;
> > busiest = rq;
> > }
> > }
> >
> >