2012-02-06 09:38:44

by Michael wang

[permalink] [raw]
Subject: [PATCH] sched: Avoid unnecessary work in reweight_entity

From: Michael Wang <[email protected]>

In original code, using account_entity_dequeue and account_entity_enqueue
as a pair will do some work unnecessary, this patch combine the work of
them to save time.

Signed-off-by: Michael Wang <[email protected]>
---
kernel/sched/fair.c | 32 ++++++++++++++++++++++++--------
kernel/sched/sched.h | 8 ++++++++
2 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 84adb2d..e380518 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -782,11 +782,23 @@ add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
{
cfs_rq->task_weight += weight;
}
+
+static void
+sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
+{
+ cfs_rq->task_weight -= sub;
+ cfs_rq->task_weight += add;
+}
#else
static inline void
add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
{
}
+
+static void
+sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
+{
+}
#endif

static void
@@ -938,20 +950,24 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
{
}
# endif /* CONFIG_SMP */
+
static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
unsigned long weight)
{
+ /* commit outstanding execution time */
+ if (cfs_rq->curr == se)
+ update_curr(cfs_rq);
+
if (se->on_rq) {
- /* commit outstanding execution time */
- if (cfs_rq->curr == se)
- update_curr(cfs_rq);
- account_entity_dequeue(cfs_rq, se);
+ update_load_sub_add(&cfs_rq->load, se->load.weight, weight);
+ if (!parent_entity(se)) {
+ update_load_sub_add(&rq_of(cfs_rq), se->load.weight, weight);
+ }
+ if (entity_is_task(se)) {
+ sub_add_cfs_task_weight(cfs_rq, se->load.weight, weight);
+ }
}
-
update_load_set(&se->load, weight);
-
- if (se->on_rq)
- account_entity_enqueue(cfs_rq, se);
}

static void update_cfs_shares(struct cfs_rq *cfs_rq)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 98c0c26..ec4430f 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -779,6 +779,14 @@ static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
lw->inv_weight = 0;
}

+static inline void update_load_sub_add(struct load_weight *lw, unsigned long dec,
+ unsigned long inc)
+{
+ lw->weight -= dec;
+ lw->weight += inc;
+ lw->inv_weight = 0;
+}
+
static inline void update_load_set(struct load_weight *lw, unsigned long w)
{
lw->weight = w;
--
1.7.4.1


2012-02-06 10:31:48

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH] sched: Avoid unnecessary work in reweight_entity

On 02/06/2012 05:37 PM, Michael Wang wrote:

> From: Michael Wang <[email protected]>
>
> In original code, using account_entity_dequeue and account_entity_enqueue
> as a pair will do some work unnecessary, this patch combine the work of
> them to save time.
>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> kernel/sched/fair.c | 32 ++++++++++++++++++++++++--------
> kernel/sched/sched.h | 8 ++++++++
> 2 files changed, 32 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 84adb2d..e380518 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -782,11 +782,23 @@ add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
> {
> cfs_rq->task_weight += weight;
> }
> +
> +static void
> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
> +{
> + cfs_rq->task_weight -= sub;
> + cfs_rq->task_weight += add;
> +}
> #else
> static inline void
> add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
> {
> }
> +
> +static void
> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
> +{
> +}
> #endif
>
> static void
> @@ -938,20 +950,24 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
> {
> }
> # endif /* CONFIG_SMP */
> +
> static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
> unsigned long weight)
> {
> + /* commit outstanding execution time */
> + if (cfs_rq->curr == se)
> + update_curr(cfs_rq);

I suppose "se is curr" means "se is already on rq", and "se is not on
rq" means "se can not be curr", so I change the logic up.

> +

> if (se->on_rq) {
> - /* commit outstanding execution time */
> - if (cfs_rq->curr == se)
> - update_curr(cfs_rq);
> - account_entity_dequeue(cfs_rq, se);
> + update_load_sub_add(&cfs_rq->load, se->load.weight, weight);
> + if (!parent_entity(se)) {
> + update_load_sub_add(&rq_of(cfs_rq), se->load.weight, weight);
> + }
> + if (entity_is_task(se)) {
> + sub_add_cfs_task_weight(cfs_rq, se->load.weight, weight);
> + }
> }
> -
> update_load_set(&se->load, weight);

I wonder will this sequence change result some mistake, but I think this
function should already be protected.

Thanks,
Michael Wang

> -
> - if (se->on_rq)
> - account_entity_enqueue(cfs_rq, se);
> }
>
> static void update_cfs_shares(struct cfs_rq *cfs_rq)
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 98c0c26..ec4430f 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -779,6 +779,14 @@ static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
> lw->inv_weight = 0;
> }
>
> +static inline void update_load_sub_add(struct load_weight *lw, unsigned long dec,
> + unsigned long inc)
> +{
> + lw->weight -= dec;
> + lw->weight += inc;
> + lw->inv_weight = 0;
> +}
> +
> static inline void update_load_set(struct load_weight *lw, unsigned long w)
> {
> lw->weight = w;

2012-02-07 02:18:47

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH] sched: Avoid unnecessary work in reweight_entity

On 02/06/2012 05:37 PM, Michael Wang wrote:

> From: Michael Wang <[email protected]>
>
> In original code, using account_entity_dequeue and account_entity_enqueue
> as a pair will do some work unnecessary, this patch combine the work of
> them to save time.
>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> kernel/sched/fair.c | 32 ++++++++++++++++++++++++--------
> kernel/sched/sched.h | 8 ++++++++
> 2 files changed, 32 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 84adb2d..e380518 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -782,11 +782,23 @@ add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
> {
> cfs_rq->task_weight += weight;
> }
> +
> +static void
> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
> +{
> + cfs_rq->task_weight -= sub;
> + cfs_rq->task_weight += add;
> +}
> #else
> static inline void
> add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
> {
> }
> +
> +static void
> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
> +{
> +}
> #endif
>
> static void
> @@ -938,20 +950,24 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
> {
> }
> # endif /* CONFIG_SMP */
> +
> static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
> unsigned long weight)
> {
> + /* commit outstanding execution time */
> + if (cfs_rq->curr == se)
> + update_curr(cfs_rq);
> +
> if (se->on_rq) {
> - /* commit outstanding execution time */
> - if (cfs_rq->curr == se)
> - update_curr(cfs_rq);
> - account_entity_dequeue(cfs_rq, se);
> + update_load_sub_add(&cfs_rq->load, se->load.weight, weight);
> + if (!parent_entity(se)) {
> + update_load_sub_add(&rq_of(cfs_rq), se->load.weight, weight);

Sorry for make a silly mistake here, should try make first, I will sent
correct patch soon.

Regards,
Michael Wang

> + }
> + if (entity_is_task(se)) {
> + sub_add_cfs_task_weight(cfs_rq, se->load.weight, weight);
> + }
> }
> -
> update_load_set(&se->load, weight);
> -
> - if (se->on_rq)
> - account_entity_enqueue(cfs_rq, se);
> }
>
> static void update_cfs_shares(struct cfs_rq *cfs_rq)
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 98c0c26..ec4430f 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -779,6 +779,14 @@ static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
> lw->inv_weight = 0;
> }
>
> +static inline void update_load_sub_add(struct load_weight *lw, unsigned long dec,
> + unsigned long inc)
> +{
> + lw->weight -= dec;
> + lw->weight += inc;
> + lw->inv_weight = 0;
> +}
> +
> static inline void update_load_set(struct load_weight *lw, unsigned long w)
> {
> lw->weight = w;

2012-02-07 02:37:05

by Michael wang

[permalink] [raw]
Subject: [PATCH v2] Avoid unnecessary work in reweight_entity

From: Michael Wang <[email protected]>

In original code, using account_entity_dequeue and account_entity_enqueue
as a pair will do some work unnecessary, such like remove node from list
and add it back again immediately.

Because there are no really enqueue and dequeue happen here, it is also not
suitable to use account_entity_dequeue and account_entity_enqueue here.

This patch combine the work of them in order to save time.

v2:
Add more description and revise wrong code.

Signed-off-by: Michael Wang <[email protected]>
---
kernel/sched/fair.c | 32 ++++++++++++++++++++++++--------
kernel/sched/sched.h | 8 ++++++++
2 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 84adb2d..050d66b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -782,11 +782,23 @@ add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
{
cfs_rq->task_weight += weight;
}
+
+static void
+sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
+{
+ cfs_rq->task_weight -= sub;
+ cfs_rq->task_weight += add;
+}
#else
static inline void
add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
{
}
+
+static void
+sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
+{
+}
#endif

static void
@@ -938,20 +950,24 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
{
}
# endif /* CONFIG_SMP */
+
static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
unsigned long weight)
{
+ /* commit outstanding execution time */
+ if (cfs_rq->curr == se)
+ update_curr(cfs_rq);
+
if (se->on_rq) {
- /* commit outstanding execution time */
- if (cfs_rq->curr == se)
- update_curr(cfs_rq);
- account_entity_dequeue(cfs_rq, se);
+ update_load_sub_add(&cfs_rq->load, se->load.weight, weight);
+ if (!parent_entity(se)) {
+ update_load_sub_add(&rq_of(cfs_rq)->load, se->load.weight, weight);
+ }
+ if (entity_is_task(se)) {
+ sub_add_cfs_task_weight(cfs_rq, se->load.weight, weight);
+ }
}
-
update_load_set(&se->load, weight);
-
- if (se->on_rq)
- account_entity_enqueue(cfs_rq, se);
}

static void update_cfs_shares(struct cfs_rq *cfs_rq)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 98c0c26..ec4430f 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -779,6 +779,14 @@ static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
lw->inv_weight = 0;
}

+static inline void update_load_sub_add(struct load_weight *lw, unsigned long dec,
+ unsigned long inc)
+{
+ lw->weight -= dec;
+ lw->weight += inc;
+ lw->inv_weight = 0;
+}
+
static inline void update_load_set(struct load_weight *lw, unsigned long w)
{
lw->weight = w;
--
1.7.4.1

2012-02-08 02:11:45

by Michael wang

[permalink] [raw]
Subject: [PATCH v3] sched: Avoid unnecessary work in reweight_entity

From: Michael Wang <[email protected]>

In original code, using account_entity_dequeue and account_entity_enqueue
as a pair will do some work unnecessary, such like remove node from list
and add it back again immediately.

And because there are no really enqueue and dequeue happen here, it is not
suitable to use account_entity_dequeue and account_entity_enqueue here.

This patch combine the work of them in order to save time.

v2:
Add more description and revise wrong code.

v3:
Mark add_cfs_task_weight and sub_add_cfs_task_weight as inline.

Signed-off-by: Michael Wang <[email protected]>
---
kernel/sched/fair.c | 34 +++++++++++++++++++++++++---------
kernel/sched/sched.h | 8 ++++++++
2 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7c6414f..8726750 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -777,16 +777,28 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
*/

#if defined CONFIG_SMP && defined CONFIG_FAIR_GROUP_SCHED
-static void
+static inline void
add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
{
cfs_rq->task_weight += weight;
}
+
+static inline void
+sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
+{
+ cfs_rq->task_weight -= sub;
+ cfs_rq->task_weight += add;
+}
#else
static inline void
add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
{
}
+
+static inline void
+sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
+{
+}
#endif

static void
@@ -938,20 +950,24 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
{
}
# endif /* CONFIG_SMP */
+
static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
unsigned long weight)
{
+ /* commit outstanding execution time */
+ if (cfs_rq->curr == se)
+ update_curr(cfs_rq);
+
if (se->on_rq) {
- /* commit outstanding execution time */
- if (cfs_rq->curr == se)
- update_curr(cfs_rq);
- account_entity_dequeue(cfs_rq, se);
+ update_load_sub_add(&cfs_rq->load, se->load.weight, weight);
+ if (!parent_entity(se)) {
+ update_load_sub_add(&rq_of(cfs_rq)->load, se->load.weight, weight);
+ }
+ if (entity_is_task(se)) {
+ sub_add_cfs_task_weight(cfs_rq, se->load.weight, weight);
+ }
}
-
update_load_set(&se->load, weight);
-
- if (se->on_rq)
- account_entity_enqueue(cfs_rq, se);
}

static void update_cfs_shares(struct cfs_rq *cfs_rq)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 98c0c26..ec4430f 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -779,6 +779,14 @@ static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
lw->inv_weight = 0;
}

+static inline void update_load_sub_add(struct load_weight *lw, unsigned long dec,
+ unsigned long inc)
+{
+ lw->weight -= dec;
+ lw->weight += inc;
+ lw->inv_weight = 0;
+}
+
static inline void update_load_set(struct load_weight *lw, unsigned long w)
{
lw->weight = w;
--
1.7.4.1

2012-02-16 14:14:42

by Michael wang

[permalink] [raw]
Subject: [PATCH v4] sched: Avoid unnecessary work in reweight_entity

From: Michael Wang <[email protected]>

In original code, using account_entity_dequeue and account_entity_enqueue
as a pair will do some work unnecessary, such like remove node from list
and add it back again immediately.

And because there are no really enqueue and dequeue happen here, it is not
suitable to use account_entity_dequeue and account_entity_enqueue here.

This patch combine the work of them in order to save time.

v2:
Add more description and revise wrong code.

v3:
Mark add_cfs_task_weight and sub_add_cfs_task_weight as inline.

v4:
Now we invoke update_curr when really necessary.

Signed-off-by: Michael Wang <[email protected]>
---
kernel/sched/fair.c | 27 +++++++++++++++++++++------
kernel/sched/sched.h | 8 ++++++++
2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7c6414f..fda63ec 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -777,16 +777,28 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
*/

#if defined CONFIG_SMP && defined CONFIG_FAIR_GROUP_SCHED
-static void
+static inline void
add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
{
cfs_rq->task_weight += weight;
}
+
+static inline void
+sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
+{
+ cfs_rq->task_weight -= sub;
+ cfs_rq->task_weight += add;
+}
#else
static inline void
add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
{
}
+
+static inline void
+sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
+{
+}
#endif

static void
@@ -938,6 +950,7 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
{
}
# endif /* CONFIG_SMP */
+
static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
unsigned long weight)
{
@@ -945,13 +958,15 @@ static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
/* commit outstanding execution time */
if (cfs_rq->curr == se)
update_curr(cfs_rq);
- account_entity_dequeue(cfs_rq, se);
+ update_load_sub_add(&cfs_rq->load, se->load.weight, weight);
+ if (!parent_entity(se)) {
+ update_load_sub_add(&rq_of(cfs_rq)->load, se->load.weight, weight);
+ }
+ if (entity_is_task(se)) {
+ sub_add_cfs_task_weight(cfs_rq, se->load.weight, weight);
+ }
}
-
update_load_set(&se->load, weight);
-
- if (se->on_rq)
- account_entity_enqueue(cfs_rq, se);
}

static void update_cfs_shares(struct cfs_rq *cfs_rq)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 98c0c26..ec4430f 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -779,6 +779,14 @@ static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
lw->inv_weight = 0;
}

+static inline void update_load_sub_add(struct load_weight *lw, unsigned long dec,
+ unsigned long inc)
+{
+ lw->weight -= dec;
+ lw->weight += inc;
+ lw->inv_weight = 0;
+}
+
static inline void update_load_set(struct load_weight *lw, unsigned long w)
{
lw->weight = w;
--
1.7.4.1

2012-02-16 14:29:38

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v4] sched: Avoid unnecessary work in reweight_entity

On Thu, 2012-02-16 at 22:14 +0800, Michael Wang wrote:
> From: Michael Wang <[email protected]>
>
> In original code, using account_entity_dequeue and account_entity_enqueue
> as a pair will do some work unnecessary, such like remove node from list
> and add it back again immediately.
>
> And because there are no really enqueue and dequeue happen here, it is not
> suitable to use account_entity_dequeue and account_entity_enqueue here.
>
> This patch combine the work of them in order to save time.

And yet no numbers on how much it saves :-(

> Signed-off-by: Michael Wang <[email protected]>
> ---
> kernel/sched/fair.c | 27 +++++++++++++++++++++------
> kernel/sched/sched.h | 8 ++++++++
> 2 files changed, 29 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 7c6414f..fda63ec 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -777,16 +777,28 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> */
>
> #if defined CONFIG_SMP && defined CONFIG_FAIR_GROUP_SCHED
> -static void
> +static inline void
> add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
> {
> cfs_rq->task_weight += weight;
> }
> +
> +static inline void
> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
> +{
> + cfs_rq->task_weight -= sub;
> + cfs_rq->task_weight += add;
> +}

This is pointless, just use: add_cfs_task_weight(cfs_rq, add - sub);

> #else
> static inline void
> add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
> {
> }
> +
> +static inline void
> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
> +{
> +}
> #endif
>
> static void
> @@ -938,6 +950,7 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
> {
> }
> # endif /* CONFIG_SMP */
> +
> static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
> unsigned long weight)
> {
> @@ -945,13 +958,15 @@ static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
> /* commit outstanding execution time */
> if (cfs_rq->curr == se)
> update_curr(cfs_rq);
> - account_entity_dequeue(cfs_rq, se);
> + update_load_sub_add(&cfs_rq->load, se->load.weight, weight);
> + if (!parent_entity(se)) {
> + update_load_sub_add(&rq_of(cfs_rq)->load, se->load.weight, weight);

same here:
update_load_add(&rq_of(cfs_rq)->load, weight - se->load.weight);

Also superfluous use of curly-braces here.

> + }
> + if (entity_is_task(se)) {
> + sub_add_cfs_task_weight(cfs_rq, se->load.weight, weight);
> + }

idem.

> }
> -
> update_load_set(&se->load, weight);
> -
> - if (se->on_rq)
> - account_entity_enqueue(cfs_rq, se);
> }
>
> static void update_cfs_shares(struct cfs_rq *cfs_rq)
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 98c0c26..ec4430f 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -779,6 +779,14 @@ static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
> lw->inv_weight = 0;
> }
>
> +static inline void update_load_sub_add(struct load_weight *lw, unsigned long dec,
> + unsigned long inc)
> +{
> + lw->weight -= dec;
> + lw->weight += inc;
> + lw->inv_weight = 0;
> +}

again, pointless stuff.

> static inline void update_load_set(struct load_weight *lw, unsigned long w)
> {
> lw->weight = w;

2012-02-16 14:33:19

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v4] sched: Avoid unnecessary work in reweight_entity

Hi, All

I have correct the issue about invoke update_curr too many times, it caused by
the context in dequeu_entity, the se will be cfs_rq->curr and se->on_rq is 0.

I suppose this patch can avoid some work like:

1. reduce times to check "se->on" from 2 to 1.
2. reduce times to check "parent_entity(se)" from 2 to 1.
3. reduce times to check "entity_is_task(se)" from 2 to 1.
4. reduce times to set "cfs_rq->load.inv_weight" from 4 to 2.
5. reduce times to set "rq_of(cfs_rq)->load.inv_weight" from 2 to 1.
6. no need to use "list_add(&se->group_node, &cfs_rq->tasks);" any more.
7. no need to use "list_del_init(&se->group_node);" any more.
8. no need to do "cfs_rq->nr_running++" and "cfs_rq->nr_running--" any more.

Please tell me if you found in some conditions it will not work or even reduce
the performance, any comments are appreciated :)

Best Regards.
Michael Wang

On 02/16/2012 10:14 PM, Michael Wang wrote:

> From: Michael Wang <[email protected]>
>
> In original code, using account_entity_dequeue and account_entity_enqueue
> as a pair will do some work unnecessary, such like remove node from list
> and add it back again immediately.
>
> And because there are no really enqueue and dequeue happen here, it is not
> suitable to use account_entity_dequeue and account_entity_enqueue here.
>
> This patch combine the work of them in order to save time.
>
> v2:
> Add more description and revise wrong code.
>
> v3:
> Mark add_cfs_task_weight and sub_add_cfs_task_weight as inline.
>
> v4:
> Now we invoke update_curr when really necessary.
>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> kernel/sched/fair.c | 27 +++++++++++++++++++++------
> kernel/sched/sched.h | 8 ++++++++
> 2 files changed, 29 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 7c6414f..fda63ec 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -777,16 +777,28 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> */
>
> #if defined CONFIG_SMP && defined CONFIG_FAIR_GROUP_SCHED
> -static void
> +static inline void
> add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
> {
> cfs_rq->task_weight += weight;
> }
> +
> +static inline void
> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
> +{
> + cfs_rq->task_weight -= sub;
> + cfs_rq->task_weight += add;
> +}
> #else
> static inline void
> add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
> {
> }
> +
> +static inline void
> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
> +{
> +}
> #endif
>
> static void
> @@ -938,6 +950,7 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
> {
> }
> # endif /* CONFIG_SMP */
> +
> static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
> unsigned long weight)
> {
> @@ -945,13 +958,15 @@ static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
> /* commit outstanding execution time */
> if (cfs_rq->curr == se)
> update_curr(cfs_rq);
> - account_entity_dequeue(cfs_rq, se);
> + update_load_sub_add(&cfs_rq->load, se->load.weight, weight);
> + if (!parent_entity(se)) {
> + update_load_sub_add(&rq_of(cfs_rq)->load, se->load.weight, weight);
> + }
> + if (entity_is_task(se)) {
> + sub_add_cfs_task_weight(cfs_rq, se->load.weight, weight);
> + }
> }
> -
> update_load_set(&se->load, weight);
> -
> - if (se->on_rq)
> - account_entity_enqueue(cfs_rq, se);
> }
>
> static void update_cfs_shares(struct cfs_rq *cfs_rq)
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 98c0c26..ec4430f 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -779,6 +779,14 @@ static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
> lw->inv_weight = 0;
> }
>
> +static inline void update_load_sub_add(struct load_weight *lw, unsigned long dec,
> + unsigned long inc)
> +{
> + lw->weight -= dec;
> + lw->weight += inc;
> + lw->inv_weight = 0;
> +}
> +
> static inline void update_load_set(struct load_weight *lw, unsigned long w)
> {
> lw->weight = w;

2012-02-17 02:28:41

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v4] sched: Avoid unnecessary work in reweight_entity

Hi, Peter

Thanks for your reply.

On 02/16/2012 10:29 PM, Peter Zijlstra wrote:

> On Thu, 2012-02-16 at 22:14 +0800, Michael Wang wrote:
>> From: Michael Wang <[email protected]>
>>
>> In original code, using account_entity_dequeue and account_entity_enqueue
>> as a pair will do some work unnecessary, such like remove node from list
>> and add it back again immediately.
>>
>> And because there are no really enqueue and dequeue happen here, it is not
>> suitable to use account_entity_dequeue and account_entity_enqueue here.
>>
>> This patch combine the work of them in order to save time.
>
> And yet no numbers on how much it saves :-(
>


That's right...
Actually I don't know how to prove the performance improvement, please
give me some suggestion and I will do it :)

>> Signed-off-by: Michael Wang <[email protected]>
>> ---
>> kernel/sched/fair.c | 27 +++++++++++++++++++++------
>> kernel/sched/sched.h | 8 ++++++++
>> 2 files changed, 29 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 7c6414f..fda63ec 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -777,16 +777,28 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
>> */
>>
>> #if defined CONFIG_SMP && defined CONFIG_FAIR_GROUP_SCHED
>> -static void
>> +static inline void
>> add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
>> {
>> cfs_rq->task_weight += weight;
>> }
>> +
>> +static inline void
>> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
>> +{
>> + cfs_rq->task_weight -= sub;
>> + cfs_rq->task_weight += add;
>> +}
>
> This is pointless, just use: add_cfs_task_weight(cfs_rq, add - sub);
>


Sorry for the silly code, v5 will correct it and all the issues mentioned below.

Regards,
Michael Wang

>> #else
>> static inline void
>> add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
>> {
>> }
>> +
>> +static inline void
>> +sub_add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long sub, unsigned long add)
>> +{
>> +}
>> #endif
>>
>> static void
>> @@ -938,6 +950,7 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
>> {
>> }
>> # endif /* CONFIG_SMP */
>> +
>> static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
>> unsigned long weight)
>> {
>> @@ -945,13 +958,15 @@ static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
>> /* commit outstanding execution time */
>> if (cfs_rq->curr == se)
>> update_curr(cfs_rq);
>> - account_entity_dequeue(cfs_rq, se);
>> + update_load_sub_add(&cfs_rq->load, se->load.weight, weight);
>> + if (!parent_entity(se)) {
>> + update_load_sub_add(&rq_of(cfs_rq)->load, se->load.weight, weight);
>
> same here:
> update_load_add(&rq_of(cfs_rq)->load, weight - se->load.weight);
>
> Also superfluous use of curly-braces here.
>
>> + }
>> + if (entity_is_task(se)) {
>> + sub_add_cfs_task_weight(cfs_rq, se->load.weight, weight);
>> + }
>
> idem.
>
>> }
>> -
>> update_load_set(&se->load, weight);
>> -
>> - if (se->on_rq)
>> - account_entity_enqueue(cfs_rq, se);
>> }
>>
>> static void update_cfs_shares(struct cfs_rq *cfs_rq)
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index 98c0c26..ec4430f 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -779,6 +779,14 @@ static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
>> lw->inv_weight = 0;
>> }
>>
>> +static inline void update_load_sub_add(struct load_weight *lw, unsigned long dec,
>> + unsigned long inc)
>> +{
>> + lw->weight -= dec;
>> + lw->weight += inc;
>> + lw->inv_weight = 0;
>> +}
>
> again, pointless stuff.
>
>> static inline void update_load_set(struct load_weight *lw, unsigned long w)
>> {
>> lw->weight = w;
>

2012-02-17 06:04:34

by Michael wang

[permalink] [raw]
Subject: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

From: Michael Wang <[email protected]>

In original code, using account_entity_dequeue and account_entity_enqueue
as a pair will do some work unnecessary, such like remove node from list
and add it back again immediately.

And because there are no really enqueue and dequeue happen here, it is not
suitable to use account_entity_dequeue and account_entity_enqueue here.

This patch combine the work of them in order to save time.

since v1:
Add more description.
Mark add_cfs_task_weight as inline.
correct useless and wrong code.
Add check point for the case there is no change on weight.

Signed-off-by: Michael Wang <[email protected]>
---
kernel/sched/fair.c | 16 ++++++++++------
1 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7c6414f..20aa355 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -777,7 +777,7 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
*/

#if defined CONFIG_SMP && defined CONFIG_FAIR_GROUP_SCHED
-static void
+static inline void
add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
{
cfs_rq->task_weight += weight;
@@ -938,20 +938,24 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
{
}
# endif /* CONFIG_SMP */
+
static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
unsigned long weight)
{
+ if (weight == se->load.weight)
+ return;
+
if (se->on_rq) {
/* commit outstanding execution time */
if (cfs_rq->curr == se)
update_curr(cfs_rq);
- account_entity_dequeue(cfs_rq, se);
+ update_load_add(&cfs_rq->load, weight - se->load.weight);
+ if (!parent_entity(se))
+ update_load_add(&rq_of(cfs_rq)->load, weight - se->load.weight);
+ if (entity_is_task(se))
+ add_cfs_task_weight(cfs_rq, weight - se->load.weight);
}
-
update_load_set(&se->load, weight);
-
- if (se->on_rq)
- account_entity_enqueue(cfs_rq, se);
}

static void update_cfs_shares(struct cfs_rq *cfs_rq)
--
1.7.4.1

2012-02-18 01:44:13

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

Hi, Peter

This is the v5, any I found there will be the case that weight == se->load.weight
in reweight_entity(very often according to the log), so I add the check point.

And I try to use perf to see whether it can help to prove the improvement, but get:

invalid or unsupported event: 'sched:sched_switch'

when use command "perf sched record".

And also someone told me that even use perf sched may not be able to prove anything
because what I do may only help improve little performance which can not be easily
detected :(

So may be we can prove this patch's performance improvement logically, the summary is:

1. reduce times to check "se->on" from 2 to 1.
2. reduce times to check "parent_entity(se)" from 2 to 1.
3. reduce times to check "entity_is_task(se)" from 2 to 1.
4. reduce times to set "cfs_rq->load.inv_weight" from 4 to 2.
5. reduce times to set "rq_of(cfs_rq)->load.inv_weight" from 2 to 1.
6. no need to use "list_add(&se->group_node, &cfs_rq->tasks);" any more.
7. no need to use "list_del_init(&se->group_node);" any more.
8. no need to do "cfs_rq->nr_running++" and "cfs_rq->nr_running--" any more.

above summary is already in previous mail, any for v5, we get one more:

9. do nothing if the new weight is same to the old weight.

And as reight_entity is invoked very often, I think this patch can do some help to the
performance, although there are no numbers, we can prove it logically :)

Best regards,
Michael Wang

On 02/17/2012 02:03 PM, Michael Wang wrote:

> From: Michael Wang <[email protected]>
>
> In original code, using account_entity_dequeue and account_entity_enqueue
> as a pair will do some work unnecessary, such like remove node from list
> and add it back again immediately.
>
> And because there are no really enqueue and dequeue happen here, it is not
> suitable to use account_entity_dequeue and account_entity_enqueue here.
>
> This patch combine the work of them in order to save time.
>
> since v1:
> Add more description.
> Mark add_cfs_task_weight as inline.
> correct useless and wrong code.
> Add check point for the case there is no change on weight.
>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> kernel/sched/fair.c | 16 ++++++++++------
> 1 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 7c6414f..20aa355 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -777,7 +777,7 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> */
>
> #if defined CONFIG_SMP && defined CONFIG_FAIR_GROUP_SCHED
> -static void
> +static inline void
> add_cfs_task_weight(struct cfs_rq *cfs_rq, unsigned long weight)
> {
> cfs_rq->task_weight += weight;
> @@ -938,20 +938,24 @@ static inline void update_entity_shares_tick(struct cfs_rq *cfs_rq)
> {
> }
> # endif /* CONFIG_SMP */
> +
> static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
> unsigned long weight)
> {
> + if (weight == se->load.weight)
> + return;
> +
> if (se->on_rq) {
> /* commit outstanding execution time */
> if (cfs_rq->curr == se)
> update_curr(cfs_rq);
> - account_entity_dequeue(cfs_rq, se);
> + update_load_add(&cfs_rq->load, weight - se->load.weight);
> + if (!parent_entity(se))
> + update_load_add(&rq_of(cfs_rq)->load, weight - se->load.weight);
> + if (entity_is_task(se))
> + add_cfs_task_weight(cfs_rq, weight - se->load.weight);
> }
> -
> update_load_set(&se->load, weight);
> -
> - if (se->on_rq)
> - account_entity_enqueue(cfs_rq, se);
> }
>
> static void update_cfs_shares(struct cfs_rq *cfs_rq)

2012-02-20 13:08:38

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

On Sat, 2012-02-18 at 09:43 +0800, Michael Wang wrote:
> And as reight_entity is invoked very often, I think this patch can do some help to the
> performance, although there are no numbers, we can prove it logically :)

Well, you're probably right (although I think you completely ignored the
optimizing compiler), but still it would be good to get some numbers to
confirm reality :-)

Just pick your favourite cgroup workload/benchmark and run a pre/post
comparison, possible using perf record.

If all squares up you should see an improvement in your benchmark score
as well as see a reduction in time spend in the function you're
patching.

2012-02-23 10:41:09

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

On 02/20/2012 09:08 PM, Peter Zijlstra wrote:

Hi, Peter

Sorry for reply so late, I was blocked by some issue army while setup the
testing environment.

> On Sat, 2012-02-18 at 09:43 +0800, Michael Wang wrote:
>> And as reight_entity is invoked very often, I think this patch can do some help to the
>> performance, although there are no numbers, we can prove it logically :)
>
> Well, you're probably right (although I think you completely ignored the
> optimizing compiler), but still it would be good to get some numbers to
> confirm reality :-)


That's right, if consider the compiler's optimization, the logic improvements
I listed may not be true...

>

> Just pick your favourite cgroup workload/benchmark and run a pre/post
> comparison, possible using perf record.
>
> If all squares up you should see an improvement in your benchmark score
> as well as see a reduction in time spend in the function you're
> patching.


So I created a cpuset cgroup 'rg1' and his sub memory group 'sub',
attached current shell to 'sub', then use 'time make kernel' as benchmark.

Below is the test result:

'time make':
old
real: 87m53.804s user: 66m41.886s sys: 11m51.792s
new
real: 86m8.485s user: 65m49.211s sys: 11m47.264s

'time make -j14':
old:
real: 42m43.825s user: 124m13.726s sys: 17m57.183s
new
real: 39m0.072s user: 114m33.258s sys: 15m30.662s

I also try to use 'perf sched record', but I can only record few seconds time,
otherwise it will be too big and report some error, as the record time is too
short, results are very different from each other, I failed to use them to prove
the patch:(

I also have try to use some other method, I moved 'reweight_entity' and related
functions to user space, and invoke it 10000000 times in 'main', I have append
part of the code (really raw...) in the end.

Three times output is:

old:
real 0m0.715s 0m0.710s 0m0.739s
user 0m0.716s 0m0.708s 0m0.736s
sys 0m0.000s 0m0.000s 0m0.000s

new:
real 0m0.318s 0m0.308s 0m0.317s
user 0m0.316s 0m0.304s 0m0.316s
sys 0m0.000s 0m0.000s 0m0.000s

It seems like that new code can save more then half execution time, but also we
can see, after calculation, what I have done can only save 0.04ns(too small...).

The user space test result is not accurate but at least we can know new code is
faster then old.

Please tell me if we need to do some thing else, and thanks for your suggestion :)

Regards,
Michael Wang



User space code:

void
account_entity_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
update_load_add(&cfs_rq->load, se->load.weight);
if (1)
update_load_add(&cfs_rq->load, se->load.weight);
if (1) {
add_cfs_task_weight(cfs_rq, se->load.weight);
list_add(&se->group_node, &cfs_rq->tasks);
}
cfs_rq->nr_running++;
}

void
account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
update_load_sub(&cfs_rq->load, se->load.weight);
if (1)
update_load_sub(&cfs_rq->load, se->load.weight);
if (1) {
add_cfs_task_weight(cfs_rq, -se->load.weight);
list_del_init(&se->group_node);
}
cfs_rq->nr_running--;
}

void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
unsigned long weight)
{
if (1) {
account_entity_dequeue(cfs_rq, se);
}

update_load_set(&se->load, weight);

if (1)
account_entity_enqueue(cfs_rq, se);
}

void reweight_entity_new(struct cfs_rq *cfs_rq, struct sched_entity *se,
unsigned long weight)
{
if (1) {
update_load_add(&cfs_rq->load, weight - se->load.weight);
if(1)
update_load_add(&cfs_rq->load, weight -
se->load.weight);
if(1)
add_cfs_task_weight(cfs_rq, weight
-se->load.weight);
}
update_load_set(&se->load, weight);
}

int main()
{
struct cfs_rq cfsrq;
struct sched_entity se;
memset(&cfsrq, 0, sizeof(struct cfs_rq));
memset(&se, 0, sizeof(struct sched_entity));
INIT_LIST_HEAD(&se.group_node);
INIT_LIST_HEAD(&cfsrq.tasks);
int i = 10000000;
while(i) {
i--;
reweight_entity_new(&cfsrq, &se, 10);
//reweight_entity(&cfsrq, &se, 10);
}
}

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>







2012-02-24 02:08:21

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

On 02/23/2012 06:40 PM, Michael Wang wrote:

> On 02/20/2012 09:08 PM, Peter Zijlstra wrote:
>
> Hi, Peter
>
> Sorry for reply so late, I was blocked by some issue army while setup the
> testing environment.
>
>> On Sat, 2012-02-18 at 09:43 +0800, Michael Wang wrote:
>>> And as reight_entity is invoked very often, I think this patch can do some help to the
>>> performance, although there are no numbers, we can prove it logically :)
>>
>> Well, you're probably right (although I think you completely ignored the
>> optimizing compiler), but still it would be good to get some numbers to
>> confirm reality :-)
>
>
> That's right, if consider the compiler's optimization, the logic improvements
> I listed may not be true...
>
>>
>
>> Just pick your favourite cgroup workload/benchmark and run a pre/post
>> comparison, possible using perf record.
>>
>> If all squares up you should see an improvement in your benchmark score
>> as well as see a reduction in time spend in the function you're
>> patching.
>
>
> So I created a cpuset cgroup 'rg1' and his sub memory group 'sub',
> attached current shell to 'sub', then use 'time make kernel' as benchmark.
>
> Below is the test result:
>
> 'time make':
> old
> real: 87m53.804s user: 66m41.886s sys: 11m51.792s
> new
> real: 86m8.485s user: 65m49.211s sys: 11m47.264s
>
> 'time make -j14':
> old:
> real: 42m43.825s user: 124m13.726s sys: 17m57.183s
> new
> real: 39m0.072s user: 114m33.258s sys: 15m30.662s
>


Hi, Peter

Someone notify me that this result is ridiculous, I should have done more test,
not just once, this is really my fault, please give me more time, I will back
with more data so we can use average number.

Regards,
Michael Wang

> I also try to use 'perf sched record', but I can only record few seconds time,
> otherwise it will be too big and report some error, as the record time is too
> short, results are very different from each other, I failed to use them to prove
> the patch:(
>
> I also have try to use some other method, I moved 'reweight_entity' and related
> functions to user space, and invoke it 10000000 times in 'main', I have append
> part of the code (really raw...) in the end.
>
> Three times output is:
>
> old:
> real 0m0.715s 0m0.710s 0m0.739s
> user 0m0.716s 0m0.708s 0m0.736s
> sys 0m0.000s 0m0.000s 0m0.000s
>
> new:
> real 0m0.318s 0m0.308s 0m0.317s
> user 0m0.316s 0m0.304s 0m0.316s
> sys 0m0.000s 0m0.000s 0m0.000s
>
> It seems like that new code can save more then half execution time, but also we
> can see, after calculation, what I have done can only save 0.04ns(too small...).
>
> The user space test result is not accurate but at least we can know new code is
> faster then old.
>
> Please tell me if we need to do some thing else, and thanks for your suggestion :)
>
> Regards,
> Michael Wang
>
>
>
> User space code:
>
> void
> account_entity_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
> update_load_add(&cfs_rq->load, se->load.weight);
> if (1)
> update_load_add(&cfs_rq->load, se->load.weight);
> if (1) {
> add_cfs_task_weight(cfs_rq, se->load.weight);
> list_add(&se->group_node, &cfs_rq->tasks);
> }
> cfs_rq->nr_running++;
> }
>
> void
> account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
> update_load_sub(&cfs_rq->load, se->load.weight);
> if (1)
> update_load_sub(&cfs_rq->load, se->load.weight);
> if (1) {
> add_cfs_task_weight(cfs_rq, -se->load.weight);
> list_del_init(&se->group_node);
> }
> cfs_rq->nr_running--;
> }
>
> void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
> unsigned long weight)
> {
> if (1) {
> account_entity_dequeue(cfs_rq, se);
> }
>
> update_load_set(&se->load, weight);
>
> if (1)
> account_entity_enqueue(cfs_rq, se);
> }
>
> void reweight_entity_new(struct cfs_rq *cfs_rq, struct sched_entity *se,
> unsigned long weight)
> {
> if (1) {
> update_load_add(&cfs_rq->load, weight - se->load.weight);
> if(1)
> update_load_add(&cfs_rq->load, weight -
> se->load.weight);
> if(1)
> add_cfs_task_weight(cfs_rq, weight
> -se->load.weight);
> }
> update_load_set(&se->load, weight);
> }
>
> int main()
> {
> struct cfs_rq cfsrq;
> struct sched_entity se;
> memset(&cfsrq, 0, sizeof(struct cfs_rq));
> memset(&se, 0, sizeof(struct sched_entity));
> INIT_LIST_HEAD(&se.group_node);
> INIT_LIST_HEAD(&cfsrq.tasks);
> int i = 10000000;
> while(i) {
> i--;
> reweight_entity_new(&cfsrq, &se, 10);
> //reweight_entity(&cfsrq, &se, 10);
> }
> }
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>
>
>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2012-02-25 13:56:40

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

Hi, Peter

I have collected more testing data, here is the test results:

Machine: ThinkPad T420
OS: Ubuntu 11.10
Benchmark: time make -j14 (build kernel)

pre:

real user sys
1. 38m32.274s 111m53.116s 15m31.866s
2. 38m36.775s 110m49.364s 16m40.927s
3. 38m48.107s 111m29.806s 16m35.794s
4. 38m35.009s 111m41.443s 16m9.705s
5 38m49.189s 112m14.973s 15m57.924s
6. 38m34.523s 109m0.265s 17m6.584s
7. 38m38.986s 110m3.385s 16m52.735s

Avg 38m47.838s 111m1.765s 16m25.076s

post:

real user sys
1. 39m32.546s 113m28.918s 16m19.217s
2. 38m40.743s 112m9.057s 15m34.246s
3. 39m12.018s 113m9.192s 16m22.933s
4. 38m55.000s 113m56.203s 15m41.343s
5. 38m53.893s 111m35.814s 15m45.303s
6. 38m48.234s 111m17.537s 15m41.287s
7. 39m11.803s 111m2.760s 16m0.012s

Avg 39m2.034s 112m22.783s 15m50.055s

Summary:
real user sys user + sys
pre Avg 38m47.838s 111m1.765s 16m25.076s 127m26.841s
post Avg 39m2.034s 112m22.783s 15m50.055s 128m12.838s
+0.6% +1.22% -3.69% +0.6%

The machine is dedicated, so I think the results should be accurate,
what we can see is the time cost in sys reduced obviously, I think this
is caused by "reweight_entity"'s reduced execution time(which we can check
in user space).

But I was confused that why the time 'user' and 'real' increased while the
'sys' down, so I really want to know your opinion on the testing result, and
will be appreciate if some one willing to provide his point of view.

Regards,
Michael Wang

On 02/24/2012 10:08 AM, Michael Wang wrote:

> On 02/23/2012 06:40 PM, Michael Wang wrote:
>
>> On 02/20/2012 09:08 PM, Peter Zijlstra wrote:
>>
>> Hi, Peter
>>
>> Sorry for reply so late, I was blocked by some issue army while setup the
>> testing environment.
>>
>>> On Sat, 2012-02-18 at 09:43 +0800, Michael Wang wrote:
>>>> And as reight_entity is invoked very often, I think this patch can do some help to the
>>>> performance, although there are no numbers, we can prove it logically :)
>>>
>>> Well, you're probably right (although I think you completely ignored the
>>> optimizing compiler), but still it would be good to get some numbers to
>>> confirm reality :-)
>>
>>
>> That's right, if consider the compiler's optimization, the logic improvements
>> I listed may not be true...
>>
>>>
>>
>>> Just pick your favourite cgroup workload/benchmark and run a pre/post
>>> comparison, possible using perf record.
>>>
>>> If all squares up you should see an improvement in your benchmark score
>>> as well as see a reduction in time spend in the function you're
>>> patching.
>>
>>
>> So I created a cpuset cgroup 'rg1' and his sub memory group 'sub',
>> attached current shell to 'sub', then use 'time make kernel' as benchmark.
>>
>> Below is the test result:
>>
>> 'time make':
>> old
>> real: 87m53.804s user: 66m41.886s sys: 11m51.792s
>> new
>> real: 86m8.485s user: 65m49.211s sys: 11m47.264s
>>
>> 'time make -j14':
>> old:
>> real: 42m43.825s user: 124m13.726s sys: 17m57.183s
>> new
>> real: 39m0.072s user: 114m33.258s sys: 15m30.662s
>>
>
>
> Hi, Peter
>
> Someone notify me that this result is ridiculous, I should have done more test,
> not just once, this is really my fault, please give me more time, I will back
> with more data so we can use average number.
>
> Regards,
> Michael Wang
>
>> I also try to use 'perf sched record', but I can only record few seconds time,
>> otherwise it will be too big and report some error, as the record time is too
>> short, results are very different from each other, I failed to use them to prove
>> the patch:(
>>
>> I also have try to use some other method, I moved 'reweight_entity' and related
>> functions to user space, and invoke it 10000000 times in 'main', I have append
>> part of the code (really raw...) in the end.
>>
>> Three times output is:
>>
>> old:
>> real 0m0.715s 0m0.710s 0m0.739s
>> user 0m0.716s 0m0.708s 0m0.736s
>> sys 0m0.000s 0m0.000s 0m0.000s
>>
>> new:
>> real 0m0.318s 0m0.308s 0m0.317s
>> user 0m0.316s 0m0.304s 0m0.316s
>> sys 0m0.000s 0m0.000s 0m0.000s
>>
>> It seems like that new code can save more then half execution time, but also we
>> can see, after calculation, what I have done can only save 0.04ns(too small...).
>>
>> The user space test result is not accurate but at least we can know new code is
>> faster then old.
>>
>> Please tell me if we need to do some thing else, and thanks for your suggestion :)
>>
>> Regards,
>> Michael Wang
>>
>>
>>
>> User space code:
>>
>> void
>> account_entity_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
>> {
>> update_load_add(&cfs_rq->load, se->load.weight);
>> if (1)
>> update_load_add(&cfs_rq->load, se->load.weight);
>> if (1) {
>> add_cfs_task_weight(cfs_rq, se->load.weight);
>> list_add(&se->group_node, &cfs_rq->tasks);
>> }
>> cfs_rq->nr_running++;
>> }
>>
>> void
>> account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
>> {
>> update_load_sub(&cfs_rq->load, se->load.weight);
>> if (1)
>> update_load_sub(&cfs_rq->load, se->load.weight);
>> if (1) {
>> add_cfs_task_weight(cfs_rq, -se->load.weight);
>> list_del_init(&se->group_node);
>> }
>> cfs_rq->nr_running--;
>> }
>>
>> void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
>> unsigned long weight)
>> {
>> if (1) {
>> account_entity_dequeue(cfs_rq, se);
>> }
>>
>> update_load_set(&se->load, weight);
>>
>> if (1)
>> account_entity_enqueue(cfs_rq, se);
>> }
>>
>> void reweight_entity_new(struct cfs_rq *cfs_rq, struct sched_entity *se,
>> unsigned long weight)
>> {
>> if (1) {
>> update_load_add(&cfs_rq->load, weight - se->load.weight);
>> if(1)
>> update_load_add(&cfs_rq->load, weight -
>> se->load.weight);
>> if(1)
>> add_cfs_task_weight(cfs_rq, weight
>> -se->load.weight);
>> }
>> update_load_set(&se->load, weight);
>> }
>>
>> int main()
>> {
>> struct cfs_rq cfsrq;
>> struct sched_entity se;
>> memset(&cfsrq, 0, sizeof(struct cfs_rq));
>> memset(&se, 0, sizeof(struct sched_entity));
>> INIT_LIST_HEAD(&se.group_node);
>> INIT_LIST_HEAD(&cfsrq.tasks);
>> int i = 10000000;
>> while(i) {
>> i--;
>> reweight_entity_new(&cfsrq, &se, 10);
>> //reweight_entity(&cfsrq, &se, 10);
>> }
>> }
>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at http://www.tux.org/lkml/
>>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2012-02-27 04:13:28

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

* Michael Wang <[email protected]> [2012-02-25 21:56:18]:

> Hi, Peter
>
> I have collected more testing data, here is the test results:
>
> Machine: ThinkPad T420
> OS: Ubuntu 11.10
> Benchmark: time make -j14 (build kernel)

Is that benchmark run in root (cpu) cgroup? If so, reweight_entity() should not
kick in at all.

static void update_cfs_shares(struct cfs_rq *cfs_rq)
{

..

if (!se || throttled_hierarchy(cfs_rq))
return;

..

reweight_entity();
}

If you want to stress reweight_entity() create several (cpu) cgroups
and launch workload like kernbench in each of them ..

- vatsa

2012-02-27 05:07:47

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

On 02/27/2012 12:12 PM, Srivatsa Vaddagiri wrote:

> * Michael Wang <[email protected]> [2012-02-25 21:56:18]:
>
>> Hi, Peter
>>
>> I have collected more testing data, here is the test results:
>>
>> Machine: ThinkPad T420
>> OS: Ubuntu 11.10
>> Benchmark: time make -j14 (build kernel)
>

Hi, Vatsa

Thanks for your reply :)

> Is that benchmark run in root (cpu) cgroup? If so, reweight_entity() should not
> kick in at all.


That's right, if no children group, 'reweight_entity' won't be called, so I have
created a cpuset group under root group named 'rg1', and created a memory group
under 'rg1' named 'sub', I attached the current shell to the 'sub' cgroup.
But I haven't changed any param under the new cgroup, don't know whether that will
cause some trouble or not? I suppose it could using some default value...

>
> static void update_cfs_shares(struct cfs_rq *cfs_rq)
> {
>
> ..
>
> if (!se || throttled_hierarchy(cfs_rq))
> return;
>
> ..
>
> reweight_entity();
> }
>
> If you want to stress reweight_entity() create several (cpu) cgroups
> and launch workload like kernbench in each of them ..


Thanks for your suggestion, now I see that only using 1 children group is really
not enough, I think I should try another round of test with kernbench, and also
I was suggested to use oprofile to trace the 'reweight_entity', wish I can get
some real proof from them.

Regards,
Michael Wang

>
> - vatsa

2012-02-27 05:10:32

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

* Michael Wang <[email protected]> [2012-02-27 13:07:28]:

> > Is that benchmark run in root (cpu) cgroup? If so, reweight_entity() should not
> > kick in at all.
>
>
> That's right, if no children group, 'reweight_entity' won't be called, so I have
> created a cpuset group under root group named 'rg1', and created a memory group
> under 'rg1' named 'sub', I attached the current shell to the 'sub' cgroup.

'cpu' and 'cpuset' cgroup resource controllers are separate. By above
steps, you would still be running kern-bench in root cgroup as far as cpu
cgroup controller is concerned ..

- vatsa

2012-02-27 06:22:13

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

On 02/27/2012 01:10 PM, Srivatsa Vaddagiri wrote:

> * Michael Wang <[email protected]> [2012-02-27 13:07:28]:
>
>>> Is that benchmark run in root (cpu) cgroup? If so, reweight_entity() should not
>>> kick in at all.
>>
>>
>> That's right, if no children group, 'reweight_entity' won't be called, so I have
>> created a cpuset group under root group named 'rg1', and created a memory group
>> under 'rg1' named 'sub', I attached the current shell to the 'sub' cgroup.
>
> 'cpu' and 'cpuset' cgroup resource controllers are separate. By above
> steps, you would still be running kern-bench in root cgroup as far as cpu
> cgroup controller is concerned ..


I think I really need some study on cgroup first...I've done some totally useless
test :( , but still confused that why sys time reduced?

And I got a server with Redhat now, so I will do:

1. cgcreate -g cpu:/subcg1
2. echo $$ > /cgroup/cpu/subcg1/tasks
3. open another shell
4. do 1~3 multi times, plan to get 7 cpu cgroup with 7 shell attached to each one.
5. run kernbench in each shell

Please tell me if I missed some thing :)

Regards,
Michael Wang

>
> - vatsa

2012-02-27 06:44:34

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v5] sched: Avoid unnecessary work in reweight_entity

* Michael Wang <[email protected]> [2012-02-27 14:21:57]:
> I think I really need some study on cgroup first...I've done some totally useless
> test :( ,

Yes!

> but still confused that why sys time reduced?

Could be just noise ..

> And I got a server with Redhat now, so I will do:
>
> 1. cgcreate -g cpu:/subcg1
> 2. echo $$ > /cgroup/cpu/subcg1/tasks
> 3. open another shell
> 4. do 1~3 multi times, plan to get 7 cpu cgroup with 7 shell attached to each one.
> 5. run kernbench in each shell
>
> Please tell me if I missed some thing :)

That should stress the cgroup-specific paths of scheduler!

-vatsa