2022-07-13 04:22:38

by Chengming Zhou

[permalink] [raw]
Subject: [PATCH v2 00/10] sched: task load tracking optimization and cleanup

Hi all,

This patch series is optimization and cleanup for task load tracking
when task migrate CPU/cgroup or switched_from/to_fair().

There are three types of detach/attach_entity_load_avg (except fork and
exit case) for a fair task:
1. task migrate CPU (on_rq migrate or wake_up migrate)
2. task migrate cgroup (detach then attach)
3. task switched_from/to_fair (detach later attach)

patch1 optimize the on_rq migrate CPU case by combine detach into dequeue,
so we don't need to do detach_entity_cfs_rq() in migrate_task_rq_fair()
any more.

patch3-4 cleanup the migrate cgroup case by remove cpu_cgrp_subsys->fork(),
since we already do the same thing in sched_cgroup_fork().

patch1-4 have been reviewed earlier, but conflicts with the current tip
tree, so include them here as a patchset. Sorry for the inconvenience.

patch6-7 use update_load_avg() to do attach/detach after check sched_avg
last_update_time, is preparation patch for the following patches.

patch8-9 fix load tracking for new forked !fair task and when task
switched_from_fair().

After these changes, the task sched_avg last_update_time is reset to 0
when migrate from CPU/cgroup or switched_from_fair(), to save updated
sched_avg for next attach.

Thanks.

Changes in v2:
- split task se depth maintainence into a separate patch3, suggested
by Peter.
- reorder patch6-7 before patch8-9, since we need update_load_avg()
to do conditional attach/detach to avoid corner cases like twice
attach problem.

Chengming Zhou (10):
sched/fair: combine detach into dequeue when migrating task
sched/fair: update comments in enqueue/dequeue_entity()
sched/fair: maintain task se depth in set_task_rq()
sched/fair: remove redundant cpu_cgrp_subsys->fork()
sched/fair: reset sched_avg last_update_time before set_task_rq()
sched/fair: delete superfluous SKIP_AGE_LOAD
sched/fair: use update_load_avg() to attach/detach entity load_avg
sched/fair: fix load tracking for new forked !fair task
sched/fair: stop load tracking when task switched_from_fair()
sched/fair: delete superfluous set_task_rq_fair()

kernel/sched/core.c | 27 ++------
kernel/sched/fair.c | 144 ++++++++++------------------------------
kernel/sched/features.h | 1 -
kernel/sched/sched.h | 14 +---
4 files changed, 41 insertions(+), 145 deletions(-)

--
2.36.1


2022-07-13 04:33:32

by Chengming Zhou

[permalink] [raw]
Subject: [PATCH v2 03/10] sched/fair: maintain task se depth in set_task_rq()

Previously we only maintain task se depth in task_move_group_fair(),
if a !fair task change task group, its se depth will not be updated,
so commit eb7a59b2c888 ("sched/fair: Reset se-depth when task switched to FAIR")
fix the problem by updating se depth in switched_to_fair() too.

This patch move task se depth maintainence to set_task_rq(), which will be
called when CPU/cgroup change, so its depth will always be correct.

This patch is preparation for the next patch.

Signed-off-by: Chengming Zhou <[email protected]>
---
kernel/sched/fair.c | 8 --------
kernel/sched/sched.h | 1 +
2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2a3e12ead144..bf595b622656 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11539,14 +11539,6 @@ static void attach_entity_cfs_rq(struct sched_entity *se)
{
struct cfs_rq *cfs_rq = cfs_rq_of(se);

-#ifdef CONFIG_FAIR_GROUP_SCHED
- /*
- * Since the real-depth could have been changed (only FAIR
- * class maintain depth value), reset depth properly.
- */
- se->depth = se->parent ? se->parent->depth + 1 : 0;
-#endif
-
/* Synchronize entity with its cfs_rq */
update_load_avg(cfs_rq, se, sched_feat(ATTACH_AGE_LOAD) ? 0 : SKIP_AGE_LOAD);
attach_entity_load_avg(cfs_rq, se);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index aad7f5ee9666..8cc3eb7b86cd 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1940,6 +1940,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
set_task_rq_fair(&p->se, p->se.cfs_rq, tg->cfs_rq[cpu]);
p->se.cfs_rq = tg->cfs_rq[cpu];
p->se.parent = tg->se[cpu];
+ p->se.depth = tg->se[cpu] ? tg->se[cpu]->depth + 1 : 0;
#endif

#ifdef CONFIG_RT_GROUP_SCHED
--
2.36.1

2022-07-14 12:33:46

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] sched/fair: maintain task se depth in set_task_rq()

On 13/07/2022 06:04, Chengming Zhou wrote:
> Previously we only maintain task se depth in task_move_group_fair(),
> if a !fair task change task group, its se depth will not be updated,
> so commit eb7a59b2c888 ("sched/fair: Reset se-depth when task switched to FAIR")
> fix the problem by updating se depth in switched_to_fair() too.

Maybe it's worth mentioning how the se.depth setting from
task_move_group_fair() and switched_to_fair() went into
attach_task_cfs_rq() with commit daa59407b558 ("sched/fair: Unify
switched_{from,to}_fair() and task_move_group_fair()"} and further into
attach_entity_cfs_rq() with commit df217913e72e ("sched/fair: Factorize
attach/detach entity").

> This patch move task se depth maintainence to set_task_rq(), which will be
> called when CPU/cgroup change, so its depth will always be correct.
>
> This patch is preparation for the next patch.
>
> Signed-off-by: Chengming Zhou <[email protected]>

Reviewed-by: Dietmar Eggemann <[email protected]>

> ---
> kernel/sched/fair.c | 8 --------
> kernel/sched/sched.h | 1 +
> 2 files changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2a3e12ead144..bf595b622656 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11539,14 +11539,6 @@ static void attach_entity_cfs_rq(struct sched_entity *se)
> {
> struct cfs_rq *cfs_rq = cfs_rq_of(se);
>
> -#ifdef CONFIG_FAIR_GROUP_SCHED
> - /*
> - * Since the real-depth could have been changed (only FAIR
> - * class maintain depth value), reset depth properly.
> - */
> - se->depth = se->parent ? se->parent->depth + 1 : 0;
> -#endif
> -
> /* Synchronize entity with its cfs_rq */
> update_load_avg(cfs_rq, se, sched_feat(ATTACH_AGE_LOAD) ? 0 : SKIP_AGE_LOAD);
> attach_entity_load_avg(cfs_rq, se);
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index aad7f5ee9666..8cc3eb7b86cd 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1940,6 +1940,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
> set_task_rq_fair(&p->se, p->se.cfs_rq, tg->cfs_rq[cpu]);
> p->se.cfs_rq = tg->cfs_rq[cpu];
> p->se.parent = tg->se[cpu];
> + p->se.depth = tg->se[cpu] ? tg->se[cpu]->depth + 1 : 0;
> #endif
>
> #ifdef CONFIG_RT_GROUP_SCHED

2022-07-14 13:48:38

by Chengming Zhou

[permalink] [raw]
Subject: Re: [External] Re: [PATCH v2 03/10] sched/fair: maintain task se depth in set_task_rq()

On 2022/7/14 20:30, Dietmar Eggemann wrote:
> On 13/07/2022 06:04, Chengming Zhou wrote:
>> Previously we only maintain task se depth in task_move_group_fair(),
>> if a !fair task change task group, its se depth will not be updated,
>> so commit eb7a59b2c888 ("sched/fair: Reset se-depth when task switched to FAIR")
>> fix the problem by updating se depth in switched_to_fair() too.
>
> Maybe it's worth mentioning how the se.depth setting from
> task_move_group_fair() and switched_to_fair() went into
> attach_task_cfs_rq() with commit daa59407b558 ("sched/fair: Unify
> switched_{from,to}_fair() and task_move_group_fair()"} and further into
> attach_entity_cfs_rq() with commit df217913e72e ("sched/fair: Factorize
> attach/detach entity").
>

Good point, I will add this part in the next version.

Thanks for your review!

>> This patch move task se depth maintainence to set_task_rq(), which will be
>> called when CPU/cgroup change, so its depth will always be correct.
>>
>> This patch is preparation for the next patch.
>>
>> Signed-off-by: Chengming Zhou <[email protected]>
>
> Reviewed-by: Dietmar Eggemann <[email protected]>
>
>> ---
>> kernel/sched/fair.c | 8 --------
>> kernel/sched/sched.h | 1 +
>> 2 files changed, 1 insertion(+), 8 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 2a3e12ead144..bf595b622656 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -11539,14 +11539,6 @@ static void attach_entity_cfs_rq(struct sched_entity *se)
>> {
>> struct cfs_rq *cfs_rq = cfs_rq_of(se);
>>
>> -#ifdef CONFIG_FAIR_GROUP_SCHED
>> - /*
>> - * Since the real-depth could have been changed (only FAIR
>> - * class maintain depth value), reset depth properly.
>> - */
>> - se->depth = se->parent ? se->parent->depth + 1 : 0;
>> -#endif
>> -
>> /* Synchronize entity with its cfs_rq */
>> update_load_avg(cfs_rq, se, sched_feat(ATTACH_AGE_LOAD) ? 0 : SKIP_AGE_LOAD);
>> attach_entity_load_avg(cfs_rq, se);
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index aad7f5ee9666..8cc3eb7b86cd 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -1940,6 +1940,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
>> set_task_rq_fair(&p->se, p->se.cfs_rq, tg->cfs_rq[cpu]);
>> p->se.cfs_rq = tg->cfs_rq[cpu];
>> p->se.parent = tg->se[cpu];
>> + p->se.depth = tg->se[cpu] ? tg->se[cpu]->depth + 1 : 0;
>> #endif
>>
>> #ifdef CONFIG_RT_GROUP_SCHED
>

2022-07-18 07:32:44

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] sched/fair: maintain task se depth in set_task_rq()

On Wed, 13 Jul 2022 at 06:04, Chengming Zhou
<[email protected]> wrote:
>
> Previously we only maintain task se depth in task_move_group_fair(),
> if a !fair task change task group, its se depth will not be updated,
> so commit eb7a59b2c888 ("sched/fair: Reset se-depth when task switched to FAIR")
> fix the problem by updating se depth in switched_to_fair() too.
>
> This patch move task se depth maintainence to set_task_rq(), which will be
> called when CPU/cgroup change, so its depth will always be correct.
>
> This patch is preparation for the next patch.
>
> Signed-off-by: Chengming Zhou <[email protected]>

Reviewed-by: Vincent Guittot <[email protected]>

> ---
> kernel/sched/fair.c | 8 --------
> kernel/sched/sched.h | 1 +
> 2 files changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2a3e12ead144..bf595b622656 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11539,14 +11539,6 @@ static void attach_entity_cfs_rq(struct sched_entity *se)
> {
> struct cfs_rq *cfs_rq = cfs_rq_of(se);
>
> -#ifdef CONFIG_FAIR_GROUP_SCHED
> - /*
> - * Since the real-depth could have been changed (only FAIR
> - * class maintain depth value), reset depth properly.
> - */
> - se->depth = se->parent ? se->parent->depth + 1 : 0;
> -#endif
> -
> /* Synchronize entity with its cfs_rq */
> update_load_avg(cfs_rq, se, sched_feat(ATTACH_AGE_LOAD) ? 0 : SKIP_AGE_LOAD);
> attach_entity_load_avg(cfs_rq, se);
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index aad7f5ee9666..8cc3eb7b86cd 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1940,6 +1940,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
> set_task_rq_fair(&p->se, p->se.cfs_rq, tg->cfs_rq[cpu]);
> p->se.cfs_rq = tg->cfs_rq[cpu];
> p->se.parent = tg->se[cpu];
> + p->se.depth = tg->se[cpu] ? tg->se[cpu]->depth + 1 : 0;
> #endif
>
> #ifdef CONFIG_RT_GROUP_SCHED
> --
> 2.36.1
>