Since commit a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to
list on unthrottle") we add cfs_rqs with no runnable tasks but not fully
decayed into the load (leaf) list. We may ignore adding some ancestors
and therefore breaking tmp_alone_branch invariant. This broke LTP test
cfs_bandwidth01 and it was partially fixed in commit fdaba61ef8a2
("sched/fair: Ensure that the CFS parent is added after unthrottling").
I noticed the named test still fails even with the fix (but with low
probability, 1 in ~1000 executions of the test). The reason is when
bailing out of unthrottle_cfs_rq early, we may miss adding ancestors of
the unthrottled cfs_rq, thus, not joining tmp_alone_branch properly.
Fix this by adding ancestors if we notice the unthrottled cfs_rq was
added to the load list.
Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
Signed-off-by: Michal Koutný <[email protected]>
---
Changes since v2 [1]:
- jump to completion for loop, don't duplicate it
- singled out of the series (to handle a fix separately from the rest)
[1] https://lore.kernel.org/r/[email protected]/
kernel/sched/fair.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ff69f245b939..f6a05d9b5443 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4936,8 +4936,12 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
/* update hierarchical throttle state */
walk_tg_tree_from(cfs_rq->tg, tg_nop, tg_unthrottle_up, (void *)rq);
- if (!cfs_rq->load.weight)
+ /* Nothing to run but something to decay (on_list)? Complete the branch */
+ if (!cfs_rq->load.weight) {
+ if (cfs_rq->on_list)
+ goto unthrottle_throttle;
return;
+ }
task_delta = cfs_rq->h_nr_running;
idle_task_delta = cfs_rq->idle_h_nr_running;
--
2.32.0
fre. 17. sep. 2021 kl. 16:32 skrev Michal Koutný <[email protected]>:
>
> Since commit a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to
> list on unthrottle") we add cfs_rqs with no runnable tasks but not fully
> decayed into the load (leaf) list. We may ignore adding some ancestors
> and therefore breaking tmp_alone_branch invariant. This broke LTP test
> cfs_bandwidth01 and it was partially fixed in commit fdaba61ef8a2
> ("sched/fair: Ensure that the CFS parent is added after unthrottling").
>
> I noticed the named test still fails even with the fix (but with low
> probability, 1 in ~1000 executions of the test). The reason is when
> bailing out of unthrottle_cfs_rq early, we may miss adding ancestors of
> the unthrottled cfs_rq, thus, not joining tmp_alone_branch properly.
>
> Fix this by adding ancestors if we notice the unthrottled cfs_rq was
> added to the load list.
>
> Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
> Signed-off-by: Michal Koutný <[email protected]>
Thanks for catching this!
Reviewed-by: Odin Ugedal <[email protected]>
> Changes since v2 [1]:
> - jump to completion for loop, don't duplicate it
> - singled out of the series (to handle a fix separately from the rest)
>
> [1] https://lore.kernel.org/r/[email protected]/
>
> kernel/sched/fair.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index ff69f245b939..f6a05d9b5443 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4936,8 +4936,12 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> /* update hierarchical throttle state */
> walk_tg_tree_from(cfs_rq->tg, tg_nop, tg_unthrottle_up, (void *)rq);
>
> - if (!cfs_rq->load.weight)
> + /* Nothing to run but something to decay (on_list)? Complete the branch */
> + if (!cfs_rq->load.weight) {
> + if (cfs_rq->on_list)
> + goto unthrottle_throttle;
> return;
> + }
>
> task_delta = cfs_rq->h_nr_running;
> idle_task_delta = cfs_rq->idle_h_nr_running;
> --
> 2.32.0
>
The following commit has been merged into the sched/urgent branch of tip:
Commit-ID: 2630cde26711dab0d0b56a8be1616475be646d13
Gitweb: https://git.kernel.org/tip/2630cde26711dab0d0b56a8be1616475be646d13
Author: Michal Koutný <[email protected]>
AuthorDate: Fri, 17 Sep 2021 17:30:37 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Fri, 01 Oct 2021 13:57:57 +02:00
sched/fair: Add ancestors of unthrottled undecayed cfs_rq
Since commit a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to
list on unthrottle") we add cfs_rqs with no runnable tasks but not fully
decayed into the load (leaf) list. We may ignore adding some ancestors
and therefore breaking tmp_alone_branch invariant. This broke LTP test
cfs_bandwidth01 and it was partially fixed in commit fdaba61ef8a2
("sched/fair: Ensure that the CFS parent is added after unthrottling").
I noticed the named test still fails even with the fix (but with low
probability, 1 in ~1000 executions of the test). The reason is when
bailing out of unthrottle_cfs_rq early, we may miss adding ancestors of
the unthrottled cfs_rq, thus, not joining tmp_alone_branch properly.
Fix this by adding ancestors if we notice the unthrottled cfs_rq was
added to the load list.
Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
Signed-off-by: Michal Koutný <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Vincent Guittot <[email protected]>
Reviewed-by: Odin Ugedal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/sched/fair.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ff69f24..f6a05d9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4936,8 +4936,12 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
/* update hierarchical throttle state */
walk_tg_tree_from(cfs_rq->tg, tg_nop, tg_unthrottle_up, (void *)rq);
- if (!cfs_rq->load.weight)
+ /* Nothing to run but something to decay (on_list)? Complete the branch */
+ if (!cfs_rq->load.weight) {
+ if (cfs_rq->on_list)
+ goto unthrottle_throttle;
return;
+ }
task_delta = cfs_rq->h_nr_running;
idle_task_delta = cfs_rq->idle_h_nr_running;