2021-06-21 17:44:42

by Vincent Guittot

[permalink] [raw]
Subject: [PATCH] sched/fair: Ensure that the cfs parent is added after unthrottling

From: Rik van Riel <[email protected]>

Ensure that a cfs will be in the list whenever one of its child is also
in the list.

A warning on rq->tmp_alone_branch != &rq->leaf_cfs_rq_list has been
reported while running LTP test cfs_bandwidth01.

Odin found the root cause:

$ tree /sys/fs/cgroup/ltp/ -d --charset=ascii
/sys/fs/cgroup/ltp/
|-- drain
`-- test-6851
`-- level2
|-- level3a
| |-- worker1
| `-- worker2
`-- level3b
`-- worker3

Timeline (ish):
- worker3 gets throttled
- level3b is decayed, since it has no more load
- level2 get throttled
- worker3 get unthrottled
- level2 get unthrottled
- worker3 is added to list
- level3b is not added to list, since nr_running==0 and is decayed

Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
Reported-by: Sachin Sant <[email protected]>
Signed-off-by: Rik van Riel <[email protected]>
Suggested-by: Vincent Guittot <[email protected]>
[Rebased and updated to fix for the reported warning]
Signed-off-by: Vincent Guittot <[email protected]>
Acked-by: Odin Ugedal <[email protected]>
Tested-by: Sachin Sant <[email protected]>
---

kernel/sched/fair.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7b8990fd4896..8cc27b847ad8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3255,6 +3255,31 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq, int flags)

#ifdef CONFIG_SMP
#ifdef CONFIG_FAIR_GROUP_SCHED
+/*
+ * Because list_add_leaf_cfs_rq always places a child cfs_rq on the list
+ * immediately before a parent cfs_rq, and cfs_rqs are removed from the list
+ * bottom-up, we only have to test whether the cfs_rq before us on the list
+ * is our child.
+ * If cfs_rq is not on the list, test whether a child needs its to be added to
+ * connect a branch to the tree * (see list_add_leaf_cfs_rq() for details).
+ */
+static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq)
+{
+ struct cfs_rq *prev_cfs_rq;
+ struct list_head *prev;
+
+ if (cfs_rq->on_list) {
+ prev = cfs_rq->leaf_cfs_rq_list.prev;
+ } else {
+ struct rq *rq = rq_of(cfs_rq);
+
+ prev = rq->tmp_alone_branch;
+ }
+
+ prev_cfs_rq = container_of(prev, struct cfs_rq, leaf_cfs_rq_list);
+
+ return (prev_cfs_rq->tg->parent == cfs_rq->tg);
+}

static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
{
@@ -3270,6 +3295,9 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
if (cfs_rq->avg.runnable_sum)
return false;

+ if (child_cfs_rq_on_list(cfs_rq))
+ return false;
+
/*
* _avg must be null when _sum are null because _avg = _sum / divider
* Make sure that rounding and/or propagation of PELT values never
--
2.17.1


2021-06-22 12:07:50

by tip-bot2 for Haifeng Xu

[permalink] [raw]
Subject: [tip: sched/urgent] sched/fair: Ensure that the CFS parent is added after unthrottling

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID: 30ca4b4598a9d680917fe24df5451afecc028b5b
Gitweb: https://git.kernel.org/tip/30ca4b4598a9d680917fe24df5451afecc028b5b
Author: Rik van Riel <[email protected]>
AuthorDate: Mon, 21 Jun 2021 19:43:30 +02:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Tue, 22 Jun 2021 14:00:53 +02:00

sched/fair: Ensure that the CFS parent is added after unthrottling

Ensure that a CFS parent will be in the list whenever one of its children is also
in the list.

A warning on rq->tmp_alone_branch != &rq->leaf_cfs_rq_list has been
reported while running LTP test cfs_bandwidth01.

Odin Ugedal found the root cause:

$ tree /sys/fs/cgroup/ltp/ -d --charset=ascii
/sys/fs/cgroup/ltp/
|-- drain
`-- test-6851
`-- level2
|-- level3a
| |-- worker1
| `-- worker2
`-- level3b
`-- worker3

Timeline (ish):
- worker3 gets throttled
- level3b is decayed, since it has no more load
- level2 get throttled
- worker3 get unthrottled
- level2 get unthrottled
- worker3 is added to list
- level3b is not added to list, since nr_running==0 and is decayed

[ Vincent Guittot: Rebased and updated to fix for the reported warning. ]

Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
Reported-by: Sachin Sant <[email protected]>
Suggested-by: Vincent Guittot <[email protected]>
Signed-off-by: Rik van Riel <[email protected]>
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Acked-by: Odin Ugedal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/sched/fair.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bfaa6e1..a56f646 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3298,6 +3298,31 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq, int flags)

#ifdef CONFIG_SMP
#ifdef CONFIG_FAIR_GROUP_SCHED
+/*
+ * Because list_add_leaf_cfs_rq always places a child cfs_rq on the list
+ * immediately before a parent cfs_rq, and cfs_rqs are removed from the list
+ * bottom-up, we only have to test whether the cfs_rq before us on the list
+ * is our child.
+ * If cfs_rq is not on the list, test whether a child needs its to be added to
+ * connect a branch to the tree * (see list_add_leaf_cfs_rq() for details).
+ */
+static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq)
+{
+ struct cfs_rq *prev_cfs_rq;
+ struct list_head *prev;
+
+ if (cfs_rq->on_list) {
+ prev = cfs_rq->leaf_cfs_rq_list.prev;
+ } else {
+ struct rq *rq = rq_of(cfs_rq);
+
+ prev = rq->tmp_alone_branch;
+ }
+
+ prev_cfs_rq = container_of(prev, struct cfs_rq, leaf_cfs_rq_list);
+
+ return (prev_cfs_rq->tg->parent == cfs_rq->tg);
+}

static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
{
@@ -3334,6 +3359,9 @@ static inline void update_tg_load_avg(struct cfs_rq *cfs_rq)
{
long delta = cfs_rq->avg.load_avg - cfs_rq->tg_load_avg_contrib;

+ if (child_cfs_rq_on_list(cfs_rq))
+ return false;
+
/*
* No need to update load_avg for root_task_group as it is not used.
*/

2021-06-22 12:28:31

by tip-bot2 for Haifeng Xu

[permalink] [raw]
Subject: [tip: sched/urgent] sched/fair: Ensure that the CFS parent is added after unthrottling

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID: fdaba61ef8a268d4136d0a113d153f7a89eb9984
Gitweb: https://git.kernel.org/tip/fdaba61ef8a268d4136d0a113d153f7a89eb9984
Author: Rik van Riel <[email protected]>
AuthorDate: Mon, 21 Jun 2021 19:43:30 +02:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Tue, 22 Jun 2021 14:06:57 +02:00

sched/fair: Ensure that the CFS parent is added after unthrottling

Ensure that a CFS parent will be in the list whenever one of its children is also
in the list.

A warning on rq->tmp_alone_branch != &rq->leaf_cfs_rq_list has been
reported while running LTP test cfs_bandwidth01.

Odin Ugedal found the root cause:

$ tree /sys/fs/cgroup/ltp/ -d --charset=ascii
/sys/fs/cgroup/ltp/
|-- drain
`-- test-6851
`-- level2
|-- level3a
| |-- worker1
| `-- worker2
`-- level3b
`-- worker3

Timeline (ish):
- worker3 gets throttled
- level3b is decayed, since it has no more load
- level2 get throttled
- worker3 get unthrottled
- level2 get unthrottled
- worker3 is added to list
- level3b is not added to list, since nr_running==0 and is decayed

[ Vincent Guittot: Rebased and updated to fix for the reported warning. ]

Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
Reported-by: Sachin Sant <[email protected]>
Suggested-by: Vincent Guittot <[email protected]>
Signed-off-by: Rik van Riel <[email protected]>
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Acked-by: Odin Ugedal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/sched/fair.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bfaa6e1..2366331 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3298,6 +3298,31 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq, int flags)

#ifdef CONFIG_SMP
#ifdef CONFIG_FAIR_GROUP_SCHED
+/*
+ * Because list_add_leaf_cfs_rq always places a child cfs_rq on the list
+ * immediately before a parent cfs_rq, and cfs_rqs are removed from the list
+ * bottom-up, we only have to test whether the cfs_rq before us on the list
+ * is our child.
+ * If cfs_rq is not on the list, test whether a child needs its to be added to
+ * connect a branch to the tree * (see list_add_leaf_cfs_rq() for details).
+ */
+static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq)
+{
+ struct cfs_rq *prev_cfs_rq;
+ struct list_head *prev;
+
+ if (cfs_rq->on_list) {
+ prev = cfs_rq->leaf_cfs_rq_list.prev;
+ } else {
+ struct rq *rq = rq_of(cfs_rq);
+
+ prev = rq->tmp_alone_branch;
+ }
+
+ prev_cfs_rq = container_of(prev, struct cfs_rq, leaf_cfs_rq_list);
+
+ return (prev_cfs_rq->tg->parent == cfs_rq->tg);
+}

static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
{
@@ -3313,6 +3338,9 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
if (cfs_rq->avg.runnable_sum)
return false;

+ if (child_cfs_rq_on_list(cfs_rq))
+ return false;
+
return true;
}