2022-08-31 07:23:14

by Cruz Zhao

[permalink] [raw]
Subject: [PATCH] sched/core: Fix the bug that sched_core_find() may return throttled task

When a cfs_rq is throttled, the cookie'd task in this cfs_rq wouldn't
dequeue from the core tree, and sched_core_find() may return this task,
which will result that the throttled task running on the cpu.

To resolve this problem, we pick the first cookie matched task and
unthrottled task.

Signed-off-by: Cruz Zhao <[email protected]>
---
kernel/sched/core.c | 6 ++++++
kernel/sched/fair.c | 7 +++++++
kernel/sched/sched.h | 1 +
3 files changed, 14 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b604223..a34acd0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -271,6 +271,12 @@ static struct task_struct *sched_core_find(struct rq *rq, unsigned long cookie)
struct rb_node *node;

node = rb_find_first((void *)cookie, &rq->core_tree, rb_sched_core_cmp);
+ while (node && task_throttled(__node_2_sc(node))) {
+ node = rb_next(node);
+ if (node && cookie != __node_2_sc(node)->core_cookie)
+ node = NULL;
+ }
+
/*
* The idle task always matches any cookie!
*/
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cf3300b..4878a25 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11563,6 +11563,13 @@ bool cfs_prio_less(struct task_struct *a, struct task_struct *b, bool in_fi)

return delta > 0;
}
+
+inline int task_throttled(struct task_struct *p)
+{
+ struct cfs_rq *cfs_rq = cfs_rq_of(&p->se);
+
+ return cfs_rq_throttled(cfs_rq);
+}
#else
static inline void task_tick_core(struct rq *rq, struct task_struct *curr) {}
#endif
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index f616e0c..c6e3955 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1285,6 +1285,7 @@ static inline bool sched_core_enqueued(struct task_struct *p)

extern void sched_core_get(void);
extern void sched_core_put(void);
+extern int task_throttled(struct task_struct *p);

#else /* !CONFIG_SCHED_CORE */

--
1.8.3.1


2022-09-01 09:36:06

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched/core: Fix the bug that sched_core_find() may return throttled task

On Wed, Aug 31, 2022 at 02:49:18PM +0800, Cruz Zhao wrote:
> When a cfs_rq is throttled, the cookie'd task in this cfs_rq wouldn't
> dequeue from the core tree, and sched_core_find() may return this task,
> which will result that the throttled task running on the cpu.
>
> To resolve this problem, we pick the first cookie matched task and
> unthrottled task.

You mean: first that that both matches the cookie and is not throttled.

Except I think you can have the same problem with the RT crud.

2022-09-01 10:46:58

by Cruz Zhao

[permalink] [raw]
Subject: Re: [PATCH] sched/core: Fix the bug that sched_core_find() may return throttled task



在 2022/9/1 下午5:02, Peter Zijlstra 写道:
> On Wed, Aug 31, 2022 at 02:49:18PM +0800, Cruz Zhao wrote:
>> When a cfs_rq is throttled, the cookie'd task in this cfs_rq wouldn't
>> dequeue from the core tree, and sched_core_find() may return this task,
>> which will result that the throttled task running on the cpu.
>>
>> To resolve this problem, we pick the first cookie matched task and
>> unthrottled task.
>
> You mean: first that that both matches the cookie and is not throttled.
>

Yeah, I mean "the first cookie matched and not throttled task".

> Except I think you can have the same problem with the RT crud.

Sure, there's the same problem with the RT crud.

There's also a problem that the priority of the tasks in the core_tree
won't change since sched_core_enqueue(), but the priority of cfs tasks
will change as vruntime changes. And sched_core_find() may not pick the
cookie matched task with the highest priority.

I tried to combine the core_tree with cfs_rq (dl_rq, rt_rq should also
be considered) to solve this problem, but I haven't come up with a
simple and graceful solution yet.