2019-10-31 21:53:16

by Josh Don

[permalink] [raw]
Subject: [PATCH] sched/fair: Do not set skip buddy up the sched hierarchy

From: Venkatesh Pallipadi <[email protected]>

Setting skip buddy all the way up the hierarchy does not play well
with intra-cgroup yield. One typical usecase of yield is when a
thread in a cgroup wants to yield CPU to another thread within the
same cgroup. For such a case, setting the skip buddy all the way up
the hierarchy is counter-productive, as that results in CPU being
yielded to a task in some other cgroup.

So, limit the skip effect only to the task requesting it.

Signed-off-by: Josh Don <[email protected]>
---
kernel/sched/fair.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 682a754ea3e1..52ab06585d7f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6647,8 +6647,15 @@ static void set_next_buddy(struct sched_entity *se)

static void set_skip_buddy(struct sched_entity *se)
{
- for_each_sched_entity(se)
- cfs_rq_of(se)->skip = se;
+ /*
+ * One typical usecase of yield is when a thread in a cgroup
+ * wants to yield CPU to another thread within the same cgroup.
+ * For such a case, setting the skip buddy all the way up the
+ * hierarchy is counter-productive, as that results in CPU being
+ * yielded to a task in some other cgroup. So, only set skip
+ * for the task requesting it.
+ */
+ cfs_rq_of(se)->skip = se;
}

/*
--
2.23.0.700.g56cf767bdb-goog


2019-11-04 14:58:44

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Do not set skip buddy up the sched hierarchy

On Thu, 31 Oct 2019 at 19:45, Josh Don <[email protected]> wrote:
>
> From: Venkatesh Pallipadi <[email protected]>
>
> Setting skip buddy all the way up the hierarchy does not play well
> with intra-cgroup yield. One typical usecase of yield is when a
> thread in a cgroup wants to yield CPU to another thread within the
> same cgroup. For such a case, setting the skip buddy all the way up
> the hierarchy is counter-productive, as that results in CPU being
> yielded to a task in some other cgroup.
>
> So, limit the skip effect only to the task requesting it.
>
> Signed-off-by: Josh Don <[email protected]>
> ---
> kernel/sched/fair.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 682a754ea3e1..52ab06585d7f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6647,8 +6647,15 @@ static void set_next_buddy(struct sched_entity *se)
>
> static void set_skip_buddy(struct sched_entity *se)
> {
> - for_each_sched_entity(se)
> - cfs_rq_of(se)->skip = se;
> + /*
> + * One typical usecase of yield is when a thread in a cgroup
> + * wants to yield CPU to another thread within the same cgroup.
> + * For such a case, setting the skip buddy all the way up the
> + * hierarchy is counter-productive, as that results in CPU being
> + * yielded to a task in some other cgroup. So, only set skip
> + * for the task requesting it.
> + */
> + cfs_rq_of(se)->skip = se;
> }

You should also update __clear_buddies_skip to only clear this skip

>
> /*
> --
> 2.23.0.700.g56cf767bdb-goog
>

2019-11-06 22:16:24

by Josh Don

[permalink] [raw]
Subject: [PATCH v2] sched/fair: Do not set skip buddy up the sched hierarchy

From: Venkatesh Pallipadi <[email protected]>

Setting skip buddy all the way up the hierarchy does not play well
with intra-cgroup yield. One typical usecase of yield is when a
thread in a cgroup wants to yield CPU to another thread within the
same cgroup. For such a case, setting the skip buddy all the way up
the hierarchy is counter-productive, as that results in CPU being
yielded to a task in some other cgroup.

So, limit the skip effect only to the task requesting it.

Signed-off-by: Josh Don <[email protected]>
---
Changelog since v1:
- As an optimization, skip clearing the skip buddy up the hierarchy
- Due to the above, it makes sense to inline __clear_buddies_skip; while
we're at it, inline the other __clear_buddies* functions as well.

kernel/sched/fair.c | 27 +++++++++++++++------------
1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 682a754ea3e1..dbac30e3cc08 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4010,7 +4010,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
}
}

-static void __clear_buddies_last(struct sched_entity *se)
+static inline void __clear_buddies_last(struct sched_entity *se)
{
for_each_sched_entity(se) {
struct cfs_rq *cfs_rq = cfs_rq_of(se);
@@ -4021,7 +4021,7 @@ static void __clear_buddies_last(struct sched_entity *se)
}
}

-static void __clear_buddies_next(struct sched_entity *se)
+static inline void __clear_buddies_next(struct sched_entity *se)
{
for_each_sched_entity(se) {
struct cfs_rq *cfs_rq = cfs_rq_of(se);
@@ -4032,15 +4032,12 @@ static void __clear_buddies_next(struct sched_entity *se)
}
}

-static void __clear_buddies_skip(struct sched_entity *se)
+static inline void __clear_buddies_skip(struct sched_entity *se)
{
- for_each_sched_entity(se) {
- struct cfs_rq *cfs_rq = cfs_rq_of(se);
- if (cfs_rq->skip != se)
- break;
+ struct cfs_rq *cfs_rq = cfs_rq_of(se);

+ if (cfs_rq->skip == se)
cfs_rq->skip = NULL;
- }
}

static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
@@ -4051,8 +4048,7 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
if (cfs_rq->next == se)
__clear_buddies_next(se);

- if (cfs_rq->skip == se)
- __clear_buddies_skip(se);
+ __clear_buddies_skip(se);
}

static __always_inline void return_cfs_rq_runtime(struct cfs_rq *cfs_rq);
@@ -6647,8 +6643,15 @@ static void set_next_buddy(struct sched_entity *se)

static void set_skip_buddy(struct sched_entity *se)
{
- for_each_sched_entity(se)
- cfs_rq_of(se)->skip = se;
+ /*
+ * One typical usecase of yield is when a thread in a cgroup
+ * wants to yield CPU to another thread within the same cgroup.
+ * For such a case, setting the skip buddy all the way up the
+ * hierarchy is counter-productive, as that results in CPU being
+ * yielded to a task in some other cgroup. So, only set skip
+ * for the task requesting it.
+ */
+ cfs_rq_of(se)->skip = se;
}

/*
--
2.24.0.rc1.363.gb1bccd3e3d-goog