2019-12-04 20:07:41

by Josh Don

[permalink] [raw]
Subject: [PATCH v2] sched/fair: Do not set skip buddy up the sched hierarchy

From: Venkatesh Pallipadi <[email protected]>

Setting skip buddy all the way up the hierarchy does not play well
with intra-cgroup yield. One typical usecase of yield is when a
thread in a cgroup wants to yield CPU to another thread within the
same cgroup. For such a case, setting the skip buddy all the way up
the hierarchy is counter-productive, as that results in CPU being
yielded to a task in some other cgroup.

So, limit the skip effect only to the task requesting it.

Signed-off-by: Josh Don <[email protected]>
---
v2: Only clear skip buddy on the current cfs_rq

kernel/sched/fair.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 08a233e97a01..0b7a1958ad52 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4051,13 +4051,10 @@ static void __clear_buddies_next(struct sched_entity *se)

static void __clear_buddies_skip(struct sched_entity *se)
{
- for_each_sched_entity(se) {
- struct cfs_rq *cfs_rq = cfs_rq_of(se);
- if (cfs_rq->skip != se)
- break;
+ struct cfs_rq *cfs_rq = cfs_rq_of(se);

+ if (cfs_rq->skip == se)
cfs_rq->skip = NULL;
- }
}

static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
@@ -6552,8 +6549,15 @@ static void set_next_buddy(struct sched_entity *se)

static void set_skip_buddy(struct sched_entity *se)
{
- for_each_sched_entity(se)
- cfs_rq_of(se)->skip = se;
+ /*
+ * One typical usecase of yield is when a thread in a cgroup
+ * wants to yield CPU to another thread within the same cgroup.
+ * For such a case, setting the skip buddy all the way up the
+ * hierarchy is counter-productive, as that results in CPU being
+ * yielded to a task in some other cgroup. So, only set skip
+ * for the task requesting it.
+ */
+ cfs_rq_of(se)->skip = se;
}

/*
--
2.24.0.393.g34dc348eaf-goog


2019-12-06 07:58:40

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH v2] sched/fair: Do not set skip buddy up the sched hierarchy

Hi Josh,

On Wed, 4 Dec 2019 at 21:06, Josh Don <[email protected]> wrote:
>
> From: Venkatesh Pallipadi <[email protected]>
>
> Setting skip buddy all the way up the hierarchy does not play well
> with intra-cgroup yield. One typical usecase of yield is when a
> thread in a cgroup wants to yield CPU to another thread within the
> same cgroup. For such a case, setting the skip buddy all the way up
> the hierarchy is counter-productive, as that results in CPU being
> yielded to a task in some other cgroup.
>
> So, limit the skip effect only to the task requesting it.
>
> Signed-off-by: Josh Don <[email protected]>

There is a mismatch between the author Venkatesh Pallipadi and the
signoff Josh Don
If Venkatesh is the original author and you have then done some
modifications, your both signed-off should be there

Apart from that, the change makes sense to me

> ---
> v2: Only clear skip buddy on the current cfs_rq
>
> kernel/sched/fair.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 08a233e97a01..0b7a1958ad52 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4051,13 +4051,10 @@ static void __clear_buddies_next(struct sched_entity *se)
>
> static void __clear_buddies_skip(struct sched_entity *se)
> {
> - for_each_sched_entity(se) {
> - struct cfs_rq *cfs_rq = cfs_rq_of(se);
> - if (cfs_rq->skip != se)
> - break;
> + struct cfs_rq *cfs_rq = cfs_rq_of(se);
>
> + if (cfs_rq->skip == se)
> cfs_rq->skip = NULL;
> - }
> }
>
> static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
> @@ -6552,8 +6549,15 @@ static void set_next_buddy(struct sched_entity *se)
>
> static void set_skip_buddy(struct sched_entity *se)
> {
> - for_each_sched_entity(se)
> - cfs_rq_of(se)->skip = se;
> + /*
> + * One typical usecase of yield is when a thread in a cgroup
> + * wants to yield CPU to another thread within the same cgroup.
> + * For such a case, setting the skip buddy all the way up the
> + * hierarchy is counter-productive, as that results in CPU being
> + * yielded to a task in some other cgroup. So, only set skip
> + * for the task requesting it.
> + */
> + cfs_rq_of(se)->skip = se;
> }
>
> /*
> --
> 2.24.0.393.g34dc348eaf-goog
>

2019-12-06 22:14:11

by Josh Don

[permalink] [raw]
Subject: Re: [PATCH v2] sched/fair: Do not set skip buddy up the sched hierarchy

Hi Vincent,

Thanks for taking a look.

> There is a mismatch between the author Venkatesh Pallipadi and the
> signoff Josh Don
> If Venkatesh is the original author and you have then done some
> modifications, your both signed-off should be there

Venkatesh no longer works at Google, so I don't have a way to get in
touch with him. Is my signed-off insufficient for this case?


On Thu, Dec 5, 2019 at 11:57 PM Vincent Guittot
<[email protected]> wrote:
>
> Hi Josh,
>
> On Wed, 4 Dec 2019 at 21:06, Josh Don <[email protected]> wrote:
> >
> > From: Venkatesh Pallipadi <[email protected]>
> >
> > Setting skip buddy all the way up the hierarchy does not play well
> > with intra-cgroup yield. One typical usecase of yield is when a
> > thread in a cgroup wants to yield CPU to another thread within the
> > same cgroup. For such a case, setting the skip buddy all the way up
> > the hierarchy is counter-productive, as that results in CPU being
> > yielded to a task in some other cgroup.
> >
> > So, limit the skip effect only to the task requesting it.
> >
> > Signed-off-by: Josh Don <[email protected]>
>
> There is a mismatch between the author Venkatesh Pallipadi and the
> signoff Josh Don
> If Venkatesh is the original author and you have then done some
> modifications, your both signed-off should be there
>
> Apart from that, the change makes sense to me
>
> > ---
> > v2: Only clear skip buddy on the current cfs_rq
> >
> > kernel/sched/fair.c | 18 +++++++++++-------
> > 1 file changed, 11 insertions(+), 7 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 08a233e97a01..0b7a1958ad52 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -4051,13 +4051,10 @@ static void __clear_buddies_next(struct sched_entity *se)
> >
> > static void __clear_buddies_skip(struct sched_entity *se)
> > {
> > - for_each_sched_entity(se) {
> > - struct cfs_rq *cfs_rq = cfs_rq_of(se);
> > - if (cfs_rq->skip != se)
> > - break;
> > + struct cfs_rq *cfs_rq = cfs_rq_of(se);
> >
> > + if (cfs_rq->skip == se)
> > cfs_rq->skip = NULL;
> > - }
> > }
> >
> > static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > @@ -6552,8 +6549,15 @@ static void set_next_buddy(struct sched_entity *se)
> >
> > static void set_skip_buddy(struct sched_entity *se)
> > {
> > - for_each_sched_entity(se)
> > - cfs_rq_of(se)->skip = se;
> > + /*
> > + * One typical usecase of yield is when a thread in a cgroup
> > + * wants to yield CPU to another thread within the same cgroup.
> > + * For such a case, setting the skip buddy all the way up the
> > + * hierarchy is counter-productive, as that results in CPU being
> > + * yielded to a task in some other cgroup. So, only set skip
> > + * for the task requesting it.
> > + */
> > + cfs_rq_of(se)->skip = se;
> > }
> >
> > /*
> > --
> > 2.24.0.393.g34dc348eaf-goog
> >

2019-12-09 09:21:01

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v2] sched/fair: Do not set skip buddy up the sched hierarchy

On 06.12.19 23:13, Josh Don wrote:

[...]

> On Thu, Dec 5, 2019 at 11:57 PM Vincent Guittot
> <[email protected]> wrote:
>>
>> Hi Josh,
>>
>> On Wed, 4 Dec 2019 at 21:06, Josh Don <[email protected]> wrote:
>>>
>>> From: Venkatesh Pallipadi <[email protected]>
>>>
>>> Setting skip buddy all the way up the hierarchy does not play well
>>> with intra-cgroup yield. One typical usecase of yield is when a
>>> thread in a cgroup wants to yield CPU to another thread within the
>>> same cgroup. For such a case, setting the skip buddy all the way up

But with yield_task{_fair}() you have no way to control which other task
gets accelerated. The other task in the taskgroup (cgroup) could be even
on another CPU.

It's not like yield_to_task_fair() which uses next buddy to accelerate
another task p.

What's this typical usecase?

>>> the hierarchy is counter-productive, as that results in CPU being
>>> yielded to a task in some other cgroup.
>>>
>>> So, limit the skip effect only to the task requesting it.

[...]

2019-12-12 08:06:37

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH v2] sched/fair: Do not set skip buddy up the sched hierarchy

Hi Josh,

On Fri, 6 Dec 2019 at 23:13, Josh Don <[email protected]> wrote:
>
> Hi Vincent,
>
> Thanks for taking a look.
>
> > There is a mismatch between the author Venkatesh Pallipadi and the
> > signoff Josh Don
> > If Venkatesh is the original author and you have then done some
> > modifications, your both signed-off should be there
>
> Venkatesh no longer works at Google, so I don't have a way to get in
> touch with him. Is my signed-off insufficient for this case?

Maybe you can add a Co-developed-by tag to reflect your additional changes
I guess that as long as you agree with the DCO, it's ok :
https://www.kernel.org/doc/html/v5.4/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin

Ingo, Peter, what do you think ?


>
>
> On Thu, Dec 5, 2019 at 11:57 PM Vincent Guittot
> <[email protected]> wrote:
> >
> > Hi Josh,
> >
> > On Wed, 4 Dec 2019 at 21:06, Josh Don <[email protected]> wrote:
> > >
> > > From: Venkatesh Pallipadi <[email protected]>
> > >
> > > Setting skip buddy all the way up the hierarchy does not play well
> > > with intra-cgroup yield. One typical usecase of yield is when a
> > > thread in a cgroup wants to yield CPU to another thread within the
> > > same cgroup. For such a case, setting the skip buddy all the way up
> > > the hierarchy is counter-productive, as that results in CPU being
> > > yielded to a task in some other cgroup.
> > >
> > > So, limit the skip effect only to the task requesting it.
> > >
> > > Signed-off-by: Josh Don <[email protected]>
> >
> > There is a mismatch between the author Venkatesh Pallipadi and the
> > signoff Josh Don
> > If Venkatesh is the original author and you have then done some
> > modifications, your both signed-off should be there
> >
> > Apart from that, the change makes sense to me
> >
> > > ---
> > > v2: Only clear skip buddy on the current cfs_rq
> > >
> > > kernel/sched/fair.c | 18 +++++++++++-------
> > > 1 file changed, 11 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index 08a233e97a01..0b7a1958ad52 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -4051,13 +4051,10 @@ static void __clear_buddies_next(struct sched_entity *se)
> > >
> > > static void __clear_buddies_skip(struct sched_entity *se)
> > > {
> > > - for_each_sched_entity(se) {
> > > - struct cfs_rq *cfs_rq = cfs_rq_of(se);
> > > - if (cfs_rq->skip != se)
> > > - break;
> > > + struct cfs_rq *cfs_rq = cfs_rq_of(se);
> > >
> > > + if (cfs_rq->skip == se)
> > > cfs_rq->skip = NULL;
> > > - }
> > > }
> > >
> > > static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > > @@ -6552,8 +6549,15 @@ static void set_next_buddy(struct sched_entity *se)
> > >
> > > static void set_skip_buddy(struct sched_entity *se)
> > > {
> > > - for_each_sched_entity(se)
> > > - cfs_rq_of(se)->skip = se;
> > > + /*
> > > + * One typical usecase of yield is when a thread in a cgroup
> > > + * wants to yield CPU to another thread within the same cgroup.
> > > + * For such a case, setting the skip buddy all the way up the
> > > + * hierarchy is counter-productive, as that results in CPU being
> > > + * yielded to a task in some other cgroup. So, only set skip
> > > + * for the task requesting it.
> > > + */
> > > + cfs_rq_of(se)->skip = se;
> > > }
> > >
> > > /*
> > > --
> > > 2.24.0.393.g34dc348eaf-goog
> > >

2019-12-12 22:21:50

by Josh Don

[permalink] [raw]
Subject: Re: [PATCH v2] sched/fair: Do not set skip buddy up the sched hierarchy

On Mon, Dec 9, 2019 at 1:19 AM Dietmar Eggemann
<[email protected]> wrote:
>
> On 06.12.19 23:13, Josh Don wrote:
>
> [...]
>
> > On Thu, Dec 5, 2019 at 11:57 PM Vincent Guittot
> > <[email protected]> wrote:
> >>
> >> Hi Josh,
> >>
> >> On Wed, 4 Dec 2019 at 21:06, Josh Don <[email protected]> wrote:
> >>>
> >>> From: Venkatesh Pallipadi <[email protected]>
> >>>
> >>> Setting skip buddy all the way up the hierarchy does not play well
> >>> with intra-cgroup yield. One typical usecase of yield is when a
> >>> thread in a cgroup wants to yield CPU to another thread within the
> >>> same cgroup. For such a case, setting the skip buddy all the way up
>
> But with yield_task{_fair}() you have no way to control which other task
> gets accelerated. The other task in the taskgroup (cgroup) could be even
> on another CPU.
>
> It's not like yield_to_task_fair() which uses next buddy to accelerate
> another task p.
>
> What's this typical usecase?

The semantics for yield_task under CFS are not well-defined. With our
CFS hierarchy, we cannot easily just push a yielded task to the end of
a runqueue. And, we don't want to play games with artificially
increasing vruntime, as this results in potentially high latency for a
yielded task to get back on CPU.

I'd interpret a task that calls yield as saying "I can run, but try to
run something else." I'd agree that this patch is imperfect in
achieving this, but I think it is better than the current
implementation (or at least, less broken). Currently, a side-effect
of calling yield is that all other tasks in the same hierarchy get
skipped as well. This is almost certainly not what the user
expects/wants. It is true that if a yielded task has no other tasks
in its cgroup on the same CPU, we will potentially end up just picking
the yielded task again. But this should be OK; a yielded task should
be able to continue making forward progress. Any yielded task that
calls yield again is likely implementing a busy loop, which is an
improper use of yield anyway.

I also played around with the idea of setting the skip buddy up the
hierarchy up to the point where cfs_rq->nr_running > 1, but this is
racy with enqueue, and in general raises questions about whether an
enqueued task should try to clear skip buddies.