2011-05-17 23:21:23

by John Stultz

[permalink] [raw]
Subject: [RFC][PATCH] sched: Fix min_vruntime calculation in dequeue_entity

From: Peter Zijlstra <[email protected]>

Peter had sent this patch out in response to a patch from Dima Zavin
<[email protected]> which tried to address the following issue:

"After pulling the thread off the run-queue during a cgroup change,
the cfs_rq.min_vruntime gets recalculated. The dequeued thread's vruntime
then gets normalized to this new value. This can then lead to the thread
getting an unfair boost in the new group if the vruntime of the next
task in the old run-queue was way further ahead."

Peter suggested the following fix instead.

The full thread can be found here:
https://lkml.org/lkml/2010/11/20/34

While Dima never replied publicly, I bugged him a few weeks ago
and he said that this fix should address his original issue.

I just wanted to resend this patch out so the fix was not missed.

CC: Dima Zavin <[email protected]>
CC: Peter Zijlstra <[email protected]>
Signed-off-by: John Stultz <[email protected]>
---
kernel/sched_fair.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 6fa833a..fb321dc 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1072,8 +1072,6 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
se->on_rq = 0;
update_cfs_load(cfs_rq, 0);
account_entity_dequeue(cfs_rq, se);
- update_min_vruntime(cfs_rq);
- update_cfs_shares(cfs_rq);

/*
* Normalize the entity after updating the min_vruntime because the
@@ -1082,6 +1080,9 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
*/
if (!(flags & DEQUEUE_SLEEP))
se->vruntime -= cfs_rq->min_vruntime;
+
+ update_min_vruntime(cfs_rq);
+ update_cfs_shares(cfs_rq);
}

/*
--
1.7.3.2.146.gca209


2011-05-19 11:58:55

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH] sched: Fix min_vruntime calculation in dequeue_entity

On Tue, 2011-05-17 at 16:21 -0700, John Stultz wrote:
> While Dima never replied publicly, I bugged him a few weeks ago
> and he said that this fix should address his original issue.

Should or does? :-)

I was really hoping for a tested-by tag..

2011-05-19 12:43:29

by Mike Galbraith

[permalink] [raw]
Subject: Re: [RFC][PATCH] sched: Fix min_vruntime calculation in dequeue_entity

On Thu, 2011-05-19 at 13:58 +0200, Peter Zijlstra wrote:
> On Tue, 2011-05-17 at 16:21 -0700, John Stultz wrote:
> > While Dima never replied publicly, I bugged him a few weeks ago
> > and he said that this fix should address his original issue.
>
> Should or does? :-)
>
> I was really hoping for a tested-by tag..

Recalls-having-tested-once-upon-a-time-by: Mike Galbraith <[email protected]> :)

2011-05-23 19:45:39

by Dima Zavin

[permalink] [raw]
Subject: Re: [RFC][PATCH] sched: Fix min_vruntime calculation in dequeue_entity

Peter,

My apologies for the very delayed response, mail filter fail.

Your patch looks equivalent to what we use today, so...

Acked-by: Dima Zavin <[email protected]>


On Thu, May 19, 2011 at 4:58 AM, Peter Zijlstra <[email protected]> wrote:
> On Tue, 2011-05-17 at 16:21 -0700, John Stultz wrote:
>> While Dima never replied publicly, I bugged him a few weeks ago
>> and he said that this fix should address his original issue.
>
> Should or does? :-)
>
> I was really hoping for a tested-by tag..
>

2011-05-28 16:35:50

by Peter Zijlstra

[permalink] [raw]
Subject: [tip:sched/urgent] sched: Fix ->min_vruntime calculation in dequeue_entity()

Commit-ID: 1e876231785d82443a5ac8b6c660e9f51bc5dede
Gitweb: http://git.kernel.org/tip/1e876231785d82443a5ac8b6c660e9f51bc5dede
Author: Peter Zijlstra <[email protected]>
AuthorDate: Tue, 17 May 2011 16:21:10 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Sat, 28 May 2011 17:02:56 +0200

sched: Fix ->min_vruntime calculation in dequeue_entity()

Dima Zavin <[email protected]> reported:

"After pulling the thread off the run-queue during a cgroup change,
the cfs_rq.min_vruntime gets recalculated. The dequeued thread's vruntime
then gets normalized to this new value. This can then lead to the thread
getting an unfair boost in the new group if the vruntime of the next
task in the old run-queue was way further ahead."

Reported-by: Dima Zavin <[email protected]>
Signed-off-by: John Stultz <[email protected]>
Recalls-having-tested-once-upon-a-time-by: Mike Galbraith <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched_fair.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index e32a9b7..433491c 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1076,8 +1076,6 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
se->on_rq = 0;
update_cfs_load(cfs_rq, 0);
account_entity_dequeue(cfs_rq, se);
- update_min_vruntime(cfs_rq);
- update_cfs_shares(cfs_rq);

/*
* Normalize the entity after updating the min_vruntime because the
@@ -1086,6 +1084,9 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
*/
if (!(flags & DEQUEUE_SLEEP))
se->vruntime -= cfs_rq->min_vruntime;
+
+ update_min_vruntime(cfs_rq);
+ update_cfs_shares(cfs_rq);
}

/*