Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933525Ab1C3Vt3 (ORCPT ); Wed, 30 Mar 2011 17:49:29 -0400 Received: from mga01.intel.com ([192.55.52.88]:60385 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933462Ab1C3VH2 (ORCPT ); Wed, 30 Mar 2011 17:07:28 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.63,270,1299484800"; d="scan'208";a="903734972" From: Andi Kleen References: <20110330203.501921634@firstfloor.org> In-Reply-To: <20110330203.501921634@firstfloor.org> To: a.p.zijlstra@chello.nl, ak@linux.intel.com, efault@gmx.de, mingo@elte.hu, gregkh@suse.de, linux-kernel@vger.kernel.org, stable@kernel.org, tim.bird@am.sony.com Subject: [PATCH] [107/275] sched, cgroup: Fixup broken cgroup movement Message-Id: <20110330210546.50AAE3E1A05@tassilo.jf.intel.com> Date: Wed, 30 Mar 2011 14:05:46 -0700 (PDT) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3838 Lines: 107 2.6.35-longterm review patch. If anyone has any objections, please let me know. ------------------ Commit: b2b5ce022acf5e9f52f7b78c5579994fdde191d4 upstream Dima noticed that we fail to correct the ->vruntime of sleeping tasks when we move them between cgroups. Reported-by: Dima Zavin Signed-off-by: Peter Zijlstra Signed-off-by: Andi Kleen Tested-by: Mike Galbraith LKML-Reference: <1287150604.29097.1513.camel@twins> Signed-off-by: Ingo Molnar Signed-off-by: Mike Galbraith Acked-by: Peter Zijlstra Signed-off-by: Greg Kroah-Hartman --- include/linux/sched.h | 2 +- kernel/sched.c | 8 ++++---- kernel/sched_fair.c | 25 +++++++++++++++++++------ 3 files changed, 24 insertions(+), 11 deletions(-) Index: linux-2.6.35.y/include/linux/sched.h =================================================================== --- linux-2.6.35.y.orig/include/linux/sched.h 2011-03-29 23:03:00.269293279 -0700 +++ linux-2.6.35.y/include/linux/sched.h 2011-03-29 23:54:57.628527856 -0700 @@ -1076,7 +1076,7 @@ struct task_struct *task); #ifdef CONFIG_FAIR_GROUP_SCHED - void (*moved_group) (struct task_struct *p, int on_rq); + void (*task_move_group) (struct task_struct *p, int on_rq); #endif }; Index: linux-2.6.35.y/kernel/sched.c =================================================================== --- linux-2.6.35.y.orig/kernel/sched.c 2011-03-29 23:03:00.391290158 -0700 +++ linux-2.6.35.y/kernel/sched.c 2011-03-29 23:54:57.628527856 -0700 @@ -8332,12 +8332,12 @@ if (unlikely(running)) tsk->sched_class->put_prev_task(rq, tsk); - set_task_rq(tsk, task_cpu(tsk)); - #ifdef CONFIG_FAIR_GROUP_SCHED - if (tsk->sched_class->moved_group) - tsk->sched_class->moved_group(tsk, on_rq); + if (tsk->sched_class->task_move_group) + tsk->sched_class->task_move_group(tsk, on_rq); + else #endif + set_task_rq(tsk, task_cpu(tsk)); if (unlikely(running)) tsk->sched_class->set_curr_task(rq); Index: linux-2.6.35.y/kernel/sched_fair.c =================================================================== --- linux-2.6.35.y.orig/kernel/sched_fair.c 2011-03-29 23:03:00.347291285 -0700 +++ linux-2.6.35.y/kernel/sched_fair.c 2011-03-29 23:54:57.628527856 -0700 @@ -3657,13 +3657,26 @@ } #ifdef CONFIG_FAIR_GROUP_SCHED -static void moved_group_fair(struct task_struct *p, int on_rq) +static void task_move_group_fair(struct task_struct *p, int on_rq) { - struct cfs_rq *cfs_rq = task_cfs_rq(p); - - update_curr(cfs_rq); + /* + * If the task was not on the rq at the time of this cgroup movement + * it must have been asleep, sleeping tasks keep their ->vruntime + * absolute on their old rq until wakeup (needed for the fair sleeper + * bonus in place_entity()). + * + * If it was on the rq, we've just 'preempted' it, which does convert + * ->vruntime to a relative base. + * + * Make sure both cases convert their relative position when migrating + * to another cgroup's rq. This does somewhat interfere with the + * fair sleeper stuff for the first placement, but who cares. + */ + if (!on_rq) + p->se.vruntime -= cfs_rq_of(&p->se)->min_vruntime; + set_task_rq(p, task_cpu(p)); if (!on_rq) - place_entity(cfs_rq, &p->se, 1); + p->se.vruntime += cfs_rq_of(&p->se)->min_vruntime; } #endif @@ -3715,7 +3728,7 @@ .get_rr_interval = get_rr_interval_fair, #ifdef CONFIG_FAIR_GROUP_SCHED - .moved_group = moved_group_fair, + .task_move_group = task_move_group_fair, #endif }; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/