Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756877Ab3I2TrR (ORCPT ); Sun, 29 Sep 2013 15:47:17 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:59502 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755804Ab3I2T2k (ORCPT ); Sun, 29 Sep 2013 15:28:40 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Daisuke Nishimura , Peter Zijlstra , Ingo Molnar Subject: [ 10/71] sched/fair: Fix small race where child->se.parent,cfs_rq might point to invalid ones Date: Sun, 29 Sep 2013 12:27:22 -0700 Message-Id: <20130929192644.242170522@linuxfoundation.org> X-Mailer: git-send-email 1.8.4.6.g82e253f.dirty In-Reply-To: <20130929192643.539596256@linuxfoundation.org> References: <20130929192643.539596256@linuxfoundation.org> User-Agent: quilt/0.60-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3684 Lines: 96 3.11-stable review patch. If anyone has any objections, please let me know. ------------------ From: Daisuke Nishimura commit 6c9a27f5da9609fca46cb2b183724531b48f71ad upstream. There is a small race between copy_process() and cgroup_attach_task() where child->se.parent,cfs_rq points to invalid (old) ones. parent doing fork() | someone moving the parent to another cgroup -------------------------------+--------------------------------------------- copy_process() + dup_task_struct() -> parent->se is copied to child->se. se.parent,cfs_rq of them point to old ones. cgroup_attach_task() + cgroup_task_migrate() -> parent->cgroup is updated. + cpu_cgroup_attach() + sched_move_task() + task_move_group_fair() +- set_task_rq() -> se.parent,cfs_rq of parent are updated. + cgroup_fork() -> parent->cgroup is copied to child->cgroup. (*1) + sched_fork() + task_fork_fair() -> se.parent,cfs_rq of child are accessed while they point to old ones. (*2) In the worst case, this bug can lead to "use-after-free" and cause a panic, because it's new cgroup's refcount that is incremented at (*1), so the old cgroup(and related data) can be freed before (*2). In fact, a panic caused by this bug was originally caught in RHEL6.4. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] sched_slice+0x6e/0xa0 [...] Call Trace: [] place_entity+0x75/0xa0 [] task_fork_fair+0xaa/0x160 [] sched_fork+0x6b/0x140 [] copy_process+0x5b2/0x1450 [] ? wake_up_new_task+0xd9/0x130 [] do_fork+0x94/0x460 [] ? sys_wait4+0xae/0x100 [] sys_clone+0x28/0x30 [] stub_clone+0x13/0x20 [] ? system_call_fastpath+0x16/0x1b Signed-off-by: Daisuke Nishimura Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/039601ceae06$733d3130$59b79390$@mxp.nes.nec.co.jp Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman --- kernel/sched/fair.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5818,11 +5818,15 @@ static void task_fork_fair(struct task_s cfs_rq = task_cfs_rq(current); curr = cfs_rq->curr; - if (unlikely(task_cpu(p) != this_cpu)) { - rcu_read_lock(); - __set_task_cpu(p, this_cpu); - rcu_read_unlock(); - } + /* + * Not only the cpu but also the task_group of the parent might have + * been changed after parent->se.parent,cfs_rq were copied to + * child->se.parent,cfs_rq. So call __set_task_cpu() to make those + * of child point to valid ones. + */ + rcu_read_lock(); + __set_task_cpu(p, this_cpu); + rcu_read_unlock(); update_curr(cfs_rq); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/