Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757658AbbBQMNI (ORCPT ); Tue, 17 Feb 2015 07:13:08 -0500 Received: from casper.infradead.org ([85.118.1.10]:59950 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752290AbbBQMNF (ORCPT ); Tue, 17 Feb 2015 07:13:05 -0500 Date: Tue, 17 Feb 2015 13:12:58 +0100 From: Peter Zijlstra To: Kirill Tkhai Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Josh Poimboeuf , oleg@redhat.com, paulmck@linux.vnet.ibm.com Subject: Re: [PATCH 2/2] [PATCH] sched: Add smp_rmb() in task rq locking cycles Message-ID: <20150217121258.GM5029@twins.programming.kicks-ass.net> References: <20150217104516.12144.85911.stgit@tkhai> <1424170021.5749.22.camel@tkhai> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1424170021.5749.22.camel@tkhai> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3296 Lines: 97 On Tue, Feb 17, 2015 at 01:47:01PM +0300, Kirill Tkhai wrote: > > We migrate a task using TASK_ON_RQ_MIGRATING state of on_rq: > > raw_spin_lock(&old_rq->lock); > deactivate_task(old_rq, p, 0); > p->on_rq = TASK_ON_RQ_MIGRATING; > set_task_cpu(p, new_cpu); > raw_spin_unlock(&rq->lock); > > I.e.: > > write TASK_ON_RQ_MIGRATING > smp_wmb() (in __set_task_cpu) > write new_cpu > > But {,__}task_rq_lock() don't use smp_rmb(), and they may see > the cpu and TASK_ON_RQ_MIGRATING in opposite order. In this case > {,__}task_rq_lock() lock new_rq before the task is actually queued > on it. > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index fc12a1d..a42fb88 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -319,8 +319,12 @@ static struct rq *task_rq_lock(struct task_struct *p, unsigned long *flags) > raw_spin_lock_irqsave(&p->pi_lock, *flags); > rq = task_rq(p); > raw_spin_lock(&rq->lock); > - if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) > - return rq; > + if (likely(rq == task_rq(p))) { > + /* Pairs with smp_wmb() in __set_task_cpu() */ That comment really is insufficient; but aside from that: If we observe the old cpu value we've just acquired the old rq->lock and therefore we must observe the new cpu value and retry -- we don't care about the migrate value in this case. If we observe the new cpu value, we've acquired the new rq->lock and its ACQUIRE will pair with the WMB to ensure we see the migrate value. So I think the current code is correct; albeit it could use a comment. > + smp_rmb(); > + if (likely(!task_on_rq_migrating(p))) > + return rq; > + } --- Subject: sched: Clarify ordering between task_rq_lock() and move_queued_task() From: Peter Zijlstra Date: Tue Feb 17 13:07:38 CET 2015 There was a wee bit of confusion around the exact ordering here; clarify things. Cc: Oleg Nesterov Cc: "Paul E. McKenney" Reported-by: Kirill Tkhai Signed-off-by: Peter Zijlstra (Intel) --- kernel/sched/core.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -341,6 +341,22 @@ static struct rq *task_rq_lock(struct ta raw_spin_lock_irqsave(&p->pi_lock, *flags); rq = task_rq(p); raw_spin_lock(&rq->lock); + /* + * move_queued_task() task_rq_lock() + * + * ACQUIRE (rq->lock) + * [S] ->on_rq = MIGRATING [L] rq = task_rq() + * WMB (__set_task_cpu()) ACQUIRE (rq->lock); + * [S] ->cpu = new_cpu [L] task_rq() + * [L] ->on_rq + * RELEASE (rq->lock) + * + * If we observe the old cpu in task_rq_lock, the acquire of + * the old rq->lock will fully serialize against the stores. + * + * If we observe the new cpu in task_rq_lock, the acquire will + * pair with the WMB to ensure we must then also see migrating. + */ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) return rq; raw_spin_unlock(&rq->lock); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/