Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757006AbbH0WSj (ORCPT ); Thu, 27 Aug 2015 18:18:39 -0400 Received: from casper.infradead.org ([85.118.1.10]:52857 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756977AbbH0WSg (ORCPT ); Thu, 27 Aug 2015 18:18:36 -0400 Date: Fri, 28 Aug 2015 00:18:28 +0200 From: Peter Zijlstra To: Wanpeng Li Cc: Ingo Molnar , linux-kernel@vger.kernel.org Subject: Re: [PATCH] sched: fix tsk->pi_lock isn't held when do_set_cpus_allowed() Message-ID: <20150827221828.GZ16853@twins.programming.kicks-ass.net> References: <20150825100527.GO16853@twins.programming.kicks-ass.net> <20150825101032.GI18673@twins.programming.kicks-ass.net> <20150825103249.GJ18673@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2924 Lines: 71 On Tue, Aug 25, 2015 at 07:47:44PM +0800, Wanpeng Li wrote: > On 8/25/15 6:32 PM, Peter Zijlstra wrote: > >So Possibly, Maybe (I'm still to wrecked to say for sure), something > >like this would work: > > > > WARN_ON(debug_locks && (lockdep_is_held(&p->pi_lock) || > > (p->on_rq && lockdep_is_held(&rq->lock)))); > > > >Instead of those two separate lockdep asserts. > > > >Please consider carefully. So the normal rules for changing task_struct::cpus_allowed are holding both pi_lock and rq->lock, such that holding either stabilizes the mask. This is so that wakeup can happen without rq->lock and load-balance without pi_lock. >From this we already get the relaxation that we can omit acquiring rq->lock if the task is not on the rq, because in that case load-balancing will not apply to it. ** these are the rules currently tested in do_set_cpus_allowed() ** Now, since __set_cpus_allowed_ptr() uses task_rq_lock() which unconditionally acquires both locks, we could get away with holding just rq->lock when on_rq for modification because that'd still exclude __set_cpus_allowed_ptr(), it would also work against __kthread_bind_mask() because that assumes !on_rq. That said, this is all somewhat fragile. > Commit (5e16bbc2f: sched: Streamline the task migration locking a little) > won't hold the pi_lock in migrate_tasks() path any more, actually pi_lock > was still not held when call select_fallback_rq() and it was held in > __migrate_task() before the commit. Then commit (25834c73f93: sched: Fix a > race between __kthread_bind() and sched_setaffinity()) add a > lockdep_assert_held() in do_set_cpus_allowed(), the bug is triggered. How > about something like below: > > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -5186,6 +5186,15 @@ static void migrate_tasks(struct rq *dead_rq) > BUG_ON(!next); > next->sched_class->put_prev_task(rq, next); > > + raw_spin_unlock(&rq->lock); > + raw_spin_lock(&next->pi_lock); > + raw_spin_lock(&rq->lock); > + if (!(task_rq(next) == rq && task_on_rq_queued(next))) { > + raw_spin_unlock(&rq->lock); > + raw_spin_unlock(&next->pi_lock); > + continue; > + } Yeah, that's quite disgusting.. also you'll trip over the lockdep_pin if you were to actually run this. Now, I don't think dropping rq->lock is quite as disastrous as it usually is because !cpu_active at this point, which means load-balance will not interfere, but that too is somewhat fragile. So we end up with a choice of two fragile.. let me ponder that a wee bit more. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/