Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751907AbbH1BtZ (ORCPT ); Thu, 27 Aug 2015 21:49:25 -0400 Received: from blu004-omc1s21.hotmail.com ([65.55.116.32]:61163 "EHLO BLU004-OMC1S21.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751801AbbH1BtX (ORCPT ); Thu, 27 Aug 2015 21:49:23 -0400 X-TMN: [k0cKlfO/k7zOnFYrWmtYKBRTM29s+Tr4] X-Originating-Email: [wanpeng.li@hotmail.com] Message-ID: Subject: Re: [PATCH] sched: fix tsk->pi_lock isn't held when do_set_cpus_allowed() To: Peter Zijlstra References: <20150825100527.GO16853@twins.programming.kicks-ass.net> <20150825101032.GI18673@twins.programming.kicks-ass.net> <20150825103249.GJ18673@twins.programming.kicks-ass.net> <20150827221828.GZ16853@twins.programming.kicks-ass.net> CC: Ingo Molnar , linux-kernel@vger.kernel.org From: Wanpeng Li Date: Fri, 28 Aug 2015 09:49:17 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <20150827221828.GZ16853@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 28 Aug 2015 01:49:21.0225 (UTC) FILETIME=[C23F3B90:01D0E133] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3157 Lines: 77 On 8/28/15 6:18 AM, Peter Zijlstra wrote: > On Tue, Aug 25, 2015 at 07:47:44PM +0800, Wanpeng Li wrote: >> On 8/25/15 6:32 PM, Peter Zijlstra wrote: >>> So Possibly, Maybe (I'm still to wrecked to say for sure), something >>> like this would work: >>> >>> WARN_ON(debug_locks && (lockdep_is_held(&p->pi_lock) || >>> (p->on_rq && lockdep_is_held(&rq->lock)))); >>> >>> Instead of those two separate lockdep asserts. >>> >>> Please consider carefully. > So the normal rules for changing task_struct::cpus_allowed are holding > both pi_lock and rq->lock, such that holding either stabilizes the mask. > > This is so that wakeup can happen without rq->lock and load-balance > without pi_lock. > > From this we already get the relaxation that we can omit acquiring > rq->lock if the task is not on the rq, because in that case > load-balancing will not apply to it. > > ** these are the rules currently tested in do_set_cpus_allowed() ** > > Now, since __set_cpus_allowed_ptr() uses task_rq_lock() which > unconditionally acquires both locks, we could get away with holding just > rq->lock when on_rq for modification because that'd still exclude > __set_cpus_allowed_ptr(), it would also work against > __kthread_bind_mask() because that assumes !on_rq. > > That said, this is all somewhat fragile. > >> Commit (5e16bbc2f: sched: Streamline the task migration locking a little) >> won't hold the pi_lock in migrate_tasks() path any more, actually pi_lock >> was still not held when call select_fallback_rq() and it was held in >> __migrate_task() before the commit. Then commit (25834c73f93: sched: Fix a >> race between __kthread_bind() and sched_setaffinity()) add a >> lockdep_assert_held() in do_set_cpus_allowed(), the bug is triggered. How >> about something like below: >> >> --- a/kernel/sched/core.c >> +++ b/kernel/sched/core.c >> @@ -5186,6 +5186,15 @@ static void migrate_tasks(struct rq *dead_rq) >> BUG_ON(!next); >> next->sched_class->put_prev_task(rq, next); >> >> + raw_spin_unlock(&rq->lock); >> + raw_spin_lock(&next->pi_lock); >> + raw_spin_lock(&rq->lock); >> + if (!(task_rq(next) == rq && task_on_rq_queued(next))) { >> + raw_spin_unlock(&rq->lock); >> + raw_spin_unlock(&next->pi_lock); >> + continue; >> + } > Yeah, that's quite disgusting.. also you'll trip over the lockdep_pin if > you were to actually run this. Indeed. I will handle lockdep_pin in these codes if you choice the second fragile. :-) Regards, Wanpeng Li > > Now, I don't think dropping rq->lock is quite as disastrous as it > usually is because !cpu_active at this point, which means load-balance > will not interfere, but that too is somewhat fragile. > > > So we end up with a choice of two fragile.. let me ponder that a wee > bit more. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/