Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756196AbZAGQd1 (ORCPT ); Wed, 7 Jan 2009 11:33:27 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752321AbZAGQdR (ORCPT ); Wed, 7 Jan 2009 11:33:17 -0500 Received: from E23SMTP03.au.ibm.com ([202.81.18.172]:43302 "EHLO e23smtp03.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752485AbZAGQdR (ORCPT ); Wed, 7 Jan 2009 11:33:17 -0500 Date: Wed, 7 Jan 2009 22:01:00 +0530 From: Vaidyanathan Srinivasan To: Peter Zijlstra Cc: Ingo Molnar , Linux Kernel , Balbir Singh , Andrew Morton , Mike Galbraith Subject: Re: [BUG] 2.6.28-git LOCKDEP: Possible recursive rq->lock Message-ID: <20090107163100.GO4574@dirshya.in.ibm.com> Reply-To: svaidy@linux.vnet.ibm.com References: <20090104174450.GB4301@dirshya.in.ibm.com> <1231092523.29980.4.camel@twins> <20090105040635.GF4301@dirshya.in.ibm.com> <20090105130638.GB6014@elte.hu> <20090107114947.GJ4574@dirshya.in.ibm.com> <20090107122913.GL4574@dirshya.in.ibm.com> <1231333963.11687.288.camel@twins> <20090107142009.GM4574@dirshya.in.ibm.com> <1231338537.11687.295.camel@twins> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1231338537.11687.295.camel@twins> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3718 Lines: 95 * Peter Zijlstra [2009-01-07 15:28:57]: > On Wed, 2009-01-07 at 19:50 +0530, Vaidyanathan Srinivasan wrote: > > * Peter Zijlstra [2009-01-07 14:12:43]: > > > > > On Wed, 2009-01-07 at 17:59 +0530, Vaidyanathan Srinivasan wrote: > > > > > > > ============================================= > > > > [ INFO: possible recursive locking detected ] > > > > 2.6.28-autotest-tip-sv #1 > > > > --------------------------------------------- > > > > klogd/5062 is trying to acquire lock: > > > > (&rq->lock){++..}, at: [] task_rq_lock+0x45/0x7e > > > > > > > > but task is already holding lock: > > > > (&rq->lock){++..}, at: [] schedule+0x158/0xa31 > > > > > > > > other info that might help us debug this: > > > > 1 lock held by klogd/5062: > > > > #0: (&rq->lock){++..}, at: [] schedule+0x158/0xa31 > > > > > > > > stack backtrace: > > > > Pid: 5062, comm: klogd Not tainted 2.6.28-autotest-tip-sv #1 > > > > Call Trace: > > > > [] __lock_acquire+0xeb9/0x16a4 > > > > [] ? __lock_acquire+0x1688/0x16a4 > > > > [] lock_acquire+0x85/0xa9 > > > > [] ? task_rq_lock+0x45/0x7e > > > > [] _spin_lock+0x31/0x66 > > > > [] ? task_rq_lock+0x45/0x7e > > > > [] task_rq_lock+0x45/0x7e > > > > [] try_to_wake_up+0x88/0x27a > > > > [] wake_up_process+0x10/0x12 > > > > [] schedule+0x560/0xa31 > > > > > > I'd be most curious to know where in schedule we are. > > > > ok, we are in sched.c:3777 > > > > double_unlock_balance(this_rq, busiest); > > if (active_balance) > > >>>>>>>>>>> wake_up_process(busiest->migration_thread); > > > > } else > > > > In active balance in newidle. This implies sched_mc was 2 at that time. > > let me trace this and debug further. > > How about something like this? Strictly speaking we'll not deadlock, > because ttwu will not be able to place the migration task on our rq, but > since the code can deal with both rqs getting unlocked, this seems the > easiest way out. Hi Peter, I agree. Unlocking this_rq is an easy way out. Thanks for the suggestion. I have moved the unlock and lock withing the if condition. --Vaidy sched: bug fix -- do not call ttwu while holding rq->lock When sched_mc=2 wake_up_process() is called on busiest_rq while holding this_rq lock in load_balance_newidle() Though this will not deadlock, this is a lockdep warning and the situation is easily solved by releasing the this_rq lock at this point in code Signed-off-by: Vaidyanathan Srinivasan diff --git a/kernel/sched.c b/kernel/sched.c index 71a054f..703a669 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -3773,8 +3773,12 @@ redo: } double_unlock_balance(this_rq, busiest); - if (active_balance) + if (active_balance) { + /* Should not call ttwu while holding a rq->lock */ + spin_unlock(&this_rq->lock); wake_up_process(busiest->migration_thread); + spin_lock(&this_rq->lock); + } } else sd->nr_balance_failed = 0; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/