Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757492AbZAHSDY (ORCPT ); Thu, 8 Jan 2009 13:03:24 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752038AbZAHSDN (ORCPT ); Thu, 8 Jan 2009 13:03:13 -0500 Received: from hrndva-omtalb.mail.rr.com ([71.74.56.123]:62379 "EHLO hrndva-omtalb.mail.rr.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751068AbZAHSDM (ORCPT ); Thu, 8 Jan 2009 13:03:12 -0500 Date: Thu, 8 Jan 2009 13:03:09 -0500 (EST) From: Steven Rostedt X-X-Sender: rostedt@gandalf.stny.rr.com To: Linus Torvalds cc: Chris Mason , Peter Zijlstra , Ingo Molnar , paulmck@linux.vnet.ibm.com, Gregory Haskins , Matthew Wilcox , Andi Kleen , Andrew Morton , Linux Kernel Mailing List , linux-fsdevel , linux-btrfs , Thomas Gleixner , Nick Piggin , Peter Morreale , Sven Dietrich Subject: Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning In-Reply-To: Message-ID: References: <1231347442.11687.344.camel@twins> <1231365115.11687.361.camel@twins> <1231366716.11687.377.camel@twins> <1231408718.11687.400.camel@twins> <20090108141808.GC11629@elte.hu> <1231426014.11687.456.camel@twins> <1231434515.14304.27.camel@think.oraclecorp.com> User-Agent: Alpine 1.10 (DEB 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2223 Lines: 71 On Thu, 8 Jan 2009, Linus Torvalds wrote: > > And I don't even believe that is the bug. I suspect the bug is simpler. > > I think the "need_resched()" needs to go in the outer loop, or at least > happen in the "!owner" case. Because at least with preemption, what can > happen otherwise is > > - process A gets the lock, but gets preempted before it sets lock->owner. > > End result: count = 0, owner = NULL. > > - processes B/C goes into the spin loop, filling up all CPU's (assuming > dual-core here), and will now both loop forever if they hold the kernel > lock (or have some other preemption disabling thing over their down()). > > And all the while, process A would _happily_ set ->owner, and eventually > release the mutex, but it never gets to run to do either of them so. > > In fact, you might not even need a process C: all you need is for B to be > on the same runqueue as A, and having enough load on the other CPU's that > A never gets migrated away. So "C" might be in user space. > > I dunno. There are probably variations on the above. Ouch! I think you are on to something: for (;;) { struct thread_info *owner; old_val = atomic_cmpxchg(&lock->count, 1, 0); if (old_val == 1) { lock_acquired(&lock->dep_map, ip); mutex_set_owner(lock); return 0; } if (old_val < 0 && !list_empty(&lock->wait_list)) break; /* See who owns it, and spin on him if anybody */ owner = ACCESS_ONCE(lock->owner); The owner was preempted before assigning lock->owner (as you stated). if (owner && !spin_on_owner(lock, owner)) break; We just spin :-( I think adding the: + if (need_resched()) + break; would solve the problem. Thanks, -- Steve cpu_relax(); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/