Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1764451AbZANQt0 (ORCPT ); Wed, 14 Jan 2009 11:49:26 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1764202AbZANQtL (ORCPT ); Wed, 14 Jan 2009 11:49:11 -0500 Received: from rcsinet12.oracle.com ([148.87.113.124]:33259 "EHLO rgminet12.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1764081AbZANQtI (ORCPT ); Wed, 14 Jan 2009 11:49:08 -0500 Subject: Re: [PATCH -v9][RFC] mutex: implement adaptive spinning From: Chris Mason To: Dmitry Adamushko Cc: Peter Zijlstra , Linus Torvalds , Ingo Molnar , "Paul E. McKenney" , Gregory Haskins , Matthew Wilcox , Andi Kleen , Andrew Morton , Linux Kernel Mailing List , linux-fsdevel , linux-btrfs , Thomas Gleixner , Nick Piggin , Peter Morreale , Sven Dietrich In-Reply-To: References: <1231774622.4371.96.camel@laptop> <1231859742.442.128.camel@twins> <1231863710.7141.3.camel@twins> <1231864854.7141.8.camel@twins> <1231867314.7141.16.camel@twins> <1231901899.1709.18.camel@think.oraclecorp.com> Content-Type: text/plain Date: Wed, 14 Jan 2009 11:47:42 -0500 Message-Id: <1231951662.8269.22.camel@think.oraclecorp.com> Mime-Version: 1.0 X-Mailer: Evolution 2.24.1 Content-Transfer-Encoding: 7bit X-Source-IP: acsmt704.oracle.com [141.146.40.82] X-Auth-Type: Internal IP X-CT-RefId: str=0001.0A090207.496E1733.021C:SCFSTAT928724,ss=1,fgs=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3635 Lines: 100 On Wed, 2009-01-14 at 12:18 +0100, Dmitry Adamushko wrote: > 2009/1/14 Chris Mason : > > On Tue, 2009-01-13 at 18:21 +0100, Peter Zijlstra wrote: > >> On Tue, 2009-01-13 at 08:49 -0800, Linus Torvalds wrote: > >> > > >> > So do a v10, and ask people to test. > >> > >> --- > >> Subject: mutex: implement adaptive spinning > >> From: Peter Zijlstra > >> Date: Mon Jan 12 14:01:47 CET 2009 > >> > >> Change mutex contention behaviour such that it will sometimes busy wait on > >> acquisition - moving its behaviour closer to that of spinlocks. > >> > > > > I've spent a bunch of time on this one, and noticed earlier today that I > > still had bits of CONFIG_FTRACE compiling. I wasn't actually tracing > > anything, but it seems to have had a big performance hit. > > > > The bad news is the simple spin got much much faster, dbench 50 coming > > in at 1282MB/s instead of 580MB/s. (other benchmarks give similar > > results) > > > > v10 is better that not spinning, but its in the 5-10% range. So, I've > > been trying to find ways to close the gap, just to understand exactly > > where it is different. > > > > If I take out: > > /* > > * If there are pending waiters, join them. > > */ > > if (!list_empty(&lock->wait_list)) > > break; > > > > > > v10 pops dbench 50 up to 1800MB/s. The other tests soundly beat my > > spinning and aren't less fair. But clearly this isn't a good solution. > > > > I tried a few variations, like only checking the wait list once before > > looping, which helps some. Are there other suggestions on better tuning > > options? > > (some thoughts/speculations) > > Perhaps for highly-contanded mutexes the spinning implementation may > quickly degrade [*] to the non-spinning one (i.e. the current > sleep-wait mutex) and then just stay in this state until a moment of > time when there are no waiters [**] -- i.e. > list_empty(&lock->wait_list) == 1 and waiters can start spinning > again. It is actually ok if the highly contention mutexes don't degrade as long as they are highly contended and the holder isn't likely to schedule. > > what may trigger [*]: > > (1) obviously, an owner scheduling out. > > Even if it happens rarely (otherwise, it's not a target scenario for > our optimization), due to the [**] it may take quite some time until > waiters are able to spin again. > > let's say, waiters (almost) never block (and possibly, such cases > would be better off just using a spinlock after some refactoring, if > possible) > > (2) need_resched() is triggered for one of the waiters. > > (3) !owner && rt_task(p) > > quite unlikely, but possible (there are 2 race windows). > > Of course, the question is whether it really takes a noticeable amount > of time to get out of the [**] state. > I'd imagine it can be a case for highly-contended locks. > > If this is the case indeed, then which of 1,2,3 gets triggered the most. Sorry, I don't have stats on that. > > Have you tried removing need_resched() checks? So we kind of emulate > real spinlocks here. Unfortunately, the need_resched() checks deal with a few of the ugly corners. They are more important without the waiter list check. Basically if we spun without the need_resched() checks, the process who wants to unlock might not be able to schedule back in. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/