Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756801Ab0DEWWj (ORCPT ); Mon, 5 Apr 2010 18:22:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:62301 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756746Ab0DEWWe (ORCPT ); Mon, 5 Apr 2010 18:22:34 -0400 Message-ID: <4BBA6279.20802@redhat.com> Date: Tue, 06 Apr 2010 01:21:45 +0300 From: Avi Kivity User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100330 Fedora/3.0.4-1.fc12 Thunderbird/3.0.4 MIME-Version: 1.0 To: Darren Hart CC: linux-kernel@vger.kernel.org, Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Eric Dumazet , "Peter W. Morreale" , Rik van Riel , Steven Rostedt , Gregory Haskins , Sven-Thorsten Dietrich , Chris Mason , John Cooper , Chris Wright Subject: Re: [PATCH V2 0/6][RFC] futex: FUTEX_LOCK with optional adaptive spinning References: <1270499039-23728-1-git-send-email-dvhltc@us.ibm.com> <4BBA5305.7010002@redhat.com> <4BBA5C00.4090703@us.ibm.com> In-Reply-To: <4BBA5C00.4090703@us.ibm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3583 Lines: 92 On 04/06/2010 12:54 AM, Darren Hart wrote: > Avi Kivity wrote: >> On 04/05/2010 11:23 PM, Darren Hart wrote: >>> In-Reply-To: >>> >>> NOT FOR INCLUSION >>> >>> The following patch series implements a new experimental kernel side >>> futex mutex >>> via new FUTEX_LOCK and FUTEX_LOCK_ADAPTIVE futex op codes. The >>> adaptive spin >>> follows the kernel mutex model of allowing one spinner until the >>> lock is >>> released or the owner is descheduled. The patch currently allows the >>> user to >>> specify if they want no spinning, a single adaptive spinner, or >>> multiple >>> spinners (aggressive adaptive spinning, or aas... which I have >>> mistyped as "ass" >>> enough times to realize a better term is indeed required :-). >> >> An interesting (but perhaps difficult to achieve) optimization would >> be to spin in userspace. > > I couldn't think of a lightweight way to determine when the owner has > been scheduled out in userspace. Kernel assistance is required. You > could do this on the schedule() side of things, but I figured I'd get > some strong pushback if I tried to add a hook into descheduling that > flipped a bit in the futex value stating the owner was about to > deschedule(). Still, that might be something to explore. In the futex value it's hopeless (since a thread can hold many locks), but I don't think it's unreasonable to set a bit in the thread local storage area. The futex format would then need to be extended to contain a pointer to this bit. > >> >> How many cores (or hardware threads) does this machine have? > > Sorry, I meant to include that. I tested on an 8 CPU (no hardware > threads) 2.6 GHz Opteron 2218 (2 QuadCore CPUs) system. > > > At 10% >> duty cycle you have 25 waiters behind the lock on average. I don't >> think this is realistic, and it means that spinning is invoked only >> rarely. > > Perhaps some instrumentation is in order, it seems to get invoked > enough to achieve some 20% increase in lock/unlock iterations. Perhaps > another metric would be of more value - such as average wait time? Why measure an unrealistic workload? > >> I'd be interested in seeing runs where the average number of waiters >> is 0.2, 0.5, 1, and 2, corresponding to moderate-to-bad contention. >> 25 average waiters on compute bound code means the application needs >> to be rewritten, no amount of mutex tweaking will help it. > > Perhaps something NR_CPUS threads would be of more interest? That seems artificial. > At 10% that's about .8 and at 25% the 2 of your upper limit. I could > add a few more duty-cycle points and make 25% the max. I'll kick that > off and post the results... probably tomorrow, 10M iterations takes a > while, but makes the results relatively stable. Thanks. But why not vary the number of threads as well? > >> Does the wakeup code select the spinning waiter, or just a random >> waiter? > > The wakeup code selects the highest priority task in fifo order to > wake-up - however, under contention it is most likely going to go back > to sleep as another waiter will steal the lock out from under it. This > locking strategy is unashamedly about as "unfair" as it gets. Best to avoid the wakeup if we notice the lock was stolen. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/