Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754110AbZAFRJ5 (ORCPT ); Tue, 6 Jan 2009 12:09:57 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751552AbZAFRJp (ORCPT ); Tue, 6 Jan 2009 12:09:45 -0500 Received: from smtp1.linux-foundation.org ([140.211.169.13]:48960 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751355AbZAFRJn (ORCPT ); Tue, 6 Jan 2009 12:09:43 -0500 Date: Tue, 6 Jan 2009 09:08:33 -0800 From: Andrew Morton To: Peter Zijlstra Cc: Matthew Wilcox , Andi Kleen , Chris Mason , linux-kernel@vger.kernel.org, linux-fsdevel , linux-btrfs , Ingo Molnar , Thomas Gleixner , Steven Rostedt , Gregory Haskins , Nick Piggin , Linus Torvalds Subject: Re: [PATCH][RFC]: mutex: adaptive spin Message-Id: <20090106090833.78482f5d.akpm@linux-foundation.org> In-Reply-To: <1231242031.11687.97.camel@twins> References: <1230722935.4680.5.camel@think.oraclecorp.com> <20081231104533.abfb1cf9.akpm@linux-foundation.org> <1230765549.7538.8.camel@think.oraclecorp.com> <87r63ljzox.fsf@basil.nowhere.org> <20090103191706.GA2002@parisc-linux.org> <1231093310.27690.5.camel@twins> <20090104184103.GE2002@parisc-linux.org> <1231242031.11687.97.camel@twins> X-Mailer: Sylpheed 2.4.8 (GTK+ 2.12.5; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4728 Lines: 160 On Tue, 06 Jan 2009 12:40:31 +0100 Peter Zijlstra wrote: > Subject: mutex: adaptive spin > From: Peter Zijlstra > Date: Tue Jan 06 12:32:12 CET 2009 > > Based on the code in -rt, provide adaptive spins on generic mutexes. > How dumb is it to send a lump of uncommented, changelogged code as an rfc, only to have Ingo reply, providing the changelog for you? Sigh. > --- linux-2.6.orig/kernel/mutex.c > +++ linux-2.6/kernel/mutex.c > @@ -46,6 +46,7 @@ __mutex_init(struct mutex *lock, const c > atomic_set(&lock->count, 1); > spin_lock_init(&lock->wait_lock); > INIT_LIST_HEAD(&lock->wait_list); > + lock->owner = NULL; > > debug_mutex_init(lock, name, key); > } > @@ -120,6 +121,28 @@ void __sched mutex_unlock(struct mutex * > > EXPORT_SYMBOL(mutex_unlock); > > +#ifdef CONFIG_SMP > +static int adaptive_wait(struct mutex_waiter *waiter, > + struct task_struct *owner, long state) > +{ > + for (;;) { > + if (signal_pending_state(state, waiter->task)) > + return 0; > + if (waiter->lock->owner != owner) > + return 0; > + if (!task_is_current(owner)) > + return 1; > + cpu_relax(); > + } > +} Each of the tests in this function should be carefully commented. It's really the core piece of the design. > +#else > +static int adaptive_wait(struct mutex_waiter *waiter, > + struct task_struct *owner, long state) > +{ > + return 1; > +} > +#endif > + > /* > * Lock a mutex (possibly interruptible), slowpath: > */ > @@ -127,7 +150,7 @@ static inline int __sched > __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > unsigned long ip) > { > - struct task_struct *task = current; > + struct task_struct *owner, *task = current; > struct mutex_waiter waiter; > unsigned int old_val; > unsigned long flags; > @@ -141,6 +164,7 @@ __mutex_lock_common(struct mutex *lock, > /* add waiting tasks to the end of the waitqueue (FIFO): */ > list_add_tail(&waiter.list, &lock->wait_list); > waiter.task = task; > + waiter.lock = lock; > > old_val = atomic_xchg(&lock->count, -1); > if (old_val == 1) > @@ -175,11 +199,19 @@ __mutex_lock_common(struct mutex *lock, > debug_mutex_free_waiter(&waiter); > return -EINTR; > } > - __set_task_state(task, state); > > - /* didnt get the lock, go to sleep: */ > + owner = lock->owner; What prevents *owner from exitting right here? > + get_task_struct(owner); > spin_unlock_mutex(&lock->wait_lock, flags); > - schedule(); > + > + if (adaptive_wait(&waiter, owner, state)) { > + put_task_struct(owner); > + __set_task_state(task, state); > + /* didnt get the lock, go to sleep: */ > + schedule(); > + } else > + put_task_struct(owner); > + > spin_lock_mutex(&lock->wait_lock, flags); > } > > @@ -187,7 +219,7 @@ done: > lock_acquired(&lock->dep_map, ip); > /* got the lock - rejoice! */ > mutex_remove_waiter(lock, &waiter, task_thread_info(task)); > - debug_mutex_set_owner(lock, task_thread_info(task)); > + lock->owner = task; > > /* set it to 0 if there are no waiters left: */ > if (likely(list_empty(&lock->wait_list))) > @@ -260,7 +292,7 @@ __mutex_unlock_common_slowpath(atomic_t > wake_up_process(waiter->task); > } > > - debug_mutex_clear_owner(lock); > + lock->owner = NULL; > > spin_unlock_mutex(&lock->wait_lock, flags); > } > @@ -352,7 +384,7 @@ static inline int __mutex_trylock_slowpa > > prev = atomic_xchg(&lock->count, -1); > if (likely(prev == 1)) { > - debug_mutex_set_owner(lock, current_thread_info()); > + lock->owner = current; > mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_); > } > /* Set it back to 0 if there are no waiters: */ > Index: linux-2.6/kernel/sched.c > =================================================================== > --- linux-2.6.orig/kernel/sched.c > +++ linux-2.6/kernel/sched.c > @@ -697,6 +697,11 @@ int runqueue_is_locked(void) > return ret; > } > > +int task_is_current(struct task_struct *p) > +{ > + return task_rq(p)->curr == p; > +} Please don't add kernel-wide infrastructure and leave it completely undocumented. Particularly functions which are as vague and dangerous as this one. What locking must the caller provide? What are the semantics of the return value? What must the caller do to avoid oopses if *p is concurrently exiting? etc. The overall design intent seems very smart to me, as long as the races can be plugged, if they're indeed present. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/