2005-12-22 23:05:26

by Ingo Molnar

[permalink] [raw]
Subject: [patch 0/8] mutex subsystem, -V6

this is verion -V6 of the generic mutex subsystem. It consists of the
following patches:

add-atomic-xchg.patch
mutex-generic-asm-implementations.patch
mutex-arch-mutex-h.patch
mutex-core.patch
mutex-docs.patch
mutex-debug.patch
mutex-debug-more.patch
xfs-mutex-namespace-collision-fix.patch

the patches are against Linus' latest GIT tree, and they should work
fine on every Linux architecture.

the delta since -V5:

53 files changed, 718 insertions(+), 454 deletions(-)

this release picks up Arjan's asm/mutex.h implementation, which adds
asm-generic/mutex-dec.h, asm-generic/mutex-xchg.h for architectures to
pick up. i386 and x86_64 use their own optimized version already, the
other architectures default to mutex-xchg.h. Architectures specify the
following functions:

-------------------------------------------------------------------
* __mutex_fastpath_lock - try to take the lock by moving the count
* from 1 to a 0 value
* @count: pointer of type atomic_t
* @fn: function to call if the original value was not 1
-------------------------------------------------------------------
* __mutex_fastpath_lock_retval - try to take the lock by moving the count
* from 1 to a 0 value
* @count: pointer of type atomic_t
* @fn: function to call if the original value was not 1
-------------------------------------------------------------------
* __mutex_fastpath_unlock - try to promote the mutex from 0 to 1
* @count: pointer of type atomic_t
* @fn: function to call if the original value was not 1
-------------------------------------------------------------------

and __mutex_slowpath_needs_to_unlock(), to specify whether the fastpath
has touched the count or not.

i have tested this on x86, and i have booted all 4 variants:
mutex-xchg.h, mutex-dec.h, asm-i386/mutex.h and the debug version. (in
MUTEX_DEBUG_FULL mode, i.e. with the mutexes in real use.)

Nico, Christoph, does this approach work for you? Nico, you might want
to try an ARM-specific mutex.h implementation.

Ingo


2005-12-22 23:55:13

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [patch 0/8] mutex subsystem, -V6

On Fri, 23 Dec 2005, Ingo Molnar wrote:

> this release picks up Arjan's asm/mutex.h implementation, which adds
> asm-generic/mutex-dec.h, asm-generic/mutex-xchg.h for architectures to
> pick up. i386 and x86_64 use their own optimized version already, the
> other architectures default to mutex-xchg.h.

Great, this is in substance what I've been proposing for a while.

> Architectures specify the
> following functions:
>
> -------------------------------------------------------------------
> * __mutex_fastpath_unlock - try to promote the mutex from 0 to 1
> * @count: pointer of type atomic_t
> * @fn: function to call if the original value was not 1
> -------------------------------------------------------------------
The above should read "function to call if the original value was not 0".

> and __mutex_slowpath_needs_to_unlock(), to specify whether the fastpath
> has touched the count or not.

Note that, in most cases (but not necessarily all of them) the count is
always touched. It should rather be "has set the count to 1 or not" to
be precise..

> Nico, Christoph, does this approach work for you? Nico, you might want
> to try an ARM-specific mutex.h implementation.

Yes, I'm happy. And the ARM version will be sent your way soon.


Nicolas

2005-12-23 04:59:14

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [patch 0/8] mutex subsystem, -V6

On Fri, 23 Dec 2005, Ingo Molnar wrote:

> this is verion -V6 of the generic mutex subsystem.

Here's a patch to fix a few minor things: some comments were wrong, some
were irrelevant, another was duplicated. Also remove a bogus "return 0".

Index: linux-2.6/include/asm-generic/mutex-dec.h
===================================================================
--- linux-2.6.orig/include/asm-generic/mutex-dec.h
+++ linux-2.6/include/asm-generic/mutex-dec.h
@@ -1,7 +1,7 @@
/*
* asm-generic/mutex-dec.h
*
- * Generic wrappers for the mutex fastpath based on an xchg() implementation
+ * Generic wrappers for the mutex fastpath based on atomic increment/decrement
*
*/
#ifndef _ASM_GENERIC_MUTEX_DEC_H
@@ -46,7 +46,7 @@ __mutex_fastpath_lock_retval(atomic_t *c
/**
* __mutex_fastpath_unlock - try to promote the mutex from 0 to 1
* @count: pointer of type atomic_t
- * @fn: function to call if the original value was not 1
+ * @fn: function to call if the original value was not 0
*
* try to promote the mutex from 0 to 1. if it wasn't 0, call <function>
* In the failure case, this function is allowed to either set the value to
Index: linux-2.6/include/asm-generic/mutex-xchg.h
===================================================================
--- linux-2.6.orig/include/asm-generic/mutex-xchg.h
+++ linux-2.6/include/asm-generic/mutex-xchg.h
@@ -51,7 +51,7 @@ __mutex_fastpath_lock_retval(atomic_t *c
/**
* __mutex_fastpath_unlock - try to promote the mutex from 0 to 1
* @count: pointer of type atomic_t
- * @fn: function to call if the original value was not 1
+ * @fn: function to call if the original value was not 0
*
* try to promote the mutex from 0 to 1. if it wasn't 0, call <function>
* In the failure case, this function is allowed to either set the value to
Index: linux-2.6/include/asm-i386/mutex.h
===================================================================
--- linux-2.6.orig/include/asm-i386/mutex.h
+++ linux-2.6/include/asm-i386/mutex.h
@@ -61,7 +61,7 @@ __mutex_fastpath_lock_retval(atomic_t *c
/**
* __mutex_fastpath_unlock - try to promote the mutex from 0 to 1
* @count: pointer of type atomic_t
- * @fn: function to call if the original value was not 1
+ * @fn: function to call if the original value was not 0
*
* try to promote the mutex from 0 to 1. if it wasn't 0, call <function>
* In the failure case, this function is allowed to either set the value to
Index: linux-2.6/kernel/mutex.c
===================================================================
--- linux-2.6.orig/kernel/mutex.c
+++ linux-2.6/kernel/mutex.c
@@ -344,10 +344,6 @@ static inline void __mutex_unlock_nonato
static __sched void FASTCALL(__mutex_lock_noinline(atomic_t *lock_count));

/*
- * Some architectures do not have fast dec_and_test atomic primitives,
- * for them we are providing an atomic_xchg() based mutex implementation,
- * if they enable CONFIG_MUTEX_XCHG_ALGORITHM.
- *
* The locking fastpath is the 1->0 transition from 'unlocked' into
* 'locked' state:
*/
@@ -356,11 +352,6 @@ static inline void __mutex_lock_atomic(s
__mutex_fastpath_lock(&lock->count, __mutex_lock_noinline);
}

-/*
- * We put the slowpath into a separate function. This reduces
- * register pressure in the fastpath, and also enables the
- * atomic_[inc/dec]_call_if_[negative|nonpositive]() primitives.
- */
static void fastcall __sched __mutex_lock_noinline(atomic_t *lock_count)
{
struct mutex *lock = container_of(lock_count, struct mutex, count);
@@ -380,7 +371,6 @@ static inline int __mutex_lock_interrupt
{
return __mutex_fastpath_lock_retval
(&lock->count, __mutex_lock_interruptible_noinline);
- return 0;
}

static int fastcall __sched

2005-12-23 05:04:50

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [patch 0/8] mutex subsystem, -V6

On Thu, 22 Dec 2005, Nicolas Pitre wrote:

> > Nico, Christoph, does this approach work for you? Nico, you might want
> > to try an ARM-specific mutex.h implementation.
>
> Yes, I'm happy. And the ARM version will be sent your way soon.

Here it is:

Index: linux-2.6/include/asm-arm/mutex.h
===================================================================
--- linux-2.6.orig/include/asm-arm/mutex.h
+++ linux-2.6/include/asm-arm/mutex.h
@@ -1,8 +1,83 @@
/*
- * Pull in the generic wrappers for __mutex_fastpath_lock() and
- * __mutex_fastpath_unlock().
+ * include/asm-arm/mutex.h
*
- * TODO: implement optimized primitives instead
+ * ARM optimized mutex locking primitives
+ *
+ * Please look into asm-generic/mutex-xchg.h for a formal definition.
+ */
+
+#if __LINUX_ARM_ARCH__ >= 6
+
+/*
+ * Attempting to lock a mutex on ARMv6+ can be done with a bastardized
+ * atomic decrement (it is not a reliable atomic decrement but it satisfies
+ * the defined semantics for our purpose, while being smaller and faster
+ * than a real atomic decrement or atomic swap. The idea is to attempt
+ * decrementing the lock value only once. If once decremented it isn't zero,
+ * or if its store-back fails due to a dispute on the exclusive store, we
+ * simply bail out immediately through the slow path where the lock will be
+ * reattempted until it succeeds.
+ */
+#define __mutex_fastpath_lock(v, fail) \
+do { \
+ int __ex_flag, __res; \
+ __asm__ ( \
+ "ldrex %0, [%2]\n\t" \
+ "sub %0, %0, #1\n\t" \
+ "strex %1, %0, [%2]" \
+ : "=&r" (__res), "=&r" (__ex_flag) \
+ : "r" (&(v)->counter) \
+ : "cc","memory" ); \
+ __res |= __ex_flag; \
+ if (unlikely(__res != 0)) \
+ fail(v); \
+} while (0)
+
+#define __mutex_fastpath_lock_retval(v, fail) \
+({ \
+ int __ex_flag, __res; \
+ __asm__ ( \
+ "ldrex %0, [%2]\n\t" \
+ "sub %0, %0, #1\n\t" \
+ "strex %1, %0, [%2]" \
+ : "=&r" (__res), "=&r" (__ex_flag) \
+ : "r" (&(v)->counter) \
+ : "cc","memory" ); \
+ __res |= __ex_flag; \
+ if (unlikely(__res != 0)) \
+ __res = fail(v); \
+ __res; \
+})
+
+/*
+ * Same trick is used for the unlock fast path. However the original value,
+ * rather than the result, is used to test for success in order to have
+ * better generated assembly.
*/
+#define __mutex_fastpath_unlock(v, fail) \
+do { \
+ int __ex_flag, __res, __orig; \
+ __asm__ ( \
+ "ldrex %0, [%3]\n\t" \
+ "add %1, %0, #1\n\t" \
+ "strex %2, %1, [%3]" \
+ : "=&r" (__orig), "=&r" (__res), "=&r" (__ex_flag) \
+ : "r" (&(v)->counter) \
+ : "cc","memory" ); \
+ __orig |= __ex_flag; \
+ if (unlikely(__orig != 0)) \
+ fail(v); \
+} while (0)

+/*
+ * If the unlock was done on a contended lock, or if the unlock simply fails
+ * then the mutex remains locked.
+ */
+#define __mutex_slowpath_needs_to_unlock() (1)
+
+#else
+
+/* On pre-ARMv6 hardware the swp based implementation is the most efficient. */
#include <asm-generic/mutex-xchg.h>
+
+#endif

2005-12-23 05:12:58

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [patch 0/8] mutex subsystem, -V6

On Fri, 23 Dec 2005, Ingo Molnar wrote:

> this is verion -V6 of the generic mutex subsystem.
>
> Nico, Christoph, does this approach work for you?

OK. My final request would be for architectures to have the choice
whether to have the fast path inlined or not. I have a patch that does
just that, but it would conflict with Linus' suggestion about the
debugging stuff if you intend to implement it.


Nicolas

2005-12-23 08:04:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/8] mutex subsystem, -V6


* Nicolas Pitre <[email protected]> wrote:

> On Fri, 23 Dec 2005, Ingo Molnar wrote:
>
> > this is verion -V6 of the generic mutex subsystem.
>
> Here's a patch to fix a few minor things: some comments were wrong,
> some were irrelevant, another was duplicated. Also remove a bogus
> "return 0".

thanks, merged.

Ingo

2005-12-23 08:09:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/8] mutex subsystem, -V6


* Nicolas Pitre <[email protected]> wrote:

> On Thu, 22 Dec 2005, Nicolas Pitre wrote:
>
> > > Nico, Christoph, does this approach work for you? Nico, you might want
> > > to try an ARM-specific mutex.h implementation.
> >
> > Yes, I'm happy. And the ARM version will be sent your way soon.
>
> Here it is:

cool! I've merged it - and the end-result looks really clean.

Ingo