2014-06-12 21:20:17

by Jason Low

[permalink] [raw]
Subject: [PATCH v3 0/3] mutex: Modifications to mutex

v2->v3:
- Merge patches 2 and 3 from the v2 patchset into one patch as suggested.
- Update Reviewed-by/Acked-by

v1->v2:
- Remove the MUTEX_SHOW_NO_WAITER() macro.
- Use !mutex_is_locked() instead of introducing a new MUTEX_IS_UNLOCKED() macro.
- Add more information to the changelog of the "Optimize mutex trylock slowpath"
patch.

This patchset contains documentation fixes, optimizations, and improvements
to code clarity for mutex.

Jason Low (3):
mutex: Correct documentation on mutex optimistic spinning
mutex: Try-acquire mutex only if unlocked and improve code clarity
mutex: Optimize mutex trylock slowpath

kernel/locking/mutex.c | 35 ++++++++++++++++++-----------------
1 files changed, 18 insertions(+), 17 deletions(-)


2014-06-12 21:20:20

by Jason Low

[permalink] [raw]
Subject: [PATCH v3 2/3] mutex: Try-acquire mutex only if unlocked and improve code clarity

Upon entering the slowpath in __mutex_lock_common(), we try once more to
acquire the mutex if (lock->count >= 0). However, what we actually want
here is to try to acquire if the mutex is unlocked (lock->count == 1).

This patch changes it so that we only try-acquire the mutex upon entering
the slowpath if it is unlocked to further reduce unnecessary xchg() operations.

Furthermore, this patch uses !mutex_is_locked(lock) to do the initial
checks for if the lock is free rather than directly calling atomic_read()
on the lock->count, in order to improve readability.

Also, delete the MUTEX_SHOW_NO_WAITER() macro, because the macro doesn't
always do what its name suggests. Directly use atomic_read() instead of
the macro and add comments on how the extra atomic_read() can be helpful.

Acked-by: Waiman Long <[email protected]>
Acked-by: Davidlohr Bueso <[email protected]>
Signed-off-by: Jason Low <[email protected]>
---
kernel/locking/mutex.c | 21 ++++++++++-----------
1 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index dd26bf6..e4d997b 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -46,12 +46,6 @@
# include <asm/mutex.h>
#endif

-/*
- * A negative mutex count indicates that waiters are sleeping waiting for the
- * mutex.
- */
-#define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
-
void
__mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
{
@@ -438,7 +432,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
if (owner && !mutex_spin_on_owner(lock, owner))
break;

- if ((atomic_read(&lock->count) == 1) &&
+ /* Try to acquire the mutex if it is unlocked. */
+ if (!mutex_is_locked(lock) &&
(atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
lock_acquired(&lock->dep_map, ip);
if (use_ww_ctx) {
@@ -483,8 +478,11 @@ slowpath:
#endif
spin_lock_mutex(&lock->wait_lock, flags);

- /* once more, can we acquire the lock? */
- if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1))
+ /*
+ * Once more, try to acquire the lock. Only try-lock the mutex if
+ * it is unlocked to reduce unnecessary xchg() operations.
+ */
+ if (!mutex_is_locked(lock) && (atomic_xchg(&lock->count, 0) == 1))
goto skip_wait;

debug_mutex_lock_common(lock, &waiter);
@@ -504,9 +502,10 @@ slowpath:
* it's unlocked. Later on, if we sleep, this is the
* operation that gives us the lock. We xchg it to -1, so
* that when we release the lock, we properly wake up the
- * other waiters:
+ * other waiters. We only attempt the xchg if the count is
+ * non-negative in order to avoid unnecessary xchg operations:
*/
- if (MUTEX_SHOW_NO_WAITER(lock) &&
+ if (atomic_read(&lock->count) >= 0 &&
(atomic_xchg(&lock->count, -1) == 1))
break;

--
1.7.1

2014-06-12 21:20:31

by Jason Low

[permalink] [raw]
Subject: [PATCH v3 3/3] mutex: Optimize mutex trylock slowpath

The mutex_trylock() function calls into __mutex_trylock_fastpath() when
trying to obtain the mutex. On 32 bit x86, in the !__HAVE_ARCH_CMPXCHG
case, __mutex_trylock_fastpath() calls directly into __mutex_trylock_slowpath()
regardless of whether or not the mutex is locked.

In __mutex_trylock_slowpath(), we then acquire the wait_lock spinlock, xchg()
lock->count with -1, then set lock->count back to 0 if there are no waiters,
and return true if the prev lock count was 1.

However, if the mutex is already locked, then there isn't much point
in attempting all of the above expensive operations. In this patch, we only
attempt the above trylock operations if the mutex is unlocked.

Reviewed-by: Davidlohr Bueso <[email protected]>
Signed-off-by: Jason Low <[email protected]>
---
kernel/locking/mutex.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index e4d997b..11b103d 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -820,6 +820,10 @@ static inline int __mutex_trylock_slowpath(atomic_t *lock_count)
unsigned long flags;
int prev;

+ /* No need to trylock if the mutex is locked. */
+ if (mutex_is_locked(lock))
+ return 0;
+
spin_lock_mutex(&lock->wait_lock, flags);

prev = atomic_xchg(&lock->count, -1);
--
1.7.1

2014-06-12 21:20:58

by Jason Low

[permalink] [raw]
Subject: [PATCH v3 1/3] mutex: Correct documentation on mutex optimistic spinning

The mutex optimistic spinning documentation states that we spin for
acquisition when we find that there are no pending waiters. However,
in actuality, whether or not there are waiters for the mutex doesn't
determine if we will spin for it.

This patch removes that statement and also adds a comment which
mentions that we spin for the mutex while we don't need to reschedule.

Acked-by: Davidlohr Bueso <[email protected]>
Signed-off-by: Jason Low <[email protected]>
---
kernel/locking/mutex.c | 10 ++++------
1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index bc73d33..dd26bf6 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -388,12 +388,10 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
/*
* Optimistic spinning.
*
- * We try to spin for acquisition when we find that there are no
- * pending waiters and the lock owner is currently running on a
- * (different) CPU.
- *
- * The rationale is that if the lock owner is running, it is likely to
- * release the lock soon.
+ * We try to spin for acquisition when we find that the lock owner
+ * is currently running on a (different) CPU and while we don't
+ * need to reschedule. The rationale is that if the lock owner is
+ * running, it is likely to release the lock soon.
*
* Since this needs the lock owner, and this mutex implementation
* doesn't track the owner atomically in the lock field, we need to
--
1.7.1