2014-06-04 19:08:36

by Jason Low

[permalink] [raw]
Subject: [RFC PATCH 0/3] locking/mutex: Modifications to mutex

This patchset contains modifications to mutex.

Patch 1 further reduces unnecessary atomic xchg() operations upon entering
the mutex slowpath in mutex_lock().

Patch 2 corrects/updates the documentation on mutex optimistic spinning.

Patch 3 optimizes mutex trylock slowpath.

Jason Low (3):
mutex: Try to acquire mutex only if it is unlocked
mutex: Correct documentation on mutex optimistic spinning
mutex: Optimize mutex trylock slowpath

kernel/locking/mutex.c | 47 ++++++++++++++++++++++++++---------------------
1 files changed, 26 insertions(+), 21 deletions(-)


2014-06-04 19:08:42

by Jason Low

[permalink] [raw]
Subject: [RFC PATCH 3/3] locking/mutex: Optimize mutex trylock slowpath

In __mutex_trylock_slowpath(), we acquire the wait_lock spinlock,
xchg() lock->count with -1, then set lock->count back to 0 if there
are no waiters, and return true if the prev lock count was 1.

However, if we the mutex is already locked, then there may not be
much point in attempting the above operations. In this patch, we only
attempt the above operations if the mutex is unlocked.

The new MUTEX_IS_UNLOCKED() macro is also used for this.

Signed-off-by: Jason Low <[email protected]>
---
kernel/locking/mutex.c | 27 ++++++++++++++++-----------
1 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index fc55f72..c65680d 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -821,21 +821,26 @@ static inline int __mutex_trylock_slowpath(atomic_t *lock_count)
{
struct mutex *lock = container_of(lock_count, struct mutex, count);
unsigned long flags;
- int prev;
+ int prev = 0;

- spin_lock_mutex(&lock->wait_lock, flags);
+ /*
+ * Only need to trylock the mutex if it is unlocked.
+ */
+ if (MUTEX_IS_UNLOCKED(lock)) {
+ spin_lock_mutex(&lock->wait_lock, flags);

- prev = atomic_xchg(&lock->count, -1);
- if (likely(prev == 1)) {
- mutex_set_owner(lock);
- mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_);
- }
+ prev = atomic_xchg(&lock->count, -1);
+ if (likely(prev == 1)) {
+ mutex_set_owner(lock);
+ mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_);
+ }

- /* Set it back to 0 if there are no waiters: */
- if (likely(list_empty(&lock->wait_list)))
- atomic_set(&lock->count, 0);
+ /* Set it back to 0 if there are no waiters: */
+ if (likely(list_empty(&lock->wait_list)))
+ atomic_set(&lock->count, 0);

- spin_unlock_mutex(&lock->wait_lock, flags);
+ spin_unlock_mutex(&lock->wait_lock, flags);
+ }

return prev == 1;
}
--
1.7.1

2014-06-04 19:08:40

by Jason Low

[permalink] [raw]
Subject: [RFC PATCH 2/3] locking/mutex: Correct documentation on mutex optimistic spinning

The mutex optimistic spinning documentation states that we spin for
acquisition when we find that there are no pending waiters. However,
in actuality, whether or not there are waiters for the mutex doesn't
determine if we will spin for it.

This patch removes that statement and also adds a comment which
mentions that we spin for the mutex while we don't need to reschedule.

Signed-off-by: Jason Low <[email protected]>
---
kernel/locking/mutex.c | 10 ++++------
1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 0925968..fc55f72 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -389,12 +389,10 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
/*
* Optimistic spinning.
*
- * We try to spin for acquisition when we find that there are no
- * pending waiters and the lock owner is currently running on a
- * (different) CPU.
- *
- * The rationale is that if the lock owner is running, it is likely to
- * release the lock soon.
+ * We try to spin for acquisition when we find that the lock owner
+ * is currently running on a (different) CPU and while we don't
+ * need to reschedule. The rationale is that if the lock owner is
+ * running, it is likely to release the lock soon.
*
* Since this needs the lock owner, and this mutex implementation
* doesn't track the owner atomically in the lock field, we need to
--
1.7.1

2014-06-04 19:08:38

by Jason Low

[permalink] [raw]
Subject: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

Upon entering the slowpath in __mutex_lock_common(), we try once more
to acquire the mutex. We only try to acquire it if MUTEX_SHOW_NO_WAITER
(lock->count >= 0) is true in order to avoid using the atomic xchg()
operation whenever it is not necessary. However, we really only need
to try to acquire if the mutex is free (lock->count == 1).

This patch changes it so that we only try-acquire the mutex upon
entering the slowpath if it is unlocked, rather than if there are
no waiters. This helps further reduce unncessary atomic xchg()
operations. Furthermore, this patch introduces and uses a new
MUTEX_IS_UNLOCKED() macro to improve readbability.

Signed-off-by: Jason Low <[email protected]>
---
kernel/locking/mutex.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index bc73d33..0925968 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -48,9 +48,10 @@

/*
* A negative mutex count indicates that waiters are sleeping waiting for the
- * mutex.
+ * mutex, and a count of one indicates the mutex is unlocked.
*/
#define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
+#define MUTEX_IS_UNLOCKED(mutex) (atomic_read(&(mutex)->count) == 1)

void
__mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
@@ -440,7 +441,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
if (owner && !mutex_spin_on_owner(lock, owner))
break;

- if ((atomic_read(&lock->count) == 1) &&
+ /* Try to acquire the lock if it is free. */
+ if (MUTEX_IS_UNLOCKED(lock) &&
(atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
lock_acquired(&lock->dep_map, ip);
if (use_ww_ctx) {
@@ -485,8 +487,8 @@ slowpath:
#endif
spin_lock_mutex(&lock->wait_lock, flags);

- /* once more, can we acquire the lock? */
- if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1))
+ /* once more, try to acquire the lock if it is free: */
+ if (MUTEX_IS_UNLOCKED(lock) && (atomic_xchg(&lock->count, 0) == 1))
goto skip_wait;

debug_mutex_lock_common(lock, &waiter);
--
1.7.1

2014-06-04 19:43:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, Jun 04, 2014 at 12:08:29PM -0700, Jason Low wrote:
> Upon entering the slowpath in __mutex_lock_common(), we try once more
> to acquire the mutex. We only try to acquire it if MUTEX_SHOW_NO_WAITER
> (lock->count >= 0) is true in order to avoid using the atomic xchg()
> operation whenever it is not necessary. However, we really only need
> to try to acquire if the mutex is free (lock->count == 1).
>
> This patch changes it so that we only try-acquire the mutex upon
> entering the slowpath if it is unlocked, rather than if there are
> no waiters. This helps further reduce unncessary atomic xchg()
> operations. Furthermore, this patch introduces and uses a new
> MUTEX_IS_UNLOCKED() macro to improve readbability.
>
> Signed-off-by: Jason Low <[email protected]>
> ---
> kernel/locking/mutex.c | 10 ++++++----
> 1 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index bc73d33..0925968 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -48,9 +48,10 @@
>
> /*
> * A negative mutex count indicates that waiters are sleeping waiting for the
> - * mutex.
> + * mutex, and a count of one indicates the mutex is unlocked.
> */
> #define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
> +#define MUTEX_IS_UNLOCKED(mutex) (atomic_read(&(mutex)->count) == 1)

So I recently saw that MUTEX_SHOW_NO_WAITER thing and cried a little;
and now you're adding more of that same nonsense.

Please make them inline functions, also can we rename the SHOW_NO_WAITER
thing, because its not at all clear to me wtf it does; should it be
called: mutex_no_waiters() or somesuch?

2014-06-04 20:11:43

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [RFC PATCH 2/3] locking/mutex: Correct documentation on mutex optimistic spinning

On Wed, 2014-06-04 at 12:08 -0700, Jason Low wrote:
> The mutex optimistic spinning documentation states that we spin for
> acquisition when we find that there are no pending waiters. However,
> in actuality, whether or not there are waiters for the mutex doesn't
> determine if we will spin for it.
>
> This patch removes that statement and also adds a comment which
> mentions that we spin for the mutex while we don't need to reschedule.
>
> Signed-off-by: Jason Low <[email protected]>

Acked-by: Davidlohr Bueso <[email protected]>

I think this could go in with v4 of the mutex doc rewrite patch...
guessing for 3.17 now.

Thanks,
Davidlohr

2014-06-04 20:13:34

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] locking/mutex: Modifications to mutex

On Wed, 2014-06-04 at 12:08 -0700, Jason Low wrote:
> This patchset contains modifications to mutex.

Wondering why the RFC...

2014-06-04 20:28:40

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] locking/mutex: Optimize mutex trylock slowpath

On Wed, 2014-06-04 at 12:08 -0700, Jason Low wrote:
> In __mutex_trylock_slowpath(), we acquire the wait_lock spinlock,
> xchg() lock->count with -1, then set lock->count back to 0 if there
> are no waiters, and return true if the prev lock count was 1.
>
> However, if we the mutex is already locked, then there may not be
^^ leave that out.

> much point in attempting the above operations.

Isn't this redundant? I mean, if we enter the slowpath its because
__mutex_fastpath_trylock() already failed so we already know that the
lock is taken.

What kind of testing has this change been put through? Any advantages?
(ie: how many cycles are we saving here?), the trylock mechanism is
already pretty darn fast.

Thanks,
Davidlohr

2014-06-04 20:31:18

by Jason Low

[permalink] [raw]
Subject: Re: [RFC PATCH 2/3] locking/mutex: Correct documentation on mutex optimistic spinning

On Wed, 2014-06-04 at 13:11 -0700, Davidlohr Bueso wrote:
> On Wed, 2014-06-04 at 12:08 -0700, Jason Low wrote:
> > The mutex optimistic spinning documentation states that we spin for
> > acquisition when we find that there are no pending waiters. However,
> > in actuality, whether or not there are waiters for the mutex doesn't
> > determine if we will spin for it.
> >
> > This patch removes that statement and also adds a comment which
> > mentions that we spin for the mutex while we don't need to reschedule.
> >
> > Signed-off-by: Jason Low <[email protected]>
>
> Acked-by: Davidlohr Bueso <[email protected]>
>
> I think this could go in with v4 of the mutex doc rewrite patch...
> guessing for 3.17 now.

Yeah, this patch could be for 3.17.

Thanks,
Jason

2014-06-04 20:57:09

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, 2014-06-04 at 21:43 +0200, Peter Zijlstra wrote:
> On Wed, Jun 04, 2014 at 12:08:29PM -0700, Jason Low wrote:
> > Upon entering the slowpath in __mutex_lock_common(), we try once more
> > to acquire the mutex. We only try to acquire it if MUTEX_SHOW_NO_WAITER
> > (lock->count >= 0) is true in order to avoid using the atomic xchg()
> > operation whenever it is not necessary. However, we really only need
> > to try to acquire if the mutex is free (lock->count == 1).
> >
> > This patch changes it so that we only try-acquire the mutex upon
> > entering the slowpath if it is unlocked, rather than if there are
> > no waiters. This helps further reduce unncessary atomic xchg()
> > operations. Furthermore, this patch introduces and uses a new
> > MUTEX_IS_UNLOCKED() macro to improve readbability.
> >
> > Signed-off-by: Jason Low <[email protected]>
> > ---
> > kernel/locking/mutex.c | 10 ++++++----
> > 1 files changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> > index bc73d33..0925968 100644
> > --- a/kernel/locking/mutex.c
> > +++ b/kernel/locking/mutex.c
> > @@ -48,9 +48,10 @@
> >
> > /*
> > * A negative mutex count indicates that waiters are sleeping waiting for the
> > - * mutex.
> > + * mutex, and a count of one indicates the mutex is unlocked.
> > */
> > #define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
> > +#define MUTEX_IS_UNLOCKED(mutex) (atomic_read(&(mutex)->count) == 1)
>
> So I recently saw that MUTEX_SHOW_NO_WAITER thing and cried a little;
> and now you're adding more of that same nonsense.
>
> Please make them inline functions, also can we rename the SHOW_NO_WAITER
> thing, because its not at all clear to me wtf it does; should it be
> called: mutex_no_waiters() or somesuch?

Agreed.

In addition, how about the following helpers instead:
- mutex_is_unlocked() : count > 0
- mutex_has_waiters() : count < 0, or list_empty(->wait_list)

If combined with the negation, I think that pretty much covers the whole
life-cycle of a mutex. The mutex.c file is full of places where we can
replace crude atomic_reads with these.

Thanks,
Davidlohr

2014-06-04 20:58:26

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, 2014-06-04 at 13:57 -0700, Davidlohr Bueso wrote:
> On Wed, 2014-06-04 at 21:43 +0200, Peter Zijlstra wrote:
> > On Wed, Jun 04, 2014 at 12:08:29PM -0700, Jason Low wrote:
> > > Upon entering the slowpath in __mutex_lock_common(), we try once more
> > > to acquire the mutex. We only try to acquire it if MUTEX_SHOW_NO_WAITER
> > > (lock->count >= 0) is true in order to avoid using the atomic xchg()
> > > operation whenever it is not necessary. However, we really only need
> > > to try to acquire if the mutex is free (lock->count == 1).
> > >
> > > This patch changes it so that we only try-acquire the mutex upon
> > > entering the slowpath if it is unlocked, rather than if there are
> > > no waiters. This helps further reduce unncessary atomic xchg()
> > > operations. Furthermore, this patch introduces and uses a new
> > > MUTEX_IS_UNLOCKED() macro to improve readbability.
> > >
> > > Signed-off-by: Jason Low <[email protected]>
> > > ---
> > > kernel/locking/mutex.c | 10 ++++++----
> > > 1 files changed, 6 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> > > index bc73d33..0925968 100644
> > > --- a/kernel/locking/mutex.c
> > > +++ b/kernel/locking/mutex.c
> > > @@ -48,9 +48,10 @@
> > >
> > > /*
> > > * A negative mutex count indicates that waiters are sleeping waiting for the
> > > - * mutex.
> > > + * mutex, and a count of one indicates the mutex is unlocked.
> > > */
> > > #define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
> > > +#define MUTEX_IS_UNLOCKED(mutex) (atomic_read(&(mutex)->count) == 1)
> >
> > So I recently saw that MUTEX_SHOW_NO_WAITER thing and cried a little;
> > and now you're adding more of that same nonsense.
> >
> > Please make them inline functions, also can we rename the SHOW_NO_WAITER
> > thing, because its not at all clear to me wtf it does; should it be
> > called: mutex_no_waiters() or somesuch?
>
> Agreed.
>
> In addition, how about the following helpers instead:
> - mutex_is_unlocked() : count > 0
> - mutex_has_waiters() : count < 0, or list_empty(->wait_list)
^ err, that's !list_empty()

2014-06-04 21:26:17

by Jason Low

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, 2014-06-04 at 21:43 +0200, Peter Zijlstra wrote:
> On Wed, Jun 04, 2014 at 12:08:29PM -0700, Jason Low wrote:
> > Upon entering the slowpath in __mutex_lock_common(), we try once more
> > to acquire the mutex. We only try to acquire it if MUTEX_SHOW_NO_WAITER
> > (lock->count >= 0) is true in order to avoid using the atomic xchg()
> > operation whenever it is not necessary. However, we really only need
> > to try to acquire if the mutex is free (lock->count == 1).
> >
> > This patch changes it so that we only try-acquire the mutex upon
> > entering the slowpath if it is unlocked, rather than if there are
> > no waiters. This helps further reduce unncessary atomic xchg()
> > operations. Furthermore, this patch introduces and uses a new
> > MUTEX_IS_UNLOCKED() macro to improve readbability.
> >
> > Signed-off-by: Jason Low <[email protected]>
> > ---
> > kernel/locking/mutex.c | 10 ++++++----
> > 1 files changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> > index bc73d33..0925968 100644
> > --- a/kernel/locking/mutex.c
> > +++ b/kernel/locking/mutex.c
> > @@ -48,9 +48,10 @@
> >
> > /*
> > * A negative mutex count indicates that waiters are sleeping waiting for the
> > - * mutex.
> > + * mutex, and a count of one indicates the mutex is unlocked.
> > */
> > #define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
> > +#define MUTEX_IS_UNLOCKED(mutex) (atomic_read(&(mutex)->count) == 1)
>
> So I recently saw that MUTEX_SHOW_NO_WAITER thing and cried a little;
> and now you're adding more of that same nonsense.
>
> Please make them inline functions, also can we rename the SHOW_NO_WAITER
> thing, because its not at all clear to me wtf it does; should it be
> called: mutex_no_waiters() or somesuch?

Okay, I can make them inline functions. I mainly added the macro to keep
it consistent with the MUTEX_SHOW_NO_WAITER() check, but we can surely
make this more clear. mutex_no_waiters() sounds fine, or perhaps
something like mutex_has_no_waiters()?

2014-06-04 21:47:16

by Jason Low

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] locking/mutex: Optimize mutex trylock slowpath

On Wed, 2014-06-04 at 13:28 -0700, Davidlohr Bueso wrote:
> On Wed, 2014-06-04 at 12:08 -0700, Jason Low wrote:
> > In __mutex_trylock_slowpath(), we acquire the wait_lock spinlock,
> > xchg() lock->count with -1, then set lock->count back to 0 if there
> > are no waiters, and return true if the prev lock count was 1.
> >
> > However, if we the mutex is already locked, then there may not be
> ^^ leave that out.
>
> > much point in attempting the above operations.
>
> Isn't this redundant? I mean, if we enter the slowpath its because
> __mutex_fastpath_trylock() already failed so we already know that the
> lock is taken.

This function is really just used as an alternative method of trylock
for !__HAVE_ARCH_CMPXCHG. In that case, the fastpath can call directly
into the slowpath function, without checking for if the lock is taken.

> What kind of testing has this change been put through? Any advantages?
> (ie: how many cycles are we saving here?), the trylock mechanism is
> already pretty darn fast.

While I did run tests with this patch, this particular patch shouldn't
show benefits on my machine as it should be using the more efficient
atomic_cmpxchg. The advantage is in !__HAVE_ARCH_CMPXCHG, where we would
avoid taking a spinlock and 2 atomic operations when the mutex is
already taken.

Thanks.

2014-06-04 21:53:51

by Jason Low

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, 2014-06-04 at 13:57 -0700, Davidlohr Bueso wrote:
> On Wed, 2014-06-04 at 21:43 +0200, Peter Zijlstra wrote:
> > On Wed, Jun 04, 2014 at 12:08:29PM -0700, Jason Low wrote:
> > > Upon entering the slowpath in __mutex_lock_common(), we try once more
> > > to acquire the mutex. We only try to acquire it if MUTEX_SHOW_NO_WAITER
> > > (lock->count >= 0) is true in order to avoid using the atomic xchg()
> > > operation whenever it is not necessary. However, we really only need
> > > to try to acquire if the mutex is free (lock->count == 1).
> > >
> > > This patch changes it so that we only try-acquire the mutex upon
> > > entering the slowpath if it is unlocked, rather than if there are
> > > no waiters. This helps further reduce unncessary atomic xchg()
> > > operations. Furthermore, this patch introduces and uses a new
> > > MUTEX_IS_UNLOCKED() macro to improve readbability.
> > >
> > > Signed-off-by: Jason Low <[email protected]>
> > > ---
> > > kernel/locking/mutex.c | 10 ++++++----
> > > 1 files changed, 6 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> > > index bc73d33..0925968 100644
> > > --- a/kernel/locking/mutex.c
> > > +++ b/kernel/locking/mutex.c
> > > @@ -48,9 +48,10 @@
> > >
> > > /*
> > > * A negative mutex count indicates that waiters are sleeping waiting for the
> > > - * mutex.
> > > + * mutex, and a count of one indicates the mutex is unlocked.
> > > */
> > > #define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
> > > +#define MUTEX_IS_UNLOCKED(mutex) (atomic_read(&(mutex)->count) == 1)
> >
> > So I recently saw that MUTEX_SHOW_NO_WAITER thing and cried a little;
> > and now you're adding more of that same nonsense.
> >
> > Please make them inline functions, also can we rename the SHOW_NO_WAITER
> > thing, because its not at all clear to me wtf it does; should it be
> > called: mutex_no_waiters() or somesuch?
>
> Agreed.
>
> In addition, how about the following helpers instead:
> - mutex_is_unlocked() : count > 0
> - mutex_has_waiters() : count < 0, or list_empty(->wait_list)

Sounds good. Likewise, for "mutex_is_unlocked()" I've noticed that there
is a mutex_is_locked() function provided in the linux/mutex.h file.
Perhaps we can just reuse that function and use !mutex_is_locked() for
places where we want to check if unlocked?

2014-06-04 21:54:54

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, 4 Jun 2014, Jason Low wrote:
> On Wed, 2014-06-04 at 21:43 +0200, Peter Zijlstra wrote:
> > > /*
> > > * A negative mutex count indicates that waiters are sleeping waiting for the
> > > - * mutex.
> > > + * mutex, and a count of one indicates the mutex is unlocked.
> > > */
> > > #define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
> > > +#define MUTEX_IS_UNLOCKED(mutex) (atomic_read(&(mutex)->count) == 1)
> >
> > So I recently saw that MUTEX_SHOW_NO_WAITER thing and cried a little;
> > and now you're adding more of that same nonsense.
> >
> > Please make them inline functions, also can we rename the SHOW_NO_WAITER
> > thing, because its not at all clear to me wtf it does; should it be
> > called: mutex_no_waiters() or somesuch?
>
> Okay, I can make them inline functions. I mainly added the macro to keep
> it consistent with the MUTEX_SHOW_NO_WAITER() check, but we can surely

Consistency with a digusting and nonsensical macro is not really a
good argument.

> make this more clear. mutex_no_waiters() sounds fine, or perhaps
> something like mutex_has_no_waiters()?

Uuurg. So we end up with

if (!mutex_has_no_waiters(m))

if we check for waiters?

Can we please go with the most intuitive thing:

mutex_has_waiters()

and have the callsites prepend the '!' in case they want to check
there is no waiter?

For heavens sake, we do not name macros/inlines in a way which fits
the intended use case. We name them so they make sense.

Your change log blurbs about readability. I have no idea what your
understandig of readability is, but neither MUTEX_SHOW_NO_WAITERS nor
mutex_has_no_waiters qualify for me. Ditto for MUTEX_IS_UNLOCKED.

Care to look at the other lock implementations:

rt_mutex_has_waiters()
spin_is_locked()
....

Why would it make sense to come up with reverse conventions for mutex?

Thanks,

tglx

2014-06-04 22:13:36

by Jason Low

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, 2014-06-04 at 23:54 +0200, Thomas Gleixner wrote:
> On Wed, 4 Jun 2014, Jason Low wrote:
> > On Wed, 2014-06-04 at 21:43 +0200, Peter Zijlstra wrote:
> > > > /*
> > > > * A negative mutex count indicates that waiters are sleeping waiting for the
> > > > - * mutex.
> > > > + * mutex, and a count of one indicates the mutex is unlocked.
> > > > */
> > > > #define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
> > > > +#define MUTEX_IS_UNLOCKED(mutex) (atomic_read(&(mutex)->count) == 1)
> > >
> > > So I recently saw that MUTEX_SHOW_NO_WAITER thing and cried a little;
> > > and now you're adding more of that same nonsense.
> > >
> > > Please make them inline functions, also can we rename the SHOW_NO_WAITER
> > > thing, because its not at all clear to me wtf it does; should it be
> > > called: mutex_no_waiters() or somesuch?
> >
> > Okay, I can make them inline functions. I mainly added the macro to keep
> > it consistent with the MUTEX_SHOW_NO_WAITER() check, but we can surely
>
> Consistency with a digusting and nonsensical macro is not really a
> good argument.

I agree :)

> > make this more clear. mutex_no_waiters() sounds fine, or perhaps
> > something like mutex_has_no_waiters()?
>
> Uuurg. So we end up with
>
> if (!mutex_has_no_waiters(m))
>
> if we check for waiters?
>
> Can we please go with the most intuitive thing:
>
> mutex_has_waiters()
>
> and have the callsites prepend the '!' in case they want to check
> there is no waiter?

Yes, !mutex_has_waiters() sounds like the better option to check for no
waiters. Same with using the already provided mutex_is_locked()
function.

> For heavens sake, we do not name macros/inlines in a way which fits
> the intended use case. We name them so they make sense.
>
> Your change log blurbs about readability. I have no idea what your
> understandig of readability is, but neither MUTEX_SHOW_NO_WAITERS nor
> mutex_has_no_waiters qualify for me. Ditto for MUTEX_IS_UNLOCKED.
>
> Care to look at the other lock implementations:
>
> rt_mutex_has_waiters()
> spin_is_locked()
> ....
>
> Why would it make sense to come up with reverse conventions for mutex?
>
> Thanks,
>
> tglx

Thanks,
Jason

2014-06-05 01:10:38

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] locking/mutex: Optimize mutex trylock slowpath

On Wed, 2014-06-04 at 14:47 -0700, Jason Low wrote:
> On Wed, 2014-06-04 at 13:28 -0700, Davidlohr Bueso wrote:
> > On Wed, 2014-06-04 at 12:08 -0700, Jason Low wrote:
> > > In __mutex_trylock_slowpath(), we acquire the wait_lock spinlock,
> > > xchg() lock->count with -1, then set lock->count back to 0 if there
> > > are no waiters, and return true if the prev lock count was 1.
> > >
> > > However, if we the mutex is already locked, then there may not be
> > ^^ leave that out.
> >
> > > much point in attempting the above operations.
> >
> > Isn't this redundant? I mean, if we enter the slowpath its because
> > __mutex_fastpath_trylock() already failed so we already know that the
> > lock is taken.
>
> This function is really just used as an alternative method of trylock
> for !__HAVE_ARCH_CMPXCHG. In that case, the fastpath can call directly
> into the slowpath function, without checking for if the lock is taken.

Ah, ok I hadn't seen that we do this in 32bit x86 and was wondering why
the heck we fallback on a slowpath for something like trylock, which
should return right away no matter what. I'd suggest explicitly
mentioning this in the changelog. Otherwise makes sense now.

2014-06-05 03:09:05

by Jason Low

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] locking/mutex: Optimize mutex trylock slowpath

On Wed, 2014-06-04 at 18:10 -0700, Davidlohr Bueso wrote:
> On Wed, 2014-06-04 at 14:47 -0700, Jason Low wrote:
> > On Wed, 2014-06-04 at 13:28 -0700, Davidlohr Bueso wrote:
> > > On Wed, 2014-06-04 at 12:08 -0700, Jason Low wrote:
> > > > In __mutex_trylock_slowpath(), we acquire the wait_lock spinlock,
> > > > xchg() lock->count with -1, then set lock->count back to 0 if there
> > > > are no waiters, and return true if the prev lock count was 1.
> > > >
> > > > However, if we the mutex is already locked, then there may not be
> > > ^^ leave that out.
> > >
> > > > much point in attempting the above operations.
> > >
> > > Isn't this redundant? I mean, if we enter the slowpath its because
> > > __mutex_fastpath_trylock() already failed so we already know that the
> > > lock is taken.
> >
> > This function is really just used as an alternative method of trylock
> > for !__HAVE_ARCH_CMPXCHG. In that case, the fastpath can call directly
> > into the slowpath function, without checking for if the lock is taken.
>
> Ah, ok I hadn't seen that we do this in 32bit x86 and was wondering why
> the heck we fallback on a slowpath for something like trylock, which
> should return right away no matter what. I'd suggest explicitly
> mentioning this in the changelog. Otherwise makes sense now.

Yup, I was also initially wondering why we have a slowpath for trylock,
which prompted me to take a look into this function. I'll add some more
information about this to the changelog.

Thanks,
Jason

2014-06-05 03:24:35

by Waiman Long

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On 06/04/2014 05:26 PM, Jason Low wrote:
> On Wed, 2014-06-04 at 21:43 +0200, Peter Zijlstra wrote:
>> On Wed, Jun 04, 2014 at 12:08:29PM -0700, Jason Low wrote:
>>> Upon entering the slowpath in __mutex_lock_common(), we try once more
>>> to acquire the mutex. We only try to acquire it if MUTEX_SHOW_NO_WAITER
>>> (lock->count>= 0) is true in order to avoid using the atomic xchg()
>>> operation whenever it is not necessary. However, we really only need
>>> to try to acquire if the mutex is free (lock->count == 1).
>>>
>>> This patch changes it so that we only try-acquire the mutex upon
>>> entering the slowpath if it is unlocked, rather than if there are
>>> no waiters. This helps further reduce unncessary atomic xchg()
>>> operations. Furthermore, this patch introduces and uses a new
>>> MUTEX_IS_UNLOCKED() macro to improve readbability.
>>>
>>> Signed-off-by: Jason Low<[email protected]>
>>> ---
>>> kernel/locking/mutex.c | 10 ++++++----
>>> 1 files changed, 6 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
>>> index bc73d33..0925968 100644
>>> --- a/kernel/locking/mutex.c
>>> +++ b/kernel/locking/mutex.c
>>> @@ -48,9 +48,10 @@
>>>
>>> /*
>>> * A negative mutex count indicates that waiters are sleeping waiting for the
>>> - * mutex.
>>> + * mutex, and a count of one indicates the mutex is unlocked.
>>> */
>>> #define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count)>= 0)
>>> +#define MUTEX_IS_UNLOCKED(mutex) (atomic_read(&(mutex)->count) == 1)
>> So I recently saw that MUTEX_SHOW_NO_WAITER thing and cried a little;
>> and now you're adding more of that same nonsense.
>>
>> Please make them inline functions, also can we rename the SHOW_NO_WAITER
>> thing, because its not at all clear to me wtf it does; should it be
>> called: mutex_no_waiters() or somesuch?
> Okay, I can make them inline functions. I mainly added the macro to keep
> it consistent with the MUTEX_SHOW_NO_WAITER() check, but we can surely
> make this more clear. mutex_no_waiters() sounds fine, or perhaps
> something like mutex_has_no_waiters()?
>

You can remove the MUTEX_SHOW_NO_WAITER macro as all the call sites are
to be replaced. I didn't check directly for unlocked count because of
fairness concern in my original patch, but I think checking directly for
unlocked count should be fine too.

-Longman

2014-06-05 19:22:14

by Jason Low

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, 2014-06-04 at 23:24 -0400, Waiman Long wrote:
> On 06/04/2014 05:26 PM, Jason Low wrote:
> > On Wed, 2014-06-04 at 21:43 +0200, Peter Zijlstra wrote:
> >> Please make them inline functions, also can we rename the SHOW_NO_WAITER
> >> thing, because its not at all clear to me wtf it does; should it be
> >> called: mutex_no_waiters() or somesuch?
> > Okay, I can make them inline functions. I mainly added the macro to keep
> > it consistent with the MUTEX_SHOW_NO_WAITER() check, but we can surely
> > make this more clear. mutex_no_waiters() sounds fine, or perhaps
> > something like mutex_has_no_waiters()?
> >
>
> You can remove the MUTEX_SHOW_NO_WAITER macro as all the call sites are
> to be replaced.

Sure.

> I didn't check directly for unlocked count because of
> fairness concern in my original patch, but I think checking directly for
> unlocked count should be fine too.

Can you elaborate on the "fairness concern"? In the current code, we're
already directly checking for unlocked count in
atomic_read(&lock->count) == 1 if that's what you're referring to.

Thanks,
Jason

2014-06-09 17:38:09

by Jason Low

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, 2014-06-04 at 13:58 -0700, Davidlohr Bueso wrote:
> On Wed, 2014-06-04 at 13:57 -0700, Davidlohr Bueso wrote:

> > In addition, how about the following helpers instead:
> > - mutex_is_unlocked() : count > 0
> > - mutex_has_waiters() : count < 0, or list_empty(->wait_list)
> ^ err, that's !list_empty()

Between checking for (count < 0) or checking for !list_empty(wait_list)
for waiters:

Now that I think about it, I would expect a mutex_has_waiters() function
to return !list_empty(wait_list) as that really tells whether or not
there are waiters. For example, in highly contended cases, there can
still be waiters on the mutex if count is 1.

Likewise, in places where we currently use "MUTEX_SHOW_NO_WAITER", we
need to check for (count < 0) to ensure lock->count is a negative value
before the thread sleeps on the mutex.

One option would be to still remove MUTEX_SHOW_NO_WAITER(), directly use
atomic_read() in place of the macro, and just comment on why we have an
extra atomic_read() that may "appear redundant". Another option could be
to provide a function that checks for "potential waiters" on the mutex.

Any thoughts?

2014-06-11 21:00:19

by Waiman Long

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked


On 6/9/2014 1:38 PM, Jason Low wrote:
> On Wed, 2014-06-04 at 13:58 -0700, Davidlohr Bueso wrote:
>> On Wed, 2014-06-04 at 13:57 -0700, Davidlohr Bueso wrote:
>>> In addition, how about the following helpers instead:
>>> - mutex_is_unlocked() : count > 0
>>> - mutex_has_waiters() : count < 0, or list_empty(->wait_list)
>> ^ err, that's !list_empty()
> Between checking for (count < 0) or checking for !list_empty(wait_list)
> for waiters:
>
> Now that I think about it, I would expect a mutex_has_waiters() function
> to return !list_empty(wait_list) as that really tells whether or not
> there are waiters. For example, in highly contended cases, there can
> still be waiters on the mutex if count is 1.
>
> Likewise, in places where we currently use "MUTEX_SHOW_NO_WAITER", we
> need to check for (count < 0) to ensure lock->count is a negative value
> before the thread sleeps on the mutex.
>
> One option would be to still remove MUTEX_SHOW_NO_WAITER(), directly use
> atomic_read() in place of the macro, and just comment on why we have an
> extra atomic_read() that may "appear redundant". Another option could be
> to provide a function that checks for "potential waiters" on the mutex.
>
> Any thoughts?
>

For the first MUTEX_SHOW_NO_WAITER() call site, you can replace it with
a check for (count > 0). The second call site within the for loop,
however, is a bit more tricky. It has to serve 2 purposes:

1. Opportunistically get the lock
2. Set the count value to -1 to indicate someone is waiting on the lock,
that is why an xchg() operation has to be done even if its value is 0.

I do agree that the naming isn't that good. Maybe it can be changed to
something like

static inline int mutex_value_has_waiters(mutex *lock) { return
lock->count < 0; }

-Longman

2014-06-11 21:48:08

by Jason Low

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked

On Wed, 2014-06-11 at 17:00 -0400, Long, Wai Man wrote:
> On 6/9/2014 1:38 PM, Jason Low wrote:
> > On Wed, 2014-06-04 at 13:58 -0700, Davidlohr Bueso wrote:
> >> On Wed, 2014-06-04 at 13:57 -0700, Davidlohr Bueso wrote:
> >>> In addition, how about the following helpers instead:
> >>> - mutex_is_unlocked() : count > 0
> >>> - mutex_has_waiters() : count < 0, or list_empty(->wait_list)
> >> ^ err, that's !list_empty()
> > Between checking for (count < 0) or checking for !list_empty(wait_list)
> > for waiters:
> >
> > Now that I think about it, I would expect a mutex_has_waiters() function
> > to return !list_empty(wait_list) as that really tells whether or not
> > there are waiters. For example, in highly contended cases, there can
> > still be waiters on the mutex if count is 1.
> >
> > Likewise, in places where we currently use "MUTEX_SHOW_NO_WAITER", we
> > need to check for (count < 0) to ensure lock->count is a negative value
> > before the thread sleeps on the mutex.
> >
> > One option would be to still remove MUTEX_SHOW_NO_WAITER(), directly use
> > atomic_read() in place of the macro, and just comment on why we have an
> > extra atomic_read() that may "appear redundant". Another option could be
> > to provide a function that checks for "potential waiters" on the mutex.
> >
> > Any thoughts?
> >
>
> For the first MUTEX_SHOW_NO_WAITER() call site, you can replace it with
> a check for (count > 0).

Yup, in my v2 patch, the first call site becomes !mutex_is_locked(lock)
which is really a check for (count == 1).

> The second call site within the for loop,
> however, is a bit more tricky. It has to serve 2 purposes:
>
> 1. Opportunistically get the lock
> 2. Set the count value to -1 to indicate someone is waiting on the lock,
> that is why an xchg() operation has to be done even if its value is 0.
>
> I do agree that the naming isn't that good. Maybe it can be changed to
> something like
>
> static inline int mutex_value_has_waiters(mutex *lock) { return
> lock->count < 0; }

So I can imagine that a mutex_value_has_waiters() function might still
not be a great name, since the mutex can have waiters in the case that
the value lock->count >= 0.

In the second call site, do you think we should just do a direct
atomic_read(lock->count) >= 0 and comment that we only do the xchg if
the count is non-negative to avoid unnecessary xchg? That what I did in
my v2 patch.

Thanks,
Jason

2014-06-12 01:25:09

by Waiman Long

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] locking/mutex: Try to acquire mutex only if it is unlocked


On 6/11/2014 5:48 PM, Jason Low wrote:
> On Wed, 2014-06-11 at 17:00 -0400, Long, Wai Man wrote:
>> On 6/9/2014 1:38 PM, Jason Low wrote:
>>> On Wed, 2014-06-04 at 13:58 -0700, Davidlohr Bueso wrote:
>>>> On Wed, 2014-06-04 at 13:57 -0700, Davidlohr Bueso wrote:
>>>>> In addition, how about the following helpers instead:
>>>>> - mutex_is_unlocked() : count > 0
>>>>> - mutex_has_waiters() : count < 0, or list_empty(->wait_list)
>>>> ^ err, that's !list_empty()
>>> Between checking for (count < 0) or checking for !list_empty(wait_list)
>>> for waiters:
>>>
>>> Now that I think about it, I would expect a mutex_has_waiters() function
>>> to return !list_empty(wait_list) as that really tells whether or not
>>> there are waiters. For example, in highly contended cases, there can
>>> still be waiters on the mutex if count is 1.
>>>
>>> Likewise, in places where we currently use "MUTEX_SHOW_NO_WAITER", we
>>> need to check for (count < 0) to ensure lock->count is a negative value
>>> before the thread sleeps on the mutex.
>>>
>>> One option would be to still remove MUTEX_SHOW_NO_WAITER(), directly use
>>> atomic_read() in place of the macro, and just comment on why we have an
>>> extra atomic_read() that may "appear redundant". Another option could be
>>> to provide a function that checks for "potential waiters" on the mutex.
>>>
>>> Any thoughts?
>>>
>> For the first MUTEX_SHOW_NO_WAITER() call site, you can replace it with
>> a check for (count > 0).
> Yup, in my v2 patch, the first call site becomes !mutex_is_locked(lock)
> which is really a check for (count == 1).

Yes, your v2 patch looks fine to me.

>> The second call site within the for loop,
>> however, is a bit more tricky. It has to serve 2 purposes:
>>
>> 1. Opportunistically get the lock
>> 2. Set the count value to -1 to indicate someone is waiting on the lock,
>> that is why an xchg() operation has to be done even if its value is 0.
>>
>> I do agree that the naming isn't that good. Maybe it can be changed to
>> something like
>>
>> static inline int mutex_value_has_waiters(mutex *lock) { return
>> lock->count < 0; }
> So I can imagine that a mutex_value_has_waiters() function might still
> not be a great name, since the mutex can have waiters in the case that
> the value lock->count >= 0.
>
> In the second call site, do you think we should just do a direct
> atomic_read(lock->count) >= 0 and comment that we only do the xchg if
> the count is non-negative to avoid unnecessary xchg? That what I did in
> my v2 patch.

I think that is a good idea to avoid any controversy in naming.

-Longman