Subject: [PATCH REPOST] locking/local_lock: Make the empty local_lock_*() function a macro.

It has been said that local_lock() does not add any overhead compared to
preempt_disable() in a !LOCKDEP configuration. A micro benchmark showed
an unexpected result which can be reduced to the fact that local_lock()
was not entirely optimized away.
In the !LOCKDEP configuration local_lock_acquire() is an empty static
inline function. On x86 the this_cpu_ptr() argument of that function is
fully evaluated leading to an additional mov+add instructions which are
not needed and not used.

Replace the static inline function with a macro. The typecheck() macro
ensures that the argument is of proper type while the resulting
disassembly shows no traces of this_cpu_ptr().

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Reviewed-by: Waiman Long <[email protected]>
---
Repost of
https://lkml.kernel.org/r/[email protected]

include/linux/local_lock_internal.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/include/linux/local_lock_internal.h
+++ b/include/linux/local_lock_internal.h
@@ -44,9 +44,9 @@ static inline void local_lock_debug_init
}
#else /* CONFIG_DEBUG_LOCK_ALLOC */
# define LOCAL_LOCK_DEBUG_INIT(lockname)
-static inline void local_lock_acquire(local_lock_t *l) { }
-static inline void local_lock_release(local_lock_t *l) { }
-static inline void local_lock_debug_init(local_lock_t *l) { }
+# define local_lock_acquire(__ll) do { typecheck(local_lock_t *, __ll); } while (0)
+# define local_lock_release(__ll) do { typecheck(local_lock_t *, __ll); } while (0)
+# define local_lock_debug_init(__ll) do { typecheck(local_lock_t *, __ll); } while (0)
#endif /* !CONFIG_DEBUG_LOCK_ALLOC */

#define INIT_LOCAL_LOCK(lockname) { LOCAL_LOCK_DEBUG_INIT(lockname) }


2022-02-09 23:20:49

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH REPOST] locking/local_lock: Make the empty local_lock_*() function a macro.

On Tue, 08 Feb 2022, Sebastian Andrzej Siewior wrote:

>It has been said that local_lock() does not add any overhead compared to
>preempt_disable() in a !LOCKDEP configuration. A micro benchmark showed
>an unexpected result which can be reduced to the fact that local_lock()
>was not entirely optimized away.
>In the !LOCKDEP configuration local_lock_acquire() is an empty static
>inline function. On x86 the this_cpu_ptr() argument of that function is
>fully evaluated leading to an additional mov+add instructions which are
>not needed and not used.
>
>Replace the static inline function with a macro. The typecheck() macro
>ensures that the argument is of proper type while the resulting
>disassembly shows no traces of this_cpu_ptr().
>
>Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
>Reviewed-by: Waiman Long <[email protected]>

Reviewed-by: Davidlohr Bueso <[email protected]>

Subject: [tip: locking/core] locking/local_lock: Make the empty local_lock_*() function a macro.

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 9983a9d577db415c41099a20a5637ab25dd3c240
Gitweb: https://git.kernel.org/tip/9983a9d577db415c41099a20a5637ab25dd3c240
Author: Sebastian Andrzej Siewior <[email protected]>
AuthorDate: Tue, 08 Feb 2022 18:08:02 +01:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Fri, 11 Feb 2022 12:13:56 +01:00

locking/local_lock: Make the empty local_lock_*() function a macro.

It has been said that local_lock() does not add any overhead compared to
preempt_disable() in a !LOCKDEP configuration. A micro benchmark showed
an unexpected result which can be reduced to the fact that local_lock()
was not entirely optimized away.
In the !LOCKDEP configuration local_lock_acquire() is an empty static
inline function. On x86 the this_cpu_ptr() argument of that function is
fully evaluated leading to an additional mov+add instructions which are
not needed and not used.

Replace the static inline function with a macro. The typecheck() macro
ensures that the argument is of proper type while the resulting
disassembly shows no traces of this_cpu_ptr().

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Waiman Long <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
include/linux/local_lock_internal.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h
index 975e33b..6d635e8 100644
--- a/include/linux/local_lock_internal.h
+++ b/include/linux/local_lock_internal.h
@@ -44,9 +44,9 @@ static inline void local_lock_debug_init(local_lock_t *l)
}
#else /* CONFIG_DEBUG_LOCK_ALLOC */
# define LOCAL_LOCK_DEBUG_INIT(lockname)
-static inline void local_lock_acquire(local_lock_t *l) { }
-static inline void local_lock_release(local_lock_t *l) { }
-static inline void local_lock_debug_init(local_lock_t *l) { }
+# define local_lock_acquire(__ll) do { typecheck(local_lock_t *, __ll); } while (0)
+# define local_lock_release(__ll) do { typecheck(local_lock_t *, __ll); } while (0)
+# define local_lock_debug_init(__ll) do { typecheck(local_lock_t *, __ll); } while (0)
#endif /* !CONFIG_DEBUG_LOCK_ALLOC */

#define INIT_LOCAL_LOCK(lockname) { LOCAL_LOCK_DEBUG_INIT(lockname) }