2015-02-27 20:09:22

by Pranith Kumar

[permalink] [raw]
Subject: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

ARM64 documentation recommends keeping exclusive loads and stores as close as
possible. Any instructions which do not depend on the value loaded should be
moved outside.

In the current implementation of cmpxchg(), there is a mov instruction which can
be pulled before the load exclusive instruction without any change in
functionality. This patch does that change.

Signed-off-by: Pranith Kumar <[email protected]>
---
arch/arm64/include/asm/cmpxchg.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
index cb95930..8057735 100644
--- a/arch/arm64/include/asm/cmpxchg.h
+++ b/arch/arm64/include/asm/cmpxchg.h
@@ -89,8 +89,8 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
case 1:
do {
asm volatile("// __cmpxchg1\n"
- " ldxrb %w1, %2\n"
" mov %w0, #0\n"
+ " ldxrb %w1, %2\n"
" cmp %w1, %w3\n"
" b.ne 1f\n"
" stxrb %w0, %w4, %2\n"
@@ -104,8 +104,8 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
case 2:
do {
asm volatile("// __cmpxchg2\n"
- " ldxrh %w1, %2\n"
" mov %w0, #0\n"
+ " ldxrh %w1, %2\n"
" cmp %w1, %w3\n"
" b.ne 1f\n"
" stxrh %w0, %w4, %2\n"
@@ -119,8 +119,8 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
case 4:
do {
asm volatile("// __cmpxchg4\n"
- " ldxr %w1, %2\n"
" mov %w0, #0\n"
+ " ldxr %w1, %2\n"
" cmp %w1, %w3\n"
" b.ne 1f\n"
" stxr %w0, %w4, %2\n"
@@ -134,8 +134,8 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
case 8:
do {
asm volatile("// __cmpxchg8\n"
- " ldxr %1, %2\n"
" mov %w0, #0\n"
+ " ldxr %1, %2\n"
" cmp %1, %3\n"
" b.ne 1f\n"
" stxr %w0, %4, %2\n"
@@ -166,11 +166,11 @@ static inline int __cmpxchg_double(volatile void *ptr1, volatile void *ptr2,
VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 1);
do {
asm volatile("// __cmpxchg_double8\n"
+ " mov %w0, #0\n"
" ldxp %0, %1, %2\n"
" eor %0, %0, %3\n"
" eor %1, %1, %4\n"
" orr %1, %0, %1\n"
- " mov %w0, #0\n"
" cbnz %1, 1f\n"
" stxp %w0, %5, %6, %2\n"
"1:\n"
--
1.9.1


2015-02-27 20:15:31

by Will Deacon

[permalink] [raw]
Subject: Re: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

On Fri, Feb 27, 2015 at 08:09:17PM +0000, Pranith Kumar wrote:
> ARM64 documentation recommends keeping exclusive loads and stores as close as
> possible. Any instructions which do not depend on the value loaded should be
> moved outside.
>
> In the current implementation of cmpxchg(), there is a mov instruction which can
> be pulled before the load exclusive instruction without any change in
> functionality. This patch does that change.
>
> Signed-off-by: Pranith Kumar <[email protected]>
> ---
> arch/arm64/include/asm/cmpxchg.h | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)

[...]

> @@ -166,11 +166,11 @@ static inline int __cmpxchg_double(volatile void *ptr1, volatile void *ptr2,
> VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 1);
> do {
> asm volatile("// __cmpxchg_double8\n"
> + " mov %w0, #0\n"
> " ldxp %0, %1, %2\n"

Seriously, you might want to test this before you mindlessly make changes to
low-level synchronisation code. Not only is the change completely unnecessary
but it is actively harmful.

Have a good weekend,

Will

2015-02-27 20:17:58

by Pranith Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

On Fri, Feb 27, 2015 at 3:15 PM, Will Deacon <[email protected]> wrote:
> On Fri, Feb 27, 2015 at 08:09:17PM +0000, Pranith Kumar wrote:
>> ARM64 documentation recommends keeping exclusive loads and stores as close as
>> possible. Any instructions which do not depend on the value loaded should be
>> moved outside.
>>
>> In the current implementation of cmpxchg(), there is a mov instruction which can
>> be pulled before the load exclusive instruction without any change in
>> functionality. This patch does that change.
>>
>> Signed-off-by: Pranith Kumar <[email protected]>
>> ---
>> arch/arm64/include/asm/cmpxchg.h | 10 +++++-----
>> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> [...]
>
>> @@ -166,11 +166,11 @@ static inline int __cmpxchg_double(volatile void *ptr1, volatile void *ptr2,
>> VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 1);
>> do {
>> asm volatile("// __cmpxchg_double8\n"
>> + " mov %w0, #0\n"
>> " ldxp %0, %1, %2\n"
>
> Seriously, you might want to test this before you mindlessly make changes to
> low-level synchronisation code. Not only is the change completely unnecessary
> but it is actively harmful.
>

Oops, I apologize for this. I should have looked more closely. It is
wrong to do this in cmpxchg_double(). What about the other cases?


--
Pranith

2015-02-27 20:30:11

by Pranith Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

On Fri, Feb 27, 2015 at 3:17 PM, Pranith Kumar <[email protected]> wrote:
> On Fri, Feb 27, 2015 at 3:15 PM, Will Deacon <[email protected]> wrote:
>>> @@ -166,11 +166,11 @@ static inline int __cmpxchg_double(volatile void *ptr1, volatile void *ptr2,
>>> VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 1);
>>> do {
>>> asm volatile("// __cmpxchg_double8\n"
>>> + " mov %w0, #0\n"
>>> " ldxp %0, %1, %2\n"
>>
>> Seriously, you might want to test this before you mindlessly make changes to
>> low-level synchronisation code. Not only is the change completely unnecessary
>> but it is actively harmful.
>>
>
> Oops, I apologize for this. I should have looked more closely. It is
> wrong to do this in cmpxchg_double(). What about the other cases?
>

Hi Will,

I tried looking closely on what might be the problem here. I am
waiting on a HiKey arm64 board and I agree I should not send in
changes without running/testing them first.

Could you please explain (for educational purposes) why you think this
change is harmful?


Thanks,
--
Pranith