2023-08-09 05:03:39

by Leonardo Bras

[permalink] [raw]
Subject: [RFC PATCH v4 0/5] Rework & improve riscv cmpxchg.h and atomic.h

While studying riscv's cmpxchg.h file, I got really interested in
understanding how RISCV asm implemented the different versions of
{cmp,}xchg.

When I understood the pattern, it made sense for me to remove the
duplications and create macros to make it easier to understand what exactly
changes between the versions: Instruction sufixes & barriers.

Also, did the same kind of work on atomic.c.

After that, I noted both cmpxchg and xchg only accept variables of
size 4 and 8, compared to x86 and arm64 which do 1,2,4,8.

Now that deduplication is done, it is quite direct to implement them
for variable sizes 1 and 2, so I did it. Then Guo Ren already presented
me some possible users :)

I did compare the generated asm on a test.c that contained usage for every
changed function, and could not detect any change on patches 1 + 2 + 3
compared with upstream.

Pathes 4 & 5 were compiled-tested, merged with guoren/qspinlock_v11 and
booted just fine with qemu -machine virt -append "qspinlock".

Thanks!
Leo

Changes since squashed cmpxchg RFCv3:
- Fixed bug on cmpxchg macro for var size 1 & 2: now working
- Macros for var size 1 & 2's lr.w and sc.w now are guaranteed to receive
input of a 32-bit aligned address
- Renamed internal macros from _mask to _masked for patches 4 & 5
- __rc variable on macros for var size 1 & 2 changed from register to ulong
https://lore.kernel.org/all/[email protected]/

Changes since squashed cmpxchg RFCv2:
- Removed rc parameter from the new macro: it can be internal to the macro
- 2 new patches: cmpxchg size 1 and 2, xchg size 1 and 2
https://lore.kernel.org/all/[email protected]/

Changes since squashed cmpxchg RFCv1:
- Unified with atomic.c patchset
- Rebased on top of torvalds/master (thanks Andrea Parri!)
- Removed helper macros that were not being used elsewhere in the kernel.
https://lore.kernel.org/all/[email protected]/
https://lore.kernel.org/all/[email protected]/

Changes since (cmpxchg) RFCv3:
- Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
https://lore.kernel.org/all/[email protected]/

Changes since (cmpxchg) RFCv2:
- Fixed macros that depend on having a local variable with a magic name
- Previous cast to (long) is now only applied on 4-bytes cmpxchg
https://lore.kernel.org/all/[email protected]/

Changes since (cmpxchg) RFCv1:
- Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
https://lore.kernel.org/all/[email protected]/


Leonardo Bras (5):
riscv/cmpxchg: Deduplicate xchg() asm functions
riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
riscv/atomic.h : Deduplicate arch_atomic.*
riscv/cmpxchg: Implement cmpxchg for variables of size 1 and 2
riscv/cmpxchg: Implement xchg for variables of size 1 and 2

arch/riscv/include/asm/atomic.h | 164 ++++++-------
arch/riscv/include/asm/cmpxchg.h | 394 ++++++++++---------------------
2 files changed, 195 insertions(+), 363 deletions(-)

--
2.41.0



2023-08-09 16:11:29

by Leonardo Bras

[permalink] [raw]
Subject: Re: [RFC PATCH v4 0/5] Rework & improve riscv cmpxchg.h and atomic.h

On Tue, Aug 8, 2023 at 11:13 PM Leonardo Bras <[email protected]> wrote:
>
> While studying riscv's cmpxchg.h file, I got really interested in
> understanding how RISCV asm implemented the different versions of
> {cmp,}xchg.
>
> When I understood the pattern, it made sense for me to remove the
> duplications and create macros to make it easier to understand what exactly
> changes between the versions: Instruction sufixes & barriers.
>
> Also, did the same kind of work on atomic.c.
>
> After that, I noted both cmpxchg and xchg only accept variables of
> size 4 and 8, compared to x86 and arm64 which do 1,2,4,8.
>
> Now that deduplication is done, it is quite direct to implement them
> for variable sizes 1 and 2, so I did it. Then Guo Ren already presented
> me some possible users :)
>
> I did compare the generated asm on a test.c that contained usage for every
> changed function, and could not detect any change on patches 1 + 2 + 3
> compared with upstream.
>
> Pathes 4 & 5 were compiled-tested, merged with guoren/qspinlock_v11 and
> booted just fine with qemu -machine virt -append "qspinlock".

This is the tree that I used:
https://gitlab.com/LeoBras/linux/-/commits/guo_qspinlock_v11

>
> Thanks!
> Leo
>
> Changes since squashed cmpxchg RFCv3:
> - Fixed bug on cmpxchg macro for var size 1 & 2: now working
> - Macros for var size 1 & 2's lr.w and sc.w now are guaranteed to receive
> input of a 32-bit aligned address
> - Renamed internal macros from _mask to _masked for patches 4 & 5
> - __rc variable on macros for var size 1 & 2 changed from register to ulong
> https://lore.kernel.org/all/[email protected]/
>
> Changes since squashed cmpxchg RFCv2:
> - Removed rc parameter from the new macro: it can be internal to the macro
> - 2 new patches: cmpxchg size 1 and 2, xchg size 1 and 2
> https://lore.kernel.org/all/[email protected]/
>
> Changes since squashed cmpxchg RFCv1:
> - Unified with atomic.c patchset
> - Rebased on top of torvalds/master (thanks Andrea Parri!)
> - Removed helper macros that were not being used elsewhere in the kernel.
> https://lore.kernel.org/all/[email protected]/
> https://lore.kernel.org/all/[email protected]/
>
> Changes since (cmpxchg) RFCv3:
> - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> https://lore.kernel.org/all/[email protected]/
>
> Changes since (cmpxchg) RFCv2:
> - Fixed macros that depend on having a local variable with a magic name
> - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> https://lore.kernel.org/all/[email protected]/
>
> Changes since (cmpxchg) RFCv1:
> - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> https://lore.kernel.org/all/[email protected]/
>
>
> Leonardo Bras (5):
> riscv/cmpxchg: Deduplicate xchg() asm functions
> riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> riscv/atomic.h : Deduplicate arch_atomic.*
> riscv/cmpxchg: Implement cmpxchg for variables of size 1 and 2
> riscv/cmpxchg: Implement xchg for variables of size 1 and 2
>
> arch/riscv/include/asm/atomic.h | 164 ++++++-------
> arch/riscv/include/asm/cmpxchg.h | 394 ++++++++++---------------------
> 2 files changed, 195 insertions(+), 363 deletions(-)
>
> --
> 2.41.0
>