2022-07-12 15:02:05

by Uros Bizjak

[permalink] [raw]
Subject: [PATCH] llist: Use try_cmpxchg in llist_add_batch and llist_del_first

Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
llist_add_batch and llist_del_first. x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg.

Also, try_cmpxchg implicitly assigns old *ptr value to "old" when
cmpxchg fails, enabling further code simplifications.

No functional change intended.

Signed-off-by: Uros Bizjak <[email protected]>
---
lib/llist.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/lib/llist.c b/lib/llist.c
index 611ce4881a87..7d78b736e8af 100644
--- a/lib/llist.c
+++ b/lib/llist.c
@@ -30,7 +30,7 @@ bool llist_add_batch(struct llist_node *new_first, struct llist_node *new_last,

do {
new_last->next = first = READ_ONCE(head->first);
- } while (cmpxchg(&head->first, first, new_first) != first);
+ } while (!try_cmpxchg(&head->first, &first, new_first));

return !first;
}
@@ -52,18 +52,14 @@ EXPORT_SYMBOL_GPL(llist_add_batch);
*/
struct llist_node *llist_del_first(struct llist_head *head)
{
- struct llist_node *entry, *old_entry, *next;
+ struct llist_node *entry, *next;

entry = smp_load_acquire(&head->first);
- for (;;) {
+ do {
if (entry == NULL)
return NULL;
- old_entry = entry;
next = READ_ONCE(entry->next);
- entry = cmpxchg(&head->first, old_entry, next);
- if (entry == old_entry)
- break;
- }
+ } while (!try_cmpxchg(&head->first, &entry, next));

return entry;
}
--
2.35.3


2022-08-15 01:58:30

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] llist: Use try_cmpxchg in llist_add_batch and llist_del_first

On Tue, 12 Jul 2022 16:49:17 +0200 Uros Bizjak <[email protected]> wrote:

> Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
> llist_add_batch and llist_del_first. x86 CMPXCHG instruction returns
> success in ZF flag, so this change saves a compare after cmpxchg.
>
> Also, try_cmpxchg implicitly assigns old *ptr value to "old" when
> cmpxchg fails, enabling further code simplifications.
>
> No functional change intended.

Well this is strange. Your innocuous little patch:

> --- a/lib/llist.c
> +++ b/lib/llist.c
> @@ -30,7 +30,7 @@ bool llist_add_batch(struct llist_node *new_first, struct llist_node *new_last,
>
> do {
> new_last->next = first = READ_ONCE(head->first);
> - } while (cmpxchg(&head->first, first, new_first) != first);
> + } while (!try_cmpxchg(&head->first, &first, new_first));
>
> return !first;
> }
> @@ -52,18 +52,14 @@ EXPORT_SYMBOL_GPL(llist_add_batch);
> */
> struct llist_node *llist_del_first(struct llist_head *head)
> {
> - struct llist_node *entry, *old_entry, *next;
> + struct llist_node *entry, *next;
>
> entry = smp_load_acquire(&head->first);
> - for (;;) {
> + do {
> if (entry == NULL)
> return NULL;
> - old_entry = entry;
> next = READ_ONCE(entry->next);
> - entry = cmpxchg(&head->first, old_entry, next);
> - if (entry == old_entry)
> - break;
> - }
> + } while (!try_cmpxchg(&head->first, &entry, next));
>
> return entry;
> }

Does this:

x1:/usr/src/25> size lib/llist.o-before lib/llist.o-after
text data bss dec hex filename
541 24 0 565 235 lib/llist.o-before
940 24 0 964 3c4 lib/llist.o-after

with x86_64 allmodconfig, gcc-11.1.0.

No change with allnoconfig, some bloat with defconfig.

I was too lazy to figure out why this happened, but it'd be great if
someone could investigate. Something has gone wrong somewhere.

x1:/usr/src/25> scripts/bloat-o-meter lib/llist.o-before lib/llist.o-after
add/remove: 0/0 grow/shrink: 2/0 up/down: 351/0 (351)
Function old new delta
llist_add_batch 106 286 +180
llist_del_first 106 277 +171
Total: Before=310, After=661, chg +113.23%

in the two functions you touched.

2022-08-15 23:23:13

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] llist: Use try_cmpxchg in llist_add_batch and llist_del_first

On Mon, Aug 15, 2022 at 3:48 AM Andrew Morton <[email protected]> wrote:
>
> On Tue, 12 Jul 2022 16:49:17 +0200 Uros Bizjak <[email protected]> wrote:
>
> > Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
> > llist_add_batch and llist_del_first. x86 CMPXCHG instruction returns
> > success in ZF flag, so this change saves a compare after cmpxchg.
> >
> > Also, try_cmpxchg implicitly assigns old *ptr value to "old" when
> > cmpxchg fails, enabling further code simplifications.
> >
> > No functional change intended.
>
> Well this is strange. Your innocuous little patch:

[...]

> Does this:
>
> x1:/usr/src/25> size lib/llist.o-before lib/llist.o-after
> text data bss dec hex filename
> 541 24 0 565 235 lib/llist.o-before
> 940 24 0 964 3c4 lib/llist.o-after
>
> with x86_64 allmodconfig, gcc-11.1.0.
>
> No change with allnoconfig, some bloat with defconfig.
>
> I was too lazy to figure out why this happened, but it'd be great if
> someone could investigate. Something has gone wrong somewhere.

Sanitizer is detecting a comparison with a constant and emits:

132: f0 48 0f b1 2b lock cmpxchg %rbp,(%rbx)
137: 41 0f 94 c6 sete %r14b
13b: 31 ff xor %edi,%edi
13d: 44 89 f6 mov %r14d,%esi
140: e8 00 00 00 00 call 145 <llist_add_batch+0xc5>
141: R_X86_64_PLT32 __sanitizer_cov_trace_const_cmp1-0x4

Since a new call is inserted, the compiler has to save all
call-clobbered variables around the call, this triggers another call
to __kasan_check_write. Finally, stack checking is emitted for patched
functions.

Without sanitizer (make defconfig), the code is as expected, with a
couple of bytes saved due to unneeded mov/cmp.

Uros.

2022-08-15 23:24:20

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] llist: Use try_cmpxchg in llist_add_batch and llist_del_first

On Mon, Aug 15, 2022 at 9:20 PM Uros Bizjak <[email protected]> wrote:
>
> On Mon, Aug 15, 2022 at 3:48 AM Andrew Morton <[email protected]> wrote:
> >
> > On Tue, 12 Jul 2022 16:49:17 +0200 Uros Bizjak <[email protected]> wrote:
> >
> > > Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
> > > llist_add_batch and llist_del_first. x86 CMPXCHG instruction returns
> > > success in ZF flag, so this change saves a compare after cmpxchg.
> > >
> > > Also, try_cmpxchg implicitly assigns old *ptr value to "old" when
> > > cmpxchg fails, enabling further code simplifications.
> > >
> > > No functional change intended.
> >
> > Well this is strange. Your innocuous little patch:
>
> [...]
>
> > Does this:
> >
> > x1:/usr/src/25> size lib/llist.o-before lib/llist.o-after
> > text data bss dec hex filename
> > 541 24 0 565 235 lib/llist.o-before
> > 940 24 0 964 3c4 lib/llist.o-after
> >
> > with x86_64 allmodconfig, gcc-11.1.0.
> >
> > No change with allnoconfig, some bloat with defconfig.
> >
> > I was too lazy to figure out why this happened, but it'd be great if
> > someone could investigate. Something has gone wrong somewhere.
>
> Sanitizer is detecting a comparison with a constant and emits:
>
> 132: f0 48 0f b1 2b lock cmpxchg %rbp,(%rbx)
> 137: 41 0f 94 c6 sete %r14b
> 13b: 31 ff xor %edi,%edi
> 13d: 44 89 f6 mov %r14d,%esi
> 140: e8 00 00 00 00 call 145 <llist_add_batch+0xc5>
> 141: R_X86_64_PLT32 __sanitizer_cov_trace_const_cmp1-0x4
>
> Since a new call is inserted, the compiler has to save all
> call-clobbered variables around the call, this triggers another call
> to __kasan_check_write. Finally, stack checking is emitted for patched

Actually, this second __kasan_check_write is for the write in case of
cmpxchg failure.

Uros.