Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751941AbdCXRYD (ORCPT ); Fri, 24 Mar 2017 13:24:03 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:41745 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751115AbdCXRXz (ORCPT ); Fri, 24 Mar 2017 13:23:55 -0400 Date: Fri, 24 Mar 2017 18:23:42 +0100 From: Peter Zijlstra To: Andy Lutomirski Cc: Dmitry Vyukov , Andrew Morton , Andy Lutomirski , Borislav Petkov , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Josh Poimboeuf , Linus Torvalds , Paul McKenney , Thomas Gleixner , Ingo Molnar , LKML Subject: Re: locking/atomic: Introduce atomic_try_cmpxchg() Message-ID: <20170324172342.radlrhk2z6mwmdgk@hirez.programming.kicks-ass.net> References: <20170324142140.vpyzl755oj6rb5qv@hirez.programming.kicks-ass.net> <20170324164108.ibcxxqbhvx6ao54r@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5277 Lines: 138 On Fri, Mar 24, 2017 at 09:54:46AM -0700, Andy Lutomirski wrote: > > So the first snipped I tested regressed like so: > > > > > > 0000000000000000 : 0000000000000000 : > > 0: 8b 07 mov (%rdi),%eax 0: 8b 17 mov (%rdi),%edx > > 2: 83 f8 ff cmp $0xffffffff,%eax 2: 83 fa ff cmp $0xffffffff,%edx > > 5: 74 13 je 1a 5: 74 1a je 21 > > 7: 85 c0 test %eax,%eax 7: 85 d2 test %edx,%edx > > 9: 74 0d je 18 9: 74 13 je 1e > > b: 8d 50 01 lea 0x1(%rax),%edx b: 8d 4a 01 lea 0x1(%rdx),%ecx > > e: f0 0f b1 17 lock cmpxchg %edx,(%rdi) e: 89 d0 mov %edx,%eax > > 12: 75 ee jne 2 10: f0 0f b1 0f lock cmpxchg %ecx,(%rdi) > > 14: ff c2 inc %edx 14: 74 04 je 1a > > 16: 75 02 jne 1a 16: 89 c2 mov %eax,%edx > > 18: 0f 0b ud2 18: eb e8 jmp 2 > > 1a: c3 retq 1a: ff c1 inc %ecx > > 1c: 75 03 jne 21 > > 1e: 0f 0b ud2 > > 20: c3 retq > > 21: c3 retq > > Can you re-send the better asm you got earlier? On the left? > If I pretend to be a dumb compiler, I wonder if you'd get better results with: > > if (!success) { > *_old = __old; > return false; > } else { > return true; > } > > or however you jam that into a statement expression. That way you > aren't relying on the compiler to merge the branches. I tried a few variants, but nothing really made it better. Find the tiny.c file below; I'm using: gcc (Debian 6.3.0-5) 6.3.0 20170124 it has both an inline and an stmt-expr try_cmpxchg variant to play with; the 'expected' output is at the bottom (same as above left). Note that clang doesn't compile this stuff due to missing features. --- /* gcc -Os -std=gnu99 -fno-strict-overflow -falign-jumps=1 -falign-loops=1 -c tiny.c; objdump -dr tiny.o */ typedef _Bool bool; static inline bool try_cmpxchg(unsigned int *ptr, unsigned int *val, unsigned int new) { unsigned int old = *val; bool success; asm volatile("lock cmpxchgl %[new], %[ptr]" : "=@ccz" (success), [ptr] "+m" (*ptr), [old] "+a" (old) : [new] "r" (new) : "memory"); // if (!success) *val = old; return success; } #define __try_cmpxchg(_ptr, _pold, _new) \ ({ \ bool success; \ __typeof__(_ptr) _old = (_pold); \ __typeof__(*(_ptr)) __old = *_old; \ __typeof__(*(_ptr)) __new = (_new); \ volatile unsigned int * __ptr = (volatile unsigned int *)(_ptr);\ \ asm volatile("lock cmpxchgl %[new], %[ptr]" \ : "=@ccz" (success), \ [ptr] "+m" (*__ptr), \ [old] "+a" (__old) \ : [new] "r" (__new) \ : "memory"); \ /* if (!success) */ \ *_old = __old; \ success; \ }) #define EXCEPTION_VALUE(val, handler) asm volatile ("ud2 # %0" : : "r" (val)) #define UINT_MAX (~0U) #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect(!!(x), 0) static inline void refcount_inc(unsigned int *r) { unsigned int new, val = *(unsigned int volatile *)r; do { if (unlikely(val == UINT_MAX)) /* saturated */ return; if (unlikely(!val)) /* use-after-free */ goto exception; /* cannot overflow because we already checked UINT_MAX */ new = val + 1; } while (unlikely(!try_cmpxchg(r, &val, new))); if (unlikely(new == UINT_MAX)) exception: EXCEPTION_VALUE(val, __refcount_warn); } void T_refcount_inc(unsigned int *r) { refcount_inc(r); } /* 0: 8b 07 mov (%rdi),%eax 2: 83 f8 ff cmp $0xffffffff,%eax 5: 74 13 je 1a 7: 85 c0 test %eax,%eax 9: 74 0d je 18 b: 8d 50 01 lea 0x1(%rax),%edx e: f0 0f b1 17 lock cmpxchg %edx,(%rdi) 12: 75 ee jne 2 14: ff c2 inc %edx 16: 75 02 jne 1a 18: 0f 0b ud2 1a: c3 retq */