Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965783AbdCXSq3 (ORCPT ); Fri, 24 Mar 2017 14:46:29 -0400 Received: from mail-vk0-f45.google.com ([209.85.213.45]:35774 "EHLO mail-vk0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964809AbdCXSqJ (ORCPT ); Fri, 24 Mar 2017 14:46:09 -0400 MIME-Version: 1.0 In-Reply-To: <20170324172342.radlrhk2z6mwmdgk@hirez.programming.kicks-ass.net> References: <20170324142140.vpyzl755oj6rb5qv@hirez.programming.kicks-ass.net> <20170324164108.ibcxxqbhvx6ao54r@hirez.programming.kicks-ass.net> <20170324172342.radlrhk2z6mwmdgk@hirez.programming.kicks-ass.net> From: Andy Lutomirski Date: Fri, 24 Mar 2017 11:45:46 -0700 Message-ID: Subject: Re: locking/atomic: Introduce atomic_try_cmpxchg() To: Peter Zijlstra Cc: Dmitry Vyukov , Andrew Morton , Andy Lutomirski , Borislav Petkov , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Josh Poimboeuf , Linus Torvalds , Paul McKenney , Thomas Gleixner , Ingo Molnar , LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3676 Lines: 76 On Fri, Mar 24, 2017 at 10:23 AM, Peter Zijlstra wrote: > On Fri, Mar 24, 2017 at 09:54:46AM -0700, Andy Lutomirski wrote: >> > So the first snipped I tested regressed like so: >> > >> > >> > 0000000000000000 : 0000000000000000 : >> > 0: 8b 07 mov (%rdi),%eax 0: 8b 17 mov (%rdi),%edx >> > 2: 83 f8 ff cmp $0xffffffff,%eax 2: 83 fa ff cmp $0xffffffff,%edx >> > 5: 74 13 je 1a 5: 74 1a je 21 >> > 7: 85 c0 test %eax,%eax 7: 85 d2 test %edx,%edx >> > 9: 74 0d je 18 9: 74 13 je 1e >> > b: 8d 50 01 lea 0x1(%rax),%edx b: 8d 4a 01 lea 0x1(%rdx),%ecx >> > e: f0 0f b1 17 lock cmpxchg %edx,(%rdi) e: 89 d0 mov %edx,%eax >> > 12: 75 ee jne 2 10: f0 0f b1 0f lock cmpxchg %ecx,(%rdi) >> > 14: ff c2 inc %edx 14: 74 04 je 1a >> > 16: 75 02 jne 1a 16: 89 c2 mov %eax,%edx >> > 18: 0f 0b ud2 18: eb e8 jmp 2 >> > 1a: c3 retq 1a: ff c1 inc %ecx >> > 1c: 75 03 jne 21 >> > 1e: 0f 0b ud2 >> > 20: c3 retq >> > 21: c3 retq >> >> Can you re-send the better asm you got earlier? > > On the left? Apparently I'm just blind this morning. */ After playing with it a bit, I found some of the problem: you're passing val into EXCEPTION_VALUE, which keeps it live. If I get rid of that, the generated code is great. I haven't found a way to convince GCC that, in the success case, eax isn't clobbered. I wrote this: static inline bool try_cmpxchg(unsigned int *ptr, unsigned int *val, unsigned int new) { unsigned int old = *val; bool success; asm volatile("lock cmpxchgl %[new], %[ptr]" : "=@ccz" (success), [ptr] "+m" (*ptr), [old] "+a" (old) : [new] "r" (new) : "memory"); if (!success) { *val = old; } else { if (*val != old) { *val = old; __builtin_unreachable(); } else { /* * Damnit, GCC, I want you to realize that this * is happening but to avoid emitting the store. */ *val = old; /* <-- here */ } } return success; } The "here" line is the problematic code that breaks certain use cases, and it obviously needn't have any effect in the generated code, but I'm having trouble getting GCC to generate good code without it. Is there some hack like if __builtin_is_unescaped(*val) *val = old; that would work? --Andy