Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936880AbdCXQzR (ORCPT ); Fri, 24 Mar 2017 12:55:17 -0400 Received: from mail-vk0-f44.google.com ([209.85.213.44]:36408 "EHLO mail-vk0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S936445AbdCXQzI (ORCPT ); Fri, 24 Mar 2017 12:55:08 -0400 MIME-Version: 1.0 In-Reply-To: <20170324164108.ibcxxqbhvx6ao54r@hirez.programming.kicks-ass.net> References: <20170324142140.vpyzl755oj6rb5qv@hirez.programming.kicks-ass.net> <20170324164108.ibcxxqbhvx6ao54r@hirez.programming.kicks-ass.net> From: Andy Lutomirski Date: Fri, 24 Mar 2017 09:54:46 -0700 Message-ID: Subject: Re: locking/atomic: Introduce atomic_try_cmpxchg() To: Peter Zijlstra Cc: Dmitry Vyukov , Andrew Morton , Andy Lutomirski , Borislav Petkov , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Josh Poimboeuf , Linus Torvalds , Paul McKenney , Thomas Gleixner , Ingo Molnar , LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3698 Lines: 67 On Fri, Mar 24, 2017 at 9:41 AM, Peter Zijlstra wrote: > On Fri, Mar 24, 2017 at 03:21:40PM +0100, Peter Zijlstra wrote: >> On Fri, Mar 24, 2017 at 01:44:00PM +0100, Dmitry Vyukov wrote: > >> > So I would suggest to change it to a safer and less surprising >> > alternative: >> > >> > diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h >> > index fb961db51a2a..81fb985f51f4 100644 >> > --- a/arch/x86/include/asm/cmpxchg.h >> > +++ b/arch/x86/include/asm/cmpxchg.h >> > @@ -212,7 +212,8 @@ extern void __add_wrong_size(void) >> > default: \ >> > __cmpxchg_wrong_size(); \ >> > } \ >> > - *_old = __old; \ >> > + if (!success) \ >> > + *_old = __old; \ >> > success; \ >> > }) >> >> I've no immediate objection, I'll double check what, if anything, it >> does for code gen. > > So the first snipped I tested regressed like so: > > > 0000000000000000 : 0000000000000000 : > 0: 8b 07 mov (%rdi),%eax 0: 8b 17 mov (%rdi),%edx > 2: 83 f8 ff cmp $0xffffffff,%eax 2: 83 fa ff cmp $0xffffffff,%edx > 5: 74 13 je 1a 5: 74 1a je 21 > 7: 85 c0 test %eax,%eax 7: 85 d2 test %edx,%edx > 9: 74 0d je 18 9: 74 13 je 1e > b: 8d 50 01 lea 0x1(%rax),%edx b: 8d 4a 01 lea 0x1(%rdx),%ecx > e: f0 0f b1 17 lock cmpxchg %edx,(%rdi) e: 89 d0 mov %edx,%eax > 12: 75 ee jne 2 10: f0 0f b1 0f lock cmpxchg %ecx,(%rdi) > 14: ff c2 inc %edx 14: 74 04 je 1a > 16: 75 02 jne 1a 16: 89 c2 mov %eax,%edx > 18: 0f 0b ud2 18: eb e8 jmp 2 > 1a: c3 retq 1a: ff c1 inc %ecx > 1c: 75 03 jne 21 > 1e: 0f 0b ud2 > 20: c3 retq > 21: c3 retq Can you re-send the better asm you got earlier? If I pretend to be a dumb compiler, I wonder if you'd get better results with: if (!success) { *_old = __old; return false; } else { return true; } or however you jam that into a statement expression. That way you aren't relying on the compiler to merge the branches. > > Which is rather unfortunate... -- Andy Lutomirski AMA Capital Management, LLC