Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936220AbdCXVZA (ORCPT ); Fri, 24 Mar 2017 17:25:00 -0400 Received: from merlin.infradead.org ([205.233.59.134]:34708 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753675AbdCXVYl (ORCPT ); Fri, 24 Mar 2017 17:24:41 -0400 Date: Fri, 24 Mar 2017 22:23:29 +0100 From: Peter Zijlstra To: Linus Torvalds Cc: Andy Lutomirski , Dmitry Vyukov , Andrew Morton , Andy Lutomirski , Borislav Petkov , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Josh Poimboeuf , Paul McKenney , Thomas Gleixner , Ingo Molnar , LKML Subject: Re: locking/atomic: Introduce atomic_try_cmpxchg() Message-ID: <20170324212329.GC5680@worktop> References: <20170324142140.vpyzl755oj6rb5qv@hirez.programming.kicks-ass.net> <20170324164108.ibcxxqbhvx6ao54r@hirez.programming.kicks-ass.net> <20170324172342.radlrhk2z6mwmdgk@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1813 Lines: 45 On Fri, Mar 24, 2017 at 12:17:28PM -0700, Linus Torvalds wrote: > On Fri, Mar 24, 2017 at 11:45 AM, Andy Lutomirski wrote: > > > > Is there some hack like if __builtin_is_unescaped(*val) *val = old; > > that would work? > > See my recent email suggesting a completely different interface, which > avoids this problem. > > My interface generates: > > 0000000000000000 : > 0: 8b 07 mov (%rdi),%eax > 2: 83 f8 ff cmp $0xffffffff,%eax > 5: 74 12 je 19 > 7: 85 c0 test %eax,%eax > 9: 74 0a je 15 > b: 8d 50 01 lea 0x1(%rax),%edx > e: f0 0f b1 17 lock cmpxchg %edx,(%rdi) > 12: 75 ee jne 2 > 14: c3 retq > 15: 31 c0 xor %eax,%eax > 17: 0f 0b ud2 > 19: c3 retq > > for PeterZ's test-case, which seems optimal. Right; now my GCC emits more or less the same code (its a slightly different compiler and instead of 12: jne, it does: 12 je ; 14: jmp 2. But maybe that's the likely() you added later. Also, see how at 7 we test if eax is 0 and then at 9 jump to 15 where we make eax 0. Pretty daft code-gen. In any case, you lost one branch into ud2; your success: return, should be success: if (new == UINT_MAX), such that when we newly saturate the count we also raise an exception. With that, the code is still larger than it used to be. I'll have a play around. I do like this interface better, but getting GCC to generate sensible code seems 'interesting'. I'll try and redo the patches that landed in tip and see what it does for total vmlinux size somewhere tomorrow.