Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760728Ab3ICVel (ORCPT ); Tue, 3 Sep 2013 17:34:41 -0400 Received: from mail-ve0-f181.google.com ([209.85.128.181]:55551 "EHLO mail-ve0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753265Ab3ICVej convert rfc822-to-8bit (ORCPT ); Tue, 3 Sep 2013 17:34:39 -0400 MIME-Version: 1.0 In-Reply-To: References: <20130901233005.GX13318@ZenIV.linux.org.uk> <20130902070538.GA31639@gmail.com> <20130903101522.GA22369@gmail.com> <20130903191950.GC30757@gmail.com> Date: Tue, 3 Sep 2013 14:34:38 -0700 X-Google-Sender-Auth: z4-Zm2AQ6rh3IcWkpUpTK-kHuvk Message-ID: Subject: Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount From: Linus Torvalds To: Ingo Molnar Cc: Al Viro , Sedat Dilek , Waiman Long , Benjamin Herrenschmidt , Jeff Layton , Miklos Szeredi , Ingo Molnar , Thomas Gleixner , linux-fsdevel , Linux Kernel Mailing List , Peter Zijlstra , Steven Rostedt , Andi Kleen , "Chandramouleeswaran, Aswin" , "Norton, Scott J" , Peter Zijlstra , Arnaldo Carvalho de Melo Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1492 Lines: 42 On Tue, Sep 3, 2013 at 2:13 PM, Linus Torvalds wrote: > > This is the one that actually compiles. Whether it *works* is still a > total mystery. It generates ok code, and it booted, so it seems to work at least for my config. However, it seems to make no performance-difference what-so-ever, and lg_local_lock is still using about 7% cpu per the profiles. The code generation is slightly better, but the profile looks the same: │ ffffffff81078e70 : 0.62 │ push %rbp 0.28 │ mov %rsp,%rbp 0.22 │ add %gs:0xcd48,%rdi 0.27 │ mov $0x100,%eax 97.22 │ lock xadd %ax,(%rdi) 0.01 │ movzbl %ah,%edx │ cmp %al,%dl 0.56 │ ↓ je 29 │ xchg %ax,%ax 0.00 │20: pause 0.00 │ movzbl (%rdi),%eax │ cmp %dl,%al │ ↑ jne 20 │29: pop %rbp 0.81 │ ← retq but it still obviously doesn't do the "lock xadd %ax,%gs:(%rdi)" (without the preceding 'add') that would be the optimal code. I'll try to hack that up too, but it's looking like it really is just the "lock xadd", not the memory dependency chain.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/