Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756093Ab3H2TZP (ORCPT ); Thu, 29 Aug 2013 15:25:15 -0400 Received: from mail-ve0-f181.google.com ([209.85.128.181]:46885 "EHLO mail-ve0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754354Ab3H2TZM (ORCPT ); Thu, 29 Aug 2013 15:25:12 -0400 MIME-Version: 1.0 In-Reply-To: References: <1375758759-29629-1-git-send-email-Waiman.Long@hp.com> <1375758759-29629-2-git-send-email-Waiman.Long@hp.com> <1377751465.4028.20.camel@pasglop> <20130829070012.GC27322@gmail.com> Date: Thu, 29 Aug 2013 12:25:11 -0700 X-Google-Sender-Auth: o_zc_oKncYzMJ_0yd1MbBAZqROY Message-ID: Subject: Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount From: Linus Torvalds To: Ingo Molnar Cc: Benjamin Herrenschmidt , Waiman Long , Alexander Viro , Jeff Layton , Miklos Szeredi , Ingo Molnar , Thomas Gleixner , linux-fsdevel , Linux Kernel Mailing List , Peter Zijlstra , Steven Rostedt , Andi Kleen , "Chandramouleeswaran, Aswin" , "Norton, Scott J" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1759 Lines: 36 On Thu, Aug 29, 2013 at 9:43 AM, Linus Torvalds wrote: > > We'll see. The real problem is that I'm not sure if I can even see the > scalability issue on any machine I actually personally want to use > (read: silent). On my current system I can only get up to 15% > _raw_spin_lock by just stat'ing the same file over and over and over > again from lots of threads. Hmm. I can see it, but it turns out that for normal pathname walking, one of the main stumbling blocks is the RCU case of complete_walk(), which cannot be done with the lockless lockref model. Why? It needs to check the sequence count too and cannot touch the refcount unless it matches under the spinlock. We could use lockref_get_non_zero(), but for the final path component (which this is) the zero refcount is actually a common case. Waiman worked around this by having some rather complex code to retry and wait for the dentry lock to be released in his lockref code. But that has a lot of tuning implications, and I wanted to see what it is *without* that kind of tuning. And that's when you hit the "lockless case fails all the time because the lock is actually held" case. I'm going to play around with changing the semantics of "lockref_get_non_zero()" to match the "lockless_put_or_lock()": instead of failing when the count it zero, it gets the lock. That won't generally get any contention, because if the count is zero, there generally isn't anybody else playing with that dentry. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/