Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756126Ab3IAWQ2 (ORCPT ); Sun, 1 Sep 2013 18:16:28 -0400 Received: from mail-vb0-f42.google.com ([209.85.212.42]:42711 "EHLO mail-vb0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751715Ab3IAWQ0 (ORCPT ); Sun, 1 Sep 2013 18:16:26 -0400 MIME-Version: 1.0 In-Reply-To: <20130901212355.GU13318@ZenIV.linux.org.uk> References: <20130901212355.GU13318@ZenIV.linux.org.uk> Date: Sun, 1 Sep 2013 15:16:24 -0700 X-Google-Sender-Auth: TfedwGiUzZbGF-72o4rpJbKIzNE Message-ID: Subject: Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount From: Linus Torvalds To: Al Viro Cc: Sedat Dilek , Waiman Long , Ingo Molnar , Benjamin Herrenschmidt , Jeff Layton , Miklos Szeredi , Ingo Molnar , Thomas Gleixner , linux-fsdevel , Linux Kernel Mailing List , Peter Zijlstra , Steven Rostedt , Andi Kleen , "Chandramouleeswaran, Aswin" , "Norton, Scott J" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2146 Lines: 48 On Sun, Sep 1, 2013 at 2:23 PM, Al Viro wrote: > > How much of that is due to br_write_lock() taken in mntput_no_expire() > for no good reason? IOW, could you try shmem.c patch I've sent yesterday > and see how much effect does it have?[1] Basically, we get it grabbed > exclusive on each final fput() of a struct file created by shmem_file_setup(), > which is _not_ a rare event. And the only reason for that is not having > shm_mnt marked long-living, even though its refcount never hits 0... Does not seem to matter. Still 66% mntput_no_expire, 31% path_init. And that lg_local_lock() takes 5-6% of CPU, pretty much all of which is that single xadd instruction that implements the spinlock. This is on /tmp, which is tmpfs. But I don't see how any of that could matter. "mntput()" does an unconditional call to mntput_no_expire(), and mntput_no_expire() does that br_read_lock() unconditionally too. Note that I'm talking about that "cheap" *read* lock being expensive. It's the local one, not the global one. So it's not what Waiman saw with the global lock. This is a local per-cpu thing. That read-lock is supposed to be very cheap - it's just a per-cpu spinlock. But it ends up being very expensive for some reason. I'm not quite sure why - I don't see any lg_global_lock() calls at all, so... I wonder if there is some false sharing going on. But I don't see that either, this is the percpu offset map afaik: 000000000000f560 d files_lglock_lock 000000000000f564 d nr_dentry 000000000000f568 d last_ino 000000000000f56c d nr_unused 000000000000f570 d nr_inodes 000000000000f574 d vfsmount_lock_lock 000000000000f580 d bh_accounting and I don't see anything there that would get cross-cpu accesses, so there shouldn't be any cacheline bouncing. That's the whole point of percpu variables, after all. Odd. What am I missing? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/