Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752600AbbGNAyc (ORCPT ); Mon, 13 Jul 2015 20:54:32 -0400 Received: from relay1.sgi.com ([192.48.180.66]:33071 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751966AbbGNAyb (ORCPT ); Mon, 13 Jul 2015 20:54:31 -0400 Date: Mon, 13 Jul 2015 19:54:28 -0500 From: Ben Myers To: Al Viro Cc: Linus Torvalds , "J. Bruce Fields" , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [RFC] freeing unlinked file indefinitely delayed Message-ID: <20150714005428.GH4015@sgi.com> References: <20150708014237.GC17109@ZenIV.linux.org.uk> <20150708154143.GG4015@sgi.com> <20150712150035.GJ17109@ZenIV.linux.org.uk> <20150713181751.GZ4568@sgi.com> <20150713195645.GR17109@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150713195645.GR17109@ZenIV.linux.org.uk> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4162 Lines: 85 Hey Al, On Mon, Jul 13, 2015 at 08:56:46PM +0100, Al Viro wrote: > On Mon, Jul 13, 2015 at 01:17:51PM -0500, Ben Myers wrote: > > > For one thing, this patch does *not* check for i_nlink at all. > > > > I agree that no checking of i_nlink has the advantage of brevity. > > Anyone who is using dentry.d_fsdata with an open_by_handle workload (if > > there are any) will be affected. > > Translate, please. What does d_fsdata have to anything above? If my understanding of your patch is correct, any disconnected dentries will be killed immediately at dput instead of being put on the dentry lru. I have some code that I think may rely on the old behavior with regard to DCACHE_DISCONNECTED. That is, the disconnected dentry sits on the lru and there are lots of open_by_handle calls and nothing to keep a ref on the inode between open_by_handle/close except that dentry's ref. Maybe there are better ways to do that and I just need to look for a workaround. I wonder if there are others who will be affected by this change in behavior. In particular, if there are filesystems that need to keep internal state in d_fsdata across open_by_handle calls. They won't be able to use d_fsdata very well anymore for this workload. > > > For another, there's no such thing as 'filesystems internal lock' for > > > i_nlink protection - that's handled by i_mutex... And what does > > > iget() have to do with any of that? > > > > i_mutex is good enough only for local filesystems. > > Network/clustered/distributed filesystems need to take an internal lock > > to provide exclusion for this .unlink with a .link on another host. > > That's where I'm coming from with iget(). > > > > Maybe plumbing i_op.unlink with another argument to return i_nlink is > > something to consider? A helper for the few filesystems that need to do > > this might be good enough in the near term. > > ???? > > a) iget() had been gone since way back > b) it never had been called by VFS - it's a filesystem's responsibility > c) again, what the hell does iget() or its replacements have to do with > dentry eviction? It does *NOT* affect dentry refcount. Never had. I misspoke with regard to iget. It is the locking for i_nlink that I am concerned about. > d) checks for _inode_ retention in icache are done by filesystem code, which > is certainly free to use its locks. Incidentally, for normal filesystems > no locks are needed at all - everything that changes ->i_nlink is holding > a referfence to in-core inode, so in a situation when its refcount is zero > and ->i_lock is held (providing an exclusion with icache lookups), ->i_nlink > is guaranteed to be stable. Thanks for that explanation, that's what I needed. For filesystems which have a distributed lock manager involved on multiple hosts, i_nlink is not stable on an individual host even with i_lock held. You would also need to hold whatever finer-grained lock the filesystem is protecting i_nlink with. So the VFS _can't_ know when i_nlink has gone to zero, only the filesystem can. That's what I was trying to get at. > e) why would VFS possibly want to know if there are links remaining after > successful ->unlink()? The VFS wouldn't. I guess the argument goes that it would be nice to keep the old behavior and still cache disconnected dentries, and leave it to the filesystems which require it to destroy the anonymous dentries in ->unlink when they they know that i_nlink has gone to zero, having taken whatever internal locks they need to use to adequately protect i_nlink from other hosts. Thats where a helper to prune disconnected dentries would come in handy. > I'm sorry, but you are not making any sense... I'm sorry for the confusion, it took me a little awhile to digest this. The change in caching behavior could affect more than nfs. Any filesystem which expects disconnected dentries to remain cached is affected. Thanks, Ben -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/