Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752466AbbEPFqd (ORCPT ); Sat, 16 May 2015 01:46:33 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:58983 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750962AbbEPFqa (ORCPT ); Sat, 16 May 2015 01:46:30 -0400 Date: Sat, 16 May 2015 06:46:26 +0100 From: Al Viro To: NeilBrown Cc: Linus Torvalds , Andreas Dilger , Dave Chinner , Linux Kernel Mailing List , linux-fsdevel , Christoph Hellwig Subject: Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks Message-ID: <20150516054626.GS7232@ZenIV.linux.org.uk> References: <20150514033040.GF7232@ZenIV.linux.org.uk> <20150514112304.GT15721@dastard> <20150516093022.51e1464e@notabene.brown> <20150516112503.2f970573@notabene.brown> <20150516014718.GO7232@ZenIV.linux.org.uk> <20150516144527.20b89194@notabene.brown> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150516144527.20b89194@notabene.brown> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2556 Lines: 50 On Sat, May 16, 2015 at 02:45:27PM +1000, NeilBrown wrote: > Yes, I've looked lately :-) > I think that all of RCU-walk, and probably some of REF-walk should happen > before the filesystem gets to see anything. > But once you hit a non-positive dentry or the parent of the target name, I'd > rather hand over the the FS. ... and be ready to get it back when the sucker runs into a symlink. Unless you want to handle _those_ in NFS somehow (including an absolute one starting with /sys/, etc.). > NFSv4 has the ability to look up multiple components in a single LOOKUP call. > VFS doesn't give it a chance to try because it wants to go step-by-step, and > wants each entry in the cache to have an inode etc. Do tell, how do we deal with .. afterwards if we leave the intermediate ones without inodes? We _could_ feed multi-component requests to filesystems (and NFSv4 isn't the first one to handle that - 9p had been there a lot earlier), but then you get to * populate all of them with inodes * be damn careful to avoid multiple dentries for the same directory inode Look, creating those suckers isn't the worst part; you need to be ready for e.g. mount(2) or pathname resolution playing with the ones you'd created. It's not fs-private data structure; pathname resolution might very well span many filesystem types. Worse, you get to deal with several multi-component requests jumping into fs at the same place. With responses arriving a bit afterwards, and guess what? Those requests happen to share bits and pieces of prefixes. Oh, and one of them is a rename. Dealing with just the final components isn't a problem; you'll need to deal with directory tree in all its fscking glory. In a way that wouldn't be in too incestous relationship with the pathwalking logics in VFS and, by that proxy, such in all other fs types. In particular, "unknown" for intermediate nodes is a recipe for really nasty mess. If the path can rejoin the known universe several components later... Dealing with multi-component lookups isn't impossible and might be a good idea, but only if all intermediates are populated. What information does NFSv4 multi-component lookup give you? 9p one gives an array of FIDs, one per component, and that is best used as multi-component revalidate on hot dcache... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/