Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261951AbUC0Xqe (ORCPT ); Sat, 27 Mar 2004 18:46:34 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261939AbUC0Xq0 (ORCPT ); Sat, 27 Mar 2004 18:46:26 -0500 Received: from ebiederm.dsl.xmission.com ([166.70.28.69]:11415 "EHLO ebiederm.dsl.xmission.com") by vger.kernel.org with ESMTP id S261937AbUC0XqO convert rfc822-to-8bit (ORCPT ); Sat, 27 Mar 2004 18:46:14 -0500 To: Jamie Lokier Cc: =?iso-8859-1?q?J=F6rn?= Engel , Davide Libenzi , "Patrick J. LoPresti" , Linux Kernel Mailing List Subject: Re: [PATCH] cowlinks v2 References: <20040321125730.GB21844@wohnheim.fh-wedel.de> <20040321181430.GB29440@wohnheim.fh-wedel.de> <20040325174942.GC11236@mail.shareable.org> <20040325194303.GE11236@mail.shareable.org> <20040327102828.GA21884@mail.shareable.org> <20040327214238.GA23893@mail.shareable.org> From: ebiederm@xmission.com (Eric W. Biederman) Date: 27 Mar 2004 16:45:53 -0700 In-Reply-To: <20040327214238.GA23893@mail.shareable.org> Message-ID: User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.2 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3898 Lines: 82 Jamie Lokier writes: > Eric W. Biederman wrote: > > There are two possible implementations strategies for implementing > > cow files. You can either start as J?rn did with hardlinks, or you > > can start with symlinks. > > Symlinks have a _big_ problem: move one, or move it's target, and it > no longer links to the same file. That's nothing like the > transparency we need cowlinks to have. With absolute paths you can move the symlink. But the other side is still problematic. > There's a third implementation strategy. Since we're talking in all > cases about adding a new feature to the underlying filesystem, why not > implement separate inodes pointing to an underlying shared inode which > holds the data. (I think it was mentioned earlier in this thread). > > The implementation is very similar to symbolic links, but instead of > having symlink inodes, you have cowlink inodes which point directly to > another inode and count as references to it. > > That provides POSIX semantics and has none of the caveats you and I > have mentioned for hard links and symbolic links. > > Implementation: Creating a cowlink to a non-cowlinked file creates a > new shared inode by cloning the file's inode, converts the original > inode to a cowlink-pointer inode, and creates a new cowlink-pointer > inode. And to state the obvious, you must make certain your fs has a lot of inodes because this dramatically changes the inode to fs block ratio. > This provides 100% semantic equivalence to copying: all operations > including hard and symbolic links on the resulting cowlink files act > as if the cowlink operation was a copy, but faster and using less > space. I like it. It's not too hard and it yields a very nice result. syscall wise we would need: int cowlink(const char *oldpath, const char *newpath); int cstat(const char *filename, struct stat *buf); Which of course are non POSIX :) > > As my memory has it the proto implementation I saw using a stackable > > fs and cow symlinks was about a 1000 lines. And it was that large > > because it was complete (i.e. it did the copies including copying > > directories.) > > I can see a stackable fs being useful for live CD distributions, where > tmpfs or disk hold the modifications stacked over the original > filesystem. The practical use was getting root on a network filesystem to scale, and be manageable. But the problem is essentially the same. Although to guard against network congestion interacting badly with paging we were actually thinking about implementing copy on read. > But mounting an fs for each version of a source tree and keeping track > of all those mounts: that would be incredibly clumsy to use. > > You could implement a stackable fs which is mounted once and provides > cowlink operations within the fs. That still be a bit clumsy, but not > so much as to make it unusable for source tree management. And that is what the prototype we were playing with was doing. You could not see the original files at all unless you mounted the base filesystem somewhere outside the cowfs mount. It was actually quite close to your special cow inode idea except that it stored data in symlinks instead of inodes. To the extent of freeing files if all of the cow links to them were removed, and the files were store on the same filesystem as the cow links. The addictive thing about the prototype implementation was that you could do ``ln --cow / /some/other/directory'' and you would have an atomic snapshot of your filesystem. Definitely not a feature for the first implementation but certainly something to dream about. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/