Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262766AbUDBLyO (ORCPT ); Fri, 2 Apr 2004 06:54:14 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S263018AbUDBLyN (ORCPT ); Fri, 2 Apr 2004 06:54:13 -0500 Received: from c3p0.cc.swin.edu.au ([136.186.1.30]:46095 "EHLO swin.edu.au") by vger.kernel.org with ESMTP id S262766AbUDBLyL (ORCPT ); Fri, 2 Apr 2004 06:54:11 -0500 To: linux-kernel@vger.kernel.org From: Tim Connors Subject: Re: [PATCH] cowlinks v2 In-reply-to: References: <20040321125730.GB21844@wohnheim.fh-wedel.de> <20040321181430.GB29440@wohnheim.fh-wedel.de> <20040325174942.GC11236@mail.shareable.org> <20040325194303.GE11236@mail.shareable.org> X-Face: "0\RuOFb6AcQ}B_F/^%;;AmS%>"qr*^0t%eriBMe_x]B7&@b8_\i Date: Fri, 2 Apr 2004 21:54:09 +1000 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2984 Lines: 65 ebiederm@xmission.com (Eric W. Biederman) said on 25 Mar 2004 15:16:48 -0700: > Linus Torvalds writes: > > > On Thu, 25 Mar 2004, Jamie Lokier wrote: > > > > > > That is not useful for me or the other people who want to use this to > > > duplicate large source trees and run "diff" between trees. > > > > > > "diff" depends on being able to check if files in the two trees are > > > identical -- by checking whether the inode number and device (and > > > maybe other stat data) are identical. This allows "diff -ur" between > > > two cloned trees the size of linux to be quite fast. Without that > > > optimisation, it's very slow indeed. > > > > I think the correct thing to do is to just admit that cowlinks aren't > > POSIX, and instead see the inode number as a way to see whether the link > > has been broken or not. Ie just accept the inode number potentially > > changing. > > > > That would make "diff" (adn most other uses) ok with this, and anythign > > that isn't, just couldn't be used with cowlinked files. > > Really? > > tar and cp still need to be taught about them. And if they are not taught > they will do the wrong thing and hard link the files removing the > copy on write semantics. Which would do ugly thing when you restore > from your backup. > > I don't have a problem with the inode number changing when you write to > a file, because I can't think of much that would care either way. > But having the inode number of an open file change sounds like a very > difficult problem to track. If the inode doesn't change upon a write, the system is still POSIX, so nothing breaks (the problem is, cowlinks would be very useful, and POSIX is very useful - making them incompatible would be a major bummer) If you use sendfile() or cowcp() or similar to implement cowlinks, then only rsync/tar/cp need be taught about its usage. Someone mentioned that diff will not be optimal with different inodes, because diff compared inodes. So add some diffstat() that keeps track of some magical inode like thingy that stays the same for cowed files. Then teach diff to use diffstat(). Now, you have a system where if your program doesn't explicitly use cowcp() and diffstat(), everything still works perfectly (we want to still be able to use old coreutils on a machine that has a new kernel applied, right? And vice-versa?). All that happens is that your program may or may not be sub-optimal - you might have to compare entire files instead of just inodes. Or you may have to copy entire files instead of just cowing them. But it all still Just Works. My AU$0.02. -- TimC -- http://astronomy.swin.edu.au/staff/tconnors/ I bet the human brain is a kludge. -- Marvin Minsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/