Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757567AbYBWOCT (ORCPT ); Sat, 23 Feb 2008 09:02:19 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754026AbYBWOCE (ORCPT ); Sat, 23 Feb 2008 09:02:04 -0500 Received: from pih-relay08.plus.net ([212.159.14.134]:44490 "EHLO pih-relay08.plus.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752473AbYBWOCB (ORCPT ); Sat, 23 Feb 2008 09:02:01 -0500 Date: Sat, 23 Feb 2008 14:01:53 +0000 From: Charles Bailey To: "J.C. Pizarro" Cc: LKML , git@vger.kernel.org Subject: Re: Question about your git habits Message-ID: <20080223140153.GB5811@hashpling.org> References: <20080223014445.GK27894@ZenIV.linux.org.uk> <7vfxvk4f07.fsf@gitster.siamese.dyndns.org> <20080223020913.GL27894@ZenIV.linux.org.uk> <998d0e4a0802221823h3ba53097gf64fcc2ea826302b@mail.gmail.com> <998d0e4a0802221847m431aa136xa217333b0517b962@mail.gmail.com> <20080223113952.GA4936@hashpling.org> <998d0e4a0802230508w12f236baiaf2d9ab5f364670a@mail.gmail.com> <20080223131749.GA5811@hashpling.org> <998d0e4a0802230536w74e93ec3s40c77d52b183a419@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <998d0e4a0802230536w74e93ec3s40c77d52b183a419@mail.gmail.com> User-Agent: Mutt/1.4.2.1i X-Plusnet-Relay: aae52d1c7b637927eeb4b1f7c3ac7fe0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2861 Lines: 56 On Sat, Feb 23, 2008 at 02:36:59PM +0100, J.C. Pizarro wrote: > On 2008/2/23, Charles Bailey wrote: > > > > It shouldn't matter how aggressively the repositories are packed or what > > the binary differences are between the pack files are. git clone > > should (with the --reference option) generate a new pack for you with > > only the missing objects. If these objects are ~52 MiB then a lot has > > been committed to the repository, but you're not going to be able to > > get around a big download any other way. > > You're wrong, nothing has to be commited ~52 MiB to the repository. > > I'm not saying "commit", i'm saying > > "Assume A & B binary git repos and delta_B-A another binary file, i > request built > B' = A + delta_B-A where is verified SHA1(B') = SHA1(B) for avoiding > corrupting". > > Assume B is the higher repacked version of "A + minor commits of the day" > as if B was optimizing 24 hours more the minimum spanning tree. Wow!!! > I'm not sure that I understand where you are going with this. Originally, you stated that if you clone a 775 MiB repository on day one, and then you clone it again on day two when it was 777 MiB, then you currently have to download 775 + 777 MiB of data, whereas you could download a 52 MiB binary diff. I have no idea where that value of 52 MiB comes from, and I've no idea how many objects were committed between day one and day two. If we're going to talk about details, then you need to provide more details about your scenario. Having said that, here is my original point in some more detail. git repositories are not binary blobs, they are object databases. Better than this, they are databases of immutable objects. This means that to get the difference between one database and another, you only need to add the objects that are missing from the other database. If the two databases are actually a database and the same database at short time interval later, then almost all the objects are going to be common and the difference will be a small set of objects. Using git:// this set of objects can be efficiently transfered as a pack file. You may have a corner case scenario where the following isn't true, but in my experience an incremental pack file will be a more compact representation of this difference than a binary difference of two aggressively repacked git repositories as generated by a generic binary difference engine. I'm sorry if I've misunderstood your last point. Perhaps you could expand in the exact issue that are having if I have, as I'm not sure that I've really answered your last message. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/