LinuxLists.cc - linux-2.5.4-pre1

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

> I second that. Maybe however we can have it both ways -- I have no
> experience with bk, but can't this same info be made available elsewhere
> like a public web interface or some such thing?

I've put up read-only clones on

http://linux.bkbits.net

you can go there and get the changelogs in web form. I just figured out
what a bad choice 8088 was for a port and we'll be moving stuff over to
8080 since that seems to go through more firewalls.

hpa is working on getting these up in some of the kernel.org sites, he's
stalled out because of some stuff he needs from me, we'll get that
straightened out and the the authoritative source is bk.kernel.org or
master.kernel.org, I'm not quite sure. Peter will tell you. But we'll
keep up to date with Linus' BK tree as long as he is playing with BK
and you can follow along there.

Oh, and for what it is worth, I agree that having the changelogs as part
of the history rocks. Goto the http://linux.bkbits.net:8088/linux-2.5
link and click on user statistics - because Linus hacked up a nice email
to patch importer script, all the patches look like they were checked
in by the person who sent them. And it propogates down to the annotated
listings.

Here's hoping bkbits.net has gone belly up before I wake up tomorrow :-)
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-06 15:20:07

by Florian Weimer

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

"Jeramy B. Smith" <[email protected]> writes:

> Firstly, IANAFSN (I am not a Free Software Nazi) but there is this
> new GPL decentralized version control program called arch that is small
> and fits in well with the Unix way of using other small utils.

Aegis has been in existence for years (about a decade) and it closely
mirrors the Linux development process. ;-)

--
Florian Weimer [email protected]
University of Stuttgart http://CERT.Uni-Stuttgart.DE/people/fw/
RUS-CERT +49-711-685-5973/fax +49-711-685-5898

2002-02-06 15:22:07

by Florian Weimer

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Linus Torvalds <[email protected]> writes:

> The long-range plan, and the real payoff, comes if main developers start
> using bk too, which should make syncing a lot easier. That will take some
> time, I suspect.

Do you think that at some point, using BitKeeper will become mandatory
for subsystem maintainers? ("mandatory" in the sense that
non-BitKeeper input is dealt with in a less timely fashion, for
example.)

--
Florian Weimer [email protected]
University of Stuttgart http://CERT.Uni-Stuttgart.DE/people/fw/
RUS-CERT +49-711-685-5973/fax +49-711-685-5898

2002-02-06 15:32:37

by Rik van Riel

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, 6 Feb 2002, Florian Weimer wrote:
> Linus Torvalds <[email protected]> writes:
>
> > The long-range plan, and the real payoff, comes if main developers start
> > using bk too, which should make syncing a lot easier. That will take some
> > time, I suspect.
>
> Do you think that at some point, using BitKeeper will become mandatory
> for subsystem maintainers? ("mandatory" in the sense that
> non-BitKeeper input is dealt with in a less timely fashion, for
> example.)

They're pretty much equally easy to deal with, except that the
bitkeeper patches will always apply and will get better changelog
entries ;)

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-02-06 16:54:50

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, Feb 06, 2002 at 04:17:29PM +0100, Florian Weimer wrote:
> Linus Torvalds <[email protected]> writes:
>
> > The long-range plan, and the real payoff, comes if main developers start
> > using bk too, which should make syncing a lot easier. That will take some
> > time, I suspect.
>
> Do you think that at some point, using BitKeeper will become mandatory
> for subsystem maintainers? ("mandatory" in the sense that
> non-BitKeeper input is dealt with in a less timely fashion, for
> example.)

If BK makes things dramatically easier for Linus, then there may be a
naturally tendency for him to look at BK patches first.

On the other hand, he's only been using it for a week and he isn't saying
it is the best thing since sliced bread. So it's a bit premature to
predict whether he will be using it in a month or not. We hope so,
and we'll keep working to make you happy with it, but Linus is a harsh
judge - if BK doesn't help out, he'll kick it out the door.

And finally, almost all of the part of the back and forth over the
last week was about how to make BK better at accepting and generating
traditional patches. You will *always* be able to send BK traditional
patches, whether Linus uses BK or not. That was true before he used it
and his use of BK has done nothing but make it be better. For example,
we're working out a plain text format for comments in the patch headers
so that you can comment individual changes on a per file basis in the
patch.

So the summary is that you'll always be able to do regular patches, even
if Linus continues to use BK. There may come a time where the value of
using BK - to you - is quite high. If not, we're back to diff&patch or
some other way.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-06 17:31:03

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hi,

On Tue, 5 Feb 2002, Linus Torvalds wrote:

> However, some of it pays off already. Basically, I'm aiming to be able to
> accept patches directly from email, with the comments in the email going
> into the revision control history.

Um, what's so special about it, what a shell script couldn't do as well?

bye, Roman

2002-02-06 17:34:33

by Linus Torvalds

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, 6 Feb 2002, Roman Zippel wrote:
>
> On Tue, 5 Feb 2002, Linus Torvalds wrote:
>
> > However, some of it pays off already. Basically, I'm aiming to be able to
> > accept patches directly from email, with the comments in the email going
> > into the revision control history.
>
> Um, what's so special about it, what a shell script couldn't do as well?

About this particular change-set? Nothing. In fact, most of it is
generated from a shell script before it goes into the BK archive.

The advantage is mainly that (a) you can generate this changeset listing
yourself, and not limit it to the stuff I merged and (b) when the
developers I work with start sending me their bitkeeper merges _as_
bitkeeper merges and we start having the advantage of various tools to
help resolve conflicts.

Linus

2002-02-06 19:36:04

by Christoph Hellwig

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

In article <[email protected]> you wrote:
>> I second that. Maybe however we can have it both ways -- I have no
>> experience with bk, but can't this same info be made available elsewhere
>> like a public web interface or some such thing?
>
> I've put up read-only clones on
>
> http://linux.bkbits.net
>
> you can go there and get the changelogs in web form. I just figured out
> what a bad choice 8088 was for a port and we'll be moving stuff over to
> 8080 since that seems to go through more firewalls.

Btw, is there a generic way to move repos cloned from Ted's (now
orphaned?) 2.4 tree to Linus' official one?

Christoph

--
Whip me. Beat me. Make me maintain AIX.

2002-02-06 19:46:07

by Tom Rini

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, Feb 06, 2002 at 08:35:21PM +0100, Christoph Hellwig wrote:
> In article <[email protected]> you wrote:
> >> I second that. Maybe however we can have it both ways -- I have no
> >> experience with bk, but can't this same info be made available elsewhere
> >> like a public web interface or some such thing?
> >
> > I've put up read-only clones on
> >
> > http://linux.bkbits.net
> >
> > you can go there and get the changelogs in web form. I just figured out
> > what a bad choice 8088 was for a port and we'll be moving stuff over to
> > 8080 since that seems to go through more firewalls.
>
> Btw, is there a generic way to move repos cloned from Ted's (now
> orphaned?) 2.4 tree to Linus' official one?

Working under the assuming that Linus started his own tree first and
didn't grab Ted's, no.

--
Tom Rini (TR1265)
http://gate.crashing.org/~trini/

2002-02-06 19:59:18

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hi,

Linus Torvalds wrote:

> > > However, some of it pays off already. Basically, I'm aiming to be able to
> > > accept patches directly from email, with the comments in the email going
> > > into the revision control history.
> >
> > Um, what's so special about it, what a shell script couldn't do as well?
>
> About this particular change-set? Nothing. In fact, most of it is
> generated from a shell script before it goes into the BK archive.

Sorry, I meant the part about accepting patches directly from email.
Pine supports piping a mail to a script, this script could try to apply
the patch and extract the text in front of the patch, but it could of
course also recognize a bk patch and feed it to bk.
The important thing is to avoid two classes of patches, bk patches and
patches, which would create extra work for you. It would be no problem
to use tags, which can be easily extracted by above script, just tell
us, how they should look like.

bye, Roman

2002-02-06 20:35:46

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

> Btw, is there a generic way to move repos cloned from Ted's (now
> orphaned?) 2.4 tree to Linus' official one?

You can export the changes you want as a patch and if you ask, we'll send
you a script which also exports your checkin comments in Linus' nifty
new email->BK converter. The format that we like (I believe, Wayne/Linus
will correct errors) is:

### Change the comments to ChangeSet below
These are the changeset comments, i.e, the email message for the patch.

### Change the comments to include/asm/whatever.h below
The comments for include/asm/whatever.h

In other words

printf("### Change the comments to %s below\n", filename);

And then it can be imported directly.

To create an extra script to do this for a bk export -tpatch is straightforward.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-06 20:45:16

by Wayne Scott

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

From: Tom Rini <[email protected]>
> On Wed, Feb 06, 2002 at 08:35:21PM +0100, Christoph Hellwig wrote:
> > Btw, is there a generic way to move repos cloned from Ted's (now
> > orphaned?) 2.4 tree to Linus' official one?
>
> Working under the assuming that Linus started his own tree first and
> didn't grab Ted's, no.

Right. And yes Linus tree was started from scratch. He started with
2.4.0 and imported all prepatches. The 2.5 tree was created as a
clone of the 2.4 tree at the appropriate place.

So Chris is right csets in Ted's tree won't directly apply to Linus'
tree. Sorry.

I think 'bk export -tpatch' and 'bk import -tpatch' is called for.
You might find the new 'bk comments' command (new in 2.1.4) useful to
fixup the comments after 'bk import'.

If you have a large ammount of state to transfer, let me know and
maybe we can rig up something better.

-Wayne

2002-02-06 22:18:38

by Rob Landley

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wednesday 06 February 2002 10:17 am, Florian Weimer wrote:
> Linus Torvalds <[email protected]> writes:
> > The long-range plan, and the real payoff, comes if main developers start
> > using bk too, which should make syncing a lot easier. That will take some
> > time, I suspect.
>
> Do you think that at some point, using BitKeeper will become mandatory
> for subsystem maintainers? ("mandatory" in the sense that
> non-BitKeeper input is dealt with in a less timely fashion, for
> example.)

The hierarchy seems to be two levels deep now. Linus doesn't accept patches
from all 300 maintainers anyway, he has a group of a dozen or so lieutenants.
(Andre Hedrick has to send code to Jens Axboe, for example, before Linus
will take it.)

Being a lieutenant would have to require bitkeeper before simply being a
maintainer would. I doubt it would ever work its way to simple developers
submitting to maintainers.

(This is, of course, just my take on things...)

Rob

2002-02-06 22:25:36

by Mike Fedyk

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

I'd like to add my little nit. This sounds better to me:

On Wed, Feb 06, 2002 at 12:35:27PM -0800, Larry McVoy wrote:
> ### Change the comments to ChangeSet below
> These are the changeset comments, i.e, the email message for the patch.
>

### Comments for change to ChangeSet below

> ### Change the comments to include/asm/whatever.h below
> The comments for include/asm/whatever.h
>
> In other words
>
> printf("### Change the comments to %s below\n", filename);

printf("### Comments for change to %s below\n", filename);

Mike

2002-02-06 22:50:24

by Pavel Machek

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hi!

> However, some of it pays off already. Basically, I'm aiming to be able to
> accept patches directly from email, with the comments in the email
> going

Hey, this looks very good! At this level of verbosity, it might be
nice to also list modified files, but this is really good.
Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-02-06 23:06:31

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, Feb 06, 2002 at 08:38:18PM +0100, Pavel Machek wrote:
> Hey, this looks very good! At this level of verbosity, it might be
> nice to also list modified files, but this is really good.

He generated that listing with "bk changes", you want "bk changes -v", which
I included below, but doesn't have any more useful info (yet) because the
comments are all auto generated from the email.

We will stick up a web page someplace that says "send Linus comments
like this if you want individual file comments to be different", and is
totally BK agnostic, i.e., you can send them with a regular diff -Nur
style patch and the import tools will do the right thing. Then the
verbose listing will start to be useful.

--lm

ChangeSet
1.237 02/02/06 10:57:18 [email protected] +1 -0
[PATCH] reiserfs fix for inodes with wrong item versions (2.5)

This is hopefully last bugfix for a bug introduced by struct inode splittin
g.
Because of setting i_flags to some value and then cleaning the i_flags
contents later, on-disk items received wrong item version ob v3.6 filesyste
ms

fs/reiserfs/inode.c
1.34 02/02/06 10:57:17 [email protected] +7 -7
reiserfs fix for inodes with wrong item versions (2.5)
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-06 23:38:08

by Linus Torvalds

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, 6 Feb 2002, Roman Zippel wrote:
>
> Sorry, I meant the part about accepting patches directly from email.
> Pine supports piping a mail to a script, this script could try to apply
> the patch and extract the text in front of the patch, but it could of
> course also recognize a bk patch and feed it to bk.

That's not the problem I have.

The problem I have with piping patches directly to bk is that I don't like
to switch back-and-forth between reading email and applying (and fixing
up) patches. Even if the patch applies cleanly (which most of them tend to
do) I still usually need to do at least some minimal editing of the commit
message etc (removing stuff like "Hi Linus" etc).

So my scripts are all done to automate this, and to allow me to just save
the patches off for later, and then apply them in chunks when I'm ready to
switch over from email to tree update. So that's why I script the thing,
and want to apply patches from emails rather than by piping them.

Some of these issues don't exist with true BK patches, and I'm trying to
set up a separate chain to apply those directly (and not from the email at
all - the email would contain only a description and a BK repository
source). That will be very convenient for multiple patches, but at the
same time that will require more trust in the source, so I'll probably
keep the "patches as diffs in emails" for the occasional work, and the
direct BK link for the people I work closest with.

Linus

2002-02-06 23:54:38

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, Feb 06, 2002 at 03:36:01PM -0800, Linus Torvalds wrote:
> do) I still usually need to do at least some minimal editing of the commit
> message etc (removing stuff like "Hi Linus" etc).

And I think once we finalize the generic patch comment format, you will
be able to scan it email, see it looks good, and dump it to apply and
move on. Then people can send you mail like what is below and it's
painless. Aside from the coffee/tea issues.

Hi Linus,

How's the wife and kids, mine are fine, here's a patch that makes coffee
from /dev/coffee bits, see the changelog. Next week we will send you
the patch which removes /dev/emacs and replaces it with /dev/vi, the
one true editor. I trust you will have no issues with these patches.

Thank you,

Joe Hacker.

### Comments for ChangeSet
This is the coffee patch. It is the one true coffee patch and it should
put an end to the coffee versus tea debate. There is no /dev/tea, there
is only a /dev/coffee, in spite of our best efforts to implement /dev/tea,
we could not fix the problem of multiple Oopses on SMP machines when we
did a "cat /dev/tea > /dev/cup". "cat /dev/coffee > /dev/cup" always
works, so we think this is proof positive that coffee is better than tea.

### Comments for drivers/char/coffee.c
I like coffee, I don't like tea,
I'm as happy as a little wired bee.

### Comments for drivers/char/tea.c
Didn't work. It's a sign from above.

<diffs>

:-)

Yup, a little punchy back here at BitMover, but trying to maintain a sense
of humor.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-07 08:07:44

by Stelian Pop

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, Feb 06, 2002 at 03:36:01PM -0800, Linus Torvalds wrote:

[...Talking about BitKeeper adoption...]
> That will be very convenient for multiple patches, but at the
> same time that will require more trust in the source, so I'll probably
> keep the "patches as diffs in emails" for the occasional work, and the
> direct BK link for the people I work closest with.

What about people who send you occasionnal patches, and happen to
be using Bitkeeper too ?

Do you still prefer only regular patches from them or you would
accept something generated with a:
bk send -d [email protected]
(which prepends the bk changeset with the equivalent in unified diff
format, so you can have the best of both worlds) ?

In the latter case Documentation/SubmittingPatches should be updated
with the proper BitKeeper syntax to use etc.

Stelian.
--
Stelian Pop <[email protected]>
|---------------- Free Software Engineer -----------------|
| Alc?ve - http://www.alcove.com - Tel: +33 1 49 22 68 00 |
|------------- Alc?ve, liberating software ---------------|

2002-02-07 10:51:41

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hi,

On Wed, 6 Feb 2002, Linus Torvalds wrote:

> The problem I have with piping patches directly to bk is that I don't like
> to switch back-and-forth between reading email and applying (and fixing
> up) patches. Even if the patch applies cleanly (which most of them tend to
> do) I still usually need to do at least some minimal editing of the commit
> message etc (removing stuff like "Hi Linus" etc).

I don't know how much your scripts already do, so below is just a
suggestion how to do some of the preprocessing of patches already during
email reading (the bk magic has to be added by someone else).

bye, Roman

#! /bin/bash

rm -f /tmp/test-patch /tmp/test-log

IFS=""
log=y
while read -r line; do
case "$line" in
---\ *)
log=n
;;
esac
test $log = y && echo "$line" >> /tmp/test-log
echo "$line" >> /tmp/test-patch
done

(
oldtty=`stty -g`
reset -Q
vim -o /tmp/test-log /tmp/test-patch
echo -n "ok?"
read
# do more
stty $oldtty
) < /dev/tty >& /dev/tty

2002-02-07 16:37:26

by Linus Torvalds

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Thu, 7 Feb 2002, Stelian Pop wrote:
>
> What about people who send you occasionnal patches, and happen to
> be using Bitkeeper too ?

For those people, "bk send -d [email protected]" is fine. It ends up
being close enough to a regular patch, and I'm hoping that Larry will
change the syntax slightly so that it won't be so ugly.

Linus

2002-02-07 16:51:06

by Jan Harkes

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Tue, Feb 05, 2002 at 11:33:34PM -0800, Jeramy B. Smith wrote:
> Firstly, IANAFSN (I am not a Free Software Nazi) but there is this
> new GPL decentralized version control program called arch that is small
> and fits in well with the Unix way of using other small utils.

Tom, I cc'd you on this,

Yes it's very interesting and has several good ideas behind it, but it's
not ready yet.

$ du /opt/arch-1.0pre3
12100 /opt/arch-1.0pre3/bin
788 /opt/arch-1.0pre3/include
2588 /opt/arch-1.0pre3/lib
1640 /opt/arch-1.0pre3/libexec
17120 /opt/arch-1.0pre3

Hmm, I wouldn't call over 17MB small. It also has several name conflicts
with existing binaries, amongst others /bin/arch and /bin/readlink. This
breaks a lot when arch's binary directory is not first in the PATH
environment variable.

Then it has these {arch} names all over the place, about as bad as CVS
and SCCS, but it breaks tab-completion with GNU bash/readline too,
wouldn't .arch (or .{arch}) be a less invasive naming scheme? It's
changesets are definitely not close to being 'patch' compatible.

> When you get weird Bk errors because Larry changes the Open Logging stuff
> for the umpteenth time which forces you to upgrade to keep using Bk,
> just remember we told you so.

Have you tried to work your way through the arch sources yet? Just
trying to figure out where 'sb' is compiled, what it does and where it
is used took me a very long time.

Most of arch's helper libraries/programs (hackerlab/xxx-utils) already
have in my opinion perfectly reasonable existing solutions, i.e. there
is something called the POSIX standard, ftp/http file access by using
wget/curl/ncftpget. And why it needs to have it's own ftp-server built
in (which is what it looks like), I have no clue about that one.

Jan

2002-02-07 17:27:04

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Thu, Feb 07, 2002 at 08:36:20AM -0800, Linus Torvalds wrote:
> > What about people who send you occasionnal patches, and happen to
> > be using Bitkeeper too ?
>
> For those people, "bk send -d [email protected]" is fine. It ends up

No! This will send the entire repository. Do a "bk help send", you probably
want "bk send -d -r+ [email protected]" to send the most recent cset.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-07 19:47:03

by Stelian Pop

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Thu, Feb 07, 2002 at 09:26:40AM -0800, Larry McVoy wrote:

> On Thu, Feb 07, 2002 at 08:36:20AM -0800, Linus Torvalds wrote:
> >
> > For those people, "bk send -d [email protected]" is fine. It ends up
>
> No! This will send the entire repository.

It will probably happen... :-)

> Do a "bk help send", you probably
> want "bk send -d -r+ [email protected]" to send the most recent cset.

I'd really like 'bk send' to drop me to a shell/mailer like and
ask for confirmation before sending the mail (and eventually add
for example the cc: line to l-k).

What I found easier to use is to 'bk send - > /tmp/foo' then send
foo using my regular mailer... But I lose the advantages of
checking the last sended ChangeSet (in Bitkeeper/etc/[email protected]).

Stelian.
--
Stelian Pop <[email protected]>
|---------------- Free Software Engineer -----------------|
| Alc?ve - http://www.alcove.com - Tel: +33 1 49 22 68 00 |
|------------- Alc?ve, liberating software ---------------|

2002-02-07 20:00:48

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

From: Jan Harkes <[email protected]>

On Tue, Feb 05, 2002 at 11:33:34PM -0800, Jeramy B. Smith wrote:
> Firstly, IANAFSN (I am not a Free Software Nazi) but there is this
> new GPL decentralized version control program called arch that is small
> and fits in well with the Unix way of using other small utils.

Tom, I cc'd you on this,

Yes it's very interesting and has several good ideas behind it, but it's
not ready yet.

Arch will certainly need to be tuned for kernel hackers and perhaps
customized. Some systems are still effected by portability bugs (arch
has been out for barely a month) Yes: because it is new, it isn't as
"off the shelf" a solution as bk. Nevertheless, arch is self hosting,
rich in features, and has properties that I think are ideal for
projects such as the kernel. So if there is to be a shift among
kernel developers to coordinating with a source code management tool,
one question is whether the effort should be directed toward deploying
bk, or towards helping to optimize arch for their use.

$ du /opt/arch-1.0pre3
12100 /opt/arch-1.0pre3/bin
788 /opt/arch-1.0pre3/include
2588 /opt/arch-1.0pre3/lib
1640 /opt/arch-1.0pre3/libexec
17120 /opt/arch-1.0pre3

Hmm, I wouldn't call over 17MB small.

Most of that size is for generic utilities and a generic library that
I package up with arch distributions since they aren't already widely
installed. The revision control system itself is, as reported, about
40K lines of code. The significance of that distinction is that arch
itself is small enough to grok, maintain, extend, etc.

It also has several name conflicts with existing binaries, amongst
others /bin/arch and /bin/readlink. This breaks a lot when arch's
binary directory is not first in the PATH environment variable.

There are instructions in the distribution for installing arch in a
way that avoids those conflicts (docs/examples/README.000.first-steps).

Then it has these {arch} names all over the place, about as bad as CVS
and SCCS, but it breaks tab-completion with GNU bash/readline too,
wouldn't .arch (or .{arch}) be a less invasive naming scheme?

It has {arch} files in the root of each project tree -- not in every
directory. The arch distribution contains eight project trees (there
are eight separately developed components in the distribution).
Should there be an alternative name for that file? Perhaps.

The name has curly braces so that it sorts reasonably (i.e. away from
ordinary source files). It is not a dot file so that you can
recognize at a glance when you are looking at the root of a project
tree.

It's changesets are definitely not close to being 'patch' compatible.

That's not quite true. arch patch sets consist of unified diffs plus
additional material to cleanly describe file and directory renames,
added and removed files, changes to binary files, changes to symbolic
links, and changes to file permissions. The arch command `dopatch'
applies the context diffs using `patch'. arch contains reporting
tools that generate either plain text or HTML reports to help simplify
reviewing patch sets.

When you do need simple diff-format patch sets, the arch feature
called "revision libraries" makes it very easy to quicly create them
between arbitrary revisions.

Have you tried to work your way through the arch sources yet? Just
trying to figure out where 'sb' is compiled, what it does and where it
is used took me a very long time.

You must have forgotten to try:

find . -name "sb.c"

The "=README" file in the parent directory of the `sb' source code
explains what the program does. There are also installation auditing
files generated in the build directory, but I admit that as an obscure
way to find `sb'.

Most of arch's helper libraries/programs (hackerlab/xxx-utils) already
have in my opinion perfectly reasonable existing solutions, i.e. there
is something called the POSIX standard, ftp/http file access by using
wget/curl/ncftpget. And why it needs to have it's own ftp-server built
in (which is what it looks like), I have no clue about that one.

arch does not have its own ftp-server. It does have an ftp client.

arch doesn't use wget/curl/ncftpget because it needed a simplier
client with different performance characteristics and a different
interface for programming. I wanted very much to save work by using
those tools, but they weren't suitable.

Should the shell utilities distributed with arch use only POSIX libc
and avoid libhackerlab? Arguably so, and I considered doing it that
way while writing them. However, libhackerlab has a bunch of nice
features that made it easier to write those utilities and get them
working quickly.

-t

2002-02-07 21:19:16

by Daniel Phillips

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On February 8, 2002 12:06 am, Tom Lord wrote:
> The name has curly braces so that it sorts reasonably (i.e. away from
> ordinary source files). It is not a dot file so that you can
> recognize at a glance when you are looking at the root of a project
> tree.

I'd pass on that to get rid of the extra 'noise'. If for some reason I'm not
sure - and I consider that unlikely to ever occur - I can always ls .* or
find -name ".*arch*" if I'm really confused.

I like my directories to look clean, *especially* the source root.

--
Daniel

2002-02-07 21:23:58

by Paul Komkoff

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Replying to Tom Lord:
> Arch will certainly need to be tuned for kernel hackers and perhaps
> customized. Some systems are still effected by portability bugs (arch
> has been out for barely a month) Yes: because it is new, it isn't as
> "off the shelf" a solution as bk. Nevertheless, arch is self hosting,
> rich in features, and has properties that I think are ideal for
> projects such as the kernel. So if there is to be a shift among
> kernel developers to coordinating with a source code management tool,
> one question is whether the effort should be directed toward deploying
> bk, or towards helping to optimize arch for their use.

Today I played some interesting games with source code control systems since
I'vent got a sexual partner to play with him. But I don't satisfied with
ever :(((

First, cvs. Marcelo uses cvs :))) I using cvs in some places ... just to use
it. It lacks renaming, it lacks ... many features :(

BitKeeper ... maybe larry should consider increasing single user repository
from 1000 files to at least 12000 ? the kernel will fit well in it - I don't
wanna my experiments be published on site. It will confuse people browsing
it - believe me :)

Also bitkeeper ... I don't use X. and without damn renametool I cannot do
proper renaming :(((

aegis cruiser is a good unit in Red Alret 2, and I fear aegis. Installed it,
and just removed ...

... and finally, arch. translate it deutch-to-english, please

What I need definitely - to have several branches in scs - one for
marcelo's, one for ac, one for mjc and several for my experiments

say, marcelo release new patch -pre2 against base-0, but I already have
base-0 and patch-1 (pre1), WHAT exactly I must do to add patch-2 to marcelo
branch ?

when I derive my branch from for example patch-1 what I must do to attempt
to update my branch and (prompt to resolve merge conflicts) and then
continue working on my patch ?

And finally - I got the patch against some revision. Maybe I know it's order
(this is -pre8 it's after -pre7 but patch is against base), maybe not, maybe
unknown at all. Where is the handle to pull ?

--
Paul P 'Stingray' Komkoff 'Greatest' Jr // (icq)23200764 // (irc)Spacebar
PPKJ1-RIPE // (smtp)[email protected] // (http)stingr.net // (pgp)0xA4B4ECA4

2002-02-07 21:26:17

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

> Arch will certainly need to be tuned for kernel hackers and perhaps
> customized.

{Note: I'm not going be drawn into a BK is better or worse than arch
discussion. It's not fair to Tom, and it would really need to be a BK now
vs arch in 5 years discussion to be remotely apples to apples. So here
are my thoughts and then I'll leave this to the rest of you to discuss.}

An interesting experiment would be to take every kernel revision,
including all the pre-patches, and import it into arch and report the
resulting size of the repository and the time to generate each version
of the tree from that repository. I suspect that this will demonstrate
the most serious issue that I have with the arch design.

In essence arch isn't that different from RCS in that arch is
fundamentally a diff&patch based system with all of the limitations that
implies. There are very good reasons that BK, ClearCase, Aide De Camp,
and other high end systems, are *not* diff&patch based revision histories,
they are weave based. Doing a weave based system is substantially
harder but has some nice attributes. Simple things like extracting any
version of the file takes constant time, regardless of which version.
Annotated listings work correctly in the face of multiple branches and
multiple merges. The revision history files are typically smaller,
and are much smaller than arch's context diff based patches.

But most importantly, BK at least, has great merge tools. At the end of
the day, what most people spend their time on is merging. Everything else
is just accounting and how the system does that is interesting to the
designers and noone else. What users care about is how much time they
spend merging. It's technically impossible to get arch or CVS or RCS or
any diff&patch based system to give you the same level of merge support.

On the other hand, I like parts of arch. I have to like the distributed
repository nature of it, that's a clear reimplementation of BitKeeper
and Teamware. I've been waiting for someone to do that for 10 years.
If arch were weave based, had good merge tools, was started 6 years ago,
and had a commercial company backing it, BitKeeper probably wouldn't
exist, we'd be using arch and working on Linux clusters.

Looking forward, I wonder about money issues, as politically incorrect
that may be. We spent millions developing BitKeeper with no end in sight.
Tom has done a lot of work, but he has to eat as well. I doubt very
much that arch will ever generate enough revenue to pay for its ongoing
development. Companies simply won't pay for a product in this space
if they can get it for free. And the problem with that is there are an
endless number of corner cases which need to be handled, aren't fun, and
aren't going to happen for free. That means arch has a natural growth
path, it will evolve to a certain point much like CVS has, and then stop.
It will be a useful point, but it won't be remotely close to covering
the same problem space that ClearCase or BK or any other professional
SCM system does.

Before you yell at me, remember that source management is not the same
as the kernel. Everyone has to use the kernel, the computer doesn't work
without it. Take that set of people and remove everyone who doesn't use
source management. Out of a potential space of billions of people, you
are left with a market of about .5 - 2 million, world wide. And there
are 300 SCM systems fighting over that space. The top 3 have 50% of the
space. So 297 systems fighting over maybe a million users. That's 3300
users per system. OK, so if each of those people were willing to pay
$500 for arch support/whatever, that's a total of 1.6 million dollars.
Which isn't remotely close enough to get the job done. And don't forget
that those people have to volunteer that money, it can't be pried out
of them.

It's just math. Projects that aren't universally used have a much harder
time getting funding than projects that everyone uses. It doesn't matter
what the value is to you, it matters what the costs are to the developers.
This is why microsoft is so rich, they build apps that the masses use.
It's also why clearcase+clearquest is $8000/seat. It's not because
that's what the value is, it's because that's what the cost has to be
or Rational starts to die.

So before you start talking about support contracts, and grants, and
whatever, realize that the pool of people interested in paying those
dollars is very small. Are _you_ going to send Tom $500?
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-07 21:29:07

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Fri, Feb 08, 2002 at 12:23:24AM +0300, Paul P Komkoff Jr wrote:
> Also bitkeeper ... I don't use X. and without damn renametool I cannot do
> proper renaming :(((

You can. We just have to tell you how.

cd `bk bin`
vi import # this is a shell script
Look for a call to "bk patch".
You will see some funny arguments, those are there to generate a list
of "creates" and "deletes" which are passed to renametool. Instead
of calling renametool, write your on way of doing text based rename
matchup and stick it in there.

Next problem, please :-)
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-07 21:56:07

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Noted. Yours would seem to be the more popular opinion. I don't feel
strongly against changing it.

-t

On February 8, 2002 12:06 am, Tom Lord wrote:
> The name has curly braces so that it sorts reasonably (i.e. away from
> ordinary source files). It is not a dot file so that you can
> recognize at a glance when you are looking at the root of a project
> tree.

I'd pass on that to get rid of the extra 'noise'. If for some reason I'm not
sure - and I consider that unlikely to ever occur - I can always ls .* or
find -name ".*arch*" if I'm really confused.

I like my directories to look clean, *especially* the source root.

--
Daniel

2002-02-07 23:25:05

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

[email protected] writes:

An interesting experiment would be to take every kernel revision,
including all the pre-patches, and import it into arch and report the
resulting size of the repository and the time to generate each version
of the tree from that repository.

I agree. This would also be an opportunity for tuning and debugging
and a giant leap towards deployment.

I suspect that this will demonstrate the most serious issue
that I have with the arch design.

>From what you go on to describe, I think your perceptions of the arch
design are slightly, but critically out-of-date.

In essence arch isn't that different from RCS in that arch is
fundamentally a diff&patch based system with all of the
limitations that implies. There are very good reasons that BK,
ClearCase, Aide De Camp, and other high end systems, are *not*
diff&patch based revision histories, they are weave based.

Arch is in two layers: The lower layer is a repository management
layer that is, as you say, diff&patch based. The upper layer is a
work-space management layer that is based on keeping a library of
complete source trees for many revisions, with unmodified files shared
between those trees via hard links.

The lower layer provides very compact archival of revisions,
repository transactions, the global name-space of revisions, and
distributed repositories. The upper layer provides convenience and
speed. By far, the second layer is not the most space efficient
approach: I'm sure that arch will lose if compared by that metric.
However, it is extremely convenient and well within the capacity of
cheap, modern storage.

Weave-based systems are a single layer approach with intermediate
characteristics. They make a different set of space/time trade-offs
-- one that, as I see it, comes from a time (not very long ago) when
storage was much more expensive. A weave-based system can provide
most of the speed of arch's second layer, but unless it is presented
as a file system, it lacks the convenience of being able to run
ordinary tools like `find', `grep', and `diff' on your past revisions.
With arch, you can use all of those standard tools and you can get a
copy of a past revision just as fast as your system can recursively
copy a tree.

But most importantly, BK at least, has great merge tools. At
the end of the day, what most people spend their time on is
merging. Everything else is just accounting and how the
system does that is interesting to the designers and noone
else. What users care about is how much time they spend
merging. It's technically impossible to get arch or CVS or
RCS or any diff&patch based system to give you the same level
of merge support.

I think this is just wrong. Aside from the fancy merge operators
built-in to arch, arch's second layer makes the choice of merging
technologies largely orthogonal to the revision control system.

Are _you_ going to send Tom $500?

If only it were that easy. It isn't, is it? :-)

-t

2002-02-08 00:30:49

by Andreas Dilger

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Feb 07, 2002 20:46 +0100, Stelian Pop wrote:
> On Thu, Feb 07, 2002 at 09:26:40AM -0800, Larry McVoy wrote:
> > Do a "bk help send", you probably
> > want "bk send -d -r+ [email protected]" to send the most recent cset.
>
> I'd really like 'bk send' to drop me to a shell/mailer like and
> ask for confirmation before sending the mail (and eventually add
> for example the cc: line to l-k).

I'd agree. I previously had a dialog with one of the BK developers
about making the email nicer (better subject, etc), but having it
dump the output to a file and fire up $EDITOR is probably a lot
better. Maybe with a "bk send -e ..." or so.

> What I found easier to use is to 'bk send - > /tmp/foo' then send
> foo using my regular mailer... But I lose the advantages of
> checking the last sended ChangeSet (in Bitkeeper/etc/[email protected]).

Yes, I ended up doing the same "bk send - > cset.X.bk" instead of
sending directly, because I wanted to add in a patch description or
other text to the beginning of the email.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2002-02-08 05:30:56

by Troy Benjegerdes

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Thu, Feb 07, 2002 at 09:26:40AM -0800, Larry McVoy wrote:
> On Thu, Feb 07, 2002 at 08:36:20AM -0800, Linus Torvalds wrote:
> > > What about people who send you occasionnal patches, and happen to
> > > be using Bitkeeper too ?
> >
> > For those people, "bk send -d [email protected]" is fine. It ends up
>
> No! This will send the entire repository. Do a "bk help send", you probably
> want "bk send -d -r+ [email protected]" to send the most recent cset.

Larry, I think this means you should change the default behavior of 'bk
send' to ask the user what the hell they are smoking if they the patch to
be sent is larger than say, oh 200k.

I got clobbered by this a couple of times trying to get someone to 'bk
send' me a patch.

I got burned enough times to just decide never to use it again.

Either remove 'bk send' or at least warn us before we shoot off a foot.

Ideally, this should ask what changesets you want to send, and what
public tree to look at to see *what* makes sense to send.

--
Troy Benjegerdes | master of mispeeling | 'da hozer' | [email protected]
-----"If this message isn't misspelled, I didn't write it" -- Me -----
"Why do musicians compose symphonies and poets write poems? They do it
because life wouldn't have any meaning for them if they didn't. That's
why I draw cartoons. It's my life." -- Charles Schulz

2002-02-08 06:06:39

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

> Ideally, this should ask what changesets you want to send, and what
> public tree to look at to see *what* makes sense to send.

In BK 2.1.4 we added a

bk send -u<url> email

which does the sync with the URL and sends only what you have that the
URL doesn't have. But you have to be running 2.1.4 on both ends.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-08 06:14:50

by Troy Benjegerdes

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Thu, Feb 07, 2002 at 10:06:19PM -0800, Larry McVoy wrote:
> > Ideally, this should ask what changesets you want to send, and what
> > public tree to look at to see *what* makes sense to send.
>
> In BK 2.1.4 we added a
>
> bk send -u<url> email
>
> which does the sync with the URL and sends only what you have that the
> URL doesn't have. But you have to be running 2.1.4 on both ends.

Perfect.

Does 2.1.4 have a "is the user on crack and trying to send the whole
tree" check?

--
Troy Benjegerdes | master of mispeeling | 'da hozer' | [email protected]
-----"If this message isn't misspelled, I didn't write it" -- Me -----
"Why do musicians compose symphonies and poets write poems? They do it
because life wouldn't have any meaning for them if they didn't. That's
why I draw cartoons. It's my life." -- Charles Schulz

2002-02-08 06:51:33

by Andreas Dilger

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Feb 07, 2002 22:06 -0800, Larry McVoy wrote:
> > Ideally, this should ask what changesets you want to send, and what
> > public tree to look at to see *what* makes sense to send.
>
> In BK 2.1.4 we added a
>
> bk send -u<url> email
>
> which does the sync with the URL and sends only what you have that the
> URL doesn't have. But you have to be running 2.1.4 on both ends.

In one way, it doesn't make sense to "bk send" a CSET that is already
in the parent repository, so by default <url> should probably be the
parent. The "proper" mode of operation would be to "bk pull" on the
other end if they want to get a copy of the whole repository, I think.

If you can't contact the repository to check, "bk send" would only send
a subset of CSETs unless told otherwise. Maybe at most all CSETs generated
locally which do not have CSETs from the parent repository following them,
or maybe non-local CSETs following them.

Unfortunately, I don't know how hard it is to determine "CSETs from the
parent repository". It is also hard to guess what to do when you _are_
the parent repository.

In general I don't think you ever want to send a whole repository by
email, and this is probably a user error.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2002-02-08 21:14:58

by Pavel Machek

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hi!

> Before you yell at me, remember that source management is not the same
> as the kernel. Everyone has to use the kernel, the computer doesn't work
> without it. Take that set of people and remove everyone who doesn't use
> source management. Out of a potential space of billions of people, you
> are left with a market of about .5 - 2 million, world wide. And there
> are 300 SCM systems fighting over that space. The top 3 have 50% of
> the

I know less than 10 such systems, not 300. And now arch is one of
them.

> space. So 297 systems fighting over maybe a million users. That's 3300
> users per system. OK, so if each of those people were willing to pay
> $500 for arch support/whatever, that's a total of 1.6 million dollars.
> Which isn't remotely close enough to get the job done. And don't forget
> that those people have to volunteer that money, it can't be pried out
> of them.
>
> It's just math. Projects that aren't universally used have a much harder
> time getting funding than projects that everyone uses. It doesn't matter
> what the value is to you, it matters what the costs are to the developers.
> This is why microsoft is so rich, they build apps that the masses use.
> It's also why clearcase+clearquest is $8000/seat. It's not because
> that's what the value is, it's because that's what the cost has to be
> or Rational starts to die.
>
> So before you start talking about support contracts, and grants, and
> whatever, realize that the pool of people interested in paying those
> dollars is very small. Are _you_ going to send Tom $500?

What was the point of this mail?

Are you concerned that arch is free software and bk is not?

arch _has_ chance; Tom is probably not going to make million dollars,
but if someone has a choice between using non-free software and arch,
maybe he will just spend his time improving arch.

Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-02-08 21:35:46

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Fri, Feb 08, 2002 at 04:33:08PM +0100, Pavel Machek wrote:
> I know less than 10 such systems, not 300. And now arch is one of
> them.

http://www.cmtoday.com/yp/ is a starting point to get a baseline for
all of them. It's by no means a complete list, but it's a lot bigger
than 10.

> What was the point of this mail?
>
> Are you concerned that arch is free software and bk is not?
>
> arch _has_ chance; Tom is probably not going to make million dollars,
> but if someone has a choice between using non-free software and arch,
> maybe he will just spend his time improving arch.

The point is that you don't make systems like this by having people "spend
their time improving arch" for free. If that worked, then CVS would
be the ultimate SCM system. CVS has been here for close to 2 decades I
think, right? Wasn't the first release 1986? OK, 15 years. So 15 years
ago, CVS was where arch is today in terms of maturity. There is no reason
that you couldn't take CVS and make it work. It's just work to do so.
We've had 15 years to have that happen and it didn't. What makes you
think it will be different this time around?

The problem with this space, and it is constantly annoying to me, is
that people think it is easy. "I can just write some scripts and wrap
them around RCS or SCCS or diff&patch, I can do a better job than those
losers over at Rational (or Perforce or AccuRev or BitMover or Collab.net
or ...)".

And guess what? In very short order, you can have something that
does something. Even something useful. Then it starts to get hard.
Then it starts to get harder. If you really want to solve the problem,
it's really really hard, harder than, say, multithreading a kernel.
I can hear you screaming BS, no way is it harder than what I do, which
is exactly what I find annoying - I've done what you do, you haven't
done what I do, so why is it that you know that I'm wrong? Don't know,
but you're sure I'm wrong.

So the point is that it is a hard problem space, it takes a lot of time,
thought, and quality programming to get it right, and it's not fun. The
easy part is a blast. I have some fun most days. But the majority of
my day is not fun. And if you were solving the same problems it wouldn't
be any more fun for you. Only the fun part is fun, and that's the small
part of the problem space. So you have a not-so-fun space, a fairly
small market of people to pay for the not-so-fun parts, and a lot of
competition.

Result? Arch is cool, it's got a good model for distribution, it needs
a huge amount of work to make it a reasonable answer, and noone is going
to pay for that work. So you take what you have now, and realize that if
it works for you now or is close, then arch is the answer you want, you
can't beat the price. If it isn't close to what you want, then I question
the chances it has of getting there.

That's all. This whole mail could have been "if it works for you, great.
If if doesn't and you can fix it, great. If you can't, don't hold your
breath".
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-11 08:21:41

by Josh MacDonald

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Quoting Larry McVoy ([email protected]):
> > Arch will certainly need to be tuned for kernel hackers and perhaps
> > customized.
>
> {Note: I'm not going be drawn into a BK is better or worse than arch
> discussion. It's not fair to Tom, and it would really need to be a BK now
> vs arch in 5 years discussion to be remotely apples to apples. So here
> are my thoughts and then I'll leave this to the rest of you to discuss.}
>
> An interesting experiment would be to take every kernel revision,
> including all the pre-patches, and import it into arch and report the
> resulting size of the repository and the time to generate each version
> of the tree from that repository. I suspect that this will demonstrate
> the most serious issue that I have with the arch design.
>
> In essence arch isn't that different from RCS in that arch is
> fundamentally a diff&patch based system with all of the limitations that
> implies. There are very good reasons that BK, ClearCase, Aide De Camp,
> and other high end systems, are *not* diff&patch based revision histories,
> they are weave based. Doing a weave based system is substantially
> harder but has some nice attributes. Simple things like extracting any
> version of the file takes constant time, regardless of which version.
> Annotated listings work correctly in the face of multiple branches and
> multiple merges. The revision history files are typically smaller,
> and are much smaller than arch's context diff based patches.

Larry, I don't think you're saying what you mean to say, and I doubt
that what you mean to say is even correct. In fact, I've got a
master's thesis to prove you wrong. I wish you would spend more time
thinking and less time flaming or actually _do some research_. This
is not the first time you've misrepresented another version control
system in public with poor factual basis.

On the subject of "weave" vs. "diff&patch" methods of storage. When
you describe the weave method operating in constant time, I just have
to wonder. It sounds like you have a magic algorithm that can process
N bytes in less than O(N) time. Unless you have a limit on file size,
you can't really expect any storage system to operate in less than
linear time (as a function of version size).

But as I understand your weave method, its not even linear as a
function of version size, its a function of _archive size_. The
archive is the sum of the versions woven together, and that means your
operation is really O(N+A) = O(A).

Let's suppose that by "diff&patch" you really mean RCS--generalization
doesn't work here. Your argument supposes that the cost of an RCS
operation is dominated by the application of an unbounded chain of
deltas. The cost of that sub-operation is linear as a function of the
chain length C. RCS has to process a version of length N, an archive
of size A, and a chain of length C giving a total time complexity of
O(N+C+A) = O(C+A).

And you would like us to believe that the weave method is faster
because of that extra processing cost (i.e., O(A) < O(C+A)). I doubt
it. The cost of these operations is likely dominated by the number of
blocks transferred to or from the disk, period. Processing each delta
is a trivial operation--but both systems (yours and RCS) slow down as
the archive size grows because you have to pass through the entire
archive. That's not constant time, that's not even linear time,
that's _unbounded growth_.

Now, putting the calculations aside, I've got a real system to prove
that you're wrong about the inadequacy of "diff&patch". First of all,
neither the "weave" style or the RCS "diff -a" method properly capture
data that moves or is copied from one part of a file to another. No
diff format does. Remember, "diffs" are meant to be human readable,
and "deltas" are meant for efficient storage and/or transport.

I based the Xdelta2 storage system on some very fine research by
Randal Burns and Darrel Long. Remember the formula O(N+C+A). Neither
the C term (chain length) or the A term (disk blocks) should be
allowed to grow without bound.

Bounding the chain length is easy, it just means that instead of
storing 1000 deltas in a chain you store 50 fully-expanded versions
and 50 delta chains of max length 20. Of course this means a little
extra storage, but really how much? Observe that storing 50 out of
1000 versions is only 5%, which is pretty good as delta-compression
ratios go. A typical delta is usually larger than 5% of the file
size. I won't bore you all by carrying out the math, it can easily be
found in either my report or his. The point is that bounding the
chain length by introducing full copies every once in a while does not
dramatically hurt your compression ratio.

Bounding the archive size is also easy, the set of versions and deltas
just cannot be stored as a single, conglomerate archive. If you're
relying on the rename() system call for atomic updates in your system,
your performance sucks already. Ideally you should only write as much
data as the delta you are inserting. Let me show you:

http://www.cs.berkeley.edu/~jmacd/xdfs-vs-rcs.eps

It is better to store full versions and deltas individually. The
graph compares insertion-speed for RCS and (the research prototype of)
Xdelta2 as a function of version count. RCS grows linearly with
version number (i.e., archive size). Xdelta2 has truly bounded
insertion time (i.e., no growth). The graph proves it.

The story gets even better (from Randal's research). A chain length
of 20 could require 20 disk seeks (or worse, tape seeks in the
incremental backup system) to reconstruct a version. You can make a
time-space tradeoff to fix the number of disk seeks at a constant
two--one full version and a single delta. This gets rid of delta
chains altogether, but the degradation in compression is _only_ a
constant factor. For example, you might compress down to 10% using
delta-chains of length 20, and with "forward version jumping", as he
calls it, your archive might compress to 20% instead. Twice as much
storage, 1/10 the number of disk seeks. Its a good tradeoff when
speed is your primary concern.

Note: implementing the system as I have described it requires some
kind of transaction support, since rename() is a performance-killer,
and it also requires efficient handling of small files. That is why I
use Berkeley DB. And that is also why I can claim, using the above
graph as proof, that adding support for file system transactions
stands to _improve application performance_.

There's another thing that this method can accomplish that I think the
"weave" method cannot. When I started out designing this storage
system, what I really wanted to do was something like CVSup, where you
transfer (efficiently computed) deltas around between repositories to
keep mirrors up-to-date. But the CVSup technique really only works in
a centralized RCS system, and I wanted decentralized version control.
The server, in such a system, should be able to generate a delta to
satisfy a client request regardless of what subset of versions the
client has or any ordering between versions on the server side. Say
the server has 20 versions of a file, the client connects and says it
wants version 17 and it has version 7. The server needs to
efficiently generate a delta from 7 to 17, and could it may have
stored those versions in any particular way. When I say efficiently,
I don't mean "compute version 7, compute version 17, then compute a
delta". The trivial implementation has at least O(N) time complexity
but you can do much better, assuming that deltas are small compared to
the file size. Your computation should be propertional to the
computed delta's size, not the full versions. There is an "extract
delta" operation in Xdelta2 can compute a delta between any two
versions--it works by merging and inverting delta chains. It supports
both forward jumping chains (for best speed) and reverse chains (for
best compression).

And as long as I'm advertising, Mihut Ionescu and I have implemented a
delta-compressing HTTP proxy system called Xproxy demonstrating the
use of the "extract delta" operation. I'm currently finishing up a
VCDIFF-encoding implementation of Xdelta, with which we will implement
the new RFCs for delta-encoded HTTP transfer.

So from my point of view, the "weave" method looks pretty inadequate.

Master's thesis, "File System Support for Delta Compression":
http://prdownloads.sourceforge.net/xdelta/xdfs.pdf
Xdelta code: http://prdownloads.sourceforge.net/xdelta/xdelta-2.0-beta9.tar.gz
Xproxy paper: http://prdownloads.sourceforge.net/xdelta/xproxy.pdf

-josh

--
PRCS version control system http://sourceforge.net/projects/prcs
Xdelta storage & transport http://sourceforge.net/projects/xdelta
Need a concurrent skip list? http://sourceforge.net/projects/skiplist

2002-02-11 15:00:30

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Mon, Feb 11, 2002 at 12:20:57AM -0800, Josh MacDonald wrote:
> > In essence arch isn't that different from RCS in that arch is
> > fundamentally a diff&patch based system with all of the limitations that
> > implies. There are very good reasons that BK, ClearCase, Aide De Camp,
> > and other high end systems, are *not* diff&patch based revision histories,
> > they are weave based. Doing a weave based system is substantially
> > harder but has some nice attributes. Simple things like extracting any
> > version of the file takes constant time, regardless of which version.
> > Annotated listings work correctly in the face of multiple branches and
> > multiple merges. The revision history files are typically smaller,
> > and are much smaller than arch's context diff based patches.
>
> Larry, I don't think you're saying what you mean to say, and I doubt
> that what you mean to say is even correct. In fact, I've got a
> master's thesis to prove you wrong. I wish you would spend more time
> thinking and less time flaming or actually _do some research_.

Right, of course your masters thesis is much better than the fact that
I've designed and shipped 2 SCM systems, not to mention the 7 years of
research and development, which included reading lots of papers, even
the xdelta papers.

> But as I understand your weave method, its not even linear as a
> function of version size, its a function of _archive size_. The
> archive is the sum of the versions woven together, and that means your
> operation is really O(N+A) = O(A).

The statement was "extracting *any* version of the file takes constant
time, regardless of which version". It's a correct statement and
attempts to twist it into something else won't work. And no diff&patch
based system can hope to achieve the same performance.

> Let's suppose that by "diff&patch" you really mean RCS--generalization
> doesn't work here.

Actually, generalization works just fine. By diff&patch I mean any system
which incrementally builds up the result in a multipass, where the number
of passes == the number of deltas needed to build the final result.

> Your argument supposes that the cost of an RCS
> operation is dominated by the application of an unbounded chain of
> deltas. The cost of that sub-operation is linear as a function of the
> chain length C. RCS has to process a version of length N, an archive
> of size A, and a chain of length C giving a total time complexity of
> O(N+C+A) = O(C+A).
>
> And you would like us to believe that the weave method is faster
> because of that extra processing cost (i.e., O(A) < O(C+A)). I doubt
> it.

Hmm, aren't you the guy who started out this post accusing me (with
no grounds, I might add) of not doing my homework? And how are we
to take this "I doubt it" statement? Like you didn't go do your
homework, perhaps?

> [etc]
> The cost of these operations is likely dominated by the number of
> blocks transferred to or from the disk, period.

I notice that in your entire discussion, you failed to address my point
about annotated listings. Perhaps you could explain to us how you get an
annotated listing out of arch or xdelta and the performance implications
of doing so.

> So from my point of view, the "weave" method looks pretty inadequate.

I'm sure it does, you are promoting xdelta, so anything else must be bad.
I am a bit curious, do you even know what a weave is? Can you explain
how one works? Surely, after all that flaming, you do know how they
work, right?
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-11 22:14:24

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Mon, Feb 11, 2002 at 12:20:57AM -0800, Josh MacDonald wrote:
> Bounding the chain length is easy, it just means that instead of
> storing 1000 deltas in a chain you store 50 fully-expanded versions
> and 50 delta chains of max length 20. Of course this means a little
> extra storage, but really how much? Observe that storing 50 out of
> 1000 versions is only 5%, which is pretty good as delta-compression
> ratios go. A typical delta is usually larger than 5% of the file
> size. I won't bore you all by carrying out the math, it can easily be
> found in either my report or his. The point is that bounding the
> chain length by introducing full copies every once in a while does not
> dramatically hurt your compression ratio.

How about some numbers which contrast your claims? Here are the diff
sizes for all the changes in Linus' BK tree. Note that he is importing
patches which may actually be bigger than your typical checkin, but
no matter, the point stands even if they represent exactly one checkin
per patch.

In this tree, at least, a typical delta is less than .63% of the file
size. And if you are measuring against the revision history size,
then your numbers are even more off.

2198 >= 20.0000000%
1647 >= 10.0000000%
2240 >= 5.0000000%
2508 >= 2.5000000%
2879 >= 1.2500000%
2983 >= 0.6250000%
2962 >= 0.3125000%
2564 >= 0.1562500%
1581 >= 0.0781250%
919 >= 0.0390625%
400 >= 0.0195312%
162 >= 0.0097656%
50 >= 0.0048828%
12 >= 0.0024414%
105 >= 0.0012207%
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-12 02:09:35

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

This is degenerating into silliness (which is good, imho :-). Josh's
complexity analysis was right and so was Larry's because they are
using phrases like "constant time" with different referants and I'm
sure we all know that (if we bother to read the very long messages).
For those following along, it's an interesting display of the
different rhetorical norms of acadamia vs. business.

It may be theoretically interesting to minimize the space taken up by
revisions, but I think it is more economically sensible to screw that
and and instead, maximize convenience and interactive speed with
features like revision libraries (as in arch). This ain't the early
90's any more.

Look, I think arch is a lot closer to being ready for deployment in
kernel work than Larry's messages suggest and a better overall
solution than bitkeeper. But whatever -- Linus appears to be fully
committed to bitkeeper and uninterested in contemplating alternatives.
One can presume he'll either eventually lose hard or drag the rest of
the kernel hackers along.

-t

arch is at http://www.regexps.com

Date: Mon, 11 Feb 2002 14:14:04 -0800
From: Larry McVoy <[email protected]>
Mail-Followup-To: Larry McVoy <[email protected]>,
Josh MacDonald <[email protected]>, Tom Lord <[email protected]>,
[email protected], [email protected]
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
X-UIDL: 170bff3c69c14615bf23eabee2d7e2df

On Mon, Feb 11, 2002 at 12:20:57AM -0800, Josh MacDonald wrote:
> Bounding the chain length is easy, it just means that instead of
> storing 1000 deltas in a chain you store 50 fully-expanded versions
> and 50 delta chains of max length 20. Of course this means a little
> extra storage, but really how much? Observe that storing 50 out of
> 1000 versions is only 5%, which is pretty good as delta-compression
> ratios go. A typical delta is usually larger than 5% of the file
> size. I won't bore you all by carrying out the math, it can easily be
> found in either my report or his. The point is that bounding the
> chain length by introducing full copies every once in a while does not
> dramatically hurt your compression ratio.

How about some numbers which contrast your claims? Here are the diff
sizes for all the changes in Linus' BK tree. Note that he is importing
patches which may actually be bigger than your typical checkin, but
no matter, the point stands even if they represent exactly one checkin
per patch.

In this tree, at least, a typical delta is less than .63% of the file
size. And if you are measuring against the revision history size,
then your numbers are even more off.

2198 >= 20.0000000%
1647 >= 10.0000000%
2240 >= 5.0000000%
2508 >= 2.5000000%
2879 >= 1.2500000%
2983 >= 0.6250000%
2962 >= 0.3125000%
2564 >= 0.1562500%
1581 >= 0.0781250%
919 >= 0.0390625%
400 >= 0.0195312%
162 >= 0.0097656%
50 >= 0.0048828%
12 >= 0.0024414%
105 >= 0.0012207%
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-12 05:59:05

by Theodore Ts'o

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Mon, Feb 11, 2002 at 09:17:43PM -0800, Tom Lord wrote:
>
> It may be theoretically interesting to minimize the space taken up by
> revisions, but I think it is more economically sensible to screw that
> and and instead, maximize convenience and interactive speed with
> features like revision libraries (as in arch). This ain't the early
> 90's any more.

For What It's Worth, on a laptop environment (where I work quite a
bit) and for something the size of the Linux kernel, and where things
change at the speed of the Linux kernel, in fact space efficiency
matters a lot.

In fact, the one thing for which I was quite unhappy with BK until
Larry implemented bk lclone (aka bk clone -l) was the amount of space
having multiple copies of the same repository took up, since BK really
requires multiple sandboxes for parallel development. It's not a big
deal with something the size of e2fsprogs, but for something the size
of the BK linux tree, Size Really Matters.

- Ted

2002-02-12 06:20:29

by Bernd Eckenfels

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

In article <[email protected]> you wrote:
> For What It's Worth, on a laptop environment (where I work quite a
> bit) and for something the size of the Linux kernel, and where things
> change at the speed of the Linux kernel, in fact space efficiency
> matters a lot.

Having the option to have very old revisions as deltas and the current heads
of the most important revisions as full images would satisfy both
requirements. Don't think you work quite often with old pre patches to the
kernel, Ted?

Greetings
Bernd

2002-02-12 09:09:49

by Pavel Machek

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hi!

> > But as I understand your weave method, its not even linear as a
> > function of version size, its a function of _archive size_. The
> > archive is the sum of the versions woven together, and that means your
> > operation is really O(N+A) = O(A).
>
> The statement was "extracting *any* version of the file takes constant
> time, regardless of which version". It's a correct statement and

So, you are saying that you can extract *any* version of any file
within second?

Certainly not. [Take sufficiently big file...]

If you are saying that speed of getting any file does not depend on
which version you want, that is pretty different statement.
Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-02-12 11:02:19

by Josh MacDonald

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Quoting Larry McVoy ([email protected]):
> On Mon, Feb 11, 2002 at 12:20:57AM -0800, Josh MacDonald wrote:
> > > In essence arch isn't that different from RCS in that arch is
> > > fundamentally a diff&patch based system with all of the limitations that
> > > implies. There are very good reasons that BK, ClearCase, Aide De Camp,
> > > and other high end systems, are *not* diff&patch based revision histories,
> > > they are weave based. Doing a weave based system is substantially
> > > harder but has some nice attributes. Simple things like extracting any
> > > version of the file takes constant time, regardless of which version.
> > > Annotated listings work correctly in the face of multiple branches and
> > > multiple merges. The revision history files are typically smaller,
> > > and are much smaller than arch's context diff based patches.
> >
> > Larry, I don't think you're saying what you mean to say, and I doubt
> > that what you mean to say is even correct. In fact, I've got a
> > master's thesis to prove you wrong. I wish you would spend more time
> > thinking and less time flaming or actually _do some research_.
>
> Right, of course your masters thesis is much better than the fact that
> I've designed and shipped 2 SCM systems, not to mention the 7 years of
> research and development, which included reading lots of papers, even
> the xdelta papers.

I'm impressed.

> > But as I understand your weave method, its not even linear as a
> > function of version size, its a function of _archive size_. The
> > archive is the sum of the versions woven together, and that means your
> > operation is really O(N+A) = O(A).
>
> The statement was "extracting *any* version of the file takes constant
> time, regardless of which version". It's a correct statement and
> attempts to twist it into something else won't work. And no diff&patch
> based system can hope to achieve the same performance.

Didn't you read what I wrote?

> > Let's suppose that by "diff&patch" you really mean RCS--generalization
> > doesn't work here.
>
> Actually, generalization works just fine. By diff&patch I mean any system
> which incrementally builds up the result in a multipass, where the number
> of passes == the number of deltas needed to build the final result.

No, I guess you didn't read what I wrote.

> > Your argument supposes that the cost of an RCS
> > operation is dominated by the application of an unbounded chain of
> > deltas. The cost of that sub-operation is linear as a function of the
> > chain length C. RCS has to process a version of length N, an archive
> > of size A, and a chain of length C giving a total time complexity of
> > O(N+C+A) = O(C+A).
> >
> > And you would like us to believe that the weave method is faster
> > because of that extra processing cost (i.e., O(A) < O(C+A)). I doubt
> > it.
>
> Hmm, aren't you the guy who started out this post accusing me (with
> no grounds, I might add) of not doing my homework? And how are we
> to take this "I doubt it" statement? Like you didn't go do your
> homework, perhaps?

You haven't denied the claim either. I said you were wrong on your
regarding complexity analysis. You're ignoring the cost of disk
accesses which have nothing to do with "number of deltas needed to
build a final result". Do you ever say something with less than
100% confidence? I should hope so, given how often you're wrong.

> > [etc]
> > The cost of these operations is likely dominated by the number of
> > blocks transferred to or from the disk, period.
>
> I notice that in your entire discussion, you failed to address my point
> about annotated listings. Perhaps you could explain to us how you get an
> annotated listing out of arch or xdelta and the performance implications
> of doing so.

Separate problems, separate solutions.

> > So from my point of view, the "weave" method looks pretty inadequate.
>
> I'm sure it does, you are promoting xdelta, so anything else must be bad.
> I am a bit curious, do you even know what a weave is? Can you explain
> how one works? Surely, after all that flaming, you do know how they
> work, right?

Of course. But disk transfer cost is the same whether you're in RCS
or SCCS, and rename is still a very expensive way to update your files.

-josh

--
PRCS version control system http://sourceforge.net/projects/prcs
Xdelta storage & transport http://sourceforge.net/projects/xdelta
Need a concurrent skip list? http://sourceforge.net/projects/skiplist

2002-02-12 11:16:11

by Jeff Garzik

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Ok, ok... I'm sure you both have hard-ons the size of Texas for SCM,
but let's take it to #offtopic, shall we?

--
Jeff Garzik | "I went through my candy like hot oatmeal
Building 1024 | through an internally-buttered weasel."
MandrakeSoft | - goats.com

2002-02-12 17:21:20

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

I think arch can help you manage the limited space on a laptop disk
quite well, too. You'll have trouble if that's the _only_ disk you
have, but otherwise:

1. Make a nice big expansive development environment
on a larger machine, with lots of revisions cached in the
revision library, mirrors of your favorite archives, etc.

2. On your laptop, store only the repository you'll need for
day to day work, plus a very sparsely populated revision
library -- it might even be empty depending on the kind of
work you're doing. The repository needs only a single
baseline (a compressed tar file) and compressed deltas for
each revision it contains. It doesn't even have to be your
main repository -- it can be an otherwise empty repository
containing only a branch from your main repository plus
those revisions you create from your laptop.

3. Make a simple shell script, "prepare-detached", that
updates the contents of your laptop in anticipation of work
on particular branches or with particular historic
revisions, copying bits and pieces from your nice big
environment. Make a shell script "return-home" that moves
a branch from your laptop to your stationary archive.

Having a huge revision library is a win if what you're doing is
fielding patches from many contributors, against many baselines,
wanting to try out various combinations of baseline and patch, and
wanting to do lots of archeology to trace the history of various
changes. If, on the other hand, what you're doing is going off
somewhere to work on coding a particular change, you don't need a big
revision library.

-t

Date: Mon, 11 Feb 2002 22:59:35 -0500
From: Theodore Tso <[email protected]>
Cc: [email protected], [email protected], [email protected],
[email protected]
Mail-Followup-To: Theodore Tso <[email protected]>, Tom Lord <[email protected]>,
[email protected], [email protected], [email protected],
[email protected]
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.3.15i
X-UIDL: 7c6ac808cf42f277fa20d221ae51da13

On Mon, Feb 11, 2002 at 09:17:43PM -0800, Tom Lord wrote:
>
> It may be theoretically interesting to minimize the space taken up by
> revisions, but I think it is more economically sensible to screw that
> and and instead, maximize convenience and interactive speed with
> features like revision libraries (as in arch). This ain't the early
> 90's any more.

For What It's Worth, on a laptop environment (where I work quite a
bit) and for something the size of the Linux kernel, and where things
change at the speed of the Linux kernel, in fact space efficiency
matters a lot.

In fact, the one thing for which I was quite unhappy with BK until
Larry implemented bk lclone (aka bk clone -l) was the amount of space
having multiple copies of the same repository took up, since BK really
requires multiple sandboxes for parallel development. It's not a big
deal with something the size of e2fsprogs, but for something the size
of the BK linux tree, Size Really Matters.

- Ted

2002-02-12 22:54:35

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

> 2. On your laptop, store only the repository you'll need for
> day to day work, plus a very sparsely populated revision
> library
>
> Having a huge revision library is a win if what you're doing is
> fielding patches [etc]

I think that the point is that when you put stuff on your laptop, you'd
dearly love not realize that you forgot something you need when you are
either not connected or are connected only via a modem. If you can store
the kernel history in 80-90MB and you have all the versions you'll ever
want, that's a win compared to storing a few versions and then realizing
the one you want isn't there.

I also think that the term "huge revision library" doesn't make sense
to all systems. Some systems can fit that "huge library" in less space
than the checked out files, so why limit yourself?

Note that this is explictly not a BK thing, it's a general thing.
I want whatever system I use to limit my choices as little as possible.
No system is perfect, it's more of an optimization over the posssible
limitations. In this particular respect, I can say that I've found it
very useful to carry around all the history when traveling, it means
there is no difference between working at home or on the road, other
than performance of my crappy laptop.

And it's not like this makes arch bad, this is one place where it isn't as
good as some other choices. But arch has other areas where it is better,
it is less pedantic than most systems about what it will try and apply.
It's the uber patch library if you ask me, and that has real value.
Why the patchbot people haven't picked up on that is beyond me, they're
off trying to write something "simple", which I think you'll agree is
a strange, there is nothing simple about this problem space.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-02-13 00:49:10

by Daniel Phillips

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On February 12, 2002 11:54 pm, Larry McVoy wrote:
> And it's not like this makes arch bad, this is one place where it isn't as
> good as some other choices. But arch has other areas where it is better,
> it is less pedantic than most systems about what it will try and apply.
> It's the uber patch library if you ask me, and that has real value.
> Why the patchbot people haven't picked up on that is beyond me, they're
> off trying to write something "simple", which I think you'll agree is
> a strange, there is nothing simple about this problem space.

The patchbot people, at least one of them, is busy working on a totally
unrelated problem ;-)

I'm keeping an eye on this. The patchbot version 1.0 will in fact be simple
and useful at the same time or I'd better seriously consider retiring.

--
Daniel

2002-02-13 06:33:02

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Larry writes:

I think that the point is that when you put stuff on your laptop, you'd
dearly love not realize that you forgot something you need when you are
either not connected or are connected only via a modem. If you can store
the kernel history in 80-90MB and you have all the versions you'll ever
want, that's a win compared to storing a few versions and then realizing
the one you want isn't there.

The base cost of storing revisions in arch is the size of a compressed
tar file of the diffs, plus the size of the directory containing those
diffs plus the size of the log message. It is therefore likely that
one can store many, many revisions of the kernel on one's laptop, if
that's what one wants to do. If one has space left over, that can be
used for a revision library (complete trees of revisions, sharing
unmodified files).

I also think that the term "huge revision library" doesn't make sense
to all systems. Some systems can fit that "huge library" in less space
than the checked out files, so why limit yourself?

Arch *does* fit that "huge library" in less space than the checked out
files. I thought I'd made that perfectly clear already.

And it's not like this makes arch bad, this is one place where it isn't as
good as some other choices.

But you haven't described arch accurately, so I don't think your
comparative judgement is something anyone ought to dwell on.

It's the uber patch library if you ask me

We agree.

there is nothing simple about this problem space.

We agree again. It isn't the most difficult branch of mathematic ever
discovered, but it isn't trivial, either.

While I'm not too sure about comparing anyone's genetalia to the state
of Texas, as an earlier poster suggested, I am sure that patch logic
and revision control are fascinating and deeply relevant to
distributed development. They are topics that kernel hackers ought to
think about carefully.

-t

2002-02-13 10:36:23

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hi,

On Tue, 12 Feb 2002, Larry McVoy wrote:

> Why the patchbot people haven't picked up on that is beyond me, they're
> off trying to write something "simple", which I think you'll agree is
> a strange, there is nothing simple about this problem space.

Because they try to solve a completely different problem? Again, the
patchbot is _no_ source management system.

bye, Roman

2002-02-18 18:28:54

by Eric W. Biederman

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Josh MacDonald <[email protected]> writes:

> Quoting Larry McVoy ([email protected]):
> >
> > I'm sure it does, you are promoting xdelta, so anything else must be bad.
> > I am a bit curious, do you even know what a weave is? Can you explain
> > how one works? Surely, after all that flaming, you do know how they
> > work, right?
>
> Of course. But disk transfer cost is the same whether you're in RCS
> or SCCS, and rename is still a very expensive way to update your files.

Hmm. Up to a certain point something like 256K disk I/O for an entire
file can be done with single seek. So that is constant time. Not in
all cases but in the normal case for source code files it is. And beyond
that you get an extra seek for the inode and the directory. The only
part that rename removes is touching the directory. And given that
there is no requirement that rename be synchronous I don't see rename
waiting for the directory change to complete. I can see problems
currently with large directories, but those problems are orthogonal to
the issue of just rename being slow.

So for source code sized files why is rename expensive?

I guess I can see some merit if you are processing thousands of
operations a second where the sheer volume of data makes things
expensive but I don't see that being the common case.

Eric

2002-03-10 08:37:13

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Larry McVoy wrote:

>
>
>Before you yell at me, remember that source management is not the same
>as the kernel. Everyone has to use the kernel, the computer doesn't work
>without it. Take that set of people and remove everyone who doesn't use
>source management. Out of a potential space of billions of people, you
>are left with a market of about .5 - 2 million, world wide. And there
>are 300 SCM systems fighting over that space. The top 3 have 50% of the
>space. So 297 systems fighting over maybe a million users. That's 3300
>users per system. OK, so if each of those people were willing to pay
>$500 for arch support/whatever, that's a total of 1.6 million dollars.
>Which isn't remotely close enough to get the job done. And don't forget
>that those people have to volunteer that money, it can't be pried out
>of them.
>
>It's just math. Projects that aren't universally used have a much harder
>time getting funding than projects that everyone uses. It doesn't matter
>what the value is to you, it matters what the costs are to the developers.
>This is why microsoft is so rich, they build apps that the masses use.
>It's also why clearcase+clearquest is $8000/seat. It's not because
>that's what the value is, it's because that's what the cost has to be
>or Rational starts to die.
>
>So before you start talking about support contracts, and grants, and
>whatever, realize that the pool of people interested in paying those
>dollars is very small. Are _you_ going to send Tom $500?
>
This is why version control has to go into a standard filesystem as a
standard filesystem feature that every user uses. When version control
costs $8k per seat, only a tiny few can afford it. I sure can't, and I
am a professional programmer. Version control has to become just
another expected filesystem feature, and one that is so transparent to
users that Mom uses it without fear. Reiser4 will have transactions
built-in, and the natural technical progression from there is version
control and branching. I want to see the number of version control
users equal to 90% of the number of ReiserFS users in three years (not
Linux 2.6 but Linux 2.8 or later). If that happens, then open source
can economically support the core version control features, and $8k
pay-for-use plugin suites and utilities from folks like Larry can supply
those expensive to write tweaks that wealthy software management types
want.

I think that if version control becomes as simple as turning on a plugin
for a directory or file, and then adding a little to the end of a
filename to see and list the old versions, Mom can use it.

Besides, version control is useful for distributed filesystem designs
(high-performance distributed parallel writes work better with version
control in use.)

That said, Larry, a big thanks for making Bitkeeper free for use, an
even bigger thanks for getting Linus to use it, and I hope you get at
least a fraction of the money that you deserve from selling Bitkeeper.
Now I just have to finish shoving it down my developer's throats....:-/
Version control system adoption is always a lot of effort for
management, and CVS was an enormous hassle to get them to use also....

Hans

2002-03-10 19:42:33

by Itai Nahshon

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Sunday 10 March 2002 10:36, Hans Reiser wrote:
> I think that if version control becomes as simple as turning on a plugin
> for a directory or file, and then adding a little to the end of a
> filename to see and list the old versions, Mom can use it.

IIRC that was a feature in systems from DEC even before
VMS (I'm talking about the late 70's). eg. file.txt;2 was revision 2
of file.txt.

I don't know if this feature was in the file-system or in the text editor
that I have used.

The basic features were not even close to what you get from RCS or
SCCS.

>
> Besides, version control is useful for distributed filesystem designs
> (high-performance distributed parallel writes work better with version
> control in use.)

That's a different topic, and it depends on system's design. Distributed
filesystem may use some form of a file's version to control the caching
(or locking) of data. In that case just any monotonic value will do.
All the version control systems that I know use file granularity version
numbers (or tags), while for distributed file systems you may want to use
anything between single block and full directory granularity - depending
on the typical access patterns.

There are some recent discussions in the Linux Kernel mailing list
about adding "undelete" and ACL features. Well.. I think of "undelete"
as the most primitive form of a version control (you keep one version back).
Add support for extended attributes (where you can store some extra
metadata) and the rest can be done in the VFS layer. Still far away
from a full featured SCM...

-- Itai Nahshon

2002-03-10 20:19:54

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Itai Nahshon wrote:

>On Sunday 10 March 2002 10:36, Hans Reiser wrote:
>
>>I think that if version control becomes as simple as turning on a plugin
>>for a directory or file, and then adding a little to the end of a
>>filename to see and list the old versions, Mom can use it.
>>
>
>IIRC that was a feature in systems from DEC even before
>VMS (I'm talking about the late 70's). eg. file.txt;2 was revision 2
>of file.txt.
>

Was it easy? Did people like it? Any lessons/successes?

Hans

2002-03-10 21:14:16

by Rob Turk

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

"Hans Reiser" <[email protected]> wrote in message
news:[email protected]...
> Itai Nahshon wrote:
>
> >On Sunday 10 March 2002 10:36, Hans Reiser wrote:
> >
> >>I think that if version control becomes as simple as turning on a plugin
> >>for a directory or file, and then adding a little to the end of a
> >>filename to see and list the old versions, Mom can use it.
> >>
> >
> >IIRC that was a feature in systems from DEC even before
> >VMS (I'm talking about the late 70's). eg. file.txt;2 was revision 2
> >of file.txt.
> >
>
> Was it easy? Did people like it? Any lessons/successes?
>
> Hans
>

It was fabulous at that time. The first time you create a file, it gets ";1"
appended to it's filename. When you edit it, it gets saved under the same name,
this time appended by ";2". Edit it again... whell, you get the picture.
Cleaning up was as simple as "$ PURGE /KEEP=3" to keep the last three versions.

For these days with sometimes hundreds of files, it might become confusing when
'ls' shows all versions of all files, but back then it worked well.

Rob

2002-03-10 21:19:07

by Alan

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

> It was fabulous at that time. The first time you create a file, it gets ";1"
> appended to it's filename. When you edit it, it gets saved under the same name,
> this time appended by ";2". Edit it again... whell, you get the picture.
> Cleaning up was as simple as "$ PURGE /KEEP=3" to keep the last three versions.
>
> For these days with sometimes hundreds of files, it might become confusing when
> 'ls' shows all versions of all files, but back then it worked well.

Its trickier than that - because all your other semantics have to align,
its akin to the undelete problem (in fact its identical). Do you version on
a rewrite, on a truncate, only on an O_CREAT ?

In terms of where to stick versions, one popular unix solution seems to be
to put them in a .something directory (eg the netapp filer)

2002-03-10 21:23:56

by Rik van Riel

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Sun, 10 Mar 2002, Alan Cox wrote:
> > It was fabulous at that time. The first time you create a file, it gets ";1"
> > appended to it's filename. When you edit it, it gets saved under the same name,
> > this time appended by ";2". Edit it again... whell, you get the picture.
> > Cleaning up was as simple as "$ PURGE /KEEP=3" to keep the last three versions.
> >
> > For these days with sometimes hundreds of files, it might become confusing when
> > 'ls' shows all versions of all files, but back then it worked well.
>
> Its trickier than that - because all your other semantics have to align,
> its akin to the undelete problem (in fact its identical). Do you version
> on a rewrite, on a truncate, only on an O_CREAT ?

That's a nice question. I would dread the scenario where a
new version was created for each append ;))

Rik
--
<insert bitkeeper endorsement here>

http://www.surriel.com/ http://distro.conectiva.com/

2002-03-10 21:28:36

by Alexander Viro

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Sun, 10 Mar 2002, Alan Cox wrote:

> Its trickier than that - because all your other semantics have to align,
> its akin to the undelete problem (in fact its identical). Do you version on
> a rewrite, on a truncate, only on an O_CREAT ?

Even better, what do you do upon link(2)? Or rename(2) over one of the
versions...

VMS is not UNIX. And union of these two will be hell - incompatible models,
let alone features. "Well, I don't use <list of Unix features>" is not an
answer - other people have different sets of things they don't use and you
can be sure that every thing you don't care about is absolute must-have
for somebody else.

2002-03-10 21:38:31

by Richard Gooch

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hans Reiser writes:
> Itai Nahshon wrote:
>
> >On Sunday 10 March 2002 10:36, Hans Reiser wrote:
> >
> >>I think that if version control becomes as simple as turning on a plugin
> >>for a directory or file, and then adding a little to the end of a
> >>filename to see and list the old versions, Mom can use it.
> >>
> >
> >IIRC that was a feature in systems from DEC even before
> >VMS (I'm talking about the late 70's). eg. file.txt;2 was revision 2
> >of file.txt.
> >
>
> Was it easy? Did people like it? Any lessons/successes?

Mostly I found it an inconvenience. When playing with big files (say
around 1 MiB), you had to remember to purge periodically. I can't
recall being grateful that this feature existed.

Certainly, when I switched to Unix, I didn't miss file versioning.
I question how useful it really is, except in certain specialised
applications (like SCM, where we have tools to fit the job).

Regards,

Richard....
Permanent: [email protected]
Current: [email protected]

2002-03-11 05:49:17

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Richard Gooch wrote:

>Hans Reiser writes:
>
>>Itai Nahshon wrote:
>>
>>>On Sunday 10 March 2002 10:36, Hans Reiser wrote:
>>>
>>>>I think that if version control becomes as simple as turning on a plugin
>>>>for a directory or file, and then adding a little to the end of a
>>>>filename to see and list the old versions, Mom can use it.
>>>>
>>>IIRC that was a feature in systems from DEC even before
>>>VMS (I'm talking about the late 70's). eg. file.txt;2 was revision 2
>>>of file.txt.
>>>
>>Was it easy? Did people like it? Any lessons/successes?
>>
>
>Mostly I found it an inconvenience. When playing with big files (say
>around 1 MiB), you had to remember to purge periodically. I can't
>recall being grateful that this feature existed.
>
>Certainly, when I switched to Unix, I didn't miss file versioning.
>I question how useful it really is, except in certain specialised
>applications (like SCM, where we have tools to fit the job).
>
> Regards,
>
> Richard....
>Permanent: [email protected]
>Current: [email protected]
>
>
So the problem was that it was not optional?

Hans

2002-03-11 05:52:47

by Alexander Viro

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Mon, 11 Mar 2002, Hans Reiser wrote:

> So the problem was that it was not optional?

The problem is that it doesn't play well with other things.

2002-03-11 06:16:06

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Alexander Viro wrote:

>
>On Mon, 11 Mar 2002, Hans Reiser wrote:
>
>>So the problem was that it was not optional?
>>
>
>The problem is that it doesn't play well with other things.
>
Your statement is information free so far, but could be the intro to an
informative statement....;-)

Hans

2002-03-11 06:37:51

by Alexander Viro

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Mon, 11 Mar 2002, Hans Reiser wrote:

> >The problem is that it doesn't play well with other things.
> >
> Your statement is information free so far, but could be the intro to an
> informative statement....;-)

See postings upthread. Versioning doesn't play well with link(2), with
overwriting rename(2), etc. - the problem is not that much in implementation
but in finding at least somewhat reasonable semantics for that.

DEC OSes have different filesystem IO model. There versions are more or
less natural. With Unix they will clash with a lot of things expected
by every damn application out there.

2002-03-11 06:42:51

by Richard Gooch

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hans Reiser writes:
> Alexander Viro wrote:
>
> >
> >On Mon, 11 Mar 2002, Hans Reiser wrote:
> >
> >>So the problem was that it was not optional?

At the least.

> >The problem is that it doesn't play well with other things.
> >
> Your statement is information free so far, but could be the intro to an
> informative statement....;-)

I found the Unix structure and API much easier to deal with than
VMS. File versioning was just another complication that I sometimes
had to deal with (it was a *long* time ago, so don't ask for
details:-).

Funny thing about VMS. It was a much richer programming environment
(the OS had a lot of functions you could call), but I found that it
was easier to get stuff done with Unix, even if there wasn't some
fancy function to help you out. Unix gets in the way less, whereas
with VMS I found myself battling the API more to force it to do what I
wanted.

Regards,

Richard....
Permanent: [email protected]
Current: [email protected]

2002-03-11 08:22:45

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Rik van Riel wrote:

>On Sun, 10 Mar 2002, Alan Cox wrote:
>
>>>It was fabulous at that time. The first time you create a file, it gets ";1"
>>>appended to it's filename. When you edit it, it gets saved under the same name,
>>>this time appended by ";2". Edit it again... whell, you get the picture.
>>>Cleaning up was as simple as "$ PURGE /KEEP=3" to keep the last three versions.
>>>
>>>For these days with sometimes hundreds of files, it might become confusing when
>>>'ls' shows all versions of all files, but back then it worked well.
>>>
>>Its trickier than that - because all your other semantics have to align,
>>its akin to the undelete problem (in fact its identical). Do you version
>>on a rewrite, on a truncate, only on an O_CREAT ?
>>
>
>That's a nice question. I would dread the scenario where a
>new version was created for each append ;))
>
>Rik
>
I think that file close is the right place for it.

Again, only for those files/plugins that have VERSION_ON_FILE_CLOSE
enabled.....

With regard to unlink, I think I don't see the problem. Unlink makes
the default version non-existent.

You need a default version, something like filenameA/..default with
filename A resolving to filenameA/..default. Listing the default
version of a directory only lists the current default versions of files.

Hans

2002-03-11 09:47:00

by Harald Arnesen

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Alan Cox <[email protected]> writes:

>> It was fabulous at that time. The first time you create a file, it
>> gets ";1" appended to it's filename. When you edit it, it gets saved
>> under the same name, this time appended by ";2". Edit it again...
>> whell, you get the picture. Cleaning up was as simple as "$ PURGE
>> /KEEP=3" to keep the last three versions.

> Its trickier than that - because all your other semantics have to align,
> its akin to the undelete problem (in fact its identical). Do you version on
> a rewrite, on a truncate, only on an O_CREAT ?

The Sintran OS for the Norsk Data minicomputers had something similar. A
new version was created every time a file was opened for writing.

It had its disadvantages. A typical machine where I worked at the time
had one 60MB disk. However, you could set the number of copies on a
per-file-basis, so big databases wouldn't have to be duplicated.
--
Hilsen Harald.

2002-03-11 10:46:43

by Mark H. Wood

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Sun, 10 Mar 2002, Itai Nahshon wrote:
> On Sunday 10 March 2002 10:36, Hans Reiser wrote:
> > I think that if version control becomes as simple as turning on a plugin
> > for a directory or file, and then adding a little to the end of a
> > filename to see and list the old versions, Mom can use it.
>
> IIRC that was a feature in systems from DEC even before
> VMS (I'm talking about the late 70's). eg. file.txt;2 was revision 2
> of file.txt.
>
> I don't know if this feature was in the file-system or in the text editor
> that I have used.

It's part of the TOPS-20 filesystem. If you try to create a file which
already exists, you get a new version of the file with length zero. Each
file has a version limit in its directory entry, and when the limit is
exceeded the oldest version is automagically deleted. The version limit
is copied from the highest existing version to the new version, and the
limit on the highest version determines whether old versions are dropped.

VMS does something similar, although ODS-2 tries to be clever by packing
all of the versions' index-file pointers together after a single copy of
the version-less name in the directory block. Originally the two used
different punctuation to set off the version number, but when Digital
killed the PDP10 line VMS was adjusted to accept the TOPS-20 form as well,
as a sop to LCG customers who were being steered into an unfamiliar
product line. IIRC TOPS-20 names were name.extension.version, while VMS
native names are name.extension;version .

RSX-11 (VMS' ancestor) may have had versions too. I've only used the
hacked RSX20F variety used as the console monitor for KL10 systems, but I
seem to recall versioning there. Or maybe I'm recalling the RSX-11 flavor
(POS) which ran the Pro300 console on the VAX 8800.

> The basic features were not even close to what you get from RCS or
> SCCS.

Indeed. The only essential relationship between two versions of a file is
that their names resemble each other. The content is entirely distinct.
It was usually used to prevent the "oops, I shouldn't have saved that"
syndrome.

--
Mark H. Wood, Lead System Programmer [email protected]
; 11-Mar-2002 MHW Support the 2080

2002-03-11 11:05:05

by Mark H. Wood

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Sun, 10 Mar 2002, Alexander Viro wrote:
> On Sun, 10 Mar 2002, Alan Cox wrote:
>
> > Its trickier than that - because all your other semantics have to align,
> > its akin to the undelete problem (in fact its identical). Do you version on
> > a rewrite, on a truncate, only on an O_CREAT ?
>
> Even better, what do you do upon link(2)? Or rename(2) over one of the
> versions...

TOPS-20 never had the concept of link(2), and VMS has an anemic version
that nobody ever used so far as I know. (There are a few VMS programs
which use the ability to create a file with *no directory links at all*,
which occasionally leaves some interesting puzzles for the sysadmin after
a crash.)

Renaming would create a new version, I think, so it might push the oldest
version off the end. Explicitly renaming as existing version N should
have no side-effects (other than deletion of the original content of
version N).

> VMS is not UNIX. And union of these two will be hell - incompatible models,
> let alone features. "Well, I don't use <list of Unix features>" is not an
> answer - other people have different sets of things they don't use and you
> can be sure that every thing you don't care about is absolute must-have
> for somebody else.

True. Studying other OSes for useful ideas is sensible, but swiping those
ideas wholesale only works if the two are fairly closely aligned in the
affected area. Sometimes the idea needs major rework, and sometimes the
graft produces a monster.

--
Mark H. Wood, Lead System Programmer [email protected]
Our lives are forever changed. But *that* is exactly as it always was.

2002-03-11 11:33:06

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Mark H. Wood wrote:

>On Sun, 10 Mar 2002, Itai Nahshon wrote:
>
>>On Sunday 10 March 2002 10:36, Hans Reiser wrote:
>>
>>>I think that if version control becomes as simple as turning on a plugin
>>>for a directory or file, and then adding a little to the end of a
>>>filename to see and list the old versions, Mom can use it.
>>>
>>IIRC that was a feature in systems from DEC even before
>>VMS (I'm talking about the late 70's). eg. file.txt;2 was revision 2
>>of file.txt.
>>
>>I don't know if this feature was in the file-system or in the text editor
>>that I have used.
>>
>
>It's part of the TOPS-20 filesystem. If you try to create a file which
>already exists, you get a new version of the file with length zero. Each
>file has a version limit in its directory entry, and when the limit is
>exceeded the oldest version is automagically deleted. The version limit
>is copied from the highest existing version to the new version, and the
>limit on the highest version determines whether old versions are dropped.
>
>
If it isn't optional (on per file and/or per directory basis) for users,
it would be quite annoying.

Hans

2002-03-11 13:14:48

by Victor Yodaiken

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Mon, Mar 11, 2002 at 09:15:38AM +0300, Hans Reiser wrote:
> >The problem is that it doesn't play well with other things.
> >
> Your statement is information free so far, but could be the intro to an
> informative statement....;-)

e.g. link, copy, remove.

Look at the Plan9 backup plan. That was much better thought out.

>
> Hans
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
---------------------------------------------------------
Victor Yodaiken
Finite State Machine Labs: The RTLinux Company.
http://www.fsmlabs.com http://www.rtlinux.com

2002-03-11 14:06:59

by Luigi Genoni

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Revision controll was a nightmare with aleph DB on dear old VMS.
Had to clean older versions of the db every month, because, of course,
I could not have on the fs more than 32K versions...
well, Aleph was a nightmare itself, actually...

On Sun, 10 Mar 2002, Hans Reiser wrote:

> Itai Nahshon wrote:
>
> >On Sunday 10 March 2002 10:36, Hans Reiser wrote:
> >
> >>I think that if version control becomes as simple as turning on a plugin
> >>for a directory or file, and then adding a little to the end of a
> >>filename to see and list the old versions, Mom can use it.
> >>
> >
> >IIRC that was a feature in systems from DEC even before
> >VMS (I'm talking about the late 70's). eg. file.txt;2 was revision 2
> >of file.txt.
> >
>
> Was it easy? Did people like it? Any lessons/successes?
>
> Hans
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-03-11 15:32:44

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Monday 11 March 2002 04:32 am, Hans Reiser wrote:
> Mark H. Wood wrote:
> >On Sun, 10 Mar 2002, Itai Nahshon wrote:
> >>On Sunday 10 March 2002 10:36, Hans Reiser wrote:
> >>>I think that if version control becomes as simple as turning on a plugin
> >>>for a directory or file, and then adding a little to the end of a
> >>>filename to see and list the old versions, Mom can use it.
> >>
> >>IIRC that was a feature in systems from DEC even before
> >>VMS (I'm talking about the late 70's). eg. file.txt;2 was revision 2
> >>of file.txt.
> >>
> >>I don't know if this feature was in the file-system or in the text editor
> >>that I have used.
> >
> >It's part of the TOPS-20 filesystem. If you try to create a file which
> >already exists, you get a new version of the file with length zero. Each
> >file has a version limit in its directory entry, and when the limit is
> >exceeded the oldest version is automagically deleted. The version limit
> >is copied from the highest existing version to the new version, and the
> >limit on the highest version determines whether old versions are dropped.
>
> If it isn't optional (on per file and/or per directory basis) for users,
> it would be quite annoying.
>
> Hans

Quoting from "VMS General User's Manual", section 2.1.1 Filenames, Types,
and Versions, "You can control the number of versions of a file by specifying
the /VERSION_LIMIT qualifier to the DCL commands CREATE/DIRECTORY, SET DIRECTORY,
and SET FILE."

It has been a while (about 12 years), but IIRC, you could set /VERSION_LIMIT=1 and
effectively get rid of the annoying versions. But some people, the Aunt Tillie
types, were always tripping over their shoelaces and unintentially deleting files.
For those people, the version feature probably seemed a blessing rather than a
curse.

Steven

2002-03-11 15:51:57

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

[email protected] wrote:

>On Mon, Mar 11, 2002 at 09:15:38AM +0300, Hans Reiser wrote:
>
>>>The problem is that it doesn't play well with other things.
>>>
>>Your statement is information free so far, but could be the intro to an
>>informative statement....;-)
>>
>
>e.g. link, copy, remove.
>
In what respect do these not play well?

>
>
>
>Look at the Plan9 backup plan. That was much better thought out.
>
Also not enough said to be informative., but could be an introduction to
an informative statement.

2002-03-11 16:08:58

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Steven Cole wrote:

>
>Quoting from "VMS General User's Manual", section 2.1.1 Filenames, Types,
>and Versions, "You can control the number of versions of a file by specifying
>the /VERSION_LIMIT qualifier to the DCL commands CREATE/DIRECTORY, SET DIRECTORY,
>and SET FILE."
>
>It has been a while (about 12 years), but IIRC, you could set /VERSION_LIMIT=1 and
>effectively get rid of the annoying versions. But some people, the Aunt Tillie
>types, were always tripping over their shoelaces and unintentially deleting files.
>For those people, the version feature probably seemed a blessing rather than a
>curse.
>
>Steven
>
>
So with every command to create a directory you had to add an extra
parameter specifying that you didn't want extra versions or else you got
them?

Hans

2002-03-11 16:09:58

by Victor Yodaiken

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Mon, Mar 11, 2002 at 06:51:16PM +0300, Hans Reiser wrote:
> [email protected] wrote:
>
> >On Mon, Mar 11, 2002 at 09:15:38AM +0300, Hans Reiser wrote:
> >
> >>>The problem is that it doesn't play well with other things.
> >>>
> >>Your statement is information free so far, but could be the intro to an
> >>informative statement....;-)
> >>
> >
> >e.g. link, copy, remove.
> >
> In what respect do these not play well?
>
> >
> >
> >
> >Look at the Plan9 backup plan. That was much better thought out.
> >
> Also not enough said to be informative., but could be an introduction to
> an informative statement.

If you can't spend 10 seconds looking it up on Google, you must not
be interested.

>
>

--
---------------------------------------------------------
Victor Yodaiken
Finite State Machine Labs: The RTLinux Company.
http://www.fsmlabs.com http://www.rtlinux.com

2002-03-11 16:27:51

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Monday 11 March 2002 09:08 am, Hans Reiser wrote:
> Steven Cole wrote:
> >Quoting from "VMS General User's Manual", section 2.1.1 Filenames, Types,
> >and Versions, "You can control the number of versions of a file by
> > specifying the /VERSION_LIMIT qualifier to the DCL commands
> > CREATE/DIRECTORY, SET DIRECTORY, and SET FILE."
> >
> >It has been a while (about 12 years), but IIRC, you could set
> > /VERSION_LIMIT=1 and effectively get rid of the annoying versions. But
> > some people, the Aunt Tillie types, were always tripping over their
> > shoelaces and unintentially deleting files. For those people, the version
> > feature probably seemed a blessing rather than a curse.
> >
> >Steven
>
> So with every command to create a directory you had to add an extra
> parameter specifying that you didn't want extra versions or else you got
> them?
>
> Hans

That is not my recollection. What I remember is that our system admistrator
set up people's accounts so that the default behaviour was as desired by
the individual. This has gotten me curious, so I went out to a storage container
and dug out an old VAX 4000/60 which hasn't run since about 1992. If it works,
I'll be able to answer with more than vague memories. At least for VMS 5.1, which
is just a bit out of date as the current version is 7.3 or so. Now, if can just
remember the SYSTEM password. ;-)

Perhaps others whose VMS experience is more recent than mine can answer this question.
More generally, if the infrastructure for keeping file versions around is going
to be generated for other reasons, having the option to have file versions could
be useful for some people. I certainly remember people who loved that feature,
but I wasn't one of them.

Steven

2002-03-11 16:57:01

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

[email protected] wrote:

>>>
>>>
>>>Look at the Plan9 backup plan. That was much better thought out.
>>>
>>Also not enough said to be informative., but could be an introduction to
>>an informative statement.
>>
>
>If you can't spend 10 seconds looking it up on Google, you must not
>be interested.
>
>>
>
Look at the way some of Windows and Macintosh were implemented. Much
better thought out.

2002-03-11 17:08:51

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Steven Cole wrote:

>On Monday 11 March 2002 09:08 am, Hans Reiser wrote:
>
>
>
>Perhaps others whose VMS experience is more recent than mine can answer this question.
>More generally, if the infrastructure for keeping file versions around is going
>to be generated for other reasons, having the option to have file versions could
>be useful for some people. I certainly remember people who loved that feature,
>but I wasn't one of them.
>
>Steven
>
>

I don't use CVS for most papers, proposals, etc,, that I write. If the
version control was turned on with something like a chattr, I would use
it, and emacs could get rid of that damned ~ file that clutters my ls
commands of what should be a listing of only my current version of my
home directory without old versions of files being listed (the disk
space I don't care about, it is the clutter that annoys.)

Hans

2002-03-11 17:22:31

by Nikita Danilov

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hans Reiser writes:
> Steven Cole wrote:
>
> >On Monday 11 March 2002 09:08 am, Hans Reiser wrote:
> >
> >
> >
> >Perhaps others whose VMS experience is more recent than mine can answer this question.
> >More generally, if the infrastructure for keeping file versions around is going
> >to be generated for other reasons, having the option to have file versions could
> >be useful for some people. I certainly remember people who loved that feature,
> >but I wasn't one of them.
> >
> >Steven
> >
> >
>
> I don't use CVS for most papers, proposals, etc,, that I write. If the
> version control was turned on with something like a chattr, I would use
> it, and emacs could get rid of that damned ~ file that clutters my ls

ls -B

> commands of what should be a listing of only my current version of my
> home directory without old versions of files being listed (the disk
> space I don't care about, it is the clutter that annoys.)
>
> Hans
>

Nikita.

> -

2002-03-11 18:22:56

by Robert Pfister

[permalink] [raw]

Subject: VMS File versions (was RE: linux-2.5.4-pre1 - bitkeeper testing)

>That is not my recollection. What I remember is that our system
admistrator
>set up people's accounts so that the default behaviour was as desired by
>the individual. This has gotten me curious, so I went out to a storage
container
>and dug out an old VAX 4000/60 which hasn't run since about 1992. If it
works,
>I'll be able to answer with more than vague memories. At least for VMS
5.1, which
>is just a bit out of date as the current version is 7.3 or so. Now, if can
just
>remember the SYSTEM password. ;-)

My recollection is that you could set the version limit on a directory, and
this would propogate to all the files underneath, unless you explicitly
changed it.

>Perhaps others whose VMS experience is more recent than mine can answer
this question.
>More generally, if the infrastructure for keeping file versions around is
going
>to be generated for other reasons, having the option to have file versions
could
>be useful for some people. I certainly remember people who loved that
feature,
>but I wasn't one of them.

I liked file versions when I was doing intense development, and it took a
lot of concentration to keep it all straight. It confused the heck out of
many people, and most sysadmins ran batch jobs to "purge" every night -- so
you could only count on versions being available for a limited time.

Is anyone working on version support for a Linux filesystem, as well as all
the utilities that would need to change?

Robb

2002-03-11 18:43:36

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Monday 11 March 2002 09:25 am, Steven Cole wrote:
> On Monday 11 March 2002 09:08 am, Hans Reiser wrote:
> > Steven Cole wrote:
> > >Quoting from "VMS General User's Manual", section 2.1.1 Filenames,
> > > Types, and Versions, "You can control the number of versions of a file
> > > by specifying the /VERSION_LIMIT qualifier to the DCL commands
> > > CREATE/DIRECTORY, SET DIRECTORY, and SET FILE."
> > >
> > >It has been a while (about 12 years), but IIRC, you could set
> > > /VERSION_LIMIT=1 and effectively get rid of the annoying versions. But
> > > some people, the Aunt Tillie types, were always tripping over their
> > > shoelaces and unintentially deleting files. For those people, the
> > > version feature probably seemed a blessing rather than a curse.
> > >
> > >Steven
> >
> > So with every command to create a directory you had to add an extra
> > parameter specifying that you didn't want extra versions or else you got
> > them?
> >
> > Hans
>
> That is not my recollection. What I remember is that our system
> admistrator set up people's accounts so that the default behaviour was as
> desired by the individual. This has gotten me curious, so I went out to a
> storage container and dug out an old VAX 4000/60 which hasn't run since
> about 1992. If it works, I'll be able to answer with more than vague
> memories. At least for VMS 5.1, which is just a bit out of date as the
> current version is 7.3 or so. Now, if can just remember the SYSTEM
> password. ;-)
>

Apologies to all who don't care about VMS and file version numbers..
OK, no more vague memories. I got my old VAX 4000 powered up, and three
amazing things happened:

1) The VAX booted even though it had been gathering dust for 10 years.
2) I remembered the SYSTEM password, and my password too!
3) VMS 5.5-2 was Y2K ready in 1992, taking today's date with no problem.

I fiddled around a bit with VMS, and it looks like the following command set things
up for me so that I only have one version for any new files I create:

SET DIRECTORY/VERSION_LIMIT=1 SYS$SYSDEVICE:[USERS.STEVEN]

This change was persistant across logins. Hope this helps.

Steven

2002-03-11 19:16:08

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Steven Cole wrote:

>
>
>I fiddled around a bit with VMS, and it looks like the following command set things
>up for me so that I only have one version for any new files I create:
>
>SET DIRECTORY/VERSION_LIMIT=1 SYS$SYSDEVICE:[USERS.STEVEN]
>
>This change was persistant across logins. Hope this helps.
>
>Steven
>
>
This affects all directories and all files for user steven, or just one
directory?

Hans

2002-03-11 21:36:41

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Monday 11 March 2002 12:15 pm, Hans Reiser wrote:
> Steven Cole wrote:
> >I fiddled around a bit with VMS, and it looks like the following command
> > set things up for me so that I only have one version for any new files I
> > create:
> >
> >SET DIRECTORY/VERSION_LIMIT=1 SYS$SYSDEVICE:[USERS.STEVEN]
> >
> >This change was persistant across logins. Hope this helps.
> >
> >Steven
>
> This affects all directories and all files for user steven, or just one
> directory?

The above example affected all subsequently created files and subsequently
created directories under user steven, such as DKA300:[USERS.STEVEN.TESTTHIS].
Previously created directories retain their previous version_limit setting, which
I checked in DKA300:[USERS.STEVEN.HELLOWORLD]. Previously created files also
retain their previous version_limit setting.

I also set the version_limit for the whole disk (as SYSTEM) with
SET DIRECTORY/VERSION_LIMIT=1 DKA300:[000000], but again this only affected
subsequently created files and directories along with the files they contain.

I have not figured out how to set the version_limit retroactively; perhaps it is
not possible with a simple command. Obviously, you could do this with a DCL
script if you really wanted to.

Steven

2002-03-11 21:55:02

by Richard B. Johnson

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Mon, 11 Mar 2002, Steven Cole wrote:

> On Monday 11 March 2002 12:15 pm, Hans Reiser wrote:
> > Steven Cole wrote:
> > >I fiddled around a bit with VMS, and it looks like the following command
> > > set things up for me so that I only have one version for any new files I
> > > create:
[SNIPPED]

>
> I have not figured out how to set the version_limit retroactively; perhaps
it is
> not possible with a simple command. Obviously, you could do this with a DCL
> script if you really wanted to.
>
> Steven
> -

$ SET PROC/PRIV=ALL
$ SET DEF DISK:[000000]
$ PURGE
$ RENAME *.* ;1

Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Bill Gates? Who?

2002-03-11 22:01:02

by Richard B. Johnson

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Mon, 11 Mar 2002, Richard B. Johnson wrote:

> On Mon, 11 Mar 2002, Steven Cole wrote:
>
> > On Monday 11 March 2002 12:15 pm, Hans Reiser wrote:
> > > Steven Cole wrote:
> > > >I fiddled around a bit with VMS, and it looks like the following command
> > > > set things up for me so that I only have one version for any new files I
> > > > create:
> [SNIPPED]
>
> >
> > I have not figured out how to set the version_limit retroactively; perhaps
> it is
> > not possible with a simple command. Obviously, you could do this with a DCL
> > script if you really wanted to.
> >
> > Steven
> > -
>
> $ SET PROC/PRIV=ALL
> $ SET DEF DISK:[000000]
> $ PURGE
> $ RENAME *.* ;1
>

Oops...been a long time.... need:

$ RENAME [*...]*.* ;1

To follow the whole tree.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Bill Gates? Who?

2002-03-11 22:22:42

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Monday 11 March 2002 02:54 pm, Richard B. Johnson wrote:
> On Mon, 11 Mar 2002, Steven Cole wrote:
> > On Monday 11 March 2002 12:15 pm, Hans Reiser wrote:
> > > Steven Cole wrote:
> > > >I fiddled around a bit with VMS, and it looks like the following
> > > > command set things up for me so that I only have one version for any
> > > > new files I create:
>
> [SNIPPED]
>
> > I have not figured out how to set the version_limit retroactively;
> > perhaps
>
> it is
>
> > not possible with a simple command. Obviously, you could do this with a
> > DCL script if you really wanted to.
> >
> > Steven
> > -
>
> $ SET PROC/PRIV=ALL
> $ SET DEF DISK:[000000]
> $ PURGE
> $ RENAME *.* ;1
>
>
> Cheers,
> Dick Johnson

Sure, that cleans up everything and sets all the version numbers back to ;1,
but what I was pointing out is that previously created directories and previously
created files retain whatever version_limit setting they were created with. After
running your four lines, the disk is cleaner, but you'll still get multiple versions
even if you don't want multiple versions for those previously created directories
and files. I know, I just tried it with VMS 5.5-2. But this is all rather moot,
since the real topic at hand is not what VMS does or didn't do in the past, but rather
what we _might_ want certain linux filesystems to do (and not do) in the future.

Steven

2002-03-11 22:55:46

by James Antill

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hans Reiser <[email protected]> writes:

> Alexander Viro wrote:
>
> >
> >On Mon, 11 Mar 2002, Hans Reiser wrote:
> >
> >>So the problem was that it was not optional?
> >>
> >
> >The problem is that it doesn't play well with other things.
>
> Your statement is information free so far, but could be the intro to
> an informative statement....;-)

Think about what people want to do with SCM, think about how the
filesystem can help.
Just having a special flag to open() that enables versioning on close()
is useless to 99% of people IMO.

For something like that to be worth it it'd need to support rename(),
symlink(), link(), unlink() and _importantly_ chmod()/chown() (you
don't want previous versions of a file becoming readable just because
you chmod() the final version).

Off the top of my head the places where I might find a use for it...

1. tar -x ... hack ... tell fs to generate diff
(can be done via. cp -al now, but possibly easier with fs support)

2. Version control my mail box (the good readers have the mark deleted
and then purge, which removes some of the need). Version control
might be clumsy (say you delete 3 mails, and want only one back), and
doesn't work with a mailer that does any caching on the mailbox.

3. Putting an entire website under it.

...and all but 3. are probably better done a different way (a shared
library might be nice ... and would also be fs/kernel independent) and
I'd imagine it's overkill for 3. as the biggest problem is saving over
the wrong filename so undelete is enough.

--
# James Antill -- [email protected]
:0:
* ^From: .*james@and\.org
/dev/null

2002-03-12 00:15:10

by Robert Pfister

[permalink] [raw]

Subject: RE: linux-2.5.4-pre1 - bitkeeper testing

Steven Cole writes:

>Sure, that cleans up everything and sets all the version numbers back to
;1,
>but what I was pointing out is that previously created directories and
previously
>created files retain whatever version_limit setting they were created with.
After
>running your four lines, the disk is cleaner, but you'll still get multiple
versions
>even if you don't want multiple versions for those previously created
directories
>and files. I know, I just tried it with VMS 5.5-2.

There are two different commands in VMS:

$ set directory /limit=1 {directory name}

this sets the default behavior from that point down for new files

$ set file/limit=1 {list of files}

which sets the limit explicitly on files, and overrides the default for that
directory. You can specify [...]*.* to recurse through and set everything,
sort
of like a bash script of "$ for j in 'find .' ; do xxx $j ; done"

>But this is all rather moot,
>since the real topic at hand is not what VMS does or didn't do in the past,
but
>rather what we _might_ want certain linux filesystems to do (and not do) in
the
>future.

With VMS, the default behavior is on, and it is a pain to turn off.

Under VMS, the versioning behavior is inherited from the parent directory
that you are affecting a file in, if that directory has no attributes, it
defers to the parent. (default file protection's work in this manner as
well)

Alternatively, storing a versioning attribute at every directory, with
"blank" meaning no versioning might be a better fit. It would certainly
make a mixed filesystem environment easier to handle.

Robb

2002-03-12 01:28:53

by Mark H. Wood

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Okay, I looked this up in the _VMS I/O User's Reference Manual: Part I_
for VMS v5.0 (latest I have in the programmer series -- they tore out our
single VMS system so that they could install two dozen hefty Windows boxes
to almost replace it).

I can find nothing explicit about creating directories (something that
virtually no one did at the $QIO level) so I'm *assuming* that one just
executes SYS$QIO IO$_CREATE|IO$M_CREATE using a filespec of the form
"somename.DIR;1", passing a file attribute descriptor of ATR$C_UCHAR with
FCH$M_DIRECTORY set, and a record attribute block with FAT$W_VERSIONS set
to the desired version limit default for the directory. Files entered
into that directory will get their limit set to this default unless they
are created with FIB$W_VERLIMIT nonzero. It looks as though this
defaulting mechanism is applied to subdirectories when they are created --
the system management essentials volume (I don't have the title handy)
makes special note that putting a version limit on the MFD (think of it as
the directory that the superblock points at) will cause that limit to
trickle down to all inferior directories.

Anyway it looks like this is how it works (and this is the way I remember
it): setting a version limit default on a directory causes any file to
have that version limit unless the file is created with or modified to
have an explicit version limit of its own. Subdirectories take on the
default limit of their superiors, so their dependent files will get the
original limit unless the subdirectory's limit default has been modified
or they are created with an explicit limit. Changing a default in any
directory only affects files subsequently created in that directory and in
any subsequently created subdirectories.

You can specify whether creating a file will supersede a file with equal
version number, or fail due to a file of that name already existing.

So to boil it down still further: if you usually want four versions
retained, set your home directory to default to four before you create any
files or subdirectories and you're all set. If you change your mind for
some file or directory, you can alter its limit/default with SET
FILE/VERSION_LIMIT or SET DIRECTORY/VERSION_LIMIT. The limit is a
per-file attribute with a per-directory default.

Ordinary users, and even ordinary programmers, never see this stuff; they
use a library called RMS which does the tedious directory-tree walking and
other messy $QIOs. Only a few of us are sick enough to use the ACP-QIO
interface. :-)

The oldest-version deletion magic only chops off the single lowest
version, so if you have four versions of FOO.BAR and you set the top
version's version limit to two, then create a fifth version, only the
first version gets deleted. To sync. up after lowering the version limit,
you must execute a PURGE command (or your code can contain equivalent
logic).

The version number does not wrap; when you get to version 32767 you cannot
create any more versions. This was actually used to foil the only malware
I ever heard of on VMS systems, which tried to create a copy of itself
under a name that had special meaning to the DECnet ACP. Setting that
file's version number to 32767 broke the WANK worm's infection strategy,
and was widely done until the worm-enabling bug was patched.

We probably ought to take this discussion somewhere else if it is to
continue, since I expect that the vast majority of LKML readers are
heartily sick of all this VMS gobbledygook. None of it has anything to do
with what most people think of as revision control.

--
Mark H. Wood, Lead System Programmer [email protected]
Our lives are forever changed. But *that* is exactly as it always was.

2002-03-12 07:55:06

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing (If you don't like the closed source nature of Bitkeeper, stop your whining and help out with reiserfs.)

Ok guys, those of you who have been saying that somehow link, etc., will
break and Unix cannot handle version control, I am sorry, Clearcase does
all this stuff in the filesystem, and it more or less works, and their
primary disadvantages are that they don't have reiserfs levels of
performance, their software is not tight and clean which makes being a
Clearcase Admin so much work that it funded the creation of Namesys, and
they are an expensive closed source solution. Once Josh's transactions
are in place, we can start convincing application writers that rename is
a sorry ass transactions API, and start thinking about coding basic
version control in the FS. My sysadmin thinks that views are the best
security model for network service offering processes to run with, and I
think he is right (chroot just isn't convenient enough to get used a lot
in practice).

If you don't like the closed source nature of Bitkeeper, stop your
whining and help out with reiserfs v5 (reiser5 development to start in
October, reiser4 is a prereq for reiser5 and feature freeze for reiser4
is in place). In the meantime, give Larry a thank you for giving us
something we wouldn't otherwise have and sorely need.

Hans

Steven Cole wrote:

>On Monday 11 March 2002 12:15 pm, Hans Reiser wrote:
>
>>Steven Cole wrote:
>>
>>>I fiddled around a bit with VMS, and it looks like the following command
>>>set things up for me so that I only have one version for any new files I
>>>create:
>>>
>>>SET DIRECTORY/VERSION_LIMIT=1 SYS$SYSDEVICE:[USERS.STEVEN]
>>>
>>>This change was persistant across logins. Hope this helps.
>>>
>>>Steven
>>>
>>This affects all directories and all files for user steven, or just one
>>directory?
>>
>
>The above example affected all subsequently created files and subsequently
>created directories under user steven, such as DKA300:[USERS.STEVEN.TESTTHIS].
>Previously created directories retain their previous version_limit setting, which
>I checked in DKA300:[USERS.STEVEN.HELLOWORLD]. Previously created files also
>retain their previous version_limit setting.
>
>I also set the version_limit for the whole disk (as SYSTEM) with
>SET DIRECTORY/VERSION_LIMIT=1 DKA300:[000000], but again this only affected
>subsequently created files and directories along with the files they contain.
>
>I have not figured out how to set the version_limit retroactively; perhaps it is
>not possible with a simple command. Obviously, you could do this with a DCL
>script if you really wanted to.
>
>Steven
>
>

So it is fair to say that all those folks who were irritated by the VMS
version control feature were just not VMS sophisticates. Thanks Steven.

2002-03-12 07:59:15

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Clearcase handles all of this in the filesystem, and it all works pretty
much reasonably. Lots of details, but let's worry about them after we
have a patch, shall we?....

Hans

James Antill wrote:

>Hans Reiser <[email protected]> writes:
>
>>Alexander Viro wrote:
>>
>>>On Mon, 11 Mar 2002, Hans Reiser wrote:
>>>
>>>>So the problem was that it was not optional?
>>>>
>>>The problem is that it doesn't play well with other things.
>>>
>>Your statement is information free so far, but could be the intro to
>>an informative statement....;-)
>>
>
> Think about what people want to do with SCM, think about how the
>filesystem can help.
> Just having a special flag to open() that enables versioning on close()
>is useless to 99% of people IMO.
>
> For something like that to be worth it it'd need to support rename(),
>symlink(), link(), unlink() and _importantly_ chmod()/chown() (you
>don't want previous versions of a file becoming readable just because
>you chmod() the final version).
>
> Off the top of my head the places where I might find a use for it...
>
>1. tar -x ... hack ... tell fs to generate diff
> (can be done via. cp -al now, but possibly easier with fs support)
>
>2. Version control my mail box (the good readers have the mark deleted
>and then purge, which removes some of the need). Version control
>might be clumsy (say you delete 3 mails, and want only one back), and
>doesn't work with a mailer that does any caching on the mailbox.
>
>3. Putting an entire website under it.
>
>...and all but 3. are probably better done a different way (a shared
>library might be nice ... and would also be fs/kernel independent) and
>I'd imagine it's overkill for 3. as the biggest problem is saving over
>the wrong filename so undelete is enough.
>

2002-03-12 18:09:11

by Thunder from the hill

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Hans Reiser wrote about versioning to be just fine if you remember to
handle it.
I'd suggest setting the number of versions to one by default, else we
might run into the same trouble as they did with VMS...

Thunder

--
begin-base64 755 -
IyEgL3Vzci9iaW4vcGVybApteSAgICAgJHNheWluZyA9CSMgVGhlIHNjcmlw
dCBvbiB0aGUgbGVmdCBpcyB0aGUgcHJvb2YKIk5lbmEgaXN0IGVpbiIgLgkj
IHRoYXQgaXQgaXNuJ3QgYWxsIHRoZSB3YXkgaXQgc2VlbXMKIiB2ZXJhbHRl
dGVyICIgLgkjIHRvIGJlIChlc3BlY2lhbGx5IG5vdCB3aXRoIG1lKQoiTkRX
LVN0YXIuXG4iICA7CiRzYXlpbmcgPX4Kcy9ORFctU3Rhci9rYW5uXAogdW5z
IHJldHRlbi9nICA7CiRzYXlpbmcgICAgICAgPX4Kcy92ZXJhbHRldGVyL2Rp
XAplIExpZWJlL2c7CiRzYXlpbmcgPX5zL2Vpbi8KbnVyL2c7JHNheWluZyA9
fgpzL2lzdC9zYWd0LC9nICA7CiRzYXlpbmc9fnMvXG4vL2cKO3ByaW50Zigk
c2F5aW5nKQo7cHJpbnRmKCJcbiIpOwo=
====
Extract this and see what will happen if you execute my
signature. Just save it to file and do a
> uudecode $file | perl

2002-03-12 22:56:29

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Tue, Mar 12, 2002 at 10:58:45AM +0300, Hans Reiser wrote:
> Clearcase handles all of this in the filesystem, and it all works pretty
> much reasonably.

This is misleading--Clearcase stores versions on top a normal
filesystem (like most other RCS's), and all manipulation is entirely
in user-space (over the network to server processes). There only
filesystem magic is that there are directories you cannot list (plus
permission semantics are a little funny).

Seems very different from what you're proposing, IIUC.

Andrew

2002-03-13 08:10:06

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Andrew Pimlott wrote:

>On Tue, Mar 12, 2002 at 10:58:45AM +0300, Hans Reiser wrote:
>
>>Clearcase handles all of this in the filesystem, and it all works pretty
>>much reasonably.
>>
>
>This is misleading--Clearcase stores versions on top a normal
>filesystem (like most other RCS's), and all manipulation is entirely
>in user-space (over the network to server processes). There only
>filesystem magic is that there are directories you cannot list (plus
>permission semantics are a little funny).
>
>Seems very different from what you're proposing, IIUC.
>
>Andrew
>
>
I am sorry, but arguing over whether network filesystems have their
functionality outside the filesystem is not an argument I respect enough
to engage in. Clearcase is a filesystem. Views are built into the
filesystem.
It has user space utilities. It is still a filesystem despite having
user space utilities, and its functionality is in the filesystem.

Hans

2002-03-13 09:43:31

by Geert Uytterhoeven

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Tue, 12 Mar 2002, Andrew Pimlott wrote:
> On Tue, Mar 12, 2002 at 10:58:45AM +0300, Hans Reiser wrote:
> > Clearcase handles all of this in the filesystem, and it all works pretty
> > much reasonably.
>
> This is misleading--Clearcase stores versions on top a normal
> filesystem (like most other RCS's), and all manipulation is entirely
^^^^^^^^
> in user-space (over the network to server processes). There only
^^^^^^^^^^^^^
> filesystem magic is that there are directories you cannot list (plus
> permission semantics are a little funny).

So what's that ClearCase file system driver doing in kernel space? If your
claims are true, we wouldn't be limited to running some specific (buggy) Red
Hat kernel when we would want to migrate development from Solaris to Linux.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven ------------- Sony Software Development Center Europe (SDCE)
[email protected] ------------------- Sint-Stevens-Woluwestraat 55
Voice +32-2-2908453 Fax +32-2-7262686 ---------------- B-1130 Brussels, Belgium

2002-03-13 14:58:18

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, Mar 13, 2002 at 10:39:28AM +0100, Geert Uytterhoeven wrote:
> On Tue, 12 Mar 2002, Andrew Pimlott wrote:
> > This is misleading--Clearcase stores versions on top a normal
> > filesystem (like most other RCS's), and all manipulation is entirely
> ^^^^^^^^
> > in user-space (over the network to server processes). There only
> ^^^^^^^^^^^^^
> > filesystem magic is that there are directories you cannot list (plus
> > permission semantics are a little funny).
>
> So what's that ClearCase file system driver doing in kernel space?

Just providing a convenient view on the repository. The only write
operation you can do through the filesystem is write to the checked
out version. Checkin, checkout, branch, label, create new
file/directory, rename, link, chmod, etc are all user-space.

Also, you can use ClearCase without the filesystem (snapshot view)
and get all the same functionality.

Andrew

2002-03-13 15:28:32

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, Mar 13, 2002 at 11:09:40AM +0300, Hans Reiser wrote:
> I am sorry, but arguing over whether network filesystems have their
> functionality outside the filesystem is not an argument I respect enough
> to engage in. Clearcase is a filesystem. Views are built into the
> filesystem.
> It has user space utilities. It is still a filesystem despite having
> user space utilities, and its functionality is in the filesystem.

I guess I'm not going to convince you, but again I think that "its
functionality is in the filesystem" is misleading. A small fraction
of its functionality is in the filesystem. Checkin, etc is not in
the filesystem. You can use all Clearcase functions without the
filesystem.

You seem to be talking about putting all RCS functions in the
filesystem, using a minimum of new operation (open flags, ioctls),
and I'm saying you can't legitimately point to Clearcase and claim
"it's been done".

Andrew

2002-03-13 16:27:13

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, Mar 13, 2002 at 09:37:20AM -0500, Andrew Pimlott wrote:
> On Wed, Mar 13, 2002 at 10:39:28AM +0100, Geert Uytterhoeven wrote:
> > On Tue, 12 Mar 2002, Andrew Pimlott wrote:
> > > This is misleading--Clearcase stores versions on top a normal
> > > filesystem (like most other RCS's), and all manipulation is entirely
> > ^^^^^^^^
> > > in user-space (over the network to server processes). There only
> > ^^^^^^^^^^^^^
> > > filesystem magic is that there are directories you cannot list (plus
> > > permission semantics are a little funny).
> >
> > So what's that ClearCase file system driver doing in kernel space?
>
> Just providing a convenient view on the repository. The only write
> operation you can do through the filesystem is write to the checked
> out version. Checkin, checkout, branch, label, create new
> file/directory, rename, link, chmod, etc are all user-space.
>
> Also, you can use ClearCase without the filesystem (snapshot view)
> and get all the same functionality.

Are you sure about that? Snapshots are just the cleartext files. The
set of operations you can do with a disconnected snapshot is extremely
limited, last I checked all you could do was edit the files.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2002-03-13 16:49:03

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

On Wed, Mar 13, 2002 at 08:26:47AM -0800, Larry McVoy wrote:
> On Wed, Mar 13, 2002 at 09:37:20AM -0500, Andrew Pimlott wrote:
> > Also, you can use ClearCase without the filesystem (snapshot view)
> > and get all the same functionality.
>
> Are you sure about that? Snapshots are just the cleartext files. The
> set of operations you can do with a disconnected snapshot is extremely
> limited, last I checked all you could do was edit the files.

Right, if you're disconnected from the network, that's all you can
do (and it sucks, because if you haven't already checked out the
file you want to edit, you have to "hijack" it, and clearcase deals
with hijacks in a braid-dead way).

But if you're on the network, you can use a snapshot view in the
same way as a dynamic view. You just don't get a "real-time" view
of the repository and the magic foo.c@@/ directories. (It has the
advantages that you can take it off the network and still have some
limited functionality; it's faster; and you don't have to run an old
Linux kernel and binary modules.)

So to be clear, I was definitely talking about using a snapshot view
while connected to the clearcase servers. In this case, you can do
everything you could do in a dynamic view.

Andrew

2002-03-13 19:18:58

[permalink] [raw]

Subject: Re: linux-2.5.4-pre1 - bitkeeper testing

Andrew Pimlott wrote:

>On Wed, Mar 13, 2002 at 08:26:47AM -0800, Larry McVoy wrote:
>
>>On Wed, Mar 13, 2002 at 09:37:20AM -0500, Andrew Pimlott wrote:
>>
>>>Also, you can use ClearCase without the filesystem (snapshot view)
>>>and get all the same functionality.
>>>
>>Are you sure about that? Snapshots are just the cleartext files. The
>>set of operations you can do with a disconnected snapshot is extremely
>>limited, last I checked all you could do was edit the files.
>>
>
>Right, if you're disconnected from the network, that's all you can
>do (and it sucks, because if you haven't already checked out the
>file you want to edit, you have to "hijack" it, and clearcase deals
>with hijacks in a braid-dead way).
>
>But if you're on the network, you can use a snapshot view in the
>same way as a dynamic view. You just don't get a "real-time" view
>of the repository and the magic foo.c@@/ directories. (It has the
>advantages that you can take it off the network and still have some
>limited functionality; it's faster; and you don't have to run an old
>Linux kernel and binary modules.)
>
>So to be clear, I was definitely talking about using a snapshot view
>while connected to the clearcase servers. In this case, you can do
>everything you could do in a dynamic view.
>
>Andrew
>
>
You are right that you aren't going to convince me. I was a clearcase
administrator, and I don't understand you at all. Using the network,
using an API in addition to VFS, using another filesystem as an
underlying cache, none of these things make it any less part of the
filesystem.

Hans

2002-03-13 21:39:39

[permalink] [raw]

Subject: filesystem transactions (was Re: linux-2.5.4-pre1 - bitkeeper testing)

On this thread, you (Hans) seem to be referring to some plan you have
for putting versioning functionality in the filesystem and that you
think this somehow gives you (at least significant parts of) revision
control nearly for free. It isn't clear from just the messages in
this thread exactly what plan for versioning you have in mind.

It's an interesting topic, though. Is there a document available that
actually specifies what you have in mind?

Leaving aside the question of remote access, a useful filesystem
primitive for revision control would be the ability to quickly create
copy-on-write clones of trees (much like the Subversion model, but as
a true file system, and without the need to store modified files as
diffs).

One could do that reasonably well entirely in user space in a portable
way by using `link(2)' to create the clones and imposing a layer
between libc `open(2)' and the kernel call, though every program on
the system would have to be linked with that special version of
`open'. An in-kernel implementation would have the slight advantages
that it wouldn't require a special version of `open' and could,
perhaps, at the cost of some complexity, create clone trees more
cheaply when the expected case is that large subtrees will never be
modified in either the original or the copy.

Another user-space approach, less successful at creating clones
quickly but portable, venerable, and not requiring a special version
of `open' is to make the clones read-only and create them with a
program that copies modified files, but links unmodified files to
their identical ancestors in earlier clones.

One can also do cheap tree cloning reasonably well using directory
stacks and an automounter: a solution based on kernel primitives with
no particular impact on the representation of the filesystem on disk,
implementable at a higher level and compatible with all underlying
disk representations.

Of course, automated file backups of the sort described in this thread
for VMS, are not particularly helpful for revision control.

Finally, if clones really are cheap to create, that gives us an 80%
solution for generalized filesystem transactions. Adding the ability
to do page-based copy-on-write for individual files gives us 90%. Put
cheap and well designed user-defined name-spaces in combination
with those features, and we can watch Oracle fall down and go boom.

None of these approaches I've mentioned require anything special from
the filesystem representation on disk. There would be a severe
portability problem and performance limitations to any approach that
does rely on a particular filesystem representation.

So, what exactly is your plan?

-t

2002-03-14 08:26:44