2003-03-12 03:33:03

by Larry McVoy

[permalink] [raw]
Subject: [ANNOUNCE] BK->CVS (real time mirror)

We've been working on a gateway between BitKeeper and CVS to provide
the revision history in a form which makes the !BK people happy (or
happier).

We have the first pass of this completed and have a linux 2.5 tree on
kernel.bkbits.net and you can check out the tree as follows (please don't
do this unless you are a programmer and will be using this. Penguin
Computing provided the hardware and the bandwidth for that machine and
if you all melt down the network they could get annoyed. By all means
go for it if you actually write code, though, that's why it is there.)

mkdir ws
cd ws
cvs -d:pserver:[email protected]:/home/cvs co linux-2.5

Each of the releases are tagged, they are of the form v2_5_64 etc.

Linus had said in the past that someone other than us should do this but
as it turns out, to do a reasonable job you need BK source. So we did it.
What do we mean by a reasonable job? BitKeeper has an automatic branch
feature which captures all parallel development. It's cool but a bit
pedantic and it makes exporting to a different system almost impossible
if you try and match what BK does exactly. So we didn't. What we
(actually Wayne Scott) did was to write a graph traversal alg which
finds the longest path through the revision history which includes
all tags. For the 2.5 tree, that is currently 8298 distinct points.
Each of those points has been captured in CVS as a commit. If we did
our job correctly, each of these commits has the same timestamp across
all files. So you should be able to get any changeset out of the CVS
tree with the appropriate CVS command based on dates.

We also created a ChangeSet file in the CVS tree. It has no contents, it
serves as a place to capture the BK changeset comments. Each file which
is part of a changeset has an extra comment which is of the form

(Logical change 1.%d)

where the "1.%d" matches the changeset rev. So you can look for all files
that have (Logical change 1.300) in their comments to reconstruct the
changeset. NOTE! That information is actually redundant, the timestamps
are supposed to do the same thing, let us know if that is not working, we'll
redo it. I expect we'll find bugs, please be patient, it takes 4 hours of
CPU time on a 2.1Ghz Athlon to do the conversion, that's a big part of
why this has taken so long. That's after a week's worth of optimizations.

Each ChangeSet delta has a BK rev associated with it in the comments.
We'll be giving you a small shell script which you can use to send Linus
patches that include the rev and we'll modify BK so that it can take
those patches with no patch rejects if you used that script.

We have a first pass of a real time gateway between BK and this CVS tree
done. Right now it is done by hand (by me) but as soon as it is debugged
you will see this tree being updated about 1-3 minutes after Linus pushes
to bkbits.

Once you guys look this over and decide you like it, we'll do the same
thing for the 2.4 tree.

We're also talking to an unnamed (in case it doesn't work out) Linux
company who may host bkbits.net for us. If they do that, we'll turn
the GNU patch exporter feature in BKD. That means that you'll be able
to wget any changeset as a GNU patch, complete with checkin comments.
I'm working with Alan on the format, I think we're close though I have
to run the latest version past him.

If all of this sounds nice, it is. It was a lot of work for us to do
this and you might be wondering why we bothered. Well, for a couple of
reasons. First of all, it was only recently that I realized that because
BK is not free software some people won't run BK to get data out of BK.
It may be dense on my part, but I simply did not anticipate that people
would be that extreme, it never occurred to me. We did a ton of work to
make sure anyone could get their data out of BK but you do have to run
BK to get the data. I never thought of people not being willing to run
BK to get at the data. Second, we have maintained SCCS compatible file
formats so that there would be another way to get the data out of BK.
This has held us back in terms of functionality and performance. I had
thought there was some value in the SCCS format but recent discussions
on this list have convinced me that without the changeset information
the file format doesn't have much value.

Our goal is to provide the data in a way that you can get at it without
being dependent on us or BK in any way. As soon as we have this
debugged, I'd like to move the CVS repositories to kernel.org (if I can
get HPA to agree) and then you'll have the revision history and can live
without the fear of the "don't piss Larry off license". Quite frankly,
we don't like the current situation any better than many of you, so if
this addresses your concerns that will take some pressure off of us.

Another goal is to have the freedom to evolve our file formats to be
better, better performance and more features. SCCS is holding us back.
So you should look hard at what we are providing and figure out if it
is enough. If you come back with "well, it's not BitKeeper so it's
not enough" we'll just ignore that. CVS isn't BitKeeper. On the
other hand, we believe we have gone as far as is possible to provide
all of the information, checkin comments, data, timestamps, user names,
everything. The graph traversal alg captures information at an extremely
fine granularity, absolutely as fine is possible. We have 8298 distinct
points over the 2.5.0 .. 2.5.64 set of changes, so it is 130 times finer
than the official releases. If you think something is missing, tell us,
we'll try and fix it.

The payoff for you is that you have the data in a format that is not
locked into some tool which could be taken away. The payoff for us is
that we can evolve our tool as we see fit. We have that right today,
we can do whatever we want, but it would be anywhere from annoying
to unethical to do so if that meant that you couldn't get at the data
except through BitKeeper. So the "deal" here is that you get the data
in CVS (and/or patches + comments) and we get to hack the heck out of
the file format. Our changes are going to move far faster than CSSC or
anyone else could keep up without a lot of effort. On the other hand,
our changes are going to make cold cache performance be much closer to
hot cache performance, use a lot less disk space, a lot less memory,
and a lot less CPU.

So take a look and tell me what you think.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm


2003-03-12 04:05:58

by Ben Collins

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> The payoff for you is that you have the data in a format that is not
> locked into some tool which could be taken away. The payoff for us is
> that we can evolve our tool as we see fit. We have that right today,
> we can do whatever we want, but it would be anywhere from annoying
> to unethical to do so if that meant that you couldn't get at the data
> except through BitKeeper. So the "deal" here is that you get the data
> in CVS (and/or patches + comments) and we get to hack the heck out of
> the file format. Our changes are going to move far faster than CSSC or
> anyone else could keep up without a lot of effort. On the other hand,
> our changes are going to make cold cache performance be much closer to
> hot cache performance, use a lot less disk space, a lot less memory,
> and a lot less CPU.

Larry, I don't mean to start yet another anti-bitmover, anti-bitkeeper or
anti-larry flame-fest, but I have to be honest that I am a little bit
worried.

You are giving us approximately 90% of our data in exchange for the one
thing that made using bitkeeper not a total sellout; the fact that the
revision history of the repo was still accessible without proprietary
software.

I honestly appreciate the work that you and BitMover do for the kernel,
but not giving us access to 100% of _our_ data is unacceptable to me.
Quite honestly, I think your move is to restrict the possible
alternatives to the BK client (the CSSC based ones like I and others had
done), which were able to extract 100% of the data, even if they
couldn't make use of it in the same way as bitkeeper. Atleast it was
there.

You've made quite a marketing move. It's obvious to me, maybe not to
others. By providing this CVS gateway, you make it almost pointless to
work on an alternative client. Also by providing it, you make it easier
to get away with locking the revision history into a proprietary format.



--
Debian - http://www.debian.org/
Linux 1394 - http://www.linux1394.org/
Subversion - http://subversion.tigris.org/
Deqo - http://www.deqo.com/

2003-03-12 04:28:51

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Followup to: <[email protected]>
By author: Larry McVoy <[email protected]>
In newsgroup: linux.dev.kernel
>
> If all of this sounds nice, it is. It was a lot of work for us to do
> this and you might be wondering why we bothered. Well, for a couple of
> reasons. First of all, it was only recently that I realized that because
> BK is not free software some people won't run BK to get data out of BK.
> It may be dense on my part, but I simply did not anticipate that people
> would be that extreme, it never occurred to me. We did a ton of work to
> make sure anyone could get their data out of BK but you do have to run
> BK to get the data. I never thought of people not being willing to run
> BK to get at the data. Second, we have maintained SCCS compatible file
> formats so that there would be another way to get the data out of BK.
> This has held us back in terms of functionality and performance. I had
> thought there was some value in the SCCS format but recent discussions
> on this list have convinced me that without the changeset information
> the file format doesn't have much value.
>

I can only speak for myself, but I didn't mind until the license ended
up having the "unless you hack on other tools" exception in it.
Personally, I value my freedom to hack on whatever I want a lot more
than the convenience of BK. This is a personal choice on my part and
may sound "extreme" to you, and other people have made other
tradeoffs, but for me freedom was the reason I started hacking Linux
instead of becoming a Win32 geek.

Having this capability available will certainly make life better for
everyone involved. Besides, "we won't hold your data hostage" is
actually a pretty nice selling argument.

>
> Our goal is to provide the data in a way that you can get at it without
> being dependent on us or BK in any way. As soon as we have this
> debugged, I'd like to move the CVS repositories to kernel.org (if I can
> get HPA to agree) and then you'll have the revision history and can live
> without the fear of the "don't piss Larry off license". Quite frankly,
> we don't like the current situation any better than many of you, so if
> this addresses your concerns that will take some pressure off of us.
>

I'm sure we can work something out. However, at the moment
zeus.kernel.org, our main server with lots and lots of bandwidth, is
starting to run into its limits, so I can't promise *when* that will
happen. Just putting in another server

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

2003-03-12 04:45:59

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Tue, Mar 11, 2003 at 08:39:19PM -0800, H. Peter Anvin wrote:
> I can only speak for myself, but I didn't mind until the license ended
> up having the "unless you hack on other tools" exception in it.
> Personally, I value my freedom to hack on whatever I want a lot more
> than the convenience of BK.

Yeah, that's cool, I don't blame you, it's a pretty extreme clause.
We may well drop it in the future if we feel we have pulled far enough
ahead that everyone else is just playing catchup. I do apologize for
that clause, I know it caused a lot of concern, but try and remember that
I'm "you" in that I'm just an engineer figuring this stuff out as I go.
We try and fix it as we go as well so there is hope.

> Having this capability available will certainly make life better for
> everyone involved. Besides, "we won't hold your data hostage" is
> actually a pretty nice selling argument.

Yup. I really thought that all the export stuff was !hostage but I
didn't factor in the license issues.

> > As soon as we have this
> > debugged, I'd like to move the CVS repositories to kernel.org (if I can
> > get HPA to agree)
>
> I'm sure we can work something out. However, at the moment
> zeus.kernel.org, our main server with lots and lots of bandwidth, is
> starting to run into its limits, so I can't promise *when* that will
> happen. Just putting in another server

We can certainly pay for a server, a server is not that much money and
is not an ongoing cost. So if that's the problem, I can get you a server
in less than a week, ping me off line and we'll work it out.

The main thing is that the CVS server and the tarball of the CVS repository
are *not* under our control. That's the only way some people are going to
believe that we're not out to screw them and it would oh-so-nice to have
people think that, it really would.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 08:44:40

by Jens Axboe

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Tue, Mar 11 2003, Ben Collins wrote:
> You've made quite a marketing move. It's obvious to me, maybe not to
> others. By providing this CVS gateway, you make it almost pointless to
> work on an alternative client. Also by providing it, you make it easier
> to get away with locking the revision history into a proprietary format.

This is a really good point, deserves high lighting imho...

The BK candy is getting increasingly bitter to swallow here, I may just
have to drop it soon. A shame.

--
Jens Axboe

2003-03-12 10:17:13

by Andreas Dilger

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mar 12, 2003 09:55 +0100, Jens Axboe wrote:
> On Tue, Mar 11 2003, Ben Collins wrote:
> > You've made quite a marketing move. It's obvious to me, maybe not to
> > others. By providing this CVS gateway, you make it almost pointless to
> > work on an alternative client. Also by providing it, you make it easier
> > to get away with locking the revision history into a proprietary format.
>
> This is a really good point, deserves high lighting imho...
>
> The BK candy is getting increasingly bitter to swallow here, I may just
> have to drop it soon. A shame.

Sadly, some people see the dark side of everything. I don't see how making
a CVS repository available with comments and an as-good-as-you-can-do-with-CVS
equivalent of a BK changeset equals "locking the revision history into a
proprietary format". Yes, Larry said that this would allow him to change the
BK file format to break compatibility with CSSC, but it is no more "locked
away" now than before for those people who refuse to use BK.

Ironically, SCCS was a former "evil proprietary format" that was reverse
engineered to get CSSC, AFAIK. People are still free to update CSSC to
track BK if they so choose.

Some people will just never be happy no matter what you give them.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2003-03-12 10:21:15

by Jens Axboe

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12 2003, Andreas Dilger wrote:
> Some people will just never be happy no matter what you give them.

I've been very happy with BK, been using it shortly after Linus started
doing so. Mostly out of curiosity at first, later because it was
actually quite useful. I even see myself as a fairly pragmatic
individual, but even so I do find it increasingly difficult to defend my
BK usage.

So please stop thinking you can judge that easily by pushing me into
your nice little 'some people will never be happy bla bla' category.

--
Jens Axboe

2003-03-12 10:47:10

by Andreas Dilger

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mar 12, 2003 11:31 +0100, Jens Axboe wrote:
> I've been very happy with BK, been using it shortly after Linus started
> doing so. Mostly out of curiosity at first, later because it was
> actually quite useful. I even see myself as a fairly pragmatic
> individual, but even so I do find it increasingly difficult to defend my
> BK usage.

Interesting. I _had_ lumped you into the "unhappy with BK" camp that has
become so vocal on l-k these days. My apologies. I do find it sort of sad
that you (or anyone) actually have to defend your BK usage to others.

I'm personally a "do what you want and let others do what they want as
long as it doesn't interfere with me" kind of person, but it seems that
lots of people here have the opinion that they know what is better for
everyone else, and have no problem telling the list over an over about it.
Probably time to fork a linux-code-repository mailing list and have everyone
spend their time over there instead of rehashing BK flamewars and/or BK
replacement here every week.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2003-03-12 11:04:34

by Jens Axboe

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12 2003, Andreas Dilger wrote:
> On Mar 12, 2003 11:31 +0100, Jens Axboe wrote:
> > I've been very happy with BK, been using it shortly after Linus started
> > doing so. Mostly out of curiosity at first, later because it was
> > actually quite useful. I even see myself as a fairly pragmatic
> > individual, but even so I do find it increasingly difficult to defend my
> > BK usage.
>
> Interesting. I _had_ lumped you into the "unhappy with BK" camp that has
> become so vocal on l-k these days. My apologies. I do find it sort of sad
> that you (or anyone) actually have to defend your BK usage to others.

No offense taken, and I personally don't have any sort of political
agenda that I care to voice on lkml :). That's part of where the
pragmatism comes in, I just don't care enough.

About every patch I sent here on lkml has been generated by bk for the
past year. I typically don't do commits, just have trees with pending
deltas and bk -r diffs -u does the job for me.

> I'm personally a "do what you want and let others do what they want as
> long as it doesn't interfere with me" kind of person, but it seems that
> lots of people here have the opinion that they know what is better for
> everyone else, and have no problem telling the list over an over about it.
> Probably time to fork a linux-code-repository mailing list and have everyone
> spend their time over there instead of rehashing BK flamewars and/or BK
> replacement here every week.

For me, I think Andrew's patch management scripts will do the job. And
yeah, the non-stop bk threads make me sick as well and are rarely read
here. So I better make this my last mail on the subject...

--
Jens Axboe

2003-03-12 11:10:14

by Jamie Lokier

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Andreas Dilger wrote:
> Ironically, SCCS was a former "evil proprietary format" that was reverse
> engineered to get CSSC, AFAIK. People are still free to update CSSC to
> track BK if they so choose.

Actually the SCCS format is documented in a manual page (although
there are a few annoying ambiguities in it).

-- Jamie

2003-03-12 16:08:13

by Ben Collins

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 03:26:14AM -0700, Andreas Dilger wrote:
> On Mar 12, 2003 09:55 +0100, Jens Axboe wrote:
> > On Tue, Mar 11 2003, Ben Collins wrote:
> > > You've made quite a marketing move. It's obvious to me, maybe not to
> > > others. By providing this CVS gateway, you make it almost pointless to
> > > work on an alternative client. Also by providing it, you make it easier
> > > to get away with locking the revision history into a proprietary format.
> >
> > This is a really good point, deserves high lighting imho...
> >
> > The BK candy is getting increasingly bitter to swallow here, I may just
> > have to drop it soon. A shame.
>
> Sadly, some people see the dark side of everything. I don't see how making
> a CVS repository available with comments and an as-good-as-you-can-do-with-CVS
> equivalent of a BK changeset equals "locking the revision history into a
> proprietary format". Yes, Larry said that this would allow him to change the
> BK file format to break compatibility with CSSC, but it is no more "locked
> away" now than before for those people who refuse to use BK.
>
> Ironically, SCCS was a former "evil proprietary format" that was reverse
> engineered to get CSSC, AFAIK. People are still free to update CSSC to
> track BK if they so choose.

Atleast SCCS is mostly ascii. Larry is talking about binary. Who knows,
maybe even encrypted and using some unknown compression method (I'm sure
if it's encrypted, it will be called "compression").


--
Debian - http://www.debian.org/
Linux 1394 - http://www.linux1394.org/
Subversion - http://subversion.tigris.org/
Deqo - http://www.deqo.com/

2003-03-12 16:02:59

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Followup to: <[email protected]>
By author: Andreas Dilger <[email protected]>
In newsgroup: linux.dev.kernel
>
> Sadly, some people see the dark side of everything. I don't see how making
> a CVS repository available with comments and an as-good-as-you-can-do-with-CVS
> equivalent of a BK changeset equals "locking the revision history into a
> proprietary format". Yes, Larry said that this would allow him to change the
> BK file format to break compatibility with CSSC, but it is no more "locked
> away" now than before for those people who refuse to use BK.
>
> Ironically, SCCS was a former "evil proprietary format" that was reverse
> engineered to get CSSC, AFAIK. People are still free to update CSSC to
> track BK if they so choose.
>
> Some people will just never be happy no matter what you give them.
>

>From what I can gather, the question is very simple:

"Can we get our data out of BK into some kind of open format?"

It's an important question. If the answer is "yes, but only the stuff
that can be mapped onto CVS" then that's a significant data loss, and
if BitMover changes the data format without documentation, then there
is no longer a way to get all the data out.

Presumably the CVS exporter could get augmented with some kind of
metadata export... perhaps an XML schema that describes how the
various points are to be linked or whatnot... it won't turn CVS into
BK overnight (so Larry can still sleep at night), but it would give
BitMover the freedom to change their data format.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

2003-03-12 16:20:16

by Dana Lacoste

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, 2003-03-12 at 11:13, H. Peter Anvin wrote:
> "Can we get our data out of BK into some kind of open format?"

> It's an important question. If the answer is "yes, but only the stuff
> that can be mapped onto CVS" then that's a significant data loss, and
> if BitMover changes the data format without documentation, then there
> is no longer a way to get all the data out.

This sounds like the old GPL argument.

The GPL'd redistributor has to supply the source, they don't have to
supply it in the format that's best for you, being an 80mm tape drive
cuz you're stuck in the punch card age.

Seriously, if CVS loses all that data, is that BK's fault? BK's so
powerful because it has more information than anyone else, but it's
not their fault (and it's not proprietary data) that no-one else can
deal with the data when it's exported, now is it????

It's not a significant data loss when you try to view a 24bpp image
on an 8bpp display, so it's not a significant data loss that CVS can't
handle the BK. If it could, Linus would've switched to CVS instead....

I'm not saying Larry's a God or anything, I'm just hoping you guys can
give it up already. Linus uses BK, nobody else needs to, so move on!

Dana Lacoste
Ottawa, Canada

2003-03-12 16:34:56

by John Bradford

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> Linus uses BK, nobody else needs to, so move on!

Seconded.

John.

2003-03-12 16:37:49

by Lars Marowsky-Bree

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On 2003-03-12T11:18:38,
Ben Collins <[email protected]> said:

> Atleast SCCS is mostly ascii. Larry is talking about binary. Who knows,
> maybe even encrypted and using some unknown compression method (I'm sure
> if it's encrypted, it will be called "compression").

*sigh* However, all Larry _could_ be talking about is that he wants to replace
the SCCS format by something more powerful. Nowhere did he say that the format
would not be documented.

Granted, he also did not say that it _would_ be, but you all are jumping so
hard on him based on the assumption that it would not be without knowing that
either, so maybe you could have just written to Larry and asked?

I'm rather agnostic to the BK debate: I think it is an awesome tool, and if it
gets the job done, I am all for it. If you don't want to use it for whatever
reason, that's fine too. And asking for the Linux Kernel data to be fully
available without using a proprietary tool also makes lots of sense. I also
agree with Larry that duplicating the work done in BK in an Open Source tool
is going to take quite a while and effort: I'm not going as far as saying that
it cannot be done, because Linux itself is the best example that it _can_ be
done if people really want too. But if you want, more power to you, too.

But the 'BK and Larry are evil to the bone because it is not GPL!' crap is
annoying the hell out of me. Shut up, will you all, pretty please? And could
you please first ask/clarify, then flame?


Sincerely,
Lars Marowsky-Br?e <[email protected]>

--
SuSE Labs - Research & Development, SuSE Linux AG

"If anything can go wrong, it will." "Chance favors the prepared (mind)."
-- Capt. Edward A. Murphy -- Louis Pasteur

2003-03-12 16:58:17

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On 12 Mar 2003, Dana Lacoste wrote:

> Seriously, if CVS loses all that data, is that BK's fault? BK's so
> powerful because it has more information than anyone else, but it's
> not their fault (and it's not proprietary data) that no-one else can
> deal with the data when it's exported, now is it????

That's not the point. Larry does not own the data in the Linux repository,
this was one of the conditions for the bk usage, so Larry cannot say, that
you only get all the data if you use bk. If cvs can't represent all the
information, we have to find another solution.

bye, Roman

2003-03-12 17:19:03

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Dana Lacoste wrote:
> On Wed, 2003-03-12 at 11:13, H. Peter Anvin wrote:
>
>>"Can we get our data out of BK into some kind of open format?"
>
>
>>It's an important question. If the answer is "yes, but only the stuff
>>that can be mapped onto CVS" then that's a significant data loss, and
>>if BitMover changes the data format without documentation, then there
>>is no longer a way to get all the data out.
>
>
> This sounds like the old GPL argument.
>
> The GPL'd redistributor has to supply the source, they don't have to
> supply it in the format that's best for you, being an 80mm tape drive
> cuz you're stuck in the punch card age.
>
> Seriously, if CVS loses all that data, is that BK's fault? BK's so
> powerful because it has more information than anyone else, but it's
> not their fault (and it's not proprietary data) that no-one else can
> deal with the data when it's exported, now is it????
>
> It's not a significant data loss when you try to view a 24bpp image
> on an 8bpp display, so it's not a significant data loss that CVS can't
> handle the BK. If it could, Linus would've switched to CVS instead....
>

You're missing the point completely.

Of course it's not BK's fault that CVS can't represent the data.
However, one of the (valid!) selling points of BK was "we won't hold
your data hostage." That requires that you can export both the data and
the metadata into some kind of open format. Since CVS clearly can't be
that open format (CVS being insufficiently powerful), the additional
metadata needs to be available in some kind of auxilliary form. It's
then, of course, not BK's fault that CVS can't possibly make use of that
auxilliary metadata.

-hpa


2003-03-12 17:24:00

by Ryan Anderson

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 05:47:41PM +0100, Lars Marowsky-Bree wrote:
> On 2003-03-12T11:18:38,
> Ben Collins <[email protected]> said:
>
> > Atleast SCCS is mostly ascii. Larry is talking about binary. Who knows,
> > maybe even encrypted and using some unknown compression method (I'm sure
> > if it's encrypted, it will be called "compression").
>
> *sigh* However, all Larry _could_ be talking about is that he wants to replace
> the SCCS format by something more powerful. Nowhere did he say that the format
> would not be documented.
>
> Granted, he also did not say that it _would_ be, but you all are jumping so
> hard on him based on the assumption that it would not be without knowing that
> either, so maybe you could have just written to Larry and asked?

This is, I think, an underlying assumption that people are making.

Larry, given that I *think* you've said that the algorithms are the
important parts, not the file format (at least in the past), would you
consider publicly documenting the file format?

Thanks for the CVS gateway, anyway, though.

--

Ryan Anderson
sometimes Pug Majere

2003-03-12 17:32:13

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

[BK is locking up our data]
[BitMover has to give us our data in an open format]
[The BK pill is oh-so-bitter]
[My tummy hurts and it's Larry's fault]

Boo hoo, cry me a river.

Those of you complaining ought to at least look before you complain.
You just assumed that we were screwing you and you couldn't be bothered
to verify it before you complained. We didn't screw you at all, all
the data is there. And BK itself has always had the ability to export
any data in any format, if you read the man pages you might notice that,
but that would be too much work, it's easier to complain.

If you had actually gone and looked at the CVS repository you would have
seen that there is nothing of value missing, in almost 100% of the files,
the full revision history is preserved:

CVS: 110,076 deltas over all files
BK: 121,891 deltas over all files

You guys don't have that much parallelism in your files and the exporter
is capturing all that it can which is virtually everything. It's worth
noting that many deltas in BK are just event recorders, they are just
empty merge delta noise and in fact many people have asked us to get rid
of them. Once again, it's easier to complain than think. I'm detecting
a trend.

The graph traversal managed to capture an amazing amount of information,
it's bloody awesome, which you might have noticed if you had looked.
But, nooooo, let's just piss and moan. What a bunch of friggin' whiners.

The next time you open your mouth, the words that come out of it should be
"thank you". Nothing else, just that. If you can't say something nice,
now is a good time to say nothing at all because we are sick and tired of
dealing with people who complain far more than they code. I'm serious,
we've done way more than anyone could reasonably expect and you react
with no basis in fact, assume bad things that aren't true, don't bother
to look to see if there is a real problem, and don't bother to say thanks.
Aren't you the slightest bit ashamed of your behaviour?
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 17:45:11

by John Bradford

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> > This sounds like the old GPL argument.
> >
> > The GPL'd redistributor has to supply the source, they don't have to
> > supply it in the format that's best for you, being an 80mm tape drive
> > cuz you're stuck in the punch card age.
> >
> > Seriously, if CVS loses all that data, is that BK's fault? BK's so
> > powerful because it has more information than anyone else, but it's
> > not their fault (and it's not proprietary data) that no-one else can
> > deal with the data when it's exported, now is it????
> >
> > It's not a significant data loss when you try to view a 24bpp image
> > on an 8bpp display, so it's not a significant data loss that CVS can't
> > handle the BK. If it could, Linus would've switched to CVS instead....
> >
>
> You're missing the point completely.
>
> Of course it's not BK's fault that CVS can't represent the data.
> However, one of the (valid!) selling points of BK was "we won't hold
> your data hostage." That requires that you can export both the data and
> the metadata into some kind of open format. Since CVS clearly can't be
> that open format (CVS being insufficiently powerful), the additional
> metadata needs to be available in some kind of auxilliary form. It's
> then, of course, not BK's fault that CVS can't possibly make use of that
> auxilliary metadata.

I thought that BK has been able to export everything to a text file
since the first version.

(Ah, but of course, unless that text file is available in EBCDIC, we
still have a problem...)

John.

2003-03-12 17:50:27

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On Wed, 12 Mar 2003, Larry McVoy wrote:

> Those of you complaining ought to at least look before you complain.
> You just assumed that we were screwing you and you couldn't be bothered
> to verify it before you complained. We didn't screw you at all, all
> the data is there. And BK itself has always had the ability to export
> any data in any format, if you read the man pages you might notice that,
> but that would be too much work, it's easier to complain.

Well, exactly the people, who are most interested in the complete data,
cannot do this, because they are not allowed to do this. They have to find
someone else to extract the data.

> But, nooooo, let's just piss and moan. What a bunch of friggin' whiners.
>
> The next time you open your mouth, the words that come out of it should be
> "thank you". Nothing else, just that.

Your whining is getting more and more unbearable. If you don't stop with
your insults, "fuck you" is the only thing you will hear. :-(

bye, Roman

2003-03-12 17:52:28

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> I thought that BK has been able to export everything to a text file
> since the first version.

bk export -tpatch -r1.900 > patch.1.900
bk changes -v -r1.900 > comments.1.900

Been there forever. So has ways to get all the metadata from the command
line without having to reverse engineer the file format. See

http://www.bitkeeper.com/manpages/bk-prs-1.html

it's all there. Always has been.

Wayne wanted me to point that it is easy to write the BK to CVS exporter
completely from the command line, we prototyped it that way, the only
reason we rewrote part of it in C was for performance. The point being
that you guys could have done this yourself without help from us because
all the metadata is right there. Ditto for anyone else worried about
getting their data out of BK now or in the future. The whole point of
prs is to be able to have a will-always-work way to get at the data or
the metadata, it makes the file format a non-issue.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 18:23:46

by Ben Collins

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> CVS: 110,076 deltas over all files
> BK: 121,891 deltas over all files

(I can recalculate this if you tell me how many of the BK ones are empty
merge pointers)

90.31%

I wasn't far off by saying 90%. And don't tell me I can get all the
data, when in fact, I can't. Unless of course you give me an explicit
variance from your license, I pay for a license, or I get someone else
with BK to get me the data.

Larry, I am not trying to knock your efforts for the kernel. I am going
on record as saying "thank you, Larry". Linus has been much more
productive since using BK. The kernel patch quality and productivity of
the core kernel developers has increased. A new paradigm in source
control has come about.

But being a person who also has certain beliefs, I am not going to stand
on the side lines and watch the fight. Please don't drop me into the
pool of people who believe all source should be free. I work for
companies that retain some of their IP for good reason. I have signed
NDA's to get at source and work on things that I cannot give out for
free. I'm ok with that choice for a company. What I am not ok with, is
seeing something that I work with everyday slowly becoming engulfed in
gray area.

The kernel's revision history is always available. I get the cset
emails. I can extract all the info I want manually.

The problem I have is that you are going to make it so that the original
files that hold this data cannot be extracted in any meaningful way
without your tools. So if bitkeeper suddenly could not be used by Linus
or any others, for whatever reason, we are locked out of that original
dataset.

--
Debian - http://www.debian.org/
Linux 1394 - http://www.linux1394.org/
Subversion - http://subversion.tigris.org/
Deqo - http://www.deqo.com/

2003-03-12 18:27:21

by Diego Calleja

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Tue, 11 Mar 2003 23:16:21 -0500
Ben Collins <[email protected]> wrote:

> You've made quite a marketing move. It's obvious to me, maybe not to
> others. By providing this CVS gateway, you make it almost pointless to
> work on an alternative client. Also by providing it, you make it easier

I don't think so. This also bits Larry. If he does well enought, there'll be
some people here that won't use bitkeeper just because they can use the cvs
gateway and they don't need/miss the features they could get with bk.

And i don't think it avoids creating a free bk clone. I guess that there's
a lot of people out there interested in such tool, and not only for kernel
development; this won't stop them.

As far as i can see; Larry is just wasting time (money) to help the kernel
development and people who doesn't use BK just because it isn't free. And
he's not charging me, so i find this a good movement for everybody. I only
can say thanks.


Diego Calleja

2003-03-12 18:36:44

by Ben Collins

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 07:38:06PM +0100, Arador wrote:
> On Tue, 11 Mar 2003 23:16:21 -0500
> Ben Collins <[email protected]> wrote:
>
> > You've made quite a marketing move. It's obvious to me, maybe not to
> > others. By providing this CVS gateway, you make it almost pointless to
> > work on an alternative client. Also by providing it, you make it easier
>
> I don't think so. This also bits Larry. If he does well enought, there'll be
> some people here that won't use bitkeeper just because they can use the cvs
> gateway and they don't need/miss the features they could get with bk.
>
> And i don't think it avoids creating a free bk clone. I guess that there's
> a lot of people out there interested in such tool, and not only for kernel
> development; this won't stop them.
>
> As far as i can see; Larry is just wasting time (money) to help the kernel
> development and people who doesn't use BK just because it isn't free. And
> he's not charging me, so i find this a good movement for everybody. I only
> can say thanks.

You're missing the point. I am not against the CVS->BK gateway. I'm all
for it. But it's kind of sour given that he now wants to change the disk
format of the repo to make it harder to get the data from it.

If all he announced was "you now have a CVS->BK repo", I wouldn't be
complaining, I'd be patting him on the back.



--
Debian - http://www.debian.org/
Linux 1394 - http://www.linux1394.org/
Subversion - http://subversion.tigris.org/
Deqo - http://www.deqo.com/

2003-03-12 18:53:03

by Sam Ravnborg

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 01:34:13PM -0500, Ben Collins wrote:
> > CVS: 110,076 deltas over all files
> > BK: 121,891 deltas over all files
>
> (I can recalculate this if you tell me how many of the BK ones are empty
> merge pointers)
>
> 90.31%
In the linux-2.5 tree there are 19300 changesets, of which there are
2705 empty changesets - 14%.
Public information that does not require a license to read.

> I wasn't far off by saying 90%. And don't tell me I can get all the
> data, when in fact, I can't.
What kind of data is actually _missing_ in the CVS repository.
Whit data I understand something usefull!

Judging based on above numbers does not make much sense to me.
How does CVS handle a cset where the same patch got applied twice,
does that count as a delta or not. Does that count as missing data?
Empty csets touching 20 files - does that count as deltas etc.
See, lots of open questions.

> Unless of course you give me an explicit
> variance from your license, I pay for a license, or I get someone else
> with BK to get me the data.
Browsing linux.bkbits.net does not require a license - or?

> What I am not ok with, is
> seeing something that I work with everyday slowly becoming engulfed in
> gray area.

Opinions vary of course. What I have seen is that the S/N ratio has
increased on lkml due to usage of BK, but...
1) Errors are fixed sooner when Linus apply patches that has errors
2) "make defconfig" can always compile on new kernel versions
3) I can follow what has been accepted in the kernel
4) i can generate patches that does not reject due to other changes
in a tee I cannot access
5) My "diff" patches get applied and credited to me
6) Valueable comments are preserved when patches are applied
7) The kernel src has become accessible in more (not less) formats
8) The changelogs posted upon release has been much more informative

So I simply do not recognize the pattern that "becoming engulfed".
I have even better access to the kernel src that I had in the past.
Several options exist, only one of them require BK.
Now I even have access via CVS (not that I plan to use it)

As a happy BK user I get frustrated reading also this negative
stuff, and wanted to give Larry & Co a heads up.
A lot has improved after introducing BK.

But I see that whatever Bitmover does that is (by some persons)
seen as negative.

Sam

2003-03-12 19:02:14

by Andreas Dilger

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mar 12, 2003 13:47 -0500, Ben Collins wrote:
> You're missing the point. I am not against the CVS->BK gateway. I'm all
> for it. But it's kind of sour given that he now wants to change the disk
> format of the repo to make it harder to get the data from it.

Actually, that is purely YOU reading something into what he wrote. He
didn't say "now I'm going to make the repo harder to get data from it".
What he said was "now I'm free to change the format from SCCS to something
that is more efficient for BK to use". Who knows, maybe the new format
will be _easier_ to reverse engineer/parse using 3rd party tools?

Also, it's not like he can change things overnight, because there are lots
of customers/users who have repos in the old SCCS format, and he doesn't
want to completely throw away his current code just to piss off some whiny
l-k users. At worst, if it bothers you so much, you can take up the now
seemingly forgotten Linux trait of "taking things into your own hands and
fixing it to your own needs" and write bk_evil_format_2_CVS conversion tool
instead of bitching on l-k about it.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2003-03-12 19:11:18

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, 12 Mar 2003, Larry McVoy wrote:

> [BK is locking up our data]
> [BitMover has to give us our data in an open format]
> [The BK pill is oh-so-bitter]
> [My tummy hurts and it's Larry's fault]
>
> Boo hoo, cry me a river.
>
> Those of you complaining ought to at least look before you complain.
> You just assumed that we were screwing you and you couldn't be bothered
> to verify it before you complained. We didn't screw you at all, all
> the data is there.

Larry, please don't fall into that trap.

There will _always_ be people to whine at you. For _ever_, and even longer
if your business remains healthy.

There was a conflict issue before since one was forced into using BK in
order to have real time access to the very latest changeset in the reference
development kernel repository. Now that this issue has been resolved there
is rationally nothing else to complain about. Those who want to/can use BK
just do it, those who don't want to/can't use BK aren't penalized anymore
with regard to access to the very data they care about.

Oh then why are they still whining? Because they are humans with their own
ego, pride and beliefs. Maybe they are upset because you just removed the
best argument they had against BK up to now, maybe they have a hard time
convincing themselves that BK is superior because you managed to create a
favorable climate for its rapid development, maybe they are religious
extremists (read Free Software extremists) always about to go on a crusade
for the only True Way of living, etc. There's nothing you can do to help
those irrational issues besides just ignoring them. Linus himself just
mentioned recently that he's getting more and more effective at ignoring BSD
extremists for his Linux licensing choice. It's about time you do the same.

So please Larry don't let your feelings be trapped into critics that will
_never_ end. There will still be people to disagree with your licensing
decisions, to hate your business model, and to simply hate you. But if
_you_ know that you've made everything to make this part of the world a
better place then it should be enough to make you feel good. Seeking for
public rewards is mostly impossible as long as you're alive.

> The next time you open your mouth, the words that come out of it should be
> "thank you". Nothing else, just that.

Larry, I say "Thank you".

> If you can't say something nice, now is a good time to say nothing at all
> because we are sick and tired of dealing with people who complain far more
> than they code.

Then why do you let them turn you down? Why are those people so credible to
you so you feel you must listen to them?

> I'm serious, we've done way more than anyone could reasonably expect and
> you react with no basis in fact, assume bad things that aren't true, don't
> bother to look to see if there is a real problem, and don't bother to say
> thanks.

If you expect that aspect of human beings to change frankly you are
dreaming. I'm sorry to make it sound bad but that's reality. There will
_always_ be people to disagree and whine at something, and statistically
that something will be BK from time to time. Those who are happy with the
system usually just say nothing unfortunately and they go back to coding
right away.


Nicolas

2003-03-12 19:24:04

by Brandon Low

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Well I'm a nobody in linux kernel land, but I'm a long time lurker on LKML and have been doing patchsets for the Gentoo Linux kernel for a while now.

I'd like to say that before Linus started using BK, close to 50% of the revision data that is now saved was completely lost in the process of him merging patches by hand into his repository. I mean be realistic, do you think that Linus kept perfect track of EVERY single ' ' ';' ')' that he changed when merging a patch with minor rejects with his repo? Do you think that every single time that he made a 1 line change or merged a 1 line change that was sent to this list it was documented and recorded? I doubt it. So now we are able to get a publicly available CVS repository with close to two times the data that was ever available before, and infinitely more than was ever available to anyone outside of Linus' own head.

I personally think that Larry has done an amazing job supporting this project and it's goals, and I will give him a big "Thanks for all your support and hard work" at this time. I think that those of you complaining about this as bitmover clearing the road to steal our data should take a long hard look at what you are really saying and consider what BK has given us that we never had before because nothing before was ever usable by Linus.

Thanks for reading (those who do),

--Brandon Low
Gentoo Linux Senior Developer

2003-03-12 19:23:15

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, 12 Mar 2003, Ben Collins wrote:

> > CVS: 110,076 deltas over all files
> > BK: 121,891 deltas over all files
>
> (I can recalculate this if you tell me how many of the BK ones are empty
> merge pointers)
>
> 90.31%
>
> I wasn't far off by saying 90%. And don't tell me I can get all the
> data, when in fact, I can't.

What the hell don't you understand in the fact that the remaining 10% is
USELESS DATA WITH NO VALUE WHAT SO EVER ???

Oh of course you won't trust Larry and maybe he's trying to screw you with
that 10% by carefully crafting essential details in there so you'll end up
being forced into buying a BK license otherwise you won't be able to make
any sense of what happened in the source tree, or even make it compile!
Isn't it pure paranoia?

Please, get a life.


Nicolas

2003-03-12 19:29:48

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On Wed, 12 Mar 2003, Sam Ravnborg wrote:

> As a happy BK user I get frustrated reading also this negative
> stuff, and wanted to give Larry & Co a heads up.
> A lot has improved after introducing BK.

Nobody denies that things have become better, but Larry's arrogance makes
it extremely difficult to be thankful for that. Somehow Larry assumes
everyone should just use bk and be happy (and don't work on other SCM
systems), but that would be like asking that all people should just use
vimacs.

bye, Roman

2003-03-12 19:40:49

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> > Boo hoo, cry me a river.
>
> Larry, please don't fall into that trap.

I'm not, I was just blowing off steam. Ya gotta admit it is pretty
unreasonable for people to complain without even checking. I'm tailing
the history log in the CVS repository, there have been 3 or 4 downloads
since last night, that's it. All that whining and no looking at all.

> > If you can't say something nice, now is a good time to say nothing at all
> > because we are sick and tired of dealing with people who complain far more
> > than they code.
>
> Then why do you let them turn you down? Why are those people so credible to
> you so you feel you must listen to them?

Well, I agree that they should be able to get at the information in a
neutral way. I guess it was unrealistic, but I was expecting that people
would go download the CVS tree and poke around and see if it is what they
wanted. We could have had a nice technical discussion about what was
missing, if anything. If the discussion had happened, they would have
found out that even for the missing deltas we captured the information.

Here's an example. Suppose the graph is like

1.1 (torvalds) -> 1.2 (alan) -> 1.3 (sct) -> 1.4 (torvalds)
\ /
\-> 1.1.1.1 (davej) --------/

and we picked the straight 1.1 to 1.4 path. When we created the CVS 1.4
delta, we knew that it was a merge delta and we needed to capture the
data off on the branch. We already capture the contents, the missing part
is what davej may have typed in as comments. We capture that as well, it
looks like this:

revision 1.342
date: 2003/03/07 15:39:16; author: torvalds; state: Exp; lines: +7 -1
[PATCH] kbuild: Smart notation for non-verbose output

2003/03/05 19:50:27-06:00 kai
kbuild: Make build stop on vmlinux link error

(Logical change 1.8166)

That particular example is from the top level Makefile, Linus merged
in Kai's work and we added the "kbuild: Make build stop on vmlinux link
error" comments from the merged in delta. If there were more than one
delta, they get merged as well, so the rlog output is completely accurate.

So we actually captured 100% of the checkin information, both in data
files and in the pseudo ChangeSet file, not one byte of that is lost.
All we did is collapse all the branches into the longest possible straight
line, which is actually for many purposes nicer than the rats nets that
you get with BK.

Anyway, to get back to your question, what gets me down is that we did
what we believe to be the absolute perfect job. All the data is captured,
all the checkin comments are captured, we made all the dates go forward
properly so that diffs would work, there is nothing wrong with the CVS
tree, it's perfect. It would have been nice if people had actually
looked at it. You can, go look at

http://linux.bkbits.net:8080/linux-2.5/hist/Makefile

and compare it to this:

cvs -d:pserver:[email protected]:/home/cvs rlog linux-2.5/Makefile

Poke around, play with your favorite files, you'll see your checkin
comments, we didn't lose anything at all. Apparently, that's too much
to ask, and that's what gets me down. I don't expect people to say
"rah rah, you guys are great" but I did expect the people who have been
bitching non-stop that they can't get what they want would at least go
see if they could get it now. Silence would be a more than adequate
reward as far as I'm concerned, I don't need the strokes but I am sick
of the baseless whining. Fear not, I'll get over it.

What I expected from Ben was a polite request for a tarball of the CVS
tree so he could go convert it to SVN and see if all his checkin comments
are there. No such request, polite or otherwise, has happened, from him
or anyone else. So it's becoming apparent that the whole data/metadata
whatever is a red herring, they just want to flame. Whatever, somebody
will get some good use out of the CVS trees, if I were in your shoes
I'd want them as a safety net so it's cool they exist.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 19:42:56

by Ben Collins

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 02:32:57PM -0500, Nicolas Pitre wrote:
> On Wed, 12 Mar 2003, Ben Collins wrote:
>
> > > CVS: 110,076 deltas over all files
> > > BK: 121,891 deltas over all files
> >
> > (I can recalculate this if you tell me how many of the BK ones are empty
> > merge pointers)
> >
> > 90.31%
> >
> > I wasn't far off by saying 90%. And don't tell me I can get all the
> > data, when in fact, I can't.
>
> What the hell don't you understand in the fact that the remaining 10% is
> USELESS DATA WITH NO VALUE WHAT SO EVER ???
>
> Oh of course you won't trust Larry and maybe he's trying to screw you with
> that 10% by carefully crafting essential details in there so you'll end up
> being forced into buying a BK license otherwise you won't be able to make
> any sense of what happened in the source tree, or even make it compile!
> Isn't it pure paranoia?
>

What part of the structure of the BK repo don't you understand? Didn'y
you pay attention to what Larry said? The tree looks like branches that
always return to the trunk. To put this into CVS, he had to choose a
line of those branches that contained the _most_ changesets (which
doesn't always equate to the most important, or largest deltas). There
are some changesets on the side that are not included here. Are all of
those changesets empty merges? No.

--
Debian - http://www.debian.org/
Linux 1394 - http://www.linux1394.org/
Subversion - http://subversion.tigris.org/
Deqo - http://www.deqo.com/

2003-03-12 19:58:35

by Ben Collins

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> So we actually captured 100% of the checkin information, both in data
> files and in the pseudo ChangeSet file, not one byte of that is lost.
> All we did is collapse all the branches into the longest possible straight
> line, which is actually for many purposes nicer than the rats nets that
> you get with BK.

Now that wasn't apparent from your original post. You made it sound more
like meta-data was missing (we all know that via the merges, as long as
you picked a line, none of the diffs were missing).

That's good to know, and makes my whole rant kind of pointless now.

I still don't like your move to change the SCCS format. Regardless of
your intentions, it makes my gut hurt. That doesn't mean I don't like
you or what you do. I never said I didn't in this discussion, although
some people decided to get rude on my behalf, when they shouldn't which
triggered others to target people in the discussion with rudeness, which
shouldn't have happened.

I can dislike the change, just the same as you can license your product
and develop it any way you want. The day someone takes that right away
from either of us is when we have a real problem.

--
Debian - http://www.debian.org/
Linux 1394 - http://www.linux1394.org/
Subversion - http://subversion.tigris.org/
Deqo - http://www.deqo.com/

2003-03-12 19:59:21

by Ben Collins

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> > Oh of course you won't trust Larry and maybe he's trying to screw you with
> > that 10% by carefully crafting essential details in there so you'll end up
> > being forced into buying a BK license otherwise you won't be able to make
> > any sense of what happened in the source tree, or even make it compile!
> > Isn't it pure paranoia?
> >
>
> What part of the structure of the BK repo don't you understand? Didn'y
> you pay attention to what Larry said? The tree looks like branches that
> always return to the trunk. To put this into CVS, he had to choose a
> line of those branches that contained the _most_ changesets (which
> doesn't always equate to the most important, or largest deltas). There
> are some changesets on the side that are not included here. Are all of
> those changesets empty merges? No.

s/changeset/metadata/ to clarify my point, which is of no use now.

--
Debian - http://www.debian.org/
Linux 1394 - http://www.linux1394.org/
Subversion - http://subversion.tigris.org/
Deqo - http://www.deqo.com/

2003-03-12 20:03:33

by Sam Ravnborg

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 11:51:20AM -0800, Larry McVoy wrote:
> is what davej may have typed in as comments. We capture that as well, it
> looks like this:
>
> revision 1.342
> date: 2003/03/07 15:39:16; author: torvalds; state: Exp; lines: +7 -1
> [PATCH] kbuild: Smart notation for non-verbose output

Ho humm, I did this not Linus.
Checked the web which is correct.

Same goes for 1.340 for the Makefile. Kai did it, not Linus.

Sam

2003-03-12 20:07:36

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 09:14:16PM +0100, Sam Ravnborg wrote:
> On Wed, Mar 12, 2003 at 11:51:20AM -0800, Larry McVoy wrote:
> > is what davej may have typed in as comments. We capture that as well, it
> > looks like this:
> >
> > revision 1.342
> > date: 2003/03/07 15:39:16; author: torvalds; state: Exp; lines: +7 -1
> > [PATCH] kbuild: Smart notation for non-verbose output
>
> Ho humm, I did this not Linus.
> Checked the web which is correct.
>
> Same goes for 1.340 for the Makefile. Kai did it, not Linus.

Right you are, looks like a bug. I'm digging into it. I suspect I have
an off by one delta error but I'm not sure. Please be aware that it
takes 4.5 hours of CPU on our fastest machine to do the conversion so
fixed trees are probably due tomorrow.

If you want to help, see if you can find a pattern the user names, being
aware that the revision numbers don't map one to one.

Thanks for pointing this out, this was what I wanted.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 20:09:42

by Jeff Garzik

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Ben Collins wrote:
> What part of the structure of the BK repo don't you understand? Didn'y
> you pay attention to what Larry said? The tree looks like branches that
> always return to the trunk. To put this into CVS, he had to choose a
> line of those branches that contained the _most_ changesets (which
> doesn't always equate to the most important, or largest deltas). There
> are some changesets on the side that are not included here. Are all of
> those changesets empty merges? No.


What are you attempting to insinuate?

I think BK had has positive benefits to all Linux kernel hackers,
whether or not they use BitKeeper. The current patch flow is pretty
amazing, and we seemed to have avoided the Linus burn-out that
threatened to come to pass pre-BK.

It's getting to the point where it seems like every time BitMover does
something to appease the non-BK crowd, people (a) read evil intentions
into the action, and/or (b) just keep asking for more.

Can someone please tell what is the overall point? what is the endgame
here? what is a concrete technical solution that will satisfy you
(Ben), Pavel, and everybody else?

Jeff



2003-03-12 20:26:54

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, 12 Mar 2003, Ben Collins wrote:

> On Wed, Mar 12, 2003 at 02:32:57PM -0500, Nicolas Pitre wrote:
> > On Wed, 12 Mar 2003, Ben Collins wrote:
> >
> > > > CVS: 110,076 deltas over all files
> > > > BK: 121,891 deltas over all files
> > >
> > > (I can recalculate this if you tell me how many of the BK ones are empty
> > > merge pointers)
> > >
> > > 90.31%
> > >
> > > I wasn't far off by saying 90%. And don't tell me I can get all the
> > > data, when in fact, I can't.
> >
> > What the hell don't you understand in the fact that the remaining 10% is
> > USELESS DATA WITH NO VALUE WHAT SO EVER ???
> >
> > Oh of course you won't trust Larry and maybe he's trying to screw you with
> > that 10% by carefully crafting essential details in there so you'll end up
> > being forced into buying a BK license otherwise you won't be able to make
> > any sense of what happened in the source tree, or even make it compile!
> > Isn't it pure paranoia?
> >
>
> What part of the structure of the BK repo don't you understand? Didn'y
> you pay attention to what Larry said? The tree looks like branches that
> always return to the trunk. To put this into CVS, he had to choose a
> line of those branches that contained the _most_ changesets (which
> doesn't always equate to the most important, or largest deltas). There
> are some changesets on the side that are not included here. Are all of
> those changesets empty merges? No.

Did you at least care to inspect the resulting CVS repository before
complaining? It looks you didn't, and still so quick to dismiss what Larry
and co have done.


Nicolas

2003-03-12 20:32:06

by Alan

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, 2003-03-12 at 17:08, Roman Zippel wrote:
> this was one of the conditions for the bk usage, so Larry cannot say, that
> you only get all the data if you use bk. If cvs can't represent all the
> information, we have to find another solution.

CVS can't represent it all because CVS isnt up to the job. If the rest
exists as comments then its your problem to write a VCS that can extract
the comment data and represent it in full

2003-03-12 20:36:22

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, 12 Mar 2003, Sam Ravnborg wrote:

> On Wed, Mar 12, 2003 at 11:51:20AM -0800, Larry McVoy wrote:
> > is what davej may have typed in as comments. We capture that as well, it
> > looks like this:
> >
> > revision 1.342
> > date: 2003/03/07 15:39:16; author: torvalds; state: Exp; lines: +7 -1
> > [PATCH] kbuild: Smart notation for non-verbose output
>
> Ho humm, I did this not Linus.
> Checked the web which is correct.
>
> Same goes for 1.340 for the Makefile. Kai did it, not Linus.

It seems that some things that should have been attributed to me (or others)
are listed as from torvalds too.

Example: drivers/char/tty_io.c

revision 1.59
date: 2003/03/04 02:13:05; author: torvalds; state: Exp; lines: +4 -6
small tty irq race fix

(Logical change 1.8144)


Nicolas

2003-03-12 20:39:33

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Larry McVoy wrote:
>>I thought that BK has been able to export everything to a text file
>>since the first version.
>
>
> bk export -tpatch -r1.900 > patch.1.900
> bk changes -v -r1.900 > comments.1.900
>
> Been there forever. So has ways to get all the metadata from the command
> line without having to reverse engineer the file format. See
>
> http://www.bitkeeper.com/manpages/bk-prs-1.html
>
> it's all there. Always has been.
>
> Wayne wanted me to point that it is easy to write the BK to CVS exporter
> completely from the command line, we prototyped it that way, the only
> reason we rewrote part of it in C was for performance. The point being
> that you guys could have done this yourself without help from us because
> all the metadata is right there. Ditto for anyone else worried about
> getting their data out of BK now or in the future. The whole point of
> prs is to be able to have a will-always-work way to get at the data or
> the metadata, it makes the file format a non-issue.
>

This is a Good Thing[TM] for a whole bunch of reasons.

Maybe this output could be made available automatically in addition to
the CVS tree? If bandwidth is a concern then I reiterate what I said
offline yesterday, if you can give me a ballpark idea of what the
requirements seem to be I'll start hunting for a place to park a
.kernel.org server dedicated to this task.

-hpa



2003-03-12 20:52:00

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 03:46:58PM -0500, Nicolas Pitre wrote:
> It seems that some things that should have been attributed to me (or others)
> are listed as from torvalds too.
>
> Example: drivers/char/tty_io.c
>
> revision 1.59
> date: 2003/03/04 02:13:05; author: torvalds; state: Exp; lines: +4 -6
> small tty irq race fix
>
> (Logical change 1.8144)

Yeah, I'm almost there, I'm pretty sure that what is happening is that
the user name is being picked up from the changeset which is current in
the path. We extract the user name and put it in the comments but I
don't see where we set $LOGNAME before doing the ci.

So here's a question. Suppose we have a series of deltas being clumped
together in a file. All made by different people. Whose name wins?
My gut is to sort them, run them through uniq -c, and take the top one.
The other idea is to count up lines inserted/deleted over each delta
and take the user who has done the most work.

Thoughts?
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 20:56:59

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 11:51:20AM -0800, Larry McVoy wrote:
> Well, I agree that they should be able to get at the information in a
> neutral way. I guess it was unrealistic, but I was expecting that people
> would go download the CVS tree and poke around and see if it is what they
> wanted. We could have had a nice technical discussion about what was
> missing, if anything. If the discussion had happened, they would have
> found out that even for the missing deltas we captured the information.
>
> Here's an example. Suppose the graph is like
>
> 1.1 (torvalds) -> 1.2 (alan) -> 1.3 (sct) -> 1.4 (torvalds)
> \ /
> \-> 1.1.1.1 (davej) --------/
>
> and we picked the straight 1.1 to 1.4 path. When we created the CVS 1.4
> delta, we knew that it was a merge delta and we needed to capture the
> data off on the branch. We already capture the contents, the missing part
> is what davej may have typed in as comments. We capture that as well, it
> looks like this:
>
> revision 1.342
> date: 2003/03/07 15:39:16; author: torvalds; state: Exp; lines: +7 -1
> [PATCH] kbuild: Smart notation for non-verbose output
>
> 2003/03/05 19:50:27-06:00 kai
> kbuild: Make build stop on vmlinux link error
>
> (Logical change 1.8166)
>
> That particular example is from the top level Makefile, Linus merged
> in Kai's work and we added the "kbuild: Make build stop on vmlinux link
> error" comments from the merged in delta. If there were more than one
> delta, they get merged as well, so the rlog output is completely accurate.

Larry, this brings up something I was meaning to ask you before this
thread exploded. What happens to those "logical change" numbers over
time?

My understanding (since you mentioned ~ 3 minute latency on BK pushes,
not five hour latency) was that future changes would go into CVS
incrementally. As far as I understand the "revision numbers" that BK
uses, they're subject to change. Based on what trees Linus merges
from, a revision number in his repository may not always have the same
number. So the comments will become out-of-date and inaccurate.

Or am I wrong about the potential for change within a single
repository?


--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer

2003-03-12 21:00:05

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, 12 Mar 2003, Larry McVoy wrote:

> On Wed, Mar 12, 2003 at 03:46:58PM -0500, Nicolas Pitre wrote:
> > It seems that some things that should have been attributed to me (or others)
> > are listed as from torvalds too.
> >
> > Example: drivers/char/tty_io.c
> >
> > revision 1.59
> > date: 2003/03/04 02:13:05; author: torvalds; state: Exp; lines: +4 -6
> > small tty irq race fix
> >
> > (Logical change 1.8144)
>
> Yeah, I'm almost there, I'm pretty sure that what is happening is that
> the user name is being picked up from the changeset which is current in
> the path. We extract the user name and put it in the comments but I
> don't see where we set $LOGNAME before doing the ci.
>
> So here's a question. Suppose we have a series of deltas being clumped
> together in a file. All made by different people. Whose name wins?
> My gut is to sort them, run them through uniq -c, and take the top one.
> The other idea is to count up lines inserted/deleted over each delta
> and take the user who has done the most work.
>
> Thoughts?

And/or add them all into the CVS log comment, with their full names when
available.

If such information can't be made into CVS directly then adding those to the
log comment is certainly the best thing to do.


Nicolas

2003-03-12 21:07:53

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> Larry, this brings up something I was meaning to ask you before this
> thread exploded. What happens to those "logical change" numbers over
> time?

They are stable in the CVS tree because the CVS tree isn't distributed.
So "Logical change 1.900" in the context of the exported CVS tree is
always the same thing. That's one advantage centralized has, things
don't shift around on you.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 21:08:03

by Eli Carter

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Larry McVoy wrote:
> On Wed, Mar 12, 2003 at 03:46:58PM -0500, Nicolas Pitre wrote:
>
>>It seems that some things that should have been attributed to me (or others)
>>are listed as from torvalds too.
>>
>>Example: drivers/char/tty_io.c
>>
>>revision 1.59
>>date: 2003/03/04 02:13:05; author: torvalds; state: Exp; lines: +4 -6
>>small tty irq race fix
>>
>>(Logical change 1.8144)
>
>
> Yeah, I'm almost there, I'm pretty sure that what is happening is that
> the user name is being picked up from the changeset which is current in
> the path. We extract the user name and put it in the comments but I
> don't see where we set $LOGNAME before doing the ci.
>
> So here's a question. Suppose we have a series of deltas being clumped
> together in a file. All made by different people. Whose name wins?
> My gut is to sort them, run them through uniq -c, and take the top one.
> The other idea is to count up lines inserted/deleted over each delta
> and take the user who has done the most work.
>
> Thoughts?

Another option:
Choose the name that _removed_ the most lines.

Reward the desired behaviour. ;)

Wha? Right, back to work.

Eli
--------------------. "If it ain't broke now,
Eli Carter \ it will be soon." -- crypto-gram
eli.carter(a)inet.com `-------------------------------------------------

2003-03-12 21:20:35

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 01:18:32PM -0800, Larry McVoy wrote:
> > Larry, this brings up something I was meaning to ask you before this
> > thread exploded. What happens to those "logical change" numbers over
> > time?
>
> They are stable in the CVS tree because the CVS tree isn't distributed.
> So "Logical change 1.900" in the context of the exported CVS tree is
> always the same thing. That's one advantage centralized has, things
> don't shift around on you.

OK, so the logical change numbers there are only related to the CVS
tree, not related to revision numbers in the BK tree being converted?
That makes more sense, thank you.

--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer

2003-03-12 21:23:19

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 04:31:08PM -0500, Daniel Jacobowitz wrote:
> On Wed, Mar 12, 2003 at 01:18:32PM -0800, Larry McVoy wrote:
> > > Larry, this brings up something I was meaning to ask you before this
> > > thread exploded. What happens to those "logical change" numbers over
> > > time?
> >
> > They are stable in the CVS tree because the CVS tree isn't distributed.
> > So "Logical change 1.900" in the context of the exported CVS tree is
> > always the same thing. That's one advantage centralized has, things
> > don't shift around on you.
>
> OK, so the logical change numbers there are only related to the CVS
> tree, not related to revision numbers in the BK tree being converted?

Correct. The BK revs are in there though, that's what

BKrev: <long string of bits>

is in the change log.

> That makes more sense, thank you.

Hey cool, you're welcome!
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 21:35:02

by Kai Germaschewski

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, 12 Mar 2003, Larry McVoy wrote:

> > Larry, this brings up something I was meaning to ask you before this
> > thread exploded. What happens to those "logical change" numbers over
> > time?
>
> They are stable in the CVS tree because the CVS tree isn't distributed.
> So "Logical change 1.900" in the context of the exported CVS tree is
> always the same thing. That's one advantage centralized has, things
> don't shift around on you.

Isn't there a more general problem, though? (I hope I'm wrong)

You want to update the CVS tree near-realtime. However, the longest-path
through your graph may change with new merges, but CVS of course cannot
cope with already committed data changing (already committed csets may
all of a sudden not be in the longest path anymore)? This is a CVS
limitation, of course, but still a problem AFAICS.

--Kai


2003-03-12 21:51:15

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 03:45:39PM -0600, Kai Germaschewski wrote:
> On Wed, 12 Mar 2003, Larry McVoy wrote:
>
> > > Larry, this brings up something I was meaning to ask you before this
> > > thread exploded. What happens to those "logical change" numbers over
> > > time?
> >
> > They are stable in the CVS tree because the CVS tree isn't distributed.
> > So "Logical change 1.900" in the context of the exported CVS tree is
> > always the same thing. That's one advantage centralized has, things
> > don't shift around on you.
>
> Isn't there a more general problem, though? (I hope I'm wrong)
>
> You want to update the CVS tree near-realtime. However, the longest-path
> through your graph may change with new merges, but CVS of course cannot
> cope with already committed data changing (already committed csets may
> all of a sudden not be in the longest path anymore)? This is a CVS
> limitation, of course, but still a problem AFAICS.

Yup, you're right, there is a tradeoff between real time updates and
best path. We've already seen it in incremental updates.

We were talking about this internally when your mail came in. I suspect
it isn't really a problem in practice because we can always redo the
entire export from scratch and get an optimal path.

Wayne pointed out that in the cases where it collapses a pile of csets
that is usually because Linus pulled some wad from somebody and one could
argue the collapse is a good thing. But it depends, sometimes it is and
sometimes it isn't. Our commercial users have frequently asked for a
way to "collapse the tree and clean up the noise in the graphs", in fact,
one called this morning and said "that BK to CVS thing, could that be a BK
to cleaner-BK thing?" so opinions vary on what is the perfect granularity.

My belief is that the real time updates is something that people value
more than the granularity. You guys can vote and if you reach agreement
we'll do what you want.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 22:13:00

by David Lang

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

and if you did real-time updates but once a month or so redid the
'longest-path' thing that would change the CVS version info, correct?

David Lang

On Wed, 12 Mar 2003, Larry McVoy wrote:

> Date: Wed, 12 Mar 2003 14:01:56 -0800
> From: Larry McVoy <[email protected]>
> To: Kai Germaschewski <[email protected]>
> Cc: Larry McVoy <[email protected]>, [email protected]
> Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)
>
> On Wed, Mar 12, 2003 at 03:45:39PM -0600, Kai Germaschewski wrote:
> > On Wed, 12 Mar 2003, Larry McVoy wrote:
> >
> > > > Larry, this brings up something I was meaning to ask you before this
> > > > thread exploded. What happens to those "logical change" numbers over
> > > > time?
> > >
> > > They are stable in the CVS tree because the CVS tree isn't distributed.
> > > So "Logical change 1.900" in the context of the exported CVS tree is
> > > always the same thing. That's one advantage centralized has, things
> > > don't shift around on you.
> >
> > Isn't there a more general problem, though? (I hope I'm wrong)
> >
> > You want to update the CVS tree near-realtime. However, the longest-path
> > through your graph may change with new merges, but CVS of course cannot
> > cope with already committed data changing (already committed csets may
> > all of a sudden not be in the longest path anymore)? This is a CVS
> > limitation, of course, but still a problem AFAICS.
>
> Yup, you're right, there is a tradeoff between real time updates and
> best path. We've already seen it in incremental updates.
>
> We were talking about this internally when your mail came in. I suspect
> it isn't really a problem in practice because we can always redo the
> entire export from scratch and get an optimal path.
>
> Wayne pointed out that in the cases where it collapses a pile of csets
> that is usually because Linus pulled some wad from somebody and one could
> argue the collapse is a good thing. But it depends, sometimes it is and
> sometimes it isn't. Our commercial users have frequently asked for a
> way to "collapse the tree and clean up the noise in the graphs", in fact,
> one called this morning and said "that BK to CVS thing, could that be a BK
> to cleaner-BK thing?" so opinions vary on what is the perfect granularity.
>
> My belief is that the real time updates is something that people value
> more than the granularity. You guys can vote and if you reach agreement
> we'll do what you want.
> --
> ---
> Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-03-12 22:20:37

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 02:21:49PM -0800, David Lang wrote:
> and if you did real-time updates but once a month or so redid the
> 'longest-path' thing that would change the CVS version info, correct?

Exactly. So if we redo it then anyone who has active CVS workspaces will
get the wrong thing when they update if the revs have moved around and
they will.

I suspect the right answer is that we do the real time updates, see how it
goes, if it starts to suck we'll periodically toss the CVS tree and start
over.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-12 23:08:52

by Andreas Dilger

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mar 12, 2003 14:30 -0800, Larry McVoy wrote:
> On Wed, Mar 12, 2003 at 02:21:49PM -0800, David Lang wrote:
> > and if you did real-time updates but once a month or so redid the
> > 'longest-path' thing that would change the CVS version info, correct?
>
> Exactly. So if we redo it then anyone who has active CVS workspaces will
> get the wrong thing when they update if the revs have moved around and
> they will.
>
> I suspect the right answer is that we do the real time updates, see how it
> goes, if it starts to suck we'll periodically toss the CVS tree and start
> over.

What you could do is have a CVS "realtime" branch which is forked from the
trunk, say once a week, or whenever Linux makes a point release. On this
branch you do incremental updates as they are merged into CVS. When it is
time to create a new branch (say for 2.5.99-pre12), you re-do the export from
the branch base tag (at 2.5.99-pre11) to the current BK head in an "optimal"
way, and retag the "realtime" branch off of the new base tag.

We do this in our current CVS development, and CVS is smart enough to keep
local changes over the update and/or generate conflicts. That way, most
people can have a simple-but-good trunk to follow, but people who want
up-to-the-second updates can have it too, and you don't end up renumbering
the CVS revisions for times in the past. It also avoids the need to re-crunch
the entire BK repository to get an optimal path (which is only going to get
slower as time goes on, and I'm not sure whether Linux development is ahead
of or behind Moore's law).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2003-03-12 23:47:48

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On Wed, 12 Mar 2003, Jeff Garzik wrote:

> It's getting to the point where it seems like every time BitMover does
> something to appease the non-BK crowd, people (a) read evil intentions
> into the action, and/or (b) just keep asking for more.

Where did you see anyone asking for more? People are worried, that they
suddenly have less. If Larry is taking the ability away to extract
information from the SCCS files, something has to replace this. Especially
SCM developers are worried that they loose information, the Linux
repository is a very valuable resource and since they can't use bk, they
always depend on others to get to that information as complete as
possible.

> Can someone please tell what is the overall point? what is the endgame
> here? what is a concrete technical solution that will satisfy you
> (Ben), Pavel, and everybody else?

Well, all this reminds me a bit of a missionary trying to convert the
pagans to the one true god and all the true believers can't understand,
why they don't want to be converted. If we could get our preacher to
preach a little less, we all could live peacefully together. :-)

bye, Roman

2003-03-13 00:28:52

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> You're missing the point. I am not against the CVS->BK gateway. I'm all
> for it. But it's kind of sour given that he now wants to change the disk
> format of the repo to make it harder to get the data from it.
>
> If all he announced was "you now have a CVS->BK repo", I wouldn't be
> complaining, I'd be patting him on the back.

As long as we continue to get all the data in an open format, I'm
not sure this really matters, personally. If there's some data loss,
let's focus on that issue ... but it seems there isn't at the moment.

I'd rather we *didn't* go trying to clone BK and make it file-format
compatible underneath ... that seems more incendiary than useful.
Cloning other products is always a loosing game, the best you can do
is catch them. Personally, I'd prefer we spent the effort making a
usable simple SCM that 95% of us can use that does merges and stuff,
and not bother trying to follow someone else in file format.

Of course, I'm in no position to dictate to others what they should
implement, do what you like ... just my personal opinion. But there's
always the possiblity we can make something that fits kernel development
*better*, rather than playing catchup to BK all the time ;-)

M.

2003-03-13 00:31:11

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> > > revision 1.59
> > > date: 2003/03/04 02:13:05; author: torvalds; state: Exp; lines: +4 -6
> > > small tty irq race fix
> > >
> > > (Logical change 1.8144)
> >
> > Yeah, I'm almost there, I'm pretty sure that what is happening is that
> > the user name is being picked up from the changeset which is current in

OK, I think I have a fix, I'm starting another conversion and going out to
dinner. This was a lot harder to fix than I thought, it cost me the whole
day. I can't wait until Pavel is doing this crap.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-13 00:45:46

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> Of course, I'm in no position to dictate to others what they should
> implement, do what you like ... just my personal opinion. But there's
> always the possiblity we can make something that fits kernel development
> *better*, rather than playing catchup to BK all the time ;-)

I like it when I agree with people, especially you since we've bumped
heads. It's much more fun to agree...

My personal opinion is that BK maps only so so well onto the kernel
development effort. It's not horrible, it's closer than any other SCM,
but it could be better. The kernel guys tend to be "more loose" than
commercial guys, i.e., stuff is tried, it sits in Alan's tree for a
while or DaveJ's tree and then is rejected if it is found to be bad.
You really need a sort of "lossy" SCM system, one which is willing to
throw data away. BK is absolutely not about losing information, we view
everything as valuable, even bad ideas. That matches the commercial
world better than the Linux world.

I _think_ that Arch is closer. You will definitely give up some stuff
if you move to Arch but you will also gain some stuff. Arch is willing
to pick and choose, we aren't, we're sort of an all or nothing answer.
Pavel is all hot and bothered about PRCS but PRCS is sort of BK without
the distribution, gui tools, and scripting. It's a step backwards as
far as I can tell (don't get me wrong, we've acknowledged the coolness
of PRCS on our website for years and I tried to team up with Josh, I'm
a fan). You should really look at Arch, it may be a better fit. And
these days, if you could find a better fit, none of us at BitMover
would shed a tear if you moved off BK. This has *not* been a pleasant
experience for us.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-13 01:48:03

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 09:14:16PM +0100, Sam Ravnborg wrote:
> On Wed, Mar 12, 2003 at 11:51:20AM -0800, Larry McVoy wrote:
> > is what davej may have typed in as comments. We capture that as well, it
> > looks like this:
> >
> > revision 1.342
> > date: 2003/03/07 15:39:16; author: torvalds; state: Exp; lines: +7 -1
> > [PATCH] kbuild: Smart notation for non-verbose output
>
> Ho humm, I did this not Linus.
> Checked the web which is correct.
>
> Same goes for 1.340 for the Makefile. Kai did it, not Linus.

There is a "fixed" (I hope) linux-2.4 tree up. We're still converting the
2.5 tree, ETA is about 6 hours (the fix substantially slowed down the
coversion process, did I mention that this stuff is a pain?). I'm going
out for a while but I'll send out mail when the 2.5 tree is up.

If you have worked on files in 2.4 please go poke at them at

cvs -d:pserver:[email protected]:/home/cvs rlog linux-2.4/<your file>

and see if you think that is accurate. Let me know either way. Thanks.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-13 02:48:01

by Aaron Lehmann

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 01:34:13PM -0500, Ben Collins wrote:
> But being a person who also has certain beliefs, I am not going to stand
> on the side lines and watch the fight.

Then I guess you'll get killfiled by a lot of people. I don't like
bk's licensing either but I'm not going to whine for months about it.
Linus knows how people feel about it and he's made a decision to use
bk anyway. People have tried to change his mind, and they have failed.
If you have a problem with bk, please just ignore it. Send diffs the
old way. Why not pick on Microsoft or Adobe? When you complain about
Larry trying to steal 10% of your data (wtf? Linus still releases
regularly. I don't see how we have any less data than before even
without BK), you're wasting your energy and annoying the huge fraction
of people who read this mailing list who don't care what software
Linus or anyone else wants to run.

The only thing that gets on my nerves is when people don't bother to
make GNU patches available. If Larry is able to turn the diff exporter
on eventually, this should make the problem almost moot. To someone
who doesn't use BK, people posting changesets would essentially just
be hosting their patches on bkbits.net. And if they want to host them
there, who am I to tell them no, no matter how they get there or what
format they're stored in internally?

2003-03-13 07:48:58

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 10:03:04AM -0800, Larry McVoy wrote:
> > I thought that BK has been able to export everything to a text file
> > since the first version.
>
> bk export -tpatch -r1.900 > patch.1.900
> bk changes -v -r1.900 > comments.1.900
>
> Been there forever.

More importantly, even if someone isn't allowed to use the BK command
line tool because once upon time, a long time ago, they submitted a
patch to arch or subversion, they can still find someone is allowed to
set up a bk daemon under the terms of the FUL, connect to the BK
daemon using a http client, and extract the full diff of any changeset
that way. This doesn't have to be the bkd on bkbits.net; anyone who
is authorized to use BK under the terms of the FUL can set up a bk
daemon to be listening on a port of any machine for which they have
shell access (it doesn't even require root privs). And every last
changeset can always be made available using this path.

So to the people are complaining that they won't be able to get out
their data if a future version of BK uses a more powerful
representation than SCCS files ---- would you like some more whine
with your cheese?

- Ted

2003-03-13 09:33:16

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)


Apparently linux-2.5/ChangeSet is an empty file?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2003-03-13 09:48:16

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On Thu, 13 Mar 2003, Theodore Ts'o wrote:

> So to the people are complaining that they won't be able to get out
> their data if a future version of BK uses a more powerful
> representation than SCCS files ---- would you like some more whine
> with your cheese?

Oh, thank you, but I'd like some of the stuff that all the bk fan boys
must be taking.

bye, Roman

2003-03-13 15:27:36

by David Mansfield

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)


Hi Larry,

I've been reading this thread, and I think the CVS repository you set up
is a great service. I have a request to improve the quality of the data.

If you want to skip straight to the suggestion, goto SUGGESTION.

I am maintainer of a handy GPLed utility called 'cvsps' (plug:
http://www.cobite.com/cvsps) which extracts 'patchset' information from a
cvs repository by parsing the 'cvs log' output. It attempts to recreate a
commit as a single atomic action, and all the branch and tag gook that
goes with it.

It's a read-only tool that I find useful to see what is going on in a cvs
repository.

Back to you: I've looked at the CVS log output from your repository, and
had my program parse it back into 'patchsets' but it's not doing a great
job because the log messages from separate parts of the commit are
different.

This is fine, because I can easily write a 'hack' to look explicitly for
the '(Logical change x.yyyyy)' text to group individual file commits back
into patchsets.

But this text is missing from the 'main' file commit (to the ChangeSet
file) that has the BKrev: tag in it.

SUGGESTION:
Put the '(Logical change x.yyyy)' text into EVERY log message that is a
port of the logical change, including the 'main' commit to the ChangeSet,
that commit has the BKrev: in it (it's missing from this one file's log
message).

Then I can make a '--bk' hack to my program to use this 'key' to recreate
the commits.

Let me know what you think,
David

--
/==============================\
| David Mansfield |
| [email protected] |
\==============================/

2003-03-13 15:31:47

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> SUGGESTION:
> Put the '(Logical change x.yyyy)' text into EVERY log message that is a
> port of the logical change, including the 'main' commit to the ChangeSet,
> that commit has the BKrev: in it (it's missing from this one file's log
> message).

The x.yyyy is the revision number of the ChangeSet file. So the information
is redundant.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-13 21:37:02

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Larry McVoy <[email protected]> said:

[...]

> So here's a question. Suppose we have a series of deltas being clumped
> together in a file. All made by different people. Whose name wins?
> My gut is to sort them, run them through uniq -c, and take the top one.
> The other idea is to count up lines inserted/deleted over each delta
> and take the user who has done the most work.

Say Several and add the list to the comment (after sort | uniq)?
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2003-03-13 21:52:29

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Larry McVoy <[email protected]> said:
> On Wed, Mar 12, 2003 at 03:45:39PM -0600, Kai Germaschewski wrote:
> > On Wed, 12 Mar 2003, Larry McVoy wrote:
> > > > Larry, this brings up something I was meaning to ask you before this
> > > > thread exploded. What happens to those "logical change" numbers over
> > > > time?
> > >
> > > They are stable in the CVS tree because the CVS tree isn't distributed.
> > > So "Logical change 1.900" in the context of the exported CVS tree is
> > > always the same thing. That's one advantage centralized has, things
> > > don't shift around on you.
> >
> > Isn't there a more general problem, though? (I hope I'm wrong)
> >
> > You want to update the CVS tree near-realtime. However, the longest-path
> > through your graph may change with new merges, but CVS of course cannot
> > cope with already committed data changing (already committed csets may
> > all of a sudden not be in the longest path anymore)? This is a CVS
> > limitation, of course, but still a problem AFAICS.
>
> Yup, you're right, there is a tradeoff between real time updates and
> best path. We've already seen it in incremental updates.
>
> We were talking about this internally when your mail came in. I suspect
> it isn't really a problem in practice because we can always redo the
> entire export from scratch and get an optimal path.

Then the CVS tree won't be stable, and so useless to remote people just
wanting to "cvs update" their stuff. Or am I missing something here?

> Wayne pointed out that in the cases where it collapses a pile of csets
> that is usually because Linus pulled some wad from somebody and one could
> argue the collapse is a good thing. But it depends, sometimes it is and
> sometimes it isn't. Our commercial users have frequently asked for a
> way to "collapse the tree and clean up the noise in the graphs", in fact,
> one called this morning and said "that BK to CVS thing, could that be a BK
> to cleaner-BK thing?" so opinions vary on what is the perfect
> granularity.

I'd add the possibility to group csets into super-csets (and so on, why
not?), without ever losing the individual pieces. Masoch^Wadventurous folks
could then grovel around inside as needed. Linus' 2.5.63 --> 2.5.64 would
then just be such a super-cset, separately manipulable.

No, I have no clue of how to do this.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2003-03-13 23:21:43

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On 12 Mar 2003, Alan Cox wrote:

> > this was one of the conditions for the bk usage, so Larry cannot say, that
> > you only get all the data if you use bk. If cvs can't represent all the
> > information, we have to find another solution.
>
> CVS can't represent it all because CVS isnt up to the job. If the rest
> exists as comments then its your problem to write a VCS that can extract
> the comment data and represent it in full

This would require the full data, I looked at it and neither the CVS tree
nor the Web interface has everything.
- the changeset ids are missing, if bk does its renumbering thing
ids in comments become useless
- complete changesets are missing, RCS is quite capable of branches, so
there is no need of merging patches into a single patch.
- merge information is missing, e.g. what branch was merged into which
changeset.

I attached an example RCS file, how that could look like. I had to guess
the merging information and tags are missing.
It would be really useful to have this information as well.

bye, Roman


Attachments:
sys_ia32.c,v.gz (59.37 kB)

2003-03-13 23:16:53

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Thu, Mar 13, 2003 at 10:43:53AM +0100, Geert Uytterhoeven wrote:
> Apparently linux-2.5/ChangeSet is an empty file?

Yeah, it's just a place holder for the ChangeSet comments. The changeset
boundaries are implicit in the dates and explict with the (Logical change x.y)
markers.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-13 23:29:42

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 09:14:16PM +0100, Sam Ravnborg wrote:
> On Wed, Mar 12, 2003 at 11:51:20AM -0800, Larry McVoy wrote:
> > is what davej may have typed in as comments. We capture that as well, it
> > looks like this:
> >
> > revision 1.342
> > date: 2003/03/07 15:39:16; author: torvalds; state: Exp; lines: +7 -1
> > [PATCH] kbuild: Smart notation for non-verbose output
>
> Ho humm, I did this not Linus.
> Checked the web which is correct.
>
> Same goes for 1.340 for the Makefile. Kai did it, not Linus.

Can you look over the linux-2.4 tree? I've done another pass on it.
The fixes slowed down the export process, we're looking at about 7 hours
of CPU time to get the 2.5 tree converted and I just started a run.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-14 08:43:16

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Thu, 13 Mar 2003, Larry McVoy wrote:
> On Thu, Mar 13, 2003 at 10:43:53AM +0100, Geert Uytterhoeven wrote:
> > Apparently linux-2.5/ChangeSet is an empty file?
>
> Yeah, it's just a place holder for the ChangeSet comments. The changeset
> boundaries are implicit in the dates and explict with the (Logical change x.y)
> markers.

Ah, IC. Then I must have misread your mail, and incorrectly understood that you
added a ChangeSet file with a list of comments.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2003-03-14 19:04:17

by Larry McVoy

[permalink] [raw]
Subject: BK->CVS (2.4 + 2.5 updates)

Updates from the latest 2.4 and 2.5 BK trees have been applied to the
kernel.bkbits.net tree. If you did a checkout, try an update and let
me know how it goes.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-15 16:41:42

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

> > I suspect the right answer is that we do the real time updates, see how it
> > goes, if it starts to suck we'll periodically toss the CVS tree and start
> > over.
>
> What you could do is have a CVS "realtime" branch which is forked from the
> trunk, say once a week, or whenever Linux makes a point release.

I'm not sure it is worth it. If you are using BK, run revtool and look at
the recent history in 2.5. I just updated the CVS tree on kernel.bkbits.net
and looked carefully at the collapsing it did. It collapsed a pile of Greg's
stuff into one cset, but that's actually OK as far as I can tell, it's all
related. And there was more work on the other path.

We've done several updates to the 2.5 tree and so far the number of changesets
we would have gotten if we had done it all in one pass and the number that
we actually got is identical. So maybe the reality of the incremental updates
is better than we expected.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-15 20:18:35

by Pavel Machek

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi!

> (actually Wayne Scott) did was to write a graph traversal alg which
> finds the longest path through the revision history which includes
> all tags. For the 2.5 tree, that is currently 8298 distinct points.
> Each of those points has been captured in CVS as a commit. If we did

As far as I can see, linux-2.5 repository has over 17000 ChangeSets,
that means half the granularity. Would it be possible to use cvs branches
to capture tree structure and have special form of commit comment
"this is merge of changeset 1.2.3.4"?
That way BK->CVS conversion could
preserve all the data...
Pavel
--
Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...

2003-03-16 03:04:23

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Tue, Mar 11, 2003 at 08:39:19PM -0800, H. Peter Anvin wrote:
> Personally, I value my freedom to hack on whatever I want a lot more
> than the convenience of BK. [..]

Same here ;)

Andrea

2003-03-16 03:34:34

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 01:47:10PM -0500, Ben Collins wrote:
> On Wed, Mar 12, 2003 at 07:38:06PM +0100, Arador wrote:
> > On Tue, 11 Mar 2003 23:16:21 -0500
> > Ben Collins <[email protected]> wrote:
> >
> > > You've made quite a marketing move. It's obvious to me, maybe not to
> > > others. By providing this CVS gateway, you make it almost pointless to
> > > work on an alternative client. Also by providing it, you make it easier
> >
> > I don't think so. This also bits Larry. If he does well enought, there'll be
> > some people here that won't use bitkeeper just because they can use the cvs
> > gateway and they don't need/miss the features they could get with bk.
> >
> > And i don't think it avoids creating a free bk clone. I guess that there's
> > a lot of people out there interested in such tool, and not only for kernel
> > development; this won't stop them.
> >
> > As far as i can see; Larry is just wasting time (money) to help the kernel
> > development and people who doesn't use BK just because it isn't free. And
> > he's not charging me, so i find this a good movement for everybody. I only
> > can say thanks.
>
> You're missing the point. I am not against the CVS->BK gateway. I'm all
> for it. But it's kind of sour given that he now wants to change the disk
> format of the repo to make it harder to get the data from it.

Ben,

You shouldn't care less of the disk format. You *can't* run bk in the
first place to reach those files, it's by pure luck that somebody is
been fine to give away his right to write free software (oh and
proprietary software too but we don't care :) in the SCM arena and to
provide this info to you via rsync or whatever proxy or open protocol
that tytso mentioned is doable.

Before you can remotely care about the disk format you've to reverse
engeneer the network protocol first, having more proprietary stuff there
won't make differences for us. And of course it makes perfect sense for
Larry to hide the stuff better, but even if he encrypts it, the secret
key has to be in the bk binary, I mean, it's all in open source assembly
anyways, if you figured out the network protcol, you shouldn't have an
order of magnitude more of troubles to figure out the new file format
too. NOTE: I don't want to discuss the legal details of reading the
open source assembly, this was only an example ;).

really, what we care is the data, and what I discussed in the last weeks
with Larry about the kernel CVS at first sigh seems enough for kernel
developers, what matters is the _mainline_ evolution. All other trees
matters much less (and NOTE: all important non mainline trees don't use
bitkeeper anyways). If getting the changesets with dates will be too
hard I assume Larry could help on it. Some script should do it pretty
well thanks to the logical tag in the log. I know it's not the most
useful format for export but this is reliable, documented and open and
it makes it trivial to checkout and search the file logs. which makes it
very usable immediatly w/o the need of new software which is good for us
kernel developers in the short term. This is a good short/mid term solution.

IMHO cloning bitkeeper would be an option if Larry would be supporting
it, but that is obviously not the case.

There is no point to complain about the change of format of files in the
Larry has all the rights to change the file format even after you
reverse engeneered it the first second third fourth time, so all your
effort will break in seconds. You can spend the rest of your life to
keep up with Larry and he'll always be ahead of you. We have to do this
with the SMB protocol because there's no open ""exporter"", but here
Larry provided the data, and the data belongs to the community in the
first place so there's no need to slowdown innovation here trying to
catch up with closed proprietary protocols. And note: if you don't like
that linux is developed with bk you should speak with Linus not with
Larry. That is Linus's choice, Larry couldn't make that change.

If you complain about the file format change, it means you realized
right now you did a mistake in depending on bk in the first place.

I think we reached a point of balance here that will solve all the
collisions. The CVS is a "stability" point. The lack of
data-availability with CVS or similar open protocol would force us to
reverse engeneer bk to access the data, and the availability of CVS
immediatly make us wasting time reverse engeneering bk. Cloning
bitkeeper is a waste of time if the CVS just exports the data correctly.

Please focus on this: the only thing we miss is the visibility of the
jfs tree and similar other bits that aren't even guaranteed to be merged
in mainline. But that doesn't worry me at all, in one year from now if
the jfs tree didn't merge correctly it won't matter what was in such
dead tree.

If you want to contribute, stop these threads, and start importing CVS
into a more powerful SCM and let us know an URL where we can access the
data from there. I will only answer to a working URL, either that or
live with CVS. The SCM can be evolved over tiem. If this new underground
domain will be better than bitkeeper than jfs and Linus as well could
join us in the future. In the meantime CVS will do fine and it
guarantees the openess of the linux info. As you probably know I don't
have much time in helping with SCM developement by I can t try give my
$.02 (or at the very least I want to be still allowed to give my $.02! ;).

I won't answer further emails about this issue to avoid hurting the l-k
traffic too much (the last bk threads even made me overlook the BK->CVS
announcement after a one day and half of email backlog go figure ;).

And of course many thanks to Larry for the BK->CVS effort! While I think
it was due, it certainly takes some relevant effort to do it.

Andrea

2003-03-16 03:37:47

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Wed, Mar 12, 2003 at 01:34:13PM -0500, Ben Collins wrote:
> or any others, for whatever reason, we are locked out of that original
> dataset.

true but the missing bits are nearly worthless, I wouldn't be ok with
CVS if this wasn't the case. I mean, in the very worst case, we're not
totally screwed, we probably won't even notice the difference.

Andrea

2003-03-16 17:34:22

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On Sun, 16 Mar 2003, Andrea Arcangeli wrote:

> true but the missing bits are nearly worthless, I wouldn't be ok with
> CVS if this wasn't the case. I mean, in the very worst case, we're not
> totally screwed, we probably won't even notice the difference.

The missing bits are absolutely not worthless. They are very useful when
you want to test other SCM system to simulate distributed development.

bye, Roman

2003-03-16 18:43:56

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Sun, 16 Mar 2003, Roman Zippel wrote:

> Hi,
>
> On Sun, 16 Mar 2003, Andrea Arcangeli wrote:
>
> > true but the missing bits are nearly worthless, I wouldn't be ok with
> > CVS if this wasn't the case. I mean, in the very worst case, we're not
> > totally screwed, we probably won't even notice the difference.
>
> The missing bits are absolutely not worthless. They are very useful when
> you want to test other SCM system to simulate distributed development.

This is completely ridiculous. Isn't this a bit too demanding? What will
be next?

Be realistic. The missing bits are worthless and add absolutely no value to
kernel development which is supposed to be the topic for this mailing list.

It's not the missing bits that will prevent you from making a better
alternative to BK or whatever either. If it really does you should consider
spending your time on another project. But since I truly believe you are
more clever than that I suspect you're just trying to stretch the issue out
of reasonable bounds because of your political beliefs.


Nicolas

2003-03-16 19:22:37

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On Sun, 16 Mar 2003, Nicolas Pitre wrote:

> > The missing bits are absolutely not worthless. They are very useful when
> > you want to test other SCM system to simulate distributed development.
>
> This is completely ridiculous. Isn't this a bit too demanding?

Not really, it's actually more simple to what Larry is currently offering.
A simply SCCS to RCS converter would be enough. Merging information is
easy to add as well. If you now also add a sequence number is quite simple
to modify a CVS server which can export the data reliably.

> Be realistic. The missing bits are worthless and add absolutely no value to
> kernel development which is supposed to be the topic for this mailing list.

If you want to test an alternative system to see whether it's usable for
kernel development, what better data is there? How could you compare it
against bk?

bye, Roman

2003-03-16 19:19:29

by Shawn

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Sun, 2003-03-16 at 12:54, Nicolas Pitre wrote:
> On Sun, 16 Mar 2003, Roman Zippel wrote:
> > Hi,
> > On Sun, 16 Mar 2003, Andrea Arcangeli wrote:
> > > true but the missing bits are nearly worthless, I wouldn't be ok with
> > > CVS if this wasn't the case. I mean, in the very worst case, we're not
> > > totally screwed, we probably won't even notice the difference.
> > The missing bits are absolutely not worthless. They are very useful when
> > you want to test other SCM system to simulate distributed development.
> This is completely ridiculous. Isn't this a bit too demanding?

I think so.

> Be realistic. The missing bits are worthless and add absolutely no value to
> kernel development which is supposed to be the topic for this mailing list.

They are not useless. It definitely adds value. It's just that it's more
than developers had before, and they want all the extra BK gives them
without having to say "ok" to a license which they do not like.

Fact is, folks simply forget to make sense sometimes.

> It's not the missing bits that will prevent you from making a better
> alternative to BK or whatever either. If it really does you should consider
> spending your time on another project. But since I truly believe you are
> more clever than that I suspect you're just trying to stretch the issue out
> of reasonable bounds because of your political beliefs.

2003-03-16 21:41:36

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

[ hoping this is the last email in this thread, I know I'm not
contributing to this reach this objective :) ]

On Sun, Mar 16, 2003 at 08:33:18PM +0100, Roman Zippel wrote:
> Hi,
>
> On Sun, 16 Mar 2003, Nicolas Pitre wrote:
>
> > > The missing bits are absolutely not worthless. They are very useful when
> > > you want to test other SCM system to simulate distributed development.
> >
> > This is completely ridiculous. Isn't this a bit too demanding?
>
> Not really, it's actually more simple to what Larry is currently offering.
> easy to add as well. If you now also add a sequence number is quite simple
> to modify a CVS server which can export the data reliably.

CVS basically exports RCS through the network, your argument makes no
sense to me, what's the difference, I don't see what you mean.

> > Be realistic. The missing bits are worthless and add absolutely no value to
> > kernel development which is supposed to be the topic for this mailing list.
>
> If you want to test an alternative system to see whether it's usable for
> kernel development, what better data is there? How could you compare it
> against bk?

Larry has all the rights to not to help providing a testcase, it makes
no sense for you to complain he's not providing a testcase for a
competitive system. It make no sense just like complaining that if Larry
changes the bk format to something encrypted compressed or .doc. he has
the rights to do it, so please stop raising pointless arguments.

You could make a bit more of sense if your argument was that you still
miss the visibility on the jfs developement or similar, but the fact a
"testcase" for a competitive SCM this way is missing makes no sense at
all. Infact a much testcase is not missing! just give us the alternative
open SCM and we'll be glad to try using it in real life, which is an
order of magnitude better testcase than feeding the old data into the
repository offline.

If you're still unhappy now that the mainline data is open it means
you're either a jfs developer (but I assume they're all fine with bk
since they're just using it, so I doubt this is the case) or your
problem is that you don't like the fact that Linux is still developed
with proprietary software but in such case go speak with Linus not with
Larry.

>From my part - now that the full data and metadata of the main branch is
available in the open in a usable form - I have no problem anymore with
Linus using bitkeeper. I'm not religious about Linux, I'm only religious
about my freedom. Sure, now I would like if Linus and Marcelo would be
the only one using bitkeeper (so CVS would miss zero info), yes, but
really all other branches are of nearly zero interest to me compared to
the main branch and usually important branches like the jfs one (I don't
know the others since I can't see them, I only know the jfs one because
it gave me troubles with bkweb, this is why I'm only mentioning such
one) can be retrieved via other methods (like asking the developers by
emails).

Now I need to write tools to extract the stuff and parse it with more
intelligent software than CVS, one of those tools is just available, so
please stop these complains, and help writing a reliable changelog
extractor using the dates and verifying the stuff with the logic tag. It
doesn't matter if the CVS protocol is good or bad, as far as the whole
mainline kernel evolution data is available reliably in the open, so
from my part I'm extremely happy because I finally have a chance to
start appreciating the advantages of Linus and Marcelo using bitkeeper ;)

Andrea

2003-03-17 01:07:17

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On Sun, 16 Mar 2003, Andrea Arcangeli wrote:

> > Not really, it's actually more simple to what Larry is currently offering.
> > A simply SCCS to RCS converter would be enough. Merging information is
> > easy to add as well. If you now also add a sequence number is quite simple
> > to modify a CVS server which can export the data reliably.
>
> CVS basically exports RCS through the network, your argument makes no
> sense to me, what's the difference, I don't see what you mean.

RCS is not that limited. It's very simple to take all deltas from a SCCS
file and put it into a RCS file:

for delta in (all sorted deltas)
get -rdelta foo.c
ci -rdelta foo.c

Now one needs a little knowledge about the SCCS format. New deltas are
added at the top and deltas have their own sequence numbers, it's no
problem to add this sequence number to a RCS delta. This sequence number
can be used by CVS to export the data reliable, the client would simply
see 1.sequencenr as the version number.
If one looks now at the bk SCCS files, it's pretty easy to guess, which
deltas are merges, so it should be really no problem to add this info to
the RCS file as well.

> > If you want to test an alternative system to see whether it's usable for
> > kernel development, what better data is there? How could you compare it
> > against bk?
>
> Larry has all the rights to not to help providing a testcase, it makes
> no sense for you to complain he's not providing a testcase for a
> competitive system. It make no sense just like complaining that if Larry
> changes the bk format to something encrypted compressed or .doc. he has
> the rights to do it, so please stop raising pointless arguments.

Well, this wasn't the deal. Larry doesn't own the data, he can't say that
you only get all the data, if you use bk. One of the arguments for the
move to bk was that the format was open and the data wasn't locked in.
Technically it makes a lot of sense to move to a different format, but if
he needs help to convert the data into proper RCS format, he only needs to
ask, I'd be happy to help.

bye, Roman

2003-03-17 01:25:08

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon, Mar 17, 2003 at 02:18:03AM +0100, Roman Zippel wrote:
> Well, this wasn't the deal. Larry doesn't own the data, he can't say that
> you only get all the data, if you use bk.

Perhaps you can explain the benefits to BitMover of this "deal". If you
are going to say that we get tons of marketing and sales from the free
use of BK, forget about it. Sales in this product space are made at
the CEO/CTO/VP level and I can assure you they don't read this mailing
list, they don't read slashdot, and if they did they would view what
we are doing as too risky. I tend to agree with them.

So what's the part of the "deal" that benefits BitMover?

As for the data, you are right, we don't own that. As for the metadata
which makes BK work, that's ours, not yours. BK made that metadata,
you did not. If you don't like those terms, convince Linus and friends
to get off of BK. That would be just fine with us.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-17 01:46:06

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On Sun, 16 Mar 2003, Larry McVoy wrote:

> As for the data, you are right, we don't own that. As for the metadata
> which makes BK work, that's ours, not yours. BK made that metadata,
> you did not. If you don't like those terms, convince Linus and friends
> to get off of BK. That would be just fine with us.

This is getting ridiculous. Thank you, I have no doubts anymore that for
you your ego is more important, than working with the community.

bye, Roman

Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Larry McVoy <[email protected]> writes:

>use of BK, forget about it. Sales in this product space are made at
>the CEO/CTO/VP level and I can assure you they don't read this mailing
>list, they don't read slashdot, and if they did they would view what
>we are doing as too risky. I tend to agree with them.

>So what's the part of the "deal" that benefits BitMover?

Sales in this product space are made by CEO/CTO/VP level after their
engineers, which will have to use the product later did evaluation and
investigation of a product. No (rotm) CEO/CTO/VP will ever _use_ this
product. He will buy whatever got recommended by the engineers because
they have to use it.

And they _do_ read the mailing lists / Slashdot.

You're throwing up smoke screens again. Larry, please try to be honest
once. It is no bad thing that you lobby your (obviously fine) product
by giving it away for free for kernel development. Why don't you
simply admit it and be proud of it? Why do you try to be "more holy
than holy?"

Your hidden agenda is showing once again quite clearly.

Regards
Henning

--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen INTERMETA GmbH
[email protected] +49 9131 50 654 0 http://www.intermeta.de/

Java, perl, Solaris, Linux, xSP Consulting, Web Services
freelance consultant -- Jakarta Turbine Development -- hero for hire

2003-03-17 14:08:04

by Wayne Scott

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

From: Pavel Machek <[email protected]>
> As far as I can see, linux-2.5 repository has over 17000 ChangeSets,
> that means half the granularity.

I assume this has already been answered since this is Monday morning
and I haven't finished my mountain of email (I try not to read it on
weekends), but I will answer this anyway.

The ChangeSet file has many csets and we only capture around 1/2 of
them in CVS ChangeSet file. The extra ChangeSets are grouped together
with the merge cset where they were added to the path we are
recording. That is correct, but it is not the whole story.

What happens is that most csets modifiy a non overlapping set of
files. So while we didn't get every delta to the ChangeSet file, we
did capture >90% of the actual changes to the source files in the
tree.

Perhaps that will help explain things.

-Wayne

2003-03-17 14:35:01

by Pavel Machek

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi!

> > As far as I can see, linux-2.5 repository has over 17000 ChangeSets,
> > that means half the granularity.
>
> I assume this has already been answered since this is Monday morning
> and I haven't finished my mountain of email (I try not to read it on
> weekends), but I will answer this anyway.
>
> The ChangeSet file has many csets and we only capture around 1/2 of
> them in CVS ChangeSet file. The extra ChangeSets are grouped together
> with the merge cset where they were added to the path we are
> recording. That is correct, but it is not the whole story.
>
> What happens is that most csets modifiy a non overlapping set of
> files. So while we didn't get every delta to the ChangeSet file, we
> did capture >90% of the actual changes to the source files in the
> tree.

Oh, so there's extra magic.

Question, through: why is it impossible / infeasible to use CVS
branches to capture *full* information? Merge would then say
"(changeset 1.2345, merge from 1.23.4.5)" or similar...

Pavel

--
Horseback riding is like software...
...vgf orggre jura vgf serr.

2003-03-17 17:31:47

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Roman Zippel <[email protected]> said:

[...]

> If you want to test an alternative system to see whether it's usable for
> kernel development, what better data is there? How could you compare it
> against bk?

Either it is a bk clone of some sort (which adds little value, and probably
won't get the head hackers to switch) or it works on different principles.
In the first case the info you request might be useful to peek at (but
Larry (quite understandably IMVHO) will veto that), but you won't need it
for day-to-day LKML bussiness; or it is utterly useless because it is from
an incompatible worldview.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2003-03-17 17:32:15

by Daniel Phillips

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon 17 Mar 03 02:35, Larry McVoy wrote:
> On Mon, Mar 17, 2003 at 02:18:03AM +0100, Roman Zippel wrote:
> > Well, this wasn't the deal. Larry doesn't own the data, he can't say that
> > you only get all the data, if you use bk.
>
> Perhaps you can explain the benefits to BitMover of this "deal". If you
> are going to say that we get tons of marketing and sales from the free
> use of BK, forget about it. Sales in this product space are made at
> the CEO/CTO/VP level and I can assure you they don't read this mailing
> list, they don't read slashdot, and if they did they would view what
> we are doing as too risky. I tend to agree with them.
>
> So what's the part of the "deal" that benefits BitMover?

Let me reinsert the essential part of Roman's message that was
censo^H^H^Homitted:

On Mon 17 Mar 03 02:18, Roman Zippel wrote:
> you only get all the data, if you use bk. One of the arguments for the
> move to bk was that the format was open and the data wasn't locked in.

Being seen to keep the promise would be the benefit to BitMover. Personally,
I no longer hold any hope of that - the migration to increasingly proprietary
formats and secret/patented protcols will be inexorable. Hence, no more
flaming, just doing.

Regards,

Daniel

2003-03-17 17:54:02

by Petr Baudis

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Dear diary, on Mon, Mar 17, 2003 at 06:41:33PM CET, I got a letter,
where Horst von Brand <[email protected]> told me, that...
> Roman Zippel <[email protected]> said:
>
> [...]
>
> > If you want to test an alternative system to see whether it's usable for
> > kernel development, what better data is there? How could you compare it
> > against bk?

It is perfectly reasonable for Larry not to support such a thing (unless he is
so beated down by lkml people already that he seeks a way to escape, though
still with grace yet ;).

Still the potential competing version control would have *MUCH* larger test
base than without BitKeeper at all --- it is IMHO fine to politely ask for
more, but if Larry doesn't want to do it, why to beat it from him so
agressively? It's an added value which is here *thanks* to BitKeeper (and
noone was able to pose any what BitKeeper *removed*, so the value has to be
positive, mathematically speaking ;) so I think it is in competency of
KitBeeper maintainers to regulate the size of such a value --- BitKeeper will
probably be used as long as the value will stay high at least short-term (for
runtime maintainement of various trees) and even if the value will approach
zero by time (ie. only partial records in CVS, and it looks they are almost
complete), as long as it's not negative I can't see why people whine about it
so much.

> Either it is a bk clone of some sort (which adds little value, and probably
> won't get the head hackers to switch)
..snip..

If it will approach the feature set AND usability of bk (or at least the subset
which matters for kernel development), the information are somehow translatable
from the bk's format AND it has acceptable licence, I can see big potential
audience for it, at least to keep private trees (and merge together busily
behind the Linus' back ;). Also the licence can be a big plus given the current
state of things, even for Linus (he even stated so before some time, IIRC).

Kind regards,

--

Petr "Pasky busy playing with own
supposedly-BK-alike SCM" Baudis
.
The pure and simple truth is rarely pure and never simple.
-- Oscar Wilde
.
Stuff: http://pasky.ji.cz/

2003-03-17 17:53:43

by Jeff Garzik

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon, Mar 17, 2003 at 06:46:43PM +0100, Daniel Phillips wrote:
> On Mon 17 Mar 03 02:18, Roman Zippel wrote:
> > you only get all the data, if you use bk. One of the arguments for the
> > move to bk was that the format was open and the data wasn't locked in.
>
> Being seen to keep the promise would be the benefit to BitMover. Personally,
> I no longer hold any hope of that - the migration to increasingly proprietary
> formats and secret/patented protcols will be inexorable.

$subject proves you wrong. cvs is an open format, and open protocol.

We hope to have cset retrieval via http available soon, too.


> Hence, no more
> flaming, just doing.

Larry seems to be the only one 'doing', ATM. :)

Jeff



2003-03-17 19:21:48

by Jamie Lokier

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

I still don't understand why bkbits.net does not simply make the SCCS
files available.

That would render every objection moot.

-- Jamie

2003-03-17 19:31:54

by David Lang

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

they do (and they ahve been available for quite a while), just not through
bkbits.net due to bandwidth requirements. I think it was David M that has
them available via rsync.

David Lang

On Mon, 17 Mar 2003, Jamie Lokier wrote:

> Date: Mon, 17 Mar 2003 19:32:09 +0000
> From: Jamie Lokier <[email protected]>
> To: Jeff Garzik <[email protected]>
> Cc: Daniel Phillips <[email protected]>, Larry McVoy <[email protected]>,
> Roman Zippel <[email protected]>, Andrea Arcangeli <[email protected]>,
> Nicolas Pitre <[email protected]>, Ben Collins <[email protected]>,
> lkml <[email protected]>
> Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)
>
> I still don't understand why bkbits.net does not simply make the SCCS
> files available.
>
> That would render every objection moot.
>
> -- Jamie
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-03-17 19:49:57

by Jamie Lokier

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

David Lang wrote:
> they do (and they ahve been available for quite a while), just not through
> bkbits.net due to bandwidth requirements. I think it was David M that has
> them available via rsync.

No, that's Rik van Riel you're thinking of.

ftp://nl.linux.org/pub/linux/bk2patch/

<whine>

But they aren't the originals (they contain lots of "merging into
Rik's tree" entries) and they aren't real time!

</whine> :)

-- Jamie

2003-03-17 20:01:32

by Roman Zippel

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi,

On Mon, 17 Mar 2003, Jeff Garzik wrote:

> > Hence, no more
> > flaming, just doing.
>
> Larry seems to be the only one 'doing', ATM. :)

That's not really true, I posted an example RCS file. I explained how the
data can be represented in a CVS tree.
The problem is Larry has the tools to easily extract all the data, I
don't. I offered my help, all Larry has to do is to ask, but it seems he
prefers to stay in control, which data you will get back.
BTW the data Larry is trying to hide is actually quite simple. The merge
delta is simply the sum of all deltas from the branch (+ the conflict
fixes you've done), which was merged.

bye, Roman


2003-03-17 20:32:37

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon, Mar 17, 2003 at 07:32:09PM +0000, Jamie Lokier wrote:
> I still don't understand why bkbits.net does not simply make the SCCS
> files available.
>
> That would render every objection moot.

tell me how to get the data and metadata out of the SCCS files for
changeset 1.786.202.5 like in the below:

http://linux.bkbits.net:8080/linux-2.5/user=akpm/[email protected]?nav=!-|index.html|stats|!+|index.html|ChangeSet@-6M|[email protected]

then tell me how to find the number "1.786.202.5" watching the SCCS
history of kernel/timer.c.

SCCS provides per-file info, not the global metadata of the tree. That
other information is stored in a proprietary format that the bitbucket
project is trying to parse. CVS exports the whole info in a open format,
this is why it's so much better than trying to parse the proprietary
format that can be changed anyways to compressed and maybe encrypted
format too.

Andrea

2003-03-17 21:45:46

by Pavel Machek

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Hi!

> If you're still unhappy now that the mainline data is open it means
> you're either a jfs developer (but I assume they're all fine with bk
> since they're just using it, so I doubt this is the case) or your

Actually, fact that "longest path" algorithm may well choose
non-mainline branch because it likes it more worries me a bit.

Pavel
--
Horseback riding is like software...
...vgf orggre jura vgf serr.

2003-03-17 21:57:36

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon, Mar 17, 2003 at 10:56:39PM +0100, Pavel Machek wrote:
> Hi!
>
> > If you're still unhappy now that the mainline data is open it means
> > you're either a jfs developer (but I assume they're all fine with bk
> > since they're just using it, so I doubt this is the case) or your
>
> Actually, fact that "longest path" algorithm may well choose
> non-mainline branch because it likes it more worries me a bit.

AFIK it's supposed to be the "longest path" of Linus's and Marcelo's
branches which means it'll reproduce all the modifcations of the
mainline trees only.

Andrea

2003-03-17 22:57:27

by David Mansfield

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)


Andrea,

FWIW, I have already written a program called cvsps (http://www.cobite.com/cvsps)
which extracts 'patchset' information from cvs log output.

Currently, this program doesn't work with the bk-cvs because the log
messages that are committed with each file in a changeset can be
different, and cvsps assumes the log message will be the same.

However, about a 5 line hack to my program (in progress) will allow it to
recreate the ChangeSet information, since Larry has promised that the
timestamps of all files touched by a changeset will be unique.

This might help you out. I'll let you know when the '--bk-cvs' option has
been implemented ;-)

David

--
/==============================\
| David Mansfield |
| [email protected] |
\==============================/

2003-03-17 23:14:50

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon, Mar 17, 2003 at 06:08:11PM -0500, David Mansfield wrote:
>
> Andrea,
>
> FWIW, I have already written a program called cvsps (http://www.cobite.com/cvsps)
> which extracts 'patchset' information from cvs log output.
>
> Currently, this program doesn't work with the bk-cvs because the log
> messages that are committed with each file in a changeset can be
> different, and cvsps assumes the log message will be the same.
>
> However, about a 5 line hack to my program (in progress) will allow it to
> recreate the ChangeSet information, since Larry has promised that the
> timestamps of all files touched by a changeset will be unique.
>
> This might help you out. I'll let you know when the '--bk-cvs' option has
> been implemented ;-)

yes, this is very helpful thanks ;). I'd suggest you to also parse the
logic tag and to print a warning if there's an error and not only to
trust the timestamps. In general I don't love to depend on timestamps,
so I appreciate the availability of the logical tag in the cvs log.

Infact it would be nice to also be able to ask for the extraction of a
certain logic tag out of the tree. This logic tag will be the
"changeset" number for us, but one that is also persistent and no only
unique (unlike in bk where the changeset number of a changeset can
change anytime AFIK)

I also wonder if it wouldn't be better if Larry would simply tag the CVS
with the logic tag number since the first place, rather than writing it
in the logs and having to parse the stuff with an external utility.
Personally I would prefer the logical tag to be applied to the CVS with
a true `cvs tag`, not only written into the logs. dozen thousand of tags
(i.e. changesets) shouldn't be a problem for cvs. Doing this change
should be trivial, it should be easier than embedding the logical tag in
the cvs comments. what do you think?

Andrea

2003-03-17 23:22:44

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Tue, Mar 18, 2003 at 12:25:44AM +0100, Andrea Arcangeli wrote:
> yes, this is very helpful thanks ;). I'd suggest you to also parse the
> logic tag and to print a warning if there's an error and not only to
> trust the timestamps.

The time stamps we're talking about are *in* the revision history.
We do all checkins to all files with the same timestamp in the same
changeset.

If you thought that we were talking about on disk timestamps, that's
way too fragile but these are fine.

> certain logic tag out of the tree. This logic tag will be the
> "changeset" number for us, but one that is also persistent and no only
> unique

(Logical tag 1.XXXX)

is in each file's checkin comments and the 1.XXXX is the ChangeSet file's
rev for that changeset.

> I also wonder if it wouldn't be better if Larry would simply tag the CVS
> with the logic tag number since the first place, rather than writing it

That means that *all* files get tags. There would be 8300 x 15,000 files
times sizeof(tag). That's too big.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-17 23:46:56

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon, Mar 17, 2003 at 03:33:32PM -0800, Larry McVoy wrote:
> On Tue, Mar 18, 2003 at 12:25:44AM +0100, Andrea Arcangeli wrote:
> > yes, this is very helpful thanks ;). I'd suggest you to also parse the
> > logic tag and to print a warning if there's an error and not only to
> > trust the timestamps.
>
> The time stamps we're talking about are *in* the revision history.
> We do all checkins to all files with the same timestamp in the same
> changeset.
>
> If you thought that we were talking about on disk timestamps, that's
> way too fragile but these are fine.

ok, I see. But then why not using the logical number by default, that
sounds simpler to parse and to work with.

> > certain logic tag out of the tree. This logic tag will be the
> > "changeset" number for us, but one that is also persistent and no only
> > unique
>
> (Logical tag 1.XXXX)
>
> is in each file's checkin comments and the 1.XXXX is the ChangeSet file's
> rev for that changeset.
>
> > I also wonder if it wouldn't be better if Larry would simply tag the CVS
> > with the logic tag number since the first place, rather than writing it
>
> That means that *all* files get tags. There would be 8300 x 15,000 files
> times sizeof(tag). That's too big.

you're writing this tag in the textual log anyways, wouldn't it only
move the too big space from one place to another? I'm saying this
because cvs just provides a means of diffing a tag against another, and
so it looks more efficient (especially in term of saving bandwidth from
your part) to use the cvs feature, rather than doing it by hand with
multiple transfers.

Andrea

2003-03-18 01:37:32

by David Mansfield

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon, 17 Mar 2003, Larry McVoy wrote:

> On Tue, Mar 18, 2003 at 12:25:44AM +0100, Andrea Arcangeli wrote:
> > yes, this is very helpful thanks ;). I'd suggest you to also parse the
> > logic tag and to print a warning if there's an error and not only to
> > trust the timestamps.
>
> The time stamps we're talking about are *in* the revision history.
> We do all checkins to all files with the same timestamp in the same
> changeset.

Ok. A version of 'cvsps' which can correctly parse Larry's log format
into patchsets is up on the website (http://www.cobite.com/cvsps). It's version
2.0b3.

Larry's timestamps actually made this hack really easy. All files
committed with the exact time are recreated into a patchset. Here, for
example is 'patchset 8156':

--------------------
PatchSet 8156
Date: 2003/03/05 03:11:19
Author: torvalds
Branch: HEAD
Tag: v2_5_64
Log:
Linux 2.5.64

BKrev: 3e656ad75XghvjRCVNEGWy20cX0qwg

Members:
ChangeSet:1.8156->1.8157
Makefile:1.340->1.341


(note, this is a pretty boring patchset, I just wanted to show it
basically works).

The patchset id is in-sync with Larry's ChangeSet commits (but off by one
from the beginning).

You can do: 'cvsps -s 8156 -g' to generate a diff of this entire
patchset, and even get the correct results.

>
> If you thought that we were talking about on disk timestamps, that's
> way too fragile but these are fine.
>
> > certain logic tag out of the tree. This logic tag will be the
> > "changeset" number for us, but one that is also persistent and no only
> > unique
>
> (Logical tag 1.XXXX)
>

The checkin (Logical Tag x.yyy) log messages are currently not validated,
and are discarded. Only the 'main' message with the BKrev: is associated
with each patchset.

> is in each file's checkin comments and the 1.XXXX is the ChangeSet file's
> rev for that changeset.

Seems to work.

> > I also wonder if it wouldn't be better if Larry would simply tag the CVS
> > with the logic tag number since the first place, rather than writing it

Not necessary, each changeset is available via cvsps with 'cvsps -s
<logical change number>' as well as searching by file, by date, by tag, by
log message etc.

> That means that *all* files get tags. There would be 8300 x 15,000 files
> times sizeof(tag). That's too big.
>

Uggh.

David

--
/==============================\
| David Mansfield |
| [email protected] |
\==============================/

2003-03-18 02:32:41

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon, Mar 17, 2003 at 08:48:16PM -0500, David Mansfield wrote:
> Not necessary, each changeset is available via cvsps with 'cvsps -s
> <logical change number>' as well as searching by file, by date, by tag, by
> log message etc.

what you did sounds great, thanks!

> > That means that *all* files get tags. There would be 8300 x 15,000 files
> > times sizeof(tag). That's too big.
> >
>
> Uggh.

;) but what kind of network overhead do you expect compared to the tag
in the tree? I hope it's minor.

NOTE: it is possible I'm missing something and you can do it much faster
than I thought w/o the tag, I'll try it now.

Andrea

2003-03-21 14:05:25

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Mon, Mar 17, 2003 at 11:08:30PM +0100, Andrea Arcangeli wrote:
> > Actually, fact that "longest path" algorithm may well choose
> > non-mainline branch because it likes it more worries me a bit.
>
> AFIK it's supposed to be the "longest path" of Linus's and Marcelo's
> branches which means it'll reproduce all the modifcations of the
> mainline trees only.

By the way, we've been incrementally updating both trees and while in
theory the incremental could result in shorter paths with less detail,
so far the incremental export and the one pass export result in exactly
the same path:

slovax $ bk _eventpath 1.0 + | wc -l
8498
slovax $ cd ../linux-2.5-cvs/linux-2.5
slovax $ rlog -r -N ChangeSet | grep revision
revision 1.8498

I've actually reimported the data in one pass and diffed the RCS files,
it's the same.

HPA, should we be mirroring the CVS tarballs to kernel.org?
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-21 17:32:30

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Fri, Mar 21, 2003 at 06:16:20AM -0800, Larry McVoy wrote:
> On Mon, Mar 17, 2003 at 11:08:30PM +0100, Andrea Arcangeli wrote:
> > > Actually, fact that "longest path" algorithm may well choose
> > > non-mainline branch because it likes it more worries me a bit.
> >
> > AFIK it's supposed to be the "longest path" of Linus's and Marcelo's
> > branches which means it'll reproduce all the modifcations of the
> > mainline trees only.
>
> By the way, we've been incrementally updating both trees and while in
> theory the incremental could result in shorter paths with less detail,
> so far the incremental export and the one pass export result in exactly
> the same path:
>
> slovax $ bk _eventpath 1.0 + | wc -l
> 8498
> slovax $ cd ../linux-2.5-cvs/linux-2.5
> slovax $ rlog -r -N ChangeSet | grep revision
> revision 1.8498
>
> I've actually reimported the data in one pass and diffed the RCS files,
> it's the same.
>
> HPA, should we be mirroring the CVS tarballs to kernel.org?

fine thanks!

BTW, CVS kernel + cvsps is just been extremely useful to me so far.

I also run into some huge patches like PatchSet 4711 in the 2.5 tree
that I would love if it could be splitted properly but I understand it's
impossible, right?

Thank you very much again for this great open service!

Andrea

2003-03-21 19:30:46

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Followup to: <[email protected]>
By author: Larry McVoy <[email protected]>
In newsgroup: linux.dev.kernel
>
> HPA, should we be mirroring the CVS tarballs to kernel.org?
>

That would be highly useful. I would also like to see the bk export
text file, whatever it's called, mirrored there.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

2003-03-22 00:04:44

by Larry McVoy

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

On Fri, Mar 21, 2003 at 11:40:26AM -0800, H. Peter Anvin wrote:
> Followup to: <[email protected]>
> By author: Larry McVoy <[email protected]>
> In newsgroup: linux.dev.kernel
> >
> > HPA, should we be mirroring the CVS tarballs to kernel.org?
> >
>
> That would be highly useful. I would also like to see the bk export
> text file, whatever it's called, mirrored there.

There is no bk export text file, the output of the export is the CVS
repository, there isn't anything else. Everything that we could
extract has been extracted and put in CVS. It's a fairly complete
and accurate extraction, too. Far more than the traditional releases
and pre-releases, I don't know how many of those there have been in
the 2.5 timeframe but I can't imagine more than a couple hundred;
there are 8500 commits in the 2.5 CVS tree.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-03-22 00:41:26

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)

Followup to: <[email protected]>
By author: Larry McVoy <[email protected]>
In newsgroup: linux.dev.kernel
>
> On Fri, Mar 21, 2003 at 11:40:26AM -0800, H. Peter Anvin wrote:
> > Followup to: <[email protected]>
> > By author: Larry McVoy <[email protected]>
> > In newsgroup: linux.dev.kernel
> > >
> > > HPA, should we be mirroring the CVS tarballs to kernel.org?
> > >
> >
> > That would be highly useful. I would also like to see the bk export
> > text file, whatever it's called, mirrored there.
>
> There is no bk export text file, the output of the export is the CVS
> repository, there isn't anything else. Everything that we could
> extract has been extracted and put in CVS. It's a fairly complete
> and accurate extraction, too. Far more than the traditional releases
> and pre-releases, I don't know how many of those there have been in
> the 2.5 timeframe but I can't imagine more than a couple hundred;
> there are 8500 commits in the 2.5 CVS tree.
>

I was referring to this stuff:

Date: Wed, 12 Mar 2003 10:03:04 -0800
X-Hdr-Sender: [email protected]
From: Larry McVoy <[email protected]>
Subject: Re: [ANNOUNCE] BK->CVS (real time mirror)
Message-ID: <[email protected]>
References: <[email protected]> <[email protected]>

> I thought that BK has been able to export everything to a text file
> since the first version.

bk export -tpatch -r1.900 > patch.1.900
bk changes -v -r1.900 > comments.1.900

Been there forever. So has ways to get all the metadata from the command
line without having to reverse engineer the file format. See

http://www.bitkeeper.com/manpages/bk-prs-1.html

it's all there. Always has been.

Wayne wanted me to point that it is easy to write the BK to CVS exporter
completely from the command line, we prototyped it that way, the only
reason we rewrote part of it in C was for performance. The point being
that you guys could have done this yourself without help from us because
all the metadata is right there. Ditto for anyone else worried about
getting their data out of BK now or in the future. The whole point of
prs is to be able to have a will-always-work way to get at the data or
the metadata, it makes the file format a non-issue.
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64