LinuxLists.cc - Git training wheels for the pimple faced maintainer

2006-10-19 21:17:24

Subject: Git training wheels for the pimple faced maintainer

Hi guys,

In an effort to change my work flow into a manner that is more suitable
for upstream merging and publishing my trees, I though I could ask for
some input from the more experienced.

My intended work flow is to work on stuff on temporary topic branches,
and cherry-pick or diff|patch them into other trees when they are mature
enough.

Stuff that need a bit more testing will be put in a public "for-andrew"
branch. From what I gather, Andrew does a pull and a diff of these kinds
of branches before putting together a -mm set. So this should be
sufficient for your needs? Do you also prefer getting "[GIT PULL]"
requests, or do you do the pull periodically anyway?

Patches that are considered stable, either directly or by virtue of
being in -mm for a while, will be moved into a "for-linus" tree and a
"[GIT PULL]" sent to herr Torvalds.

Now, the patch in "for-linus" will be a duplicate of one or several
commits in "for-andrew". Will I get any problems from git once I do a
new pull from Linus' tree into "for-andrew"?

Another concern is all the merges. As I have modifications in my tree,
every merge should generate at least one commit and one tree object. Is
this kind of noise in the git history something that needs concern?

Looking forward to your kind words and ruthless flames :)

Rgds
--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

2006-10-19 22:25:16

by Andrew Morton

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Thu, 19 Oct 2006 23:17:27 +0200
Pierre Ossman <[email protected]> wrote:

> Hi guys,
>
> In an effort to change my work flow into a manner that is more suitable
> for upstream merging and publishing my trees, I though I could ask for
> some input from the more experienced.
>
>
> My intended work flow is to work on stuff on temporary topic branches,
> and cherry-pick or diff|patch them into other trees when they are mature
> enough.
>
> Stuff that need a bit more testing will be put in a public "for-andrew"
> branch. From what I gather, Andrew does a pull and a diff of these kinds
> of branches before putting together a -mm set. So this should be
> sufficient for your needs? Do you also prefer getting "[GIT PULL]"
> requests, or do you do the pull periodically anyway?

Just send me the url&branch-name for a tree which you want included in -mm.
I typically pull all the trees once per day. I usually won't even look at
the contents of what I pulled from you unless it breaks.

IOW, -mm is like a tree to which 70-odd people have commit permissions,
except it's 70 separate trees and I independently jam them all into one
tree daily.

> Patches that are considered stable, either directly or by virtue of
> being in -mm for a while, will be moved into a "for-linus" tree and a
> "[GIT PULL]" sent to herr Torvalds.

yup.

> Now, the patch in "for-linus" will be a duplicate of one or several
> commits in "for-andrew". Will I get any problems from git once I do a
> new pull from Linus' tree into "for-andrew"?

git will sort that out.

> Another concern is all the merges. As I have modifications in my tree,
> every merge should generate at least one commit and one tree object. Is
> this kind of noise in the git history something that needs concern?

I'll leave that question to a gittier responder. But yes, you'll get
shouted at if Linus's final commit contains irrelevant commit and merge
stuff.

2006-10-19 23:44:53

by Linus Torvalds

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Thu, 19 Oct 2006, Pierre Ossman wrote:
>
> Stuff that need a bit more testing will be put in a public "for-andrew"
> branch. From what I gather, Andrew does a pull and a diff of these kinds
> of branches before putting together a -mm set. So this should be
> sufficient for your needs? Do you also prefer getting "[GIT PULL]"
> requests, or do you do the pull periodically anyway?
>
> Patches that are considered stable, either directly or by virtue of
> being in -mm for a while, will be moved into a "for-linus" tree and a
> "[GIT PULL]" sent to herr Torvalds.

That all sounds fine. Please just check the format for the "[GIT PULL]"
message: Andrew pulls peoples trees on his own and largely automatically,
so he doesn't much care _what_ is in the tree, but I care deeply. So I
want the diffstat and shortlog listings, and preferably a few sentences at
the top of the email describing what's going on and why things are
happening.

I think people have seen the messages that other people send out (eg at
least Greg KH tends to Cc: those messages to linux-kernel, so others can
see what's going on too - although not all other maintainers do that).

Basically, I want to know that the thing I pull makes sense for the stage
I'm pulling in (ie if it's after -rc1, think about trying to explain why
the fixes are all important bug-fixes for example), and what it affects.
The diffstat is part of that, but please include any other explanations
that seem meaningful.

> Now, the patch in "for-linus" will be a duplicate of one or several
> commits in "for-andrew". Will I get any problems from git once I do a
> new pull from Linus' tree into "for-andrew"?

No, git will take care of it, unless, of course, your extra patches
conflict with the ones you sent me.

> Another concern is all the merges. As I have modifications in my tree,
> every merge should generate at least one commit and one tree object. Is
> this kind of noise in the git history something that needs concern?

Yes and no.

An occasional merge by you is fine, and if the merge is about _you_
merging your own branches into one branch for me or Andrew to pull, then
the merge actually adds valid information.

HOWEVER. Please don't just pull from my tree just to keep your
development tree "up-to-date". Your development tree is YOUR development
tree, it should be about the stuff _you_ did - not about merging stuff
that I merged from others. See?

So there's simply no point in merging from me, unless you know that there
are clashes due to other development, and you actually want to fix them
up. You will just cause unnecessary criss-cross merges if you pull from my
tree after you've started development, and the history gets really really
messy.

There's several ways to handle this:

- just base your work on a certain release, and ignore all the other
changes. Then, when you're ready, just ask me to pull your changes. git
will just merge it up to my current version, and everything will be
fine.

(Of course, once I _have_ merged your changes, if you pull at that
point, you'll just fast-forward, and there will be no unnecessary
back-and-forth merging)

- If you actually want your development tree to "track" my tree, I'd
suggest you have your "for-linus" branch that you put the work you want
to track into, and then a plain "linus" branch which tracks _my_ tree.
Then you can just fetch my tree (to keep your "linus" branch
up-to-date), and if you want your development branch to track those
changes, you can just do a "git rebase linus" in your "for-linus"
branch.

- If you actually notice that the stuff in my tree conflicts with the
stuff you develop, _then_ you obviously want to merge (you can try to
"rebase" things too and fix it up durign the rebase, but merging is
often easier, and at this point the merge is no longer "unnecessary
noise", it's actually a real action of you doing a real merge to handle
the conflict.

And hey, if there is occasionally an unnecessary merge, nobody really
cares. So don't be _too_ worried about it. But a clean history makes
things simpler to track, so I'm asking people to not generate noise that
simply doesn't help.

Other git maintainers may have other hints about how they work. Anybody?

Linus

2006-10-20 01:07:34

by Mark Fasheh

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Thu, Oct 19, 2006 at 04:44:41PM -0700, Linus Torvalds wrote:
> I think people have seen the messages that other people send out (eg at
> least Greg KH tends to Cc: those messages to linux-kernel, so others can
> see what's going on too - although not all other maintainers do that).
I noticed also that people started sending out "What's in XX.git" type
messages at the beginning of a merge window to describe what might shortly
get sent upstream.

> Other git maintainers may have other hints about how they work. Anybody?
I think I have a slightly different workflow than what Pierre describes. I
find that it works well for me and it keeps things very organized in
ocfs2.git. It's also probably a little more work than other methods for
managing a git tree that people employ. Hopefully a description of my
process will be useful to someone.

Basically I have two trees, ocfs2.git which is the main ocfs2 repository and
my own personal linux-2.6.git which I actually hack in.

All my hacking happens on topic branches based off of master in my personal
tree. I keep master pristine so that it's always a direct copy of what Linus
has in his tree. If somebody sends me ocfs2 patches, I'll make a topic
branch for the patches in my personal tree and import them, typically via
git-applymbox. Pull requests (which I typically get from Joel for configfs
changes) go directly to ocfs2.git (described below). The point of this is
that I hack elsewhere and keep ocfs2.git for merging stuff that's ready for
people to see.

In ocfs2.git, I will also make topic branches and pull from my linux-2.6.git
(for my work, or patches e-mailed to me), or directly from somebody elses
git tree. I make an ALL branch, based off of master which gets a merge of
everything ocfs2 related that I want to go in -mm (which so far has been
anything I eventually want to go upstream to Linus).

Once I'm ready to send an upstream pull request, I'll update the master
branch of ocfs2.git. I then make a for-linus branch based off of it, and
git-cherry-pick each individual patch into that branch and send my request.

I do the cherry pick because I like that it allows me to do one last review
of each patch, and it helps avoid lots of merge records in my pull. This
also makes it easy for me to tailor which patches I want to go upstream in a
given pull - sometimes I hold things back if I feel they need more testing,
or if they're features that need to wait for the next merge window.

Once Linus pulls, I'll re-make the ALL branch for Andrew by re-pulling all
the patchsets which weren't a part of that pull request.

Btw, I cannot over state how important and useful it is to have patches go
to -mm first.

Hope this helps,
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[email protected]

2006-10-20 04:28:44

by Kyle Moffett

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Oct 19, 2006, at 19:44:41, Linus Torvalds wrote:
> - If you actually want your development tree to "track" my tree,
> I'd suggest you have your "for-linus" branch that you put the work
> you want to track into, and then a plain "linus" branch which
> tracks _my_ tree. Then you can just fetch my tree (to keep your
> "linus" branch up-to-date), and if you want your development branch
> to track those changes, you can just do a "git rebase linus" in
> your "for-linus" branch.

I'm no official maintainer, but I have several random local GIT trees
I use for local development and patches which don't make a lot of
sense outside my personal systems. Because of this it's important
for me to be able to migrate my changes over to newer versions
easily, but I don't care all that much about maintaining old history,
I just want my separate patches to work on latest Linus.

It also matters a lot to me to be able to wipe out a devel tree with
a quick rm and create it from scratch.

As a result, I have an "upstream" tree with various "linus", "libata-
dev", etc upstream branches that I pull from for various reasons. I
then have a "linux-template" tree which has no objects of its own but
references the "upstream" tree's object directory and "pulls" from
some branch in the upstream tree. To create a new devel tree I just
copy the "upsteram" tree which is the size of a single checkout, and
start patching. To update all my patchsets to latest linus:

### This gets the upstream tree to the latest state
$ cd ~/git/linux/upstream
$ cg-fetch linus
$ cg-fetch libata-dev
$ cg-fetch $OTHER_UPSTREAM_SRC

## Optionally repack the single copy of the upstream objects for
better speed
$ git-repack -a -d -l -f

### Now fast-forward the patches in $MY_TREE to be based on the
latest version of the random upstream tree
$ cd ~/git/linux/$MY_TREE
$ cg-fetch upstream
$ git-rebase upstream
### Now I resolve rebase-conflicts

When I want to export a GIT tree for somebody else to look at I just
pull into the HTTP-accessible GIT directories from my various
development trees, optionally merging if necessary.

This isn't "The Best Solution(TM)" for everyone, but it works really
well for me and has the advantage of only storing one easily-repacked
copy of the upstream sources; the rest of my dev trees have only the
overhead of my local changesets and a single copy of the kernel
sources. In the event I have to wipe out a dev tree it's a very fast
"rm -rf $OLD_TREE", and creating one is also a fast "cp -a linux-
template $NEW_TREE"

Cheers,
Kyle Moffett

2006-10-20 06:26:45

by Pierre Ossman

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

Andrew Morton wrote:
> Just send me the url&branch-name for a tree which you want included in -mm.
> I typically pull all the trees once per day. I usually won't even look at
> the contents of what I pulled from you unless it breaks.
>
> IOW, -mm is like a tree to which 70-odd people have commit permissions,
> except it's 70 separate trees and I independently jam them all into one
> tree daily.
>

So, in other words, you have no issues with a lot of merges in the
branch you're pulling from? Do you do a fresh pull each time or do you
update an existing copy? If you do the latter, then I assume it is
critical that my "for-andrew" branch has a continous history? (Which it
won't naturally have as the changes will be replaced by identical
changes coming from Linus' tree)

Rgds

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

2006-10-20 06:34:51

by Pierre Ossman

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

Linus Torvalds wrote:
> That all sounds fine. Please just check the format for the "[GIT PULL]"
> message: Andrew pulls peoples trees on his own and largely automatically,
> so he doesn't much care _what_ is in the tree, but I care deeply. So I
> want the diffstat and shortlog listings, and preferably a few sentences at
> the top of the email describing what's going on and why things are
> happening.
>

I'm still learning the more fancy parts of git, but I think that would be:

git diff master..for-linus | diffstat
git log master..for-list | git shortlog

right?

> I think people have seen the messages that other people send out (eg at
> least Greg KH tends to Cc: those messages to linux-kernel, so others can
> see what's going on too - although not all other maintainers do that).
>
> Basically, I want to know that the thing I pull makes sense for the stage
> I'm pulling in (ie if it's after -rc1, think about trying to explain why
> the fixes are all important bug-fixes for example), and what it affects.
> The diffstat is part of that, but please include any other explanations
> that seem meaningful.
>
>

That was the basic idea. I've been looking at these kinds of messages on
LKML, trying to deduce how people do their work.

> So there's simply no point in merging from me, unless you know that there
> are clashes due to other development, and you actually want to fix them
> up. You will just cause unnecessary criss-cross merges if you pull from my
> tree after you've started development, and the history gets really really
> messy.
>

And in order to test for conflicts, I assume I should have a "test tree"
that I merge all my local stuff in, together with your current HEAD?

> There's several ways to handle this:
>
> - just base your work on a certain release, and ignore all the other
> changes. Then, when you're ready, just ask me to pull your changes. git
> will just merge it up to my current version, and everything will be
> fine.
>
> (Of course, once I _have_ merged your changes, if you pull at that
> point, you'll just fast-forward, and there will be no unnecessary
> back-and-forth merging)
>
> - If you actually want your development tree to "track" my tree, I'd
> suggest you have your "for-linus" branch that you put the work you want
> to track into, and then a plain "linus" branch which tracks _my_ tree.
> Then you can just fetch my tree (to keep your "linus" branch
> up-to-date), and if you want your development branch to track those
> changes, you can just do a "git rebase linus" in your "for-linus"
> branch.
>

If I've understood git correctly, a rebase is a big no-no once I've
published those changes as it reverts some history. Right?

> - If you actually notice that the stuff in my tree conflicts with the
> stuff you develop, _then_ you obviously want to merge (you can try to
> "rebase" things too and fix it up durign the rebase, but merging is
> often easier, and at this point the merge is no longer "unnecessary
> noise", it's actually a real action of you doing a real merge to handle
> the conflict.
>
> And hey, if there is occasionally an unnecessary merge, nobody really
> cares. So don't be _too_ worried about it. But a clean history makes
> things simpler to track, so I'm asking people to not generate noise that
> simply doesn't help.
>

A load of my mind. ;)

Big thanks for all the pointers. I have my account at kernel.org, so it
won't be long until my first [GIT PULL]. Be gentle.

Rgds

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

2006-10-20 06:35:40

by Kyle Moffett

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Oct 20, 2006, at 02:26:49, Pierre Ossman wrote:
> Andrew Morton wrote:
>> Just send me the url&branch-name for a tree which you want
>> included in -mm.
>> I typically pull all the trees once per day. I usually won't even
>> look at
>> the contents of what I pulled from you unless it breaks.
>>
>> IOW, -mm is like a tree to which 70-odd people have commit
>> permissions,
>> except it's 70 separate trees and I independently jam them all
>> into one
>> tree daily.
>>
>
> So, in other words, you have no issues with a lot of merges in the
> branch you're pulling from? Do you do a fresh pull each time or do
> you update an existing copy? If you do the latter, then I assume it
> is critical that my "for-andrew" branch has a continous history?
> (Which it won't naturally have as the changes will be replaced by
> identical changes coming from Linus' tree)

I seem to remember Andrew saying something like (paraphrased) "In the
event that your tree doesn't have a continuous history for whatever
reason, I'll just pull a fresh copy and work from there". Given that
he maintains -mm as a quilt patchset and only uses GIT for incoming
pulls, I would guess that either way is probably OK, although the
continuous history makes merging and fixing rejects mildly nicer.

Cheers,
Kyle Moffett

2006-10-20 06:37:18

by Andrew Morton

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Fri, 20 Oct 2006 08:26:49 +0200
Pierre Ossman <[email protected]> wrote:

> Andrew Morton wrote:
> > Just send me the url&branch-name for a tree which you want included in -mm.
> > I typically pull all the trees once per day. I usually won't even look at
> > the contents of what I pulled from you unless it breaks.
> >
> > IOW, -mm is like a tree to which 70-odd people have commit permissions,
> > except it's 70 separate trees and I independently jam them all into one
> > tree daily.
> >
>
> So, in other words, you have no issues with a lot of merges in the
> branch you're pulling from? Do you do a fresh pull each time or do you
> update an existing copy? If you do the latter, then I assume it is
> critical that my "for-andrew" branch has a continous history? (Which it
> won't naturally have as the changes will be replaced by identical
> changes coming from Linus' tree)
>

I don't care what the history is. I fetch the whole thing then generate
(you - linus) as a single unified diff then whack it into the patch pile.

2006-10-20 06:45:32

by Pierre Ossman

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

Mark Fasheh wrote:
> On Thu, Oct 19, 2006 at 04:44:41PM -0700, Linus Torvalds wrote:
>
>> I think people have seen the messages that other people send out (eg at
>> least Greg KH tends to Cc: those messages to linux-kernel, so others can
>> see what's going on too - although not all other maintainers do that).
>>
> I noticed also that people started sending out "What's in XX.git" type
> messages at the beginning of a merge window to describe what might shortly
> get sent upstream.
>
>

Yes, I've found those to be quite nice. I'll try to remember to send my own.

>
>> Other git maintainers may have other hints about how they work. Anybody?
>>
> I think I have a slightly different workflow than what Pierre describes. I
> find that it works well for me and it keeps things very organized in
> ocfs2.git. It's also probably a little more work than other methods for
> managing a git tree that people employ. Hopefully a description of my
> process will be useful to someone.
>
> Basically I have two trees, ocfs2.git which is the main ocfs2 repository and
> my own personal linux-2.6.git which I actually hack in.
>

Hmm.. What is the gain of having two tree instead of just more branches?

> Once I'm ready to send an upstream pull request, I'll update the master
> branch of ocfs2.git. I then make a for-linus branch based off of it, and
> git-cherry-pick each individual patch into that branch and send my request.
>

This should be equivalent of just keeping the "for-linus" branch around
as it will just fast-forward along with Linus' tree when it doesn't
contain any local changes. Or am I missing something?

> Once Linus pulls, I'll re-make the ALL branch for Andrew by re-pulling all
> the patchsets which weren't a part of that pull request.
>

In other words, you destroy all the old history of your ALL branch and
create a new one? So you couldn't continuously pull from that branch?

> Btw, I cannot over state how important and useful it is to have patches go
> to -mm first.
>

My intention was always to send him everything but the most trivial patches.

On questions related to that though. Previously, I've always sent plain
patches to Andrew. After they have simmered a bit in -mm, he usually
pushes them on to Linus, even though they do not qualify as being just
bug fixes. As I will now be the one moving stuff from "from-andrew" to
"for-linus", will the decision of what to move now fall on me? I would
probably be more inclined to wait for the next merge window than Andrew is.

Thanks

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

2006-10-20 07:30:45

by Kyle Moffett

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Oct 20, 2006, at 02:34:54, Pierre Ossman wrote:
> Linus Torvalds wrote:
>> That all sounds fine. Please just check the format for the "[GIT
>> PULL]" message: Andrew pulls peoples trees on his own and largely
>> automatically, so he doesn't much care _what_ is in the tree, but
>> I care deeply. So I want the diffstat and shortlog listings, and
>> preferably a few sentences at the top of the email describing
>> what's going on and why things are happening.
>
> I'm still learning the more fancy parts of git, but I think that
> would be:
>
> git diff master..for-linus | diffstat
> git log master..for-list | git shortlog

Not quite. diffstat doesn't understand renames and such, you want to
use something more like this:

git diff -M --stat --summary master..for-linus
git log --pretty=short master..for-linus | git shortlog

As an example, if you rename foo/bar/baz.c to foo/bar/quux.c and
change a few lines, you'll get something like this:

foo/bar/{baz.c => quux.c} | 8 +--

It similarly makes renames between directories look nice.

>> So there's simply no point in merging from me, unless you know
>> that there are clashes due to other development, and you actually
>> want to fix them up. You will just cause unnecessary criss-cross
>> merges if you pull from my tree after you've started development,
>> and the history gets really really messy.
>
> And in order to test for conflicts, I assume I should have a "test
> tree" that I merge all my local stuff in, together with your
> current HEAD?

Yes

>> If you actually want your development tree to "track" my tree, I'd
>> suggest you have your "for-linus" branch that you put the work you
>> want to track into, and then a plain "linus" branch which tracks
>> _my_ tree. Then you can just fetch my tree (to keep your "linus"
>> branch up-to-date), and if you want your development branch to
>> track those changes, you can just do a "git rebase linus" in your
>> "for-linus" branch.
>
> If I've understood git correctly, a rebase is a big no-no once I've
> published those changes as it reverts some history. Right?

Well, sorta. If it's a pseudo-published development and you actually
_don't_ _care_ what the old history was (because it was broken or
incorrect or one of the patches got corrupted during import) then go
ahead and wipe it out. On the other hand if you have random people
pulling from your published tree then you can't safely git-rebase or
cg-admin-rewrite-hist or similar. Luckily GIT will just complain
about a discontinuous history as opposed to losing data.

Cheers,
Kyle Moffett

2006-10-20 15:26:23

by Linus Torvalds

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Fri, 20 Oct 2006, Pierre Ossman wrote:
>
> I'm still learning the more fancy parts of git, but I think that would be:
>
> git diff master..for-linus | diffstat

Use "git diff -M --stat master..for-linus" instead.

The "-M" enables rename detection, and the "--stat" does the diffstat for
you (and better than plain diffstat, since it knows about renames, copies
and deletes).

HOWEVER! The above obviously only really works correctly if "master" is a
strict subset of "for-linus".

> git log master..for-linus | git shortlog

Yes.

> And in order to test for conflicts, I assume I should have a "test tree"
> that I merge all my local stuff in, together with your current HEAD?

Exactly. It can be either just a random temporary branch (it's cheap), or
it can just be your "work tree" that you can keep as messy as you want,
and then the "for-linus" branch is the cleaned-up version.

And quite frankly, most of the time you don't even really need one. It
depends on what you work on, but a _lot_ of the kernel is so independent
of anything else, that you know that the only thing that will ever really
conflict is trivial things, and hey, one of the things I do is to fix up
those conflicts.

In fact, quite often the _right_ thing to do for most developers is to
just entirely ignore what everybody else is doing, because if there are
trivial conflicts, I'll take care of them, and if there are more serious
conflicts, I'll just let you know myself - and you may not even be in a
position to _know_ about it, because the conflicts could come from
somebody elses git tree that I just happened to pull before.

So don't worry too much about it. As already mentioned, the _worst_ thing
you can do is probably to continually pull from my tree to "stay on the
edge". The way we keep the kernel maintainable is not by having everybody
try to keep up with everybody else, but by trying to keep things so
independent that people don't _need_ to keep up with everybody else.

A lot of people seem to just synchronize up at major releases, and then
rebase their work (which they may even have kept in quilt or something
else: you don't even have to use "git" for this) on that, and ask me to
merge the result.

So don't worry too much.

Also - different people work in different ways, and it's _ok_.

> If I've understood git correctly, a rebase is a big no-no once I've
> published those changes as it reverts some history. Right?

That is mostly correct. It's a big no-no if somebody has already pulled
from you, and you want them to pull again. Because at that point, you're
essentially asking them to pull two totally different versions of the same
thing - git will do the right thing (since all the duplicates will usually
merge perfectly), but it will look like two different histories, and
you'll see every commit twice. That's just ugly.

On the other hand, things like the -mm tree are "throw-away" anyway:
Andrew re-creates the tree every time he pulls. So you can rebase the
branch you send to Andrew as much as you want. So it's not _entirely_
about whether something is "published" or not, it's literally more about
how something is actively _used_.

But yes - in general, the rule of thumb should be: rebase as much as you
want in your own _private_ sandbox to make things look nice, but once
you've exposed it to anybody else, it's set in stone.

> Big thanks for all the pointers. I have my account at kernel.org, so it
> won't be long until my first [GIT PULL]. Be gentle.

Now, I may not be "gentle", because if there is something wrong with the
end result I'll tell you so and I'm not exactly known for always being
excessively polite ;)

But don't worry, it can be fixed up. At worst, you'll just get an email
back saying "I'm not going to pull this one, because you've been a
complete clutz, and did something really stupid wrt XYZ", and I'll ask you
to fix it up. Or I might say "I'll pull it this time, but I don't want to
see XYZ again in the future".

Or I might not say anythign at all, and you'll just notice that I've
pulled from you.

Linus

2006-10-20 15:35:32

by Linus Torvalds

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Fri, 20 Oct 2006, Linus Torvalds wrote:
>
> Use "git diff -M --stat master..for-linus" instead.

Actually, use "git diff -M --stat --summary master..for-linus".

The "--summary" thing generates an additional summary at the end of the
diffstat that lists deleted/created/moved/copied files, which is nice to
see. There's a difference between a

drivers/char/myserial.c | 50 ++++++++
1 file changed, 50 insertions(+), 0 deletions(-)

and

drivers/char/myserial.c | 50 ++++++++
1 file changed, 50 insertions(+), 0 deletions(-)
create mode 100644 drivers/char/myserial.c

because the latter tells that the new lines are actually in a new file,
while the previous says that you just added lines to an old one.

(Without "--summary", you can't tell the difference between these two
cases)

And the "-M" flag obviously means the difference between:

drivers/pci/hotplug/pci_hotplug.h | 236 ----------------------
include/linux/pci_hotplug.h | 236 ++++++++++++++++++++++
2 files changed, 236 insertions(+), 236 deletions(-)
delete mode 100644 drivers/pci/hotplug/pci_hotplug.h
create mode 100644 include/linux/pci_hotplug.h

and

.../pci/hotplug => include/linux}/pci_hotplug.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
rename drivers/pci/hotplug/pci_hotplug.h => include/linux/pci_hotplug.h (99%)

where the latter version clearly tells you a whole lot more about the
patch than the non-renaming one.

The reason rename detection isn't on by default is that non-git tools
don't understand the rename diffs. But if anybody sends me patches, please
feel free to use "git diff -M" to make them smaller and more readable in
the face of renames.

Linus

2006-10-20 21:08:43

by Mark Fasheh

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Fri, Oct 20, 2006 at 08:45:36AM +0200, Pierre Ossman wrote:
> Hmm.. What is the gain of having two tree instead of just more branches?
That way I have my own private playground where I can mess around with
patches, prototype new ideas, etc. It also serves as my local repository of
patches I got from other folks. I treat ocfs2.git as a 'public' repository,
so I don't want to pollute it with junk branches, etc.

> > Once I'm ready to send an upstream pull request, I'll update the master
> > branch of ocfs2.git. I then make a for-linus branch based off of it, and
> > git-cherry-pick each individual patch into that branch and send my request.
> >
>
> This should be equivalent of just keeping the "for-linus" branch around
> as it will just fast-forward along with Linus' tree when it doesn't
> contain any local changes. Or am I missing something?
Yeah. I just remove it after the merge and re-create later, but I could just
fast-forward it if I wanted to. I guess it's personal preference - it
doesn't really cost me much to re-create the for-linus branch.

> In other words, you destroy all the old history of your ALL branch and
> create a new one? So you couldn't continuously pull from that branch?
Yep. ALL is essentially a throwaway branch. Keep in mind that the topic
branches don't get thrown out until they've been merged upstream.

Typically people aren't pulling continuously from ALL. Most patches are
against Linus' tree - I take it as part of my "job" to handle merging stuff
into ALL.

> On questions related to that though. Previously, I've always sent plain
> patches to Andrew. After they have simmered a bit in -mm, he usually
> pushes them on to Linus, even though they do not qualify as being just
> bug fixes. As I will now be the one moving stuff from "from-andrew" to
> "for-linus", will the decision of what to move now fall on me? I would
> probably be more inclined to wait for the next merge window than Andrew is.
Yes, generally you now have the responsibility of deciding what patches in
your git tree are appropriate to be pushed upstream at any given time :)
There are rules that people should follow of course, which Linus outlined
earlier in this thread.

Good Luck!
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[email protected]

2006-10-21 09:44:25

by Pierre Ossman

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

Linus Torvalds wrote:
> On Fri, 20 Oct 2006, Pierre Ossman wrote:
>
>> I'm still learning the more fancy parts of git, but I think that would be:
>>
>> git diff master..for-linus | diffstat
>>
>
> Use "git diff -M --stat master..for-linus" instead.
>
> The "-M" enables rename detection, and the "--stat" does the diffstat for
> you (and better than plain diffstat, since it knows about renames, copies
> and deletes).
>
> HOWEVER! The above obviously only really works correctly if "master" is a
> strict subset of "for-linus".
>
>

Ah, that's a bit of a gotcha. Any nice tricks to keep track of where you
where in sync with upstream last? Create a dummy branch/tag perhaps?

Rgds

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

2006-10-21 16:10:15

by Linus Torvalds

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Sat, 21 Oct 2006, Pierre Ossman wrote:
> >
> > HOWEVER! The above obviously only really works correctly if "master" is a
> > strict subset of "for-linus".
>
> Ah, that's a bit of a gotcha. Any nice tricks to keep track of where you
> where in sync with upstream last? Create a dummy branch/tag perhaps?

You don't need to. Git keeps track of the fork-point, and you can always
get it with

git merge-base a b

where "a" and "b" are the two branches.

HOWEVER. If you have _merged_ since (ie your branch contains merges _from_
the branch that you are tracking), this will give you the last
merge-point (since that's the last common base), and as such a "diff" from
that point will _ignore_ your changes from before the merge. See?

But holding a tag to the "original fork point" is equally useless in that
case, since if you have merged from my tree since that tag, and you do a
"git diff tag..for-linus", then the diff will contain all the new stuff
that came from _me_ through your merge as well. See?

In other words: in both cases you really really shouldn't merge from me
after you started developing. And the reason in both cases is really
fundamnetlly the same: because merging from me obviously brings in commits
_from_me_, so any single diff thus obviously turns pointless: it will
_not_ talk about all your new work.

Anyway, notice the "single diff" caveat above. Git obviously does actually
keep track of individual commits, so the individual commits that are
unique to your repository are _still_ unique to your repository even after
you've merged with me - since I haven't merged with you. So you _can_ get
the information, but now you have to do something fundamentally
different..

So if you've done merges with me since you started development, you cannot
now just say "what's the difference between <this> point and <that> point
in the development tree", because clearly there is no _single_ line of
development that shows just _your_ changes. But that doesn't mean that
your development isn't separatable, and you can do one of two things:

(a) work on a "individual commit" level:

git log -p linus..for-linus

will show each commit that is in your "for-linus" branch but is _not_
in your "linus" tracker branch. This does the right thing even in the
presense of merges: it will show the merge commit you did (since that
individual commit is _yours_), but it will not show the commits
merged (since those came from _my_ line of development)

So now a

git log -p linux..for-linus | diffstat

will give something that _approximates_ the diffstat I will see when
merging. I say _approximates_, because it only really gives the right
answer if all the commits are entirely independent, and you never
have one commit that changes a line one way, and then a subsequent
commit that changes the same lines another way.

If you have commits that are inter-dependent, the diffstat above will
show the "sum" of the diffs, which is not what I will see when I
actually merge. I will see just the end result, which is more like
the "union" of the diffs. And the two are the same only for
independent diffs, of course.

So the above is simple, and gives _almost_ the right answer. The other
alternative is slightly smarter, and more involved, and gives the exact
right answer:

(b) create a temporary new merge, and see what the difference of the
merge is, as seen by me (eg as seen from "linus"). So this is
basically:

git checkout -b test-branch for-linus
git pull . linus
git diff -M --stat --summary linus..

will create a new branch ("checkout -b") based on your current
"for-linus" state, and within that branch, do a merge of the "linus"
branch (or you could have done it the other way around and made the
merge as if you were me: check out the state of "linus" and then
pull the "for-linus" branch instead).

And then, the final step is to just diff the result of the merge
against the "linus" branch. This obviously gives the same diffstat
as the one _I_ should see when I merge, because you basically
"pre-tested" my merge for me.

See? git does give you the tools, but if you merge from me and don't have
a branch that is a nice clear superset of what you started off with, but
have mixed in changes from _my_ tree since you started developing, you end
up having to do some extra work to separate out all the new changes.

So that's why I suggest not doing a lot of criss-crossing merges. It
generates an uglier history that is much harder to follow visually in
"gitk", but it also generates some extra work for you. Not a lot, but
considering that there are seldom any real upsides, this hopefully
explains why I suggest against it.

And again, as a final note: none of this is "set in stone". These are all
_suggestions_. Notice the "seldom any real upsides". I say "seldom" on
purpose, because quite frankly, sometimes it's just easier for you to
merge (especially if you know there are likely to be clashes), so that you
can fix up any issues that the merge brings.

Anyway, I hope this clarified the issue. I don't think we've actually had
a lot of problems with these things in practice. None of this is really
"hard", and a lot of it is just getting used to the model. Once you are
comfortable with how git works (and using "gitk" to show history tends to
be a very visual way to see what is going on in the presense of merges),
and get used to working with me, you'll do all of this without even
thinking about it.

It really just _sounds_ more complicated than it really is.

Linus

2006-10-21 16:47:17

by Roland Dreier

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

> Other git maintainers may have other hints about how they work. Anybody?

I use StGIT (http://www.procode.org/stgit/) to have sort of a hybrid
git/quilt workflow. My infiniband.git tree has the following main
branches (I also keep other topic branches around):

for-2.6.19
for-2.6.20
for-linus
for-mm
master

I use master to track Linus's tree. for-2.6.19 and for-2.6.20 are
StGIT branches that have patches queued up for 2.6.19 and 2.6.20
(duh). The advantages of StGIT are:

- I can do "stg pull" to do the equivalent of "git rebase" in a
slightly cleaner way.
- If I queue a patch and then someone later says "oops, that patch
needs this fix," I can go back and revise the patch easily. This
means I avoid cluttering the main kernel history with "change X"
followed by "fix for change X" followed by "update change X"
- StGIT works within git, so when it is time to send the changes to
Linus, I can just do "git merge blah for-linus for-2.6.19" and
then ask Linus to pull the for-linus branch.

- R.

2006-10-21 18:05:49

by Pierre Ossman

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

Linus Torvalds wrote:
> On Sat, 21 Oct 2006, Pierre Ossman wrote:
>
>>> HOWEVER! The above obviously only really works correctly if "master" is a
>>> strict subset of "for-linus".
>>>
>> Ah, that's a bit of a gotcha. Any nice tricks to keep track of where you
>> where in sync with upstream last? Create a dummy branch/tag perhaps?
>>
>
> You don't need to. Git keeps track of the fork-point, and you can always
> get it with
>
> git merge-base a b
>
> where "a" and "b" are the two branches.
>
> HOWEVER. If you have _merged_ since (ie your branch contains merges _from_
> the branch that you are tracking), this will give you the last
> merge-point (since that's the last common base), and as such a "diff" from
> that point will _ignore_ your changes from before the merge. See?
>

This sounds sufficent. My idea was to freeze my outgoing branches (and
possible topic branches that are "done"). I would like to keep my
development branches up to date though.

In other words, I have a branch "linus" which keeps your current tree.
>From this I'll fork off branches for things going upstream. Until these
have been merged, I won't do any more syncs with "linus". But my
development branch will keep moving with the "linus" branch.

If I read your response above and the man page for git-merge-base, it
will do the right thing even if "linus" now is further in the future
than the point I forked it.

>
> (a) work on a "individual commit" level:
>
> git log -p linus..for-linus
>
> will show each commit that is in your "for-linus" branch but is _not_
> in your "linus" tracker branch. This does the right thing even in the
> presense of merges: it will show the merge commit you did (since that
> individual commit is _yours_), but it will not show the commits
> merged (since those came from _my_ line of development)
>
>

Ah, so "git log" will not show the commits that have popped up on
"linus" after "for-linus" branched off? Neat. :)

One concern I had was how to find stuff to cherry-pick when doing a
stable review.

> Anyway, I hope this clarified the issue. I don't think we've actually had
> a lot of problems with these things in practice. None of this is really
> "hard", and a lot of it is just getting used to the model. Once you are
> comfortable with how git works (and using "gitk" to show history tends to
> be a very visual way to see what is going on in the presense of merges),
> and get used to working with me, you'll do all of this without even
> thinking about it.
>
> It really just _sounds_ more complicated than it really is.
>
>

git has a lot of these hidden features and ways of doing
less-than-obvious things, so I'm just trying to broaden my repertoire by
consulting those who have been using it on a more daily basis.

I am just thankful git has a reset command ;)

Thanks

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

2006-10-21 18:15:09

by Pierre Ossman

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

Roland Dreier wrote:
> > Other git maintainers may have other hints about how they work. Anybody?
>
> I use StGIT (http://www.procode.org/stgit/) to have sort of a hybrid
> git/quilt workflow. My infiniband.git tree has the following main
> branches (I also keep other topic branches around):
>

I've actually been using StGIT up until now. But I've started to feel a
need for sharing my tree, and StGIT isn't really suited for that.

How have you handled collaborative development on stuff that isn't ready
for Linus yet? Simply sending patches back and forth?

Rgds

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

2006-10-21 19:07:54

by Linus Torvalds

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

On Sat, 21 Oct 2006, Pierre Ossman wrote:
>
> If I read your response above and the man page for git-merge-base, it
> will do the right thing even if "linus" now is further in the future
> than the point I forked it.

Yes. You can continue to track my state in the "linus" branch as much as
you want, and "git merge-base" will show where your branch and mine
diverged, so you don't need to remember it explicitly.

Only if you start _mixing_ the branches (ie you merge "linus" into your
branch) do you end up in the situation where there now is no longer a
single-threaded line of development, so you can no longer expect to be
able to just use a direct "git diff".

> > (a) work on a "individual commit" level:
> >
> > git log -p linus..for-linus
> >
> > will show each commit that is in your "for-linus" branch but is _not_
> > in your "linus" tracker branch. This does the right thing even in the
> > presense of merges: it will show the merge commit you did (since that
> > individual commit is _yours_), but it will not show the commits
> > merged (since those came from _my_ line of development)
>
> Ah, so "git log" will not show the commits that have popped up on
> "linus" after "for-linus" branched off? Neat. :)

That is what the git "a..b" syntax means for everything _but_ "diff".
Doing a "git diff" really is actually the special case: to create a diff,
you need two end-points. For all other git commands, "a..b" really means
"all commits that are in 'b' but not in 'a'", ie it's _not_ really about
two end-points, it's about a _set_ operation.

You should think of "a..b" as the "set difference" operation, or "b-a".

There's also a "symmetric difference", which is called "a...b" (three
dots). That's the "union of the differences both ways", in other words,
"a...b" is the set of commits that exist in a _or_ b, but not in both.

You can do some even more complex operations, and one that I find
reasonably useful at times is for example

gitk --all --not HEAD

which basically means: "show all commits in all branches, but subtract
everything that is reachable from the current HEAD". In other words, it
shows what commits exist in all the other branches that have not been
merged into the current one.

(The "--not HEAD" thing is mostly written as "^HEAD", but I wrote it out
in long-hand here because it is perhaps a bit more readable that way.)

> One concern I had was how to find stuff to cherry-pick when doing a
> stable review.

So looking at the above, what you can do is literally

gitk --all ^linus

which shows all your branches _except_ stuff that is already merged into
the "linus" branch that tracks what I have merged.

Git really is very clever.

HOWEVER! A word of warning: especially when you start doing
cherry-picking, git will consider a commit that has been cherry-picked to
be totally _separate_ from the original one. So when you do things like
the above, and you have commits that have "identical patches" as the ones
I have already applied, they will show up as "not being in linus' branch".

That's because the identity of a commit is really not the patch it
describes at all: the commit is defined by the exact spot in the history,
and by the exact contents of that commit (which include date, time,
committer info, parents, exact tree state etc). So when you do a
"cherry-pick", you are very much creating a totally new commit - it just
happens to generate the same (or similar) _diff_.

There are tools to help you filter out cherry-picked commits too, by
literally looking at the diff and saying "oh, that same diff already
exists upstream", but that's different. If you really care, you can look
at what "git cherry" does (and it's not very efficient).

> git has a lot of these hidden features and ways of doing
> less-than-obvious things, so I'm just trying to broaden my repertoire by
> consulting those who have been using it on a more daily basis.

You really can do a _lot_ with git. Part of what seems to scare some
people is that git really allows for a lot of power and flexibility, and
you can do some very fancy stuff.

At the same time, you can mostly also use it as if it were a lot dumber
than it really is. There are ways to limit your usage so that you'll never
even need to worry about things like multiple branches or cherry-picking
or merging or anything else, and try to just see your work as a linear
progression on top of a particular release version.

I'll happily explain all the grotty details, but keep in mind that you
don't _need_ to use the features if you don't want to.

> I am just thankful git has a reset command ;)

You can undo almost any mess you get yourself into (you _can_ really screw
that up too, if you do a combination of "reset" and "git prune", but you
have to work at it).

The bigger problem may be that if you get yourself into a real mess, you
need to understand how you got there: you can always get back to a
previous state, sometimes you just need to know what that state _was_, and
if you get confused enough, even that can be a problem.

"gitk" really does tend to help clarify what happened.

Linus

2006-10-21 21:27:10

by Roland Dreier

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

> I've actually been using StGIT up until now. But I've started to feel a
> need for sharing my tree, and StGIT isn't really suited for that.
>
> How have you handled collaborative development on stuff that isn't ready
> for Linus yet? Simply sending patches back and forth?

I don't use StGIT for collaborative development. My StGIT branches
are really just patch queues (as the names for-2.6.19 and for-2.6.20
imply). Usually development is just about done before things wind up
in a maintainer tree, so being able to apply updates to patches
already in my tree is more important than fully automated merged (as
native git gives you).

2006-10-25 21:50:15

by Pierre Ossman

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

Andrew Morton wrote:
> I don't care what the history is. I fetch the whole thing then generate
> (you - linus) as a single unified diff then whack it into the patch pile.
>

How do you handle when I'm a bit after Linus (which will be the case
most of the time)? Will me doing pulls left and right distrupt this?

Rgds

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

2006-10-25 22:07:04

by Andrew Morton

[permalink] [raw]

Subject: Re: Git training wheels for the pimple faced maintainer

> On Wed, 25 Oct 2006 23:50:11 +0200 Pierre Ossman <[email protected]> wrote:
> Andrew Morton wrote:
> > I don't care what the history is. I fetch the whole thing then generate
> > (you - linus) as a single unified diff then whack it into the patch pile.
> >
>
> How do you handle when I'm a bit after Linus (which will be the case
> most of the time)? Will me doing pulls left and right distrupt this?
>

Nope, that's fine. git gives me a diff which is "things which are in Pierre's
tree but which aren't in Linus's".

Just go ahead and do whatever it is you want to do and don't bother about
-mm. If something goes badly wrong (it probably won't) then we can take a
look at it.