2021-06-02 15:25:01

by Mathieu Desnoyers

[permalink] [raw]
Subject: git feature request: git blame --ignore-cleanup/--ignore-trivial

Hi,

Following a discussion with Peter Zijlstra about whether code cleanup
and functional changes done to the Linux kernel scheduler belong to separate
patches or should be folded together, the argument for folding cleanup
and function changes came to be mainly motivated by the current behavior
of git blame: code cleanup patches end up burying the important changes so
it becomes cumbersome to find them using git blame.

Considering the added value brought by splitting cleanups from functional changes
from a maintainer perspective (easier reverts) and from a reviewer perspective
(easier to focus on the functional changes), I think it would be good to improve
the git tooling to allow easily filtering out the noise from git blame.

Perhaps a new git blame "--ignore-trivial" and/or "--ignore-cleanup" could solve
this by filtering out "trivial" and "cleanup" patches from the history it considers.

Tagging patches as trivial and cleanup should be done in the patch commit message
(possibly in the title), and enforcing proper tagging of commits is already the
responsibility of the maintainer merging those cleanup/trivial commits into the
Linux kernel anyway.

Under the hood, I suspect it could use something similar to git log --grep=<pattern>
--invert-grep.

This should allow git blame users to easily filter out the noise and focus on the relevant
functional changes.

Any maybe the patterns associated to "cleanup" and "trivial" commits should be something
that can be configured through a git config file.

Thoughts ?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


2021-06-02 15:33:49

by Al Viro

[permalink] [raw]
Subject: Re: git feature request: git blame --ignore-cleanup/--ignore-trivial

On Wed, Jun 02, 2021 at 11:20:35AM -0400, Mathieu Desnoyers wrote:
> Hi,
>
> Following a discussion with Peter Zijlstra about whether code cleanup
> and functional changes done to the Linux kernel scheduler belong to separate
> patches or should be folded together, the argument for folding cleanup
> and function changes came to be mainly motivated by the current behavior
> of git blame: code cleanup patches end up burying the important changes so
> it becomes cumbersome to find them using git blame.
>
> Considering the added value brought by splitting cleanups from functional changes
> from a maintainer perspective (easier reverts) and from a reviewer perspective
> (easier to focus on the functional changes), I think it would be good to improve
> the git tooling to allow easily filtering out the noise from git blame.
>
> Perhaps a new git blame "--ignore-trivial" and/or "--ignore-cleanup" could solve
> this by filtering out "trivial" and "cleanup" patches from the history it considers.
>
> Tagging patches as trivial and cleanup should be done in the patch commit message
> (possibly in the title), and enforcing proper tagging of commits is already the
> responsibility of the maintainer merging those cleanup/trivial commits into the
> Linux kernel anyway.
>
> Under the hood, I suspect it could use something similar to git log --grep=<pattern>
> --invert-grep.
>
> This should allow git blame users to easily filter out the noise and focus on the relevant
> functional changes.
>
> Any maybe the patterns associated to "cleanup" and "trivial" commits should be something
> that can be configured through a git config file.
>
> Thoughts ?

Just an observation: quite a few subtle bugs arise from mistakes in
what should've been a trivial cleanup. Hell, I've seen bugs coming
from rebase of provably no-op patches - with commit message unchanged.
So IME this is counterproductive...

2021-06-02 19:43:28

by Taylor Blau

[permalink] [raw]
Subject: Re: git feature request: git blame --ignore-cleanup/--ignore-trivial

On Wed, Jun 02, 2021 at 03:29:44PM +0000, Al Viro wrote:
> > Any maybe the patterns associated to "cleanup" and "trivial" commits
> > should be something that can be configured through a git config
> > file.
>
> Just an observation: quite a few subtle bugs arise from mistakes in
> what should've been a trivial cleanup. Hell, I've seen bugs coming
> from rebase of provably no-op patches - with commit message unchanged.
> So IME this is counterproductive...

Yes, I find excluding revisions from 'git blame' to be rarely useful,
exactly for this reason.

You could probably use the '--ignore-revs-file' option of 'git blame' to
exclude commits you consider trivial ahead of time. If you had an
'Is-trivial' trailer, I would probably do something like:

$ git log --format='%H %(trailers:key=Is-trivial)' |
grep "Is-trivial: true" | cut -d" " -f1 >exclude
$ git blame --ignore-revs-file exclude ...

Thanks,
Taylor

2021-06-02 19:45:36

by Jeff King

[permalink] [raw]
Subject: Re: git feature request: git blame --ignore-cleanup/--ignore-trivial

On Wed, Jun 02, 2021 at 11:20:35AM -0400, Mathieu Desnoyers wrote:

> Considering the added value brought by splitting cleanups from functional changes
> from a maintainer perspective (easier reverts) and from a reviewer perspective
> (easier to focus on the functional changes), I think it would be good to improve
> the git tooling to allow easily filtering out the noise from git blame.
>
> Perhaps a new git blame "--ignore-trivial" and/or "--ignore-cleanup" could solve
> this by filtering out "trivial" and "cleanup" patches from the history it considers.

There's "-w" to ignore whitespace-only changes. Since Git v2.23.0,
there's also "--ignore-rev-file", which lets you ignore arbitrary
commits. Since you have to generate a separate list of commits to feed
it, it's a little more involved than your "--invert-grep" example. But
it's also much more flexible (you can generate the file however you
like, and even tweak it by hand).

I do tend to agree with Al's notion that commits which are _supposed_ to
be trivial sometimes end up not being so. Or another way of thinking
about it is: relevance is a property of the query you're making, not the
original change.

So IMHO the best tool for this kind of thing is the "re-blame from
parent" feature that many interactive blame viewers have (I use tig, but
I'm sure other tools like magic have a similar feature). There when you
land on a boring commit, it's a single key to skip past it and see how
the earlier code came about.

-Peff

2021-06-02 21:18:24

by Felipe Contreras

[permalink] [raw]
Subject: RE: git feature request: git blame --ignore-cleanup/--ignore-trivial

Mathieu Desnoyers wrote:
> Perhaps a new git blame "--ignore-trivial" and/or "--ignore-cleanup" could solve
> this by filtering out "trivial" and "cleanup" patches from the history it considers.

While this feature would be a good addition, more often than not I need
to look behind the latest commit regardless of whether or not it's a
trivial one.

So what I often end up doing is a `git blame --incremental`:

git blame --incremental -L100,+6 file.c | grep -o -e '^[0-9a-f]\{40\}'

This can be fed to `git log --stdin --oneline --no-walk` for more
user-friendliness.

Of course you could just do:

git log -L100,+6:file.c --oneline --no-patch

But for some reason that's much slower on my system.

Cheers.

--
Felipe Contreras

2021-06-03 10:19:56

by David Sterba

[permalink] [raw]
Subject: Re: git feature request: git blame --ignore-cleanup/--ignore-trivial

On Wed, Jun 02, 2021 at 03:29:44PM +0000, Al Viro wrote:
> On Wed, Jun 02, 2021 at 11:20:35AM -0400, Mathieu Desnoyers wrote:
> > Hi,
> >
> > Following a discussion with Peter Zijlstra about whether code cleanup
> > and functional changes done to the Linux kernel scheduler belong to separate
> > patches or should be folded together, the argument for folding cleanup
> > and function changes came to be mainly motivated by the current behavior
> > of git blame: code cleanup patches end up burying the important changes so
> > it becomes cumbersome to find them using git blame.
> >
> > Considering the added value brought by splitting cleanups from functional changes
> > from a maintainer perspective (easier reverts) and from a reviewer perspective
> > (easier to focus on the functional changes), I think it would be good to improve
> > the git tooling to allow easily filtering out the noise from git blame.
> >
> > Perhaps a new git blame "--ignore-trivial" and/or "--ignore-cleanup" could solve
> > this by filtering out "trivial" and "cleanup" patches from the history it considers.
> >
> > Tagging patches as trivial and cleanup should be done in the patch commit message
> > (possibly in the title), and enforcing proper tagging of commits is already the
> > responsibility of the maintainer merging those cleanup/trivial commits into the
> > Linux kernel anyway.
> >
> > Under the hood, I suspect it could use something similar to git log --grep=<pattern>
> > --invert-grep.
> >
> > This should allow git blame users to easily filter out the noise and focus on the relevant
> > functional changes.
> >
> > Any maybe the patterns associated to "cleanup" and "trivial" commits should be something
> > that can be configured through a git config file.
> >
> > Thoughts ?
>
> Just an observation: quite a few subtle bugs arise from mistakes in
> what should've been a trivial cleanup.

I was about to write such comment. Cleanups that are eg. mechanically
switching names/types/variables/... and need some manual fixup are hot
candidates for buggy patches.

2021-06-03 10:29:39

by David Sterba

[permalink] [raw]
Subject: Re: git feature request: git blame --ignore-cleanup/--ignore-trivial

On Wed, Jun 02, 2021 at 11:20:35AM -0400, Mathieu Desnoyers wrote:
> Following a discussion with Peter Zijlstra about whether code cleanup
> and functional changes done to the Linux kernel scheduler belong to separate
> patches or should be folded together, the argument for folding cleanup
> and function changes came to be mainly motivated by the current behavior
> of git blame: code cleanup patches end up burying the important changes so
> it becomes cumbersome to find them using git blame.
>
> Considering the added value brought by splitting cleanups from functional changes
> from a maintainer perspective (easier reverts) and from a reviewer perspective
> (easier to focus on the functional changes), I think it would be good to improve
> the git tooling to allow easily filtering out the noise from git blame.
>
> Perhaps a new git blame "--ignore-trivial" and/or "--ignore-cleanup" could solve
> this by filtering out "trivial" and "cleanup" patches from the history it considers.
>
> Tagging patches as trivial and cleanup should be done in the patch commit message
> (possibly in the title), and enforcing proper tagging of commits is already the
> responsibility of the maintainer merging those cleanup/trivial commits into the
> Linux kernel anyway.
>
> Under the hood, I suspect it could use something similar to git log --grep=<pattern>
> --invert-grep.
>
> This should allow git blame users to easily filter out the noise and focus on the relevant
> functional changes.
>
> Any maybe the patterns associated to "cleanup" and "trivial" commits should be something
> that can be configured through a git config file.

As long as the subsystem policy is consistent, eg. always split cleanups
from functional changes, and make the trivial cleanups really obvious
either from code or mentioned in the changelog, I don't see a need for
a tooling.

Going through unrelated cleanups when digging in the git history is
annoying and I think it's up to maintainers' and developers' decision
what kind of cleanups are desired (because they make the code better,
not just because they're trivial). Mandating some sort of tagging is
just another burden, if it's not applied consistently it won't be
reliable so it won't help much.

2021-06-03 15:36:49

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: git feature request: git blame --ignore-cleanup/--ignore-trivial

----- On Jun 2, 2021, at 3:41 PM, Taylor Blau [email protected] wrote:

> On Wed, Jun 02, 2021 at 03:29:44PM +0000, Al Viro wrote:
>> > Any maybe the patterns associated to "cleanup" and "trivial" commits
>> > should be something that can be configured through a git config
>> > file.
>>
>> Just an observation: quite a few subtle bugs arise from mistakes in
>> what should've been a trivial cleanup. Hell, I've seen bugs coming
>> from rebase of provably no-op patches - with commit message unchanged.
>> So IME this is counterproductive...
>
> Yes, I find excluding revisions from 'git blame' to be rarely useful,
> exactly for this reason.
>
> You could probably use the '--ignore-revs-file' option of 'git blame' to
> exclude commits you consider trivial ahead of time. If you had an
> 'Is-trivial' trailer, I would probably do something like:
>
> $ git log --format='%H %(trailers:key=Is-trivial)' |
> grep "Is-trivial: true" | cut -d" " -f1 >exclude
> $ git blame --ignore-revs-file exclude ...

Nice trick! So within a project which standardize on a "Cleanup: " prefix
at the beginning of the patch subject, this would look like:

git log --format='%H Subject=("%s")' file.c | grep 'Subject=(\"Cleanup: ' | cut -d" " -f1 > exclude.txt
git blame --ignore-revs-file exclude.txt file.c

I fully understand that in many cases having the entire set of revisions is
needed, because even a cleanup patch could be buggy, but IMHO it's nice to
have a way to achieve this in situations where the cleanup patches get in the
way of figuring out the most recent behavior changes in a given area of the
code.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com