2020-12-02 23:48:22

by Vlastimil Babka

[permalink] [raw]
Subject: crediting bug reports and fixes folded into original patch

Hi,

there was a bit of debate on Twitter about this, so I thought I would bring it
here. Imagine a scenario where patch sits as a commit in -next and there's a bug
report or fix, possibly by a bot or with some static analysis. The maintainer
decides to fold it into the original patch, which makes sense for e.g.
bisectability. But there seem to be no clear rules about attribution in this
case, which looks like there should be, probably in
Documentation/maintainer/modifying-patches.rst

The original bug fix might include a From: $author, a Reported-by: (e.g.
syzbot), Fixes: $next-commit, some tag such as Addresses-Coverity: to credit the
static analysis tool, and an SoB. After folding, all that's left might be a line
as "include fix from $author" in the SoB area. This is a loss of
metadata/attribution just due to folding, and might make contributors unhappy.
Had they sent the fix after the original commit was mainline and immutable, all
the info above would "survive" in the form of new commit.

So I think we could decide what the proper format would be, and document it
properly. I personally wouldn't mind just copy/pasting the whole commit message
of the fix (with just a short issue description, no need to include stacktraces
etc if the fix is folded), we could just standardize where, and how to delimit
it from the main commit message. If it's a report (person or bot) of a bug that
the main author then fixed, preserve the Reported-by in the same way (making
clear it's not a Reported-By for the "main thing" addressed by the commit).

In the debate one less verbose alternatve proposed was a SoB with comment
describing it's for a fix and not whole patch, as some see SoB as the main mark
of contribution, that can be easily found and counted etc. I'm not so sure about
it myself, as AFAIK SoB is mainly a DCO thing, and for a maintainer it means
something else ("passed through my tree") than for a patch author. And this
approach would still lose the other tags.

Thoughts?
Vlastimil


2020-12-03 04:07:42

by Dan Williams

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

On Wed, Dec 2, 2020 at 3:44 PM Vlastimil Babka <[email protected]> wrote:
>
> Hi,
>
> there was a bit of debate on Twitter about this, so I thought I would bring it
> here. Imagine a scenario where patch sits as a commit in -next and there's a bug
> report or fix, possibly by a bot or with some static analysis. The maintainer
> decides to fold it into the original patch, which makes sense for e.g.
> bisectability. But there seem to be no clear rules about attribution in this
> case, which looks like there should be, probably in
> Documentation/maintainer/modifying-patches.rst
>
> The original bug fix might include a From: $author, a Reported-by: (e.g.
> syzbot), Fixes: $next-commit, some tag such as Addresses-Coverity: to credit the
> static analysis tool, and an SoB. After folding, all that's left might be a line
> as "include fix from $author" in the SoB area. This is a loss of
> metadata/attribution just due to folding, and might make contributors unhappy.
> Had they sent the fix after the original commit was mainline and immutable, all
> the info above would "survive" in the form of new commit.
>
> So I think we could decide what the proper format would be, and document it
> properly. I personally wouldn't mind just copy/pasting the whole commit message
> of the fix (with just a short issue description, no need to include stacktraces
> etc if the fix is folded), we could just standardize where, and how to delimit
> it from the main commit message. If it's a report (person or bot) of a bug that
> the main author then fixed, preserve the Reported-by in the same way (making
> clear it's not a Reported-By for the "main thing" addressed by the commit).
>
> In the debate one less verbose alternatve proposed was a SoB with comment
> describing it's for a fix and not whole patch, as some see SoB as the main mark
> of contribution, that can be easily found and counted etc. I'm not so sure about
> it myself, as AFAIK SoB is mainly a DCO thing, and for a maintainer it means
> something else ("passed through my tree") than for a patch author. And this
> approach would still lose the other tags.
>
> Thoughts?

How about a convention to add a Reported-by: and a Link: to the
incremental fixup discussion? It's just polite to credit helpful
feedback, not sure it needs a more formal process.

Along those lines, how is this situation different than the feedback
that helps improve a patch that does not necessarily get credited by
Reviewed-by:? Links to thank you notes in cover letters seems more
appealing than moving more review / fix logs into the main history.

2020-12-03 09:39:50

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

On Wed, Dec 02, 2020 at 08:02:27PM -0800, Dan Williams wrote:
> On Wed, Dec 2, 2020 at 3:44 PM Vlastimil Babka <[email protected]> wrote:
> >
> > Hi,
> >
> > there was a bit of debate on Twitter about this, so I thought I would bring it
> > here. Imagine a scenario where patch sits as a commit in -next and there's a bug
> > report or fix, possibly by a bot or with some static analysis. The maintainer
> > decides to fold it into the original patch, which makes sense for e.g.
> > bisectability. But there seem to be no clear rules about attribution in this
> > case, which looks like there should be, probably in
> > Documentation/maintainer/modifying-patches.rst
> >
> > The original bug fix might include a From: $author, a Reported-by: (e.g.
> > syzbot), Fixes: $next-commit, some tag such as Addresses-Coverity: to credit the
> > static analysis tool, and an SoB. After folding, all that's left might be a line
> > as "include fix from $author" in the SoB area. This is a loss of
> > metadata/attribution just due to folding, and might make contributors unhappy.
> > Had they sent the fix after the original commit was mainline and immutable, all
> > the info above would "survive" in the form of new commit.
> >
> > So I think we could decide what the proper format would be, and document it
> > properly. I personally wouldn't mind just copy/pasting the whole commit message
> > of the fix (with just a short issue description, no need to include stacktraces
> > etc if the fix is folded), we could just standardize where, and how to delimit
> > it from the main commit message. If it's a report (person or bot) of a bug that
> > the main author then fixed, preserve the Reported-by in the same way (making
> > clear it's not a Reported-By for the "main thing" addressed by the commit).
> >
> > In the debate one less verbose alternatve proposed was a SoB with comment
> > describing it's for a fix and not whole patch, as some see SoB as the main mark
> > of contribution, that can be easily found and counted etc. I'm not so sure about
> > it myself, as AFAIK SoB is mainly a DCO thing, and for a maintainer it means
> > something else ("passed through my tree") than for a patch author. And this
> > approach would still lose the other tags.
> >
> > Thoughts?
>
> How about a convention to add a Reported-by: and a Link: to the
> incremental fixup discussion? It's just polite to credit helpful
> feedback, not sure it needs a more formal process.

Maybe "Fixup-Reported-by:" and "Fixup-Link:"?

Thanks

2020-12-03 10:36:24

by Dan Carpenter

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

I'd like a "Fixes-from: Name email" tag for when someone spots a bug in
a patch.

I think we should not give credit for style complaints, because those
are their own reward and we already have enough bike shedding.

regards,
dan carpenter

2020-12-03 13:44:10

by Julia Lawall

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch



On Thu, 3 Dec 2020, Dan Carpenter wrote:

> I'd like a "Fixes-from: Name email" tag for when someone spots a bug in
> a patch.
>
> I think we should not give credit for style complaints, because those
> are their own reward and we already have enough bike shedding.

I agree with Dan, although I'm quite ok with the current situation as
well.

julia

2020-12-03 14:04:13

by James Bottomley

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

On Thu, 2020-12-03 at 00:43 +0100, Vlastimil Babka wrote:
> Hi,
>
> there was a bit of debate on Twitter about this, so I thought I would
> bring it here. Imagine a scenario where patch sits as a commit in
> -next and there's a bug report or fix, possibly by a bot or with some
> static analysis. The maintainer decides to fold it into the original
> patch, which makes sense for e.g. bisectability. But there seem to be
> no clear rules about attribution in this case, which looks like there
> should be, probably in
> Documentation/maintainer/modifying-patches.rst
>
> The original bug fix might include a From: $author, a Reported-by:
> (e.g. syzbot), Fixes: $next-commit, some tag such as Addresses-
> Coverity: to credit the static analysis tool, and an SoB. After
> folding, all that's left might be a line as "include fix from
> $author" in the SoB area. This is a loss of metadata/attribution just
> due to folding, and might make contributors unhappy. Had they sent
> the fix after the original commit was mainline and immutable, all
> the info above would "survive" in the form of new commit.

It has been the case since forever that discussion which improves an
uncommitted patch is only captured in email (which now may be preserved
in a link tag). Patch updates that come in after the patch is
committed get their own commit. We've tried to move people away from
counting commits as an indicator of upstream eminence, but it's still a
fact of life that this is what matters to a lot of open source
community managers. The tension we have is between liking a clean
commit in the tree as opposed to a sequence of commits tracking the
evolution of the patch and this community manager desire to track
patches.

So there are two embedded questions here: firstly, should we be as
wedded to clean history as we are, because showing the evolution would
simply solve this? Secondly, if we are agreed on clean history, how
can we make engagement via email as important as engagement via commit
for the community managers so the Link tag is enough? I've got to say
I think trying to add tags to recognize patch evolution is a mistake
and we instead investigate one of the two proposals above.

James


2020-12-03 16:59:18

by Joe Perches

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

On Thu, 2020-12-03 at 05:58 -0800, James Bottomley wrote:
> So there are two embedded questions here: firstly, should we be as
> wedded to clean history as we are, because showing the evolution would
> simply solve this? Secondly, if we are agreed on clean history, how
> can we make engagement via email as important as engagement via commit
> for the community managers so the Link tag is enough? I've got to say
> I think trying to add tags to recognize patch evolution is a mistake
> and we instead investigate one of the two proposals above.

I don't care that any trivial style notes I give to anyone
are tracked for posterity.

Who are these 'community managers' that use these?

Signatures are a mechanism for credit tracking isn't great.

One style that seems to have been generally accepted is for
patch revision change logs to be noted below a --- line.

Often that change log will shows various improvements made
to a patch and the people and reasoning that helped make
those improvements.

Perhaps automate a mechanism to capture that information as
git notes for the patches when applied.


2020-12-03 19:24:44

by Konstantin Ryabitsev

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

On Thu, Dec 03, 2020 at 08:55:54AM -0800, Joe Perches wrote:
> Perhaps automate a mechanism to capture that information as
> git notes for the patches when applied.

Git notes have a limited usefulness for this -- they are indeed part of
the repository, but they aren't replicated unless someone does a
--mirror clone (or specifically fetches refs/notes/*). If the goal is to
improve visibility for contributors, then putting this info into a git
note will hardly make more difference than providing a Link: that
someone has to follow to a list archival service.

I can offer the following proposal:

- kernel.org already monitors all mailing lists that are archived on
lore.kernel.org for the purposes of pull request tracking
(pr-tracker-bot).
- in the near future, we will add a separate process that will
auto-explode all pull requests into individual patches and add them
to a separate public-inbox archive (think of it as another
transparency log, since pull requests are transient and opaque).

We can additionally:

- identify all Link: and Message-Id: entries in commit messages,
retrieve the threads they refer to, and archive them as part of the
same (or adjacent) transparency log.

This offers an improvement over the status quo, because if
lore.kernel.org becomes unavailable, someone would have to have access
to all backend archive repositories it is currently tracking in order to
be able to reconstitute relevant conversations -- whereas with this
change, it should be sufficient to just have the copy of the
transparency log to have a fully self-contained high-relevancy archive
of both individual commits and conversations that happened around them.

I'm just not sure if this will help with the subject of the
conversation, or if it does not serve the goal of recognizing developer
contributions by making them more visible.

-K

2020-12-03 19:27:17

by Joe Perches

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

On Thu, 2020-12-03 at 14:17 -0500, Konstantin Ryabitsev wrote:
> On Thu, Dec 03, 2020 at 08:55:54AM -0800, Joe Perches wrote:
> > Perhaps automate a mechanism to capture that information as
> > git notes for the patches when applied.
>
> Git notes have a limited usefulness for this -- they are indeed part of
> the repository, but they aren't replicated unless someone does a
> --mirror clone (or specifically fetches refs/notes/*). If the goal is to
> improve visibility for contributors, then putting this info into a git
> note will hardly make more difference than providing a Link: that
> someone has to follow to a list archival service.

Or it becomes standard to fetch the refs/notes/... at the pull time.

> I can offer the following proposal:
>
> - kernel.org already monitors all mailing lists that are archived on
> ??lore.kernel.org for the purposes of pull request tracking
> ??(pr-tracker-bot).
> - in the near future, we will add a separate process that will
> ??auto-explode all pull requests into individual patches and add them
> ??to a separate public-inbox archive (think of it as another
> ??transparency log, since pull requests are transient and opaque).
>
> We can additionally:
>
> - identify all Link: and Message-Id: entries in commit messages,
> ??retrieve the threads they refer to, and archive them as part of the
> ??same (or adjacent) transparency log.
>
> This offers an improvement over the status quo, because if
> lore.kernel.org becomes unavailable, someone would have to have access
> to all backend archive repositories it is currently tracking in order to
> be able to reconstitute relevant conversations -- whereas with this
> change, it should be sufficient to just have the copy of the
> transparency log to have a fully self-contained high-relevancy archive
> of both individual commits and conversations that happened around them.

I think that would be great. Thanks.

I think all the requests for additional -by: -from: signature/bylines
becoe unnecessary if and when this proposal is implemented.


2020-12-03 21:17:18

by James Bottomley

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

On Thu, 2020-12-03 at 14:17 -0500, Konstantin Ryabitsev wrote:
> On Thu, Dec 03, 2020 at 08:55:54AM -0800, Joe Perches wrote:
> > Perhaps automate a mechanism to capture that information as
> > git notes for the patches when applied.
>
> Git notes have a limited usefulness for this -- they are indeed part
> of the repository, but they aren't replicated unless someone does a
> --mirror clone (or specifically fetches refs/notes/*). If the goal is
> to improve visibility for contributors, then putting this info into a
> git note will hardly make more difference than providing a Link: that
> someone has to follow to a list archival service.
>
> I can offer the following proposal:
>
> - kernel.org already monitors all mailing lists that are archived on
> lore.kernel.org for the purposes of pull request tracking
> (pr-tracker-bot).
> - in the near future, we will add a separate process that will
> auto-explode all pull requests into individual patches and add them
> to a separate public-inbox archive (think of it as another
> transparency log, since pull requests are transient and opaque).
>
> We can additionally:
>
> - identify all Link: and Message-Id: entries in commit messages,
> retrieve the threads they refer to, and archive them as part of

> the same (or adjacent) transparency log.
>
> This offers an improvement over the status quo, because if
> lore.kernel.org becomes unavailable, someone would have to have
> access to all backend archive repositories it is currently tracking
> in order to be able to reconstitute relevant conversations -- whereas
> with this change, it should be sufficient to just have the copy of
> the transparency log to have a fully self-contained high-relevancy
> archive of both individual commits and conversations that happened
> around them.

I don't think this is strictly necessary because there's more than lore
that archive's our lists, but the people at the internet history
project would remind me not to look a gift horse in the mouth, so I
think this would certainly be a useful addition.

The thing which Link: doesn't necessarily track is iterations, so if
you replied to v2 and your feedback got incorporated, there's a v3
iteration which has a different msgid. Is there a way of getting this
full history, not just the current thread?

> I'm just not sure if this will help with the subject of the
> conversation, or if it does not serve the goal of recognizing
> developer contributions by making them more visible.

I added Jon to the cc since a lot of managers (community or otherwise)
do use the lwn.net stats as a performance guide. The real question is
could we get something measurable out of the data? say number of
replies to an accepted patch counting in the reviewer stats or
something?

James


2020-12-04 04:56:52

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

On Thu, Dec 03, 2020 at 12:43:52AM +0100, Vlastimil Babka wrote:
>
> there was a bit of debate on Twitter about this, so I thought I would bring it
> here. Imagine a scenario where patch sits as a commit in -next and there's a bug
> report or fix, possibly by a bot or with some static analysis. The maintainer
> decides to fold it into the original patch, which makes sense for e.g.
> bisectability. But there seem to be no clear rules about attribution in this
> case, which looks like there should be, probably in
> Documentation/maintainer/modifying-patches.rst

I don't think there should be any kind of fixed, inflexible rules
about this.

1) Sometimes there will be a *huge* number of comments and
suggestions. Do we really want to require links to dozens of mail
message id's, and/or dozens or more e-mail addresses?

2) Sometimes a fixup is pretty trivial; even if it is expressed in the
form of a one-line patch, versus someone who does a detailed review of
a patch, but doesn't actually end up appending an explicit
Reviewed-by, perhaps because he or she didn't completely agree with
the final version of the patch.

3) I think this very much should be up to the maintainer's discretion,
as opposed to making rules that may result in some rediculous amount
of bloat in the git log.

4) It's really unhealthy, in my opinion for people to be fixed on
counting attributions. If we create fixed rules, this can turn into
people try to game the system. It's the same reason why I'm not
terribly enthusiastic about people trying to game Signed-off-by counts
by sending gazillions of white space or spelling fixes.

If the fix is large enough that for copyright reasons we need to
acknowledge the work, then folding in the SoB as for DCO reason makes
perfect sense. But if it's a trivial patch (the kind where projects
that require copyright assignment wouldn't require executed legal
agreements), then perhaps attribution is not always a requirement.
Again, there are times when people who spend a lot of work discussing
patch may not get attributiionm even if they didn't actually create
the one-line whitespace fix and sent it in as a patch with a
signed-off-by with a demand that the attribution be preserved.

Common sense really needs to prevale here, and I'm concerned that
people who like to create rules don't realize what a mess this can
create when contributors approach their participation with a sense of
entitlement.

Cheers,

- Ted