2022-12-04 12:22:38

by Kai Wasserbäch

[permalink] [raw]
Subject: [PATCH 0/2] feat: checkpatch: prohibit Buglink: and warn about missing Link:

Hey,
Thorsten Leemhuis suggested the following two changes to checkpatch, which
I hereby humbly submit for review. Please let me know if any changes
would be required for acceptance.

NOTES:
- checkpatch is rather long, so I might have chosen the wrong section to
add these two checks. Any better placement suggestion is welcome.
- checkpatch implements custum versions of Perl core modules, that might
be an future area for clean-ups? Eg. right off the bat there is a
`uniq` implementation. List::Util (core module since 5.8.0 (5.7.3 to
be pedantic)) has a far better performing version in XS.

Cheers,
Kai

Suggested-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Kai Wasserbäch <[email protected]>


Kai Wasserbäch (2):
feat: checkpatch: error on usage of a Buglink tag in the commit log
feat: checkpatch: Warn about Reported-by: not being followed by a
Link:

scripts/checkpatch.pl | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

--
2.35.1


2022-12-08 19:40:48

by Kai Wasserbäch

[permalink] [raw]
Subject: [PATCH 2/2] checkpatch: warn when Reported-by: is not followed by Link:

Encourage patch authors to link to reports by issuing a warning, if
a Reported-by: is not accompanied by a link to the report. Those links
are often extremely useful for any code archaeologist that wants to know
more about the backstory of a change than the commit message provides.
That includes maintainers higher up in the patch-flow hierarchy, which
is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:

> Again, the commit has a link to the patch *submission*, which is
> almost entirely useless. There's no link to the actual problem the
> patch fixes.
>
> [...]
>
> Put another way: I can see that
>
> Reported-by: Zhangfei Gao <[email protected]>
>
> in the commit, but I don't have a clue what the actual report was, and
> there really isn't enough information in the commit itself, except for
> a fairly handwavy "Device drivers might, for instance, still need to
> flush operations.."
>
> I don't want to know what device drivers _might_ do. I would want to
> have an actual pointer to what they do and where.

Another reason why these links are wanted: the ongoing regression
tracking efforts can only scale with them, as they allow the regression
tracking bot 'regzbot' to automatically connect tracked reports with
patches that are posted or committed to fix tracked regressions.

Link: https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/ [2]
Link: https://lore.kernel.org/all/CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com/ [3]
Signed-off-by: Kai Wasserbäch <[email protected]>
Co-developed-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Thorsten Leemhuis <[email protected]>
---
scripts/checkpatch.pl | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index f2f997f487..d4547c5fd4 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3144,6 +3144,20 @@ sub process {
"Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
}
}
+
+ # check if Reported-by: is followed by a Link:
+ if ($sign_off =~ /^reported-by:$/i) {
+ if (!defined $lines[$linenr]) {
+ WARN("BAD_REPORTED_BY_LINK",
+ "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline);
+ } elsif ($rawlines[$linenr] !~ /^\s*link:\s*(.*)/i) {
+ WARN("BAD_REPORTED_BY_LINK",
+ "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
+ } elsif ($lines[$linenr] !~ /https?:\/\//i) {
+ WARN("BAD_REPORTED_BY_LINK",
+ "Link: following Reported-by: should contain an URL\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
+ }
+ }
}

# Check Fixes: styles is correct
--
2.35.1

2022-12-08 20:08:24

by Kai Wasserbäch

[permalink] [raw]
Subject: [PATCH 1/2] checkpatch: warn when unknown tags are used for links

Issue a warning when encountering URLs behind unknown tags, as Linus
recently stated ```please stop making up random tags that make no sense.
Just use "Link:"```[1]. That statement was triggered by an use of
'BugLink', but that's not the only tag people invented:

$ git log -100000 --format=email -P --grep='^\w+:[ \t]*http' | \
grep -Poh '^\w+:[ \t]*http' | \
sort | uniq -c | sort -rn | head -n 20
103889 Link: http
415 BugLink: http
372 Patchwork: http
270 Closes: http
221 Bug: http
121 References: http
101 v1: http
77 Bugzilla: http
60 URL: http
59 v2: http
37 Datasheet: http
35 v3: http
19 v4: http
12 v5: http
9 Ref: http
9 Fixes: http
9 Buglink: http
8 v6: http
8 Reference: http
7 V1: http

Some of these non-standard tags make it harder for external tools that
rely on use of proper tags. One of those tools is the regression
tracking bot 'regzbot', which looks out for "Link:" tags pointing to
reports of tracked regressions.

The initial idea was to use a disallow list to raise an error when
encountering known unwanted tags like BugLink:; during review it was
requested to use a list of allowed tags instead[2].

Link: https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/[email protected]/ [2]
Signed-off-by: Kai Wasserbäch <[email protected]>
Co-developed-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Thorsten Leemhuis <[email protected]>
---
scripts/checkpatch.pl | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 1e5e66ae5a..f2f997f487 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3239,6 +3239,13 @@ sub process {
$commit_log_possible_stack_dump = 0;
}

+# Check for odd tags before a URI/URL
+ if ($in_commit_log &&
+ $line =~ /^\s*(\w+):\s*http/ && $1 !~ /^Link/) {
+ WARN("COMMIT_LOG_USE_LINK",
+ "Unknown link reference '$1:', use 'Link:' instead.\n" . $herecurr);
+ }
+
# Check for lines starting with a #
if ($in_commit_log && $line =~ /^#/) {
if (WARN("COMMIT_COMMENT_SYMBOL",
--
2.35.1

2022-12-08 20:14:38

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 1/2] checkpatch: warn when unknown tags are used for links

On Thu, 2022-12-08 at 20:32 +0100, Kai Wasserb?ch wrote:
> Issue a warning when encountering URLs behind unknown tags, as Linus
> recently stated ```please stop making up random tags that make no sense.
> Just use "Link:"```[1]. That statement was triggered by an use of
> 'BugLink', but that's not the only tag people invented:
>
> $ git log -100000 --format=email -P --grep='^\w+:[ \t]*http' | \

Please use the --no-merges output

> grep -Poh '^\w+:[ \t]*http' | \
> sort | uniq -c | sort -rn | head -n 20
> 103889 Link: http
> 415 BugLink: http
> 372 Patchwork: http
> 270 Closes: http
> 221 Bug: http
> 121 References: http
> 101 v1: http
> 77 Bugzilla: http
> 60 URL: http
> 59 v2: http
> 37 Datasheet: http
> 35 v3: http
> 19 v4: http
> 12 v5: http
> 9 Ref: http
> 9 Fixes: http
> 9 Buglink: http
> 8 v6: http
> 8 Reference: http
> 7 V1: http
>
> Some of these non-standard tags make it harder for external tools that
> rely on use of proper tags. One of those tools is the regression
> tracking bot 'regzbot', which looks out for "Link:" tags pointing to
> reports of tracked regressions.
>
> The initial idea was to use a disallow list to raise an error when
> encountering known unwanted tags like BugLink:; during review it was
> requested to use a list of allowed tags instead[2].
[]
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -3239,6 +3239,13 @@ sub process {
> $commit_log_possible_stack_dump = 0;
> }
>
> +# Check for odd tags before a URI/URL
> + if ($in_commit_log &&
> + $line =~ /^\s*(\w+):\s*http/ && $1 !~ /^Link/) {
> + WARN("COMMIT_LOG_USE_LINK",
> + "Unknown link reference '$1:', use 'Link:' instead.\n" . $herecurr);

This would allow LinkFoo: so better would be

$line =~ /^\s*(\w+):\s*http/ && $1 ne 'Link')
or
$line =~ /^\s*(\w+):\s*http/ && $1 !~ /^Link$/) {

(and checkpatch doesn't use periods after output messages)

Maybe better as well would be to use something like the below to
better describe the preferred location of patch versioning info.

# Check for odd tags before a URI/URL
if ($in_commit_log &&
$line =~ /^\s*(\w+):\s*http/ && $1 ne 'Link') {
if ($1 =~ /^v(?:ersion)?\d+/i) {
WARN("COMMIT_LOG_VERSIONING",
"Patch version information should be after the --- line\n" . $herecurr);
} else {
WARN("COMMIT_LOG_USE_LINK",
"Unknown link reference '$1:', use 'Link:' instead\n" . $herecurr);
}
}

2022-12-08 20:25:45

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 2/2] checkpatch: warn when Reported-by: is not followed by Link:

On Thu, 2022-12-08 at 20:32 +0100, Kai Wasserb?ch wrote:
> Encourage patch authors to link to reports by issuing a warning, if
> a Reported-by: is not accompanied by a link to the report. Those links
> are often extremely useful for any code archaeologist that wants to know
> more about the backstory of a change than the commit message provides.
> That includes maintainers higher up in the patch-flow hierarchy, which
> is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:
>
> > Again, the commit has a link to the patch *submission*, which is
> > almost entirely useless. There's no link to the actual problem the
> > patch fixes.
> >
> > [...]
> >
> > Put another way: I can see that
> >
> > Reported-by: Zhangfei Gao <[email protected]>
> >
> > in the commit, but I don't have a clue what the actual report was, and
> > there really isn't enough information in the commit itself, except for
> > a fairly handwavy "Device drivers might, for instance, still need to
> > flush operations.."
> >
> > I don't want to know what device drivers _might_ do. I would want to
> > have an actual pointer to what they do and where.
>
> Another reason why these links are wanted: the ongoing regression
> tracking efforts can only scale with them, as they allow the regression
> tracking bot 'regzbot' to automatically connect tracked reports with
> patches that are posted or committed to fix tracked regressions.
[]
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -3144,6 +3144,20 @@ sub process {
> "Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);

I believe this use of '"$here\n" . $rawline . "\n"' to be a defect.
I think this should just use $herecurr

And the unnecessary space before a newline is an abomination ;)

"Co-developed-by and Signed-off-by: name/email do not match\n" . $herecurr . $rawlines[$linenr]);

> +
> + # check if Reported-by: is followed by a Link:
> + if ($sign_off =~ /^reported-by:$/i) {
> + if (!defined $lines[$linenr]) {
> + WARN("BAD_REPORTED_BY_LINK",
> + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline);
> + } elsif ($rawlines[$linenr] !~ /^\s*link:\s*(.*)/i) {
> + WARN("BAD_REPORTED_BY_LINK",
> + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
> + } elsif ($lines[$linenr] !~ /https?:\/\//i) {
> + WARN("BAD_REPORTED_BY_LINK",
> + "Link: following Reported-by: should contain an URL\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);

Please use a space before and after a string concatenation '.'

English generally uses "a URL" and not "an URL"
https://www.techtarget.com/whatis/feature/Which-is-correct-a-URL-or-an-URL

Also the actual link line should likely be from lore

So maybe:
} elsif ($lines[$linenr] !~ m{https?://lore.kernel.org/.+}i) {
WARN("BAD_REPORTED_BY_LINK",
"Link: following Reported-by: should use a lore.kernel.org URL\n" . $herecurr . $rawlines[$linenr]);

etc...

2022-12-08 21:34:06

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 2/2] checkpatch: warn when Reported-by: is not followed by Link:

Joe, many thx for your time and your valuable feedback. I agree with
most of it and will look into addressing it tomorrow, but there is one
area where I have a different option:

On 08.12.22 21:21, Joe Perches wrote:
> On Thu, 2022-12-08 at 20:32 +0100, Kai Wasserbäch wrote:
>> +
>> + # check if Reported-by: is followed by a Link:
>> + if ($sign_off =~ /^reported-by:$/i) {
>> + if (!defined $lines[$linenr]) {
>> + WARN("BAD_REPORTED_BY_LINK",
>> + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline);
>> + } elsif ($rawlines[$linenr] !~ /^\s*link:\s*(.*)/i) {
>> + WARN("BAD_REPORTED_BY_LINK",
>> + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
>> + } elsif ($lines[$linenr] !~ /https?:\/\//i) {
>> + WARN("BAD_REPORTED_BY_LINK",
>> + "Link: following Reported-by: should contain an URL\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
>
> [...]
>
> Also the actual link line should likely be from lore
>
> So maybe:
> } elsif ($lines[$linenr] !~ m{https?://lore.kernel.org/.+}i) {
> WARN("BAD_REPORTED_BY_LINK",
> "Link: following Reported-by: should use a lore.kernel.org URL\n" . $herecurr . $rawlines[$linenr]);

I'm pretty sure that's not want we want, as from regression tracking I
known that we have other links that should continue to work here. Links
to bugzilla.kernel.org immediately come to my mind, for example. Then
there are some subsystems that use issue trackers on github or in gitlab
instances (DRM for example), which must work here, too; and we
occasionally even have links to bugs in bugzilla instances from RH or
SUSE there, as sometimes valuable details for code archeologists can be
found there.

Ciao, Thorsten

2022-12-08 22:36:44

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 2/2] checkpatch: warn when Reported-by: is not followed by Link:

On Thu, 2022-12-08 at 22:11 +0100, Thorsten Leemhuis wrote:
> Joe, many thx for your time and your valuable feedback. I agree with
> most of it and will look into addressing it tomorrow, but there is one
> area where I have a different option:
>
> On 08.12.22 21:21, Joe Perches wrote:
> > On Thu, 2022-12-08 at 20:32 +0100, Kai Wasserb?ch wrote:
> > > +
> > > + # check if Reported-by: is followed by a Link:
> > > + if ($sign_off =~ /^reported-by:$/i) {
> > > + if (!defined $lines[$linenr]) {
> > > + WARN("BAD_REPORTED_BY_LINK",
> > > + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline);
> > > + } elsif ($rawlines[$linenr] !~ /^\s*link:\s*(.*)/i) {
> > > + WARN("BAD_REPORTED_BY_LINK",
> > > + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
> > > + } elsif ($lines[$linenr] !~ /https?:\/\//i) {
> > > + WARN("BAD_REPORTED_BY_LINK",
> > > + "Link: following Reported-by: should contain an URL\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
> >
> > [...]
> >
> > Also the actual link line should likely be from lore
> >
> > So maybe:
> > } elsif ($lines[$linenr] !~ m{https?://lore.kernel.org/.+}i) {
> > WARN("BAD_REPORTED_BY_LINK",
> > "Link: following Reported-by: should use a lore.kernel.org URL\n" . $herecurr . $rawlines[$linenr]);
>
> I'm pretty sure that's not want we want, as from regression tracking I
> known that we have other links that should continue to work here. Links
> to bugzilla.kernel.org immediately come to my mind, for example. Then
> there are some subsystems that use issue trackers on github or in gitlab
> instances (DRM for example), which must work here, too; and we
> occasionally even have links to bugs in bugzilla instances from RH or
> SUSE there, as sometimes valuable details for code archeologists can be
> found there.

Outside the fact there are relatively few existing uses of Link: after
Reported-by:, it appears that "Fixes:" should also be allowed.

$ git log --no-merges --format=email -P -i --grep "^Reported-by:" -100000 | \
grep -A1 "^Reported-by:" | \
grep -v -P '^(Reported-by:|\-\-)' | \
cut -f1 -d":" | sort | uniq -c | sort -rn | head -20
31719 Signed-off-by
4556 Tested-by
4276 Cc
2976 Fixes
2304 Reviewed-by
1526 Acked-by
949 Suggested-by
821 Link
387
228 CC
139 Bugzilla
82 Reported-and-tested-by
69 Debugged-by
68 References
63 Bisected-by
43 Co-developed-by
34 LKML-Reference
29 Diagnosed-by
27 [ paulmck
26 Addresses-Coverity-ID

and lore definitely dominates the Link: uses.

And IMO: the lkml.kernel.org entries should instead use lore.

Maybe there should be some other patterns allowed. Dunno.

$ git log --no-merges --format=email -P -i --grep "^Reported-by:" -100000 | \
grep -A1 "^Reported-by:" | \
grep '^Link:' | \
cut -f1-3 -d"/" | sort | uniq -c | sort -rn | head -20
551 Link: https://lore.kernel.org
67 Link: http://lkml.kernel.org
42 Link: https://bugzilla.kernel.org
28 Link: https://github.com
16 Link: https://lkml.kernel.org
11 Link: lore.kernel.org/r/[email protected]
10 Link: https://patchwork.freedesktop.org
10 Link: https://github.blog
9 Link: https://syzkaller.appspot.com
9 Link: https://lkml.org
6 Link: https://www.spinics.net
5 Link: https://gitlab.freedesktop.org
5 Link: https://bugs.debian.org
5 Link: http://lore.kernel.org
4 Link: https://www.openwall.com
4 Link: https://patchwork.kernel.org
4 Link: http://patchwork.freedesktop.org
3 Link: https://marc.info
3 Link: https://bugzilla.linux-nfs.org
2 Link: https://bugzilla.suse.com

2022-12-09 09:13:17

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 2/2] checkpatch: warn when Reported-by: is not followed by Link:

On 08.12.22 22:34, Joe Perches wrote:
> On Thu, 2022-12-08 at 22:11 +0100, Thorsten Leemhuis wrote:
>> Joe, many thx for your time and your valuable feedback. I agree with
>> most of it and will look into addressing it tomorrow, but there is one
>> area where I have a different option:
>>
>> On 08.12.22 21:21, Joe Perches wrote:
>>> On Thu, 2022-12-08 at 20:32 +0100, Kai Wasserbäch wrote:
>>>> +
>>>> + # check if Reported-by: is followed by a Link:
>>>> + if ($sign_off =~ /^reported-by:$/i) {
>>>> + if (!defined $lines[$linenr]) {
>>>> + WARN("BAD_REPORTED_BY_LINK",
>>>> + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline);
>>>> + } elsif ($rawlines[$linenr] !~ /^\s*link:\s*(.*)/i) {
>>>> + WARN("BAD_REPORTED_BY_LINK",
>>>> + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
>>>> + } elsif ($lines[$linenr] !~ /https?:\/\//i) {
>>>> + WARN("BAD_REPORTED_BY_LINK",
>>>> + "Link: following Reported-by: should contain an URL\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
>>>
>>> [...]
>>>
>>> Also the actual link line should likely be from lore
>>>
>>> So maybe:
>>> } elsif ($lines[$linenr] !~ m{https?://lore.kernel.org/.+}i) {
>>> WARN("BAD_REPORTED_BY_LINK",
>>> "Link: following Reported-by: should use a lore.kernel.org URL\n" . $herecurr . $rawlines[$linenr]);
>>
>> I'm pretty sure that's not want we want, as from regression tracking I
>> known that we have other links that should continue to work here. Links
>> to bugzilla.kernel.org immediately come to my mind, for example. Then
>> there are some subsystems that use issue trackers on github or in gitlab
>> instances (DRM for example), which must work here, too; and we
>> occasionally even have links to bugs in bugzilla instances from RH or
>> SUSE there, as sometimes valuable details for code archeologists can be
>> found there.
>
> Outside the fact there are relatively few existing uses of Link: after
> Reported-by:,

Well, sure, because many people forgot to place them -- and this patch
is trying to change that for reasons outlined in the commit description.

> it appears that "Fixes:" should also be allowed.

Well, that would defeat the purpose of this patch -- and there is no
need to, as developers can still put Fixes: tag above or below the combo
of Reported-by: and Link:.

> and lore definitely dominates the Link: uses.
>
> And IMO: the lkml.kernel.org entries should instead use lore.

Well, that might be something nice to have (if Konstantin thinks it's
worth the trouble, as lkml.kernel.org likely needs to stick around
anyway to not break existing links). But that IMHO is something
independent of this patch-set, because the proper solution afaics then
would be to check all Link: tags in the commit message, not only those
next to a Reported-by.

Ciao, Thorsten

2022-12-09 09:46:13

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 1/2] checkpatch: warn when unknown tags are used for links

On 08.12.22 20:58, Joe Perches wrote:
> On Thu, 2022-12-08 at 20:32 +0100, Kai Wasserbäch wrote:
>> Issue a warning when encountering URLs behind unknown tags, as Linus
>> recently stated ```please stop making up random tags that make no sense.
>> Just use "Link:"```[1]. That statement was triggered by an use of
>> 'BugLink', but that's not the only tag people invented:
>>
>> $ git log -100000 --format=email -P --grep='^\w+:[ \t]*http' | \
>
> Please use the --no-merges output

Done, good point.

> [...]
>> @@ -3239,6 +3239,13 @@ sub process {
>> $commit_log_possible_stack_dump = 0;
>> }
>>
>> +# Check for odd tags before a URI/URL
>> + if ($in_commit_log &&
>> + $line =~ /^\s*(\w+):\s*http/ && $1 !~ /^Link/) {
>> + WARN("COMMIT_LOG_USE_LINK",
>> + "Unknown link reference '$1:', use 'Link:' instead.\n" . $herecurr);
>
> This would allow LinkFoo: so better would be
>
> $line =~ /^\s*(\w+):\s*http/ && $1 ne 'Link')
> or
> $line =~ /^\s*(\w+):\s*http/ && $1 !~ /^Link$/) {
>
> (and checkpatch doesn't use periods after output messages)
>
> Maybe better as well would be to use something like the below to
> better describe the preferred location of patch versioning info.
>
> # Check for odd tags before a URI/URL
> if ($in_commit_log &&
> $line =~ /^\s*(\w+):\s*http/ && $1 ne 'Link') {
> if ($1 =~ /^v(?:ersion)?\d+/i) {
> WARN("COMMIT_LOG_VERSIONING",
> "Patch version information should be after the --- line\n" . $herecurr);
> } else {
> WARN("COMMIT_LOG_USE_LINK",
> "Unknown link reference '$1:', use 'Link:' instead\n" . $herecurr);
> }
> }
>

Yeah, that looks like a really good idea. I went with that. But I'd say
this is the point where this really warrants a Co-developed-by: that
mentions you (and thus a Signed-off-by: from your side, too), don't you
think so?

Ciao, Thorsten

2022-12-09 10:25:21

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 2/2] checkpatch: warn when Reported-by: is not followed by Link:

On 08.12.22 21:21, Joe Perches wrote:
> On Thu, 2022-12-08 at 20:32 +0100, Kai Wasserbäch wrote:
>> Encourage patch authors to link to reports by issuing a warning, if
>> a Reported-by: is not accompanied by a link to the report. Those links
>> are often extremely useful for any code archaeologist that wants to know
>> more about the backstory of a change than the commit message provides.
>> That includes maintainers higher up in the patch-flow hierarchy, which
>> is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:
>>
>>> Again, the commit has a link to the patch *submission*, which is
>>> almost entirely useless. There's no link to the actual problem the
>>> patch fixes.
>>>
>>> [...]
>>>
>>> Put another way: I can see that
>>>
>>> Reported-by: Zhangfei Gao <[email protected]>
>>>
>>> in the commit, but I don't have a clue what the actual report was, and
>>> there really isn't enough information in the commit itself, except for
>>> a fairly handwavy "Device drivers might, for instance, still need to
>>> flush operations.."
>>>
>>> I don't want to know what device drivers _might_ do. I would want to
>>> have an actual pointer to what they do and where.
>>
>> Another reason why these links are wanted: the ongoing regression
>> tracking efforts can only scale with them, as they allow the regression
>> tracking bot 'regzbot' to automatically connect tracked reports with
>> patches that are posted or committed to fix tracked regressions.
> []
>> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> []
>> @@ -3144,6 +3144,20 @@ sub process {
>> "Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
>
> I believe this use of '"$here\n" . $rawline . "\n"' to be a defect.
> I think this should just use $herecurr
>
> And the unnecessary space before a newline is an abomination ;)
>
> "Co-developed-by and Signed-off-by: name/email do not match\n" . $herecurr . $rawlines[$linenr]);

Well, that's existing code. Want me to add a separate patch that fixes
both of these aspects up in that area?

>> +
>> + # check if Reported-by: is followed by a Link:
>> + if ($sign_off =~ /^reported-by:$/i) {
>> + if (!defined $lines[$linenr]) {
>> + WARN("BAD_REPORTED_BY_LINK",
>> + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline);
>> + } elsif ($rawlines[$linenr] !~ /^\s*link:\s*(.*)/i) {
>> + WARN("BAD_REPORTED_BY_LINK",
>> + "Reported-by: should be immediately followed by Link: to the report\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
>> + } elsif ($lines[$linenr] !~ /https?:\/\//i) {
>> + WARN("BAD_REPORTED_BY_LINK",
>> + "Link: following Reported-by: should contain an URL\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
>
> Please use a space before and after a string concatenation '.'

Okay, went with " . $herecurr . $rawlines[$linenr]" here, the result
seems to be the same afaics.

> English generally uses "a URL" and not "an URL"
> https://www.techtarget.com/whatis/feature/Which-is-correct-a-URL-or-an-URL

Thx!

/me grumbles, he should have remembered that

Ciao, Thorsten

2022-12-09 17:12:01

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 1/2] checkpatch: warn when unknown tags are used for links

On Fri, 2022-12-09 at 10:33 +0100, Thorsten Leemhuis wrote:
> I'd say
> this is the point where this really warrants a Co-developed-by: that
> mentions you (and thus a Signed-off-by: from your side, too), don't you
> think so?

No. I think it's just being a good maintainer and offering advice.

I'll ack it when it's submitted and I don't have any other comments.

cheers, Joe

2022-12-09 18:20:03

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 2/2] checkpatch: warn when Reported-by: is not followed by Link:

On Fri, 2022-12-09 at 10:54 +0100, Thorsten Leemhuis wrote:
> On 08.12.22 21:21, Joe Perches wrote:
> > On Thu, 2022-12-08 at 20:32 +0100, Kai Wasserb?ch wrote:
[]
> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > []
> > > @@ -3144,6 +3144,20 @@ sub process {
> > > "Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
> >
> > I believe this use of '"$here\n" . $rawline . "\n"' to be a defect.
> > I think this should just use $herecurr
> >
> > And the unnecessary space before a newline is an abomination ;)
> >
> > "Co-developed-by and Signed-off-by: name/email do not match\n" . $herecurr . $rawlines[$linenr]);
>
> Well, that's existing code. Want me to add a separate patch that fixes
> both of these aspects up in that area?

Sure, thanks.

2022-12-15 14:50:30

by Kai Wasserbäch

[permalink] [raw]
Subject: [PATCH 0/3, v3] feat: checkpatch: warn about dicouraged link tags and missing links

Hey,
Thorsten Leemhuis suggested the following two changes to checkpatch, which
I hereby humbly submit for review. Please let me know if any changes
would be required for acceptance.

The third patch came out of the preview rounds and fixes an odd
behaviour noticed then.

The two main changes should help developers write commit messages in a way
that's preferred by Linus and will greatly help Thorsten's regression
tracking efforts.

Patches are against mm/mm-nonmm-stable (which apparently is the tree
checkpatch changes are merged through).

Cheers,
Kai


v2->v3:
- address review feedback from Joe (grammer fixes, use of $herecurr, use
of --no-merges in commit log quote) (Thorsten)
- warn when people try to add version information to the commit log
using code suggested by Joe (Thorsten)
- add a patch to make existing code in a nearby area use $herecurr where
it should, which Joe noticed during review

v1->v2:
- add commit messages explaining the reasoning (Thorsten)
- approach the usage of wrong tags with an allow list as suggested by
Joe (Thorsten)

Suggested-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Kai Wasserbäch <[email protected]>
Signed-off-by: Thorsten Leemhuis <[email protected]>

Kai Wasserbäch (2):
checkpatch: warn when unknown tags are used for links
checkpatch: warn when Reported-by: is not followed by Link:

Thorsten Leemhuis (1):
checkpatch: use proper way for show problematic line

scripts/checkpatch.pl | 34 ++++++++++++++++++++++++++++++----
1 file changed, 30 insertions(+), 4 deletions(-)

--
2.35.1

2022-12-15 14:50:43

by Kai Wasserbäch

[permalink] [raw]
Subject: [PATCH 3/3] checkpatch: use proper way for show problematic line

From: Thorsten Leemhuis <[email protected]>

Instead of using an unnecessarily complicated approach to print a line
that is warned about, use `$herecurr` instead, just like everywhere else
in checkpatch. While at it, remove a superfluous space in one of the
changed lines, too.

Both problems were found by Joe Perches.

Signed-off-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Kai Wasserbäch <[email protected]>
---
scripts/checkpatch.pl | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9434f94d2d..f7ebbdf4f0 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3142,17 +3142,17 @@ sub process {
if ($sign_off =~ /^co-developed-by:$/i) {
if ($email eq $author) {
WARN("BAD_SIGN_OFF",
- "Co-developed-by: should not be used to attribute nominal patch author '$author'\n" . "$here\n" . $rawline);
+ "Co-developed-by: should not be used to attribute nominal patch author '$author'\n" . $herecurr);
}
if (!defined $lines[$linenr]) {
WARN("BAD_SIGN_OFF",
- "Co-developed-by: must be immediately followed by Signed-off-by:\n" . "$here\n" . $rawline);
+ "Co-developed-by: must be immediately followed by Signed-off-by:\n" . $herecurr);
} elsif ($rawlines[$linenr] !~ /^\s*signed-off-by:\s*(.*)/i) {
WARN("BAD_SIGN_OFF",
- "Co-developed-by: must be immediately followed by Signed-off-by:\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
+ "Co-developed-by: must be immediately followed by Signed-off-by:\n" . $herecurr . $rawlines[$linenr] . "\n");
} elsif ($1 ne $email) {
WARN("BAD_SIGN_OFF",
- "Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
+ "Co-developed-by and Signed-off-by: name/email do not match\n" . $herecurr . $rawlines[$linenr] . "\n");
}
}

--
2.35.1

2022-12-15 14:51:59

by Kai Wasserbäch

[permalink] [raw]
Subject: [PATCH 1/3] checkpatch: warn when unknown tags are used for links

Issue a warning when encountering URLs behind unknown tags, as Linus
recently stated ```please stop making up random tags that make no sense.
Just use "Link:"```[1]. That statement was triggered by an use of
'BugLink', but that's not the only tag people invented:

$ git log -100000 --no-merges --format=email -P \
--grep='^\w+:[ \t]*http' | grep -Poh '^\w+:[ \t]*http' | \
sort | uniq -c | sort -rn | head -n 20
103958 Link: http
418 BugLink: http
372 Patchwork: http
280 Closes: http
224 Bug: http
123 References: http
84 Bugzilla: http
61 URL: http
42 v1: http
38 Datasheet: http
20 v2: http
9 Ref: http
9 Fixes: http
9 Buglink: http
8 v3: http
8 Reference: http
7 See: http
6 1: http
5 link: http
3 Link:http

Some of these non-standard tags make it harder for external tools that
rely on use of proper tags. One of those tools is the regression
tracking bot 'regzbot', which looks out for "Link:" tags pointing to
reports of tracked regressions.

The initial idea was to use a disallow list to raise an error when
encountering known unwanted tags like BugLink:; during review it was
requested to use a list of allowed tags instead[2].

Link: https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/[email protected]/ [2]
Co-developed-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Kai Wasserbäch <[email protected]>
---
scripts/checkpatch.pl | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 78cc595b98..d739ce0909 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3250,6 +3250,18 @@ sub process {
$commit_log_possible_stack_dump = 0;
}

+# Check for odd tags before a URI/URL
+ if ($in_commit_log &&
+ $line =~ /^\s*(\w+):\s*http/ && $1 ne 'Link') {
+ if ($1 =~ /^v(?:ersion)?\d+/i) {
+ WARN("COMMIT_LOG_VERSIONING",
+ "Patch version information should be after the --- line\n" . $herecurr);
+ } else {
+ WARN("COMMIT_LOG_USE_LINK",
+ "Unknown link reference '$1:', use 'Link:' instead\n" . $herecurr);
+ }
+ }
+
# Check for lines starting with a #
if ($in_commit_log && $line =~ /^#/) {
if (WARN("COMMIT_COMMENT_SYMBOL",
--
2.35.1

2022-12-15 14:53:13

by Kai Wasserbäch

[permalink] [raw]
Subject: [PATCH 2/3] checkpatch: warn when Reported-by: is not followed by Link:

Encourage patch authors to link to reports by issuing a warning, if
a Reported-by: is not accompanied by a link to the report. Those links
are often extremely useful for any code archaeologist that wants to know
more about the backstory of a change than the commit message provides.
That includes maintainers higher up in the patch-flow hierarchy, which
is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:

> Again, the commit has a link to the patch *submission*, which is
> almost entirely useless. There's no link to the actual problem the
> patch fixes.
>
> [...]
>
> Put another way: I can see that
>
> Reported-by: Zhangfei Gao <[email protected]>
>
> in the commit, but I don't have a clue what the actual report was, and
> there really isn't enough information in the commit itself, except for
> a fairly handwavy "Device drivers might, for instance, still need to
> flush operations.."
>
> I don't want to know what device drivers _might_ do. I would want to
> have an actual pointer to what they do and where.

Another reason why these links are wanted: the ongoing regression
tracking efforts can only scale with them, as they allow the regression
tracking bot 'regzbot' to automatically connect tracked reports with
patches that are posted or committed to fix tracked regressions.

Link: https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/ [2]
Link: https://lore.kernel.org/all/CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com/ [3]
Co-developed-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Kai Wasserbäch <[email protected]>
---
scripts/checkpatch.pl | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index d739ce0909..9434f94d2d 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3155,6 +3155,20 @@ sub process {
"Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
}
}
+
+ # check if Reported-by: is followed by a Link:
+ if ($sign_off =~ /^reported-by:$/i) {
+ if (!defined $lines[$linenr]) {
+ WARN("BAD_REPORTED_BY_LINK",
+ "Reported-by: should be immediately followed by Link: to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
+ } elsif ($rawlines[$linenr] !~ /^\s*link:\s*(.*)/i) {
+ WARN("BAD_REPORTED_BY_LINK",
+ "Reported-by: should be immediately followed by Link: to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
+ } elsif ($lines[$linenr] !~ /https?:\/\//i) {
+ WARN("BAD_REPORTED_BY_LINK",
+ "Link: following Reported-by: should contain a URL\n" . $herecurr . $rawlines[$linenr] . "\n");
+ }
+ }
}

# Check Fixes: styles is correct
--
2.35.1

2022-12-15 18:22:49

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On Thu, 2022-12-15 at 15:43 +0100, Kai Wasserb?ch wrote:
> Encourage patch authors to link to reports by issuing a warning, if
> a Reported-by: is not accompanied by a link to the report.

Please also expand Documentation/ as appropriate to encourage this too.

> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -3155,6 +3155,20 @@ sub process {
> "Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
> }
> }
> +
> + # check if Reported-by: is followed by a Link:
> + if ($sign_off =~ /^reported-by:$/i) {
> + if (!defined $lines[$linenr]) {
> + WARN("BAD_REPORTED_BY_LINK",
> + "Reported-by: should be immediately followed by Link: to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
> + } elsif ($rawlines[$linenr] !~ /^\s*link:\s*(.*)/i) {

This capture group isn't necessary and does not guarantee there is
an actual link. Also this is allowing spaces before "Link:" when
the reported-by test above does not allow spaces. Please be
consistent. My preference would be to not allow spaces.

> + WARN("BAD_REPORTED_BY_LINK",
> + "Reported-by: should be immediately followed by Link: to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
> + } elsif ($lines[$linenr] !~ /https?:\/\//i) {

To make reading the match pattern easier, prefer m{} when the expected
content uses / instead of using / as a delimiter.

> + WARN("BAD_REPORTED_BY_LINK",
> + "Link: following Reported-by: should contain a URL\n" . $herecurr . $rawlines[$linenr] . "\n");
> + }
> + }
> }
>
> # Check Fixes: styles is correct

2022-12-18 15:26:26

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On 15.12.22 19:00, Joe Perches wrote:
> On Thu, 2022-12-15 at 15:43 +0100, Kai Wasserbäch wrote:
>> Encourage patch authors to link to reports by issuing a warning, if
>> a Reported-by: is not accompanied by a link to the report.
>
> Please also expand Documentation/ as appropriate to encourage this too.

Well, I already updated it a few moths ago, but yeah, will revisit the
docs and make it even clearer if needed (I assume that's the case).

>> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> []
>> @@ -3155,6 +3155,20 @@ sub process {
>> "Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
>> }
>> }
>> +
>> + # check if Reported-by: is followed by a Link:
>> + if ($sign_off =~ /^reported-by:$/i) {
>> + if (!defined $lines[$linenr]) {
>> + WARN("BAD_REPORTED_BY_LINK",
>> + "Reported-by: should be immediately followed by Link: to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
>> + } elsif ($rawlines[$linenr] !~ /^\s*link:\s*(.*)/i) {
>
> This capture group isn't necessary and does not guarantee there is
> an actual link. Also this is allowing spaces before "Link:" when
> the reported-by test above does not allow spaces. Please be
> consistent. My preference would be to not allow spaces.

Good idea. FWIW, the section patch 3 changes has a space here as well:
```
> } elsif ($rawlines[$linenr] !~ /^\s*signed-off-by:\s*(.*)/i) {
```
Hence I guess I'll remove it as well, unless you for some reason tell me
that's a bad idea.

>> + WARN("BAD_REPORTED_BY_LINK",
>> + "Reported-by: should be immediately followed by Link: to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
>> + } elsif ($lines[$linenr] !~ /https?:\/\//i) {
>
> To make reading the match pattern easier, prefer m{} when the expected
> content uses / instead of using / as a delimiter.

Also a good idea. That being said:

I think I'll drop this check, as your comment made me think: why should
we check for "https?://" here at all? If we want to check that a "Link:"
is followed by a URI, then we should do that all the time, not just in
this single case.

Or am I missing something?

>> + WARN("BAD_REPORTED_BY_LINK",
>> + "Link: following Reported-by: should contain a URL\n" . $herecurr . $rawlines[$linenr] . "\n");
>> + }
>> + }
>> }
>>
>> # Check Fixes: styles is correct

Ciao, Thorsten

2023-01-20 13:05:18

by Thorsten Leemhuis

[permalink] [raw]
Subject: [PATCH v4 0/3] checkpatch.pl: warn about discouraged tags and missing Link: tags

Hi, please consider the following checkpatch.pl patches for review.

The first two changes make checkpatch.pl check for a few mistakes wrt to
links to bug reports Linus recently complained about a few times.
Avoiding those is also important for my regression tracking efforts a
lot, as the automated tracking performed by regzbot relies on the proper
usage of the Link: tag.

The third patch fixes a few small oddities noticed in existing code
during review of the two changes.

Ciao, Thorsten
---
v3->v4:
- address review feedback from Joe (do not allow leading spaces in
matches, check if Link: is actually followed by a URL in one go,
and use m{} for matching https://) (Thorsten)
- catch Reported-and-tested-by as well (Thorsten)

v2->v3:
- address review feedback from Joe (grammer fixes, use of $herecurr, use
of --no-merges in commit log quote) (Thorsten)
- warn when people try to add version information to the commit log
using code suggested by Joe (Thorsten)
- add a patch to make existing code in a nearby area use $herecurr where
it should, which Joe noticed during review

v1->v2:
- add commit messages explaining the reasoning (Thorsten)
- approach the usage of wrong tags with an allow list as suggested by
Joe (Thorsten)

Kai Wasserbäch (2):
checkpatch: warn when unknown tags are used for links
checkpatch: warn when Reported-by: is not followed by Link:

Thorsten Leemhuis (1):
checkpatch: use proper way for show problematic line

scripts/checkpatch.pl | 34 +++++++++++++++++++++++++++++-----
1 file changed, 29 insertions(+), 5 deletions(-)


base-commit: 5dc4c995db9eb45f6373a956eb1f69460e69e6d4
--
2.39.0

2023-01-20 13:07:11

by Thorsten Leemhuis

[permalink] [raw]
Subject: [PATCH v4 1/3] checkpatch: warn when unknown tags are used for links

From: Kai Wasserbäch <[email protected]>

Issue a warning when encountering URLs behind unknown tags, as Linus
recently stated ```please stop making up random tags that make no sense.
Just use "Link:"```[1]. That statement was triggered by an use of
'BugLink', but that's not the only tag people invented:

$ git log -100000 --no-merges --format=email -P \
--grep='^\w+:[ \t]*http' | grep -Poh '^\w+:[ \t]*http' | \
sort | uniq -c | sort -rn | head -n 20
103958 Link: http
418 BugLink: http
372 Patchwork: http
280 Closes: http
224 Bug: http
123 References: http
84 Bugzilla: http
61 URL: http
42 v1: http
38 Datasheet: http
20 v2: http
9 Ref: http
9 Fixes: http
9 Buglink: http
8 v3: http
8 Reference: http
7 See: http
6 1: http
5 link: http
3 Link:http

Some of these non-standard tags make it harder for external tools that
rely on use of proper tags. One of those tools is the regression
tracking bot 'regzbot', which looks out for "Link:" tags pointing to
reports of tracked regressions.

The initial idea was to use a disallow list to raise an error when
encountering known unwanted tags like BugLink:; during review it was
requested to use a list of allowed tags instead[2].

Link: https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/[email protected]/ [2]
Signed-off-by: Kai Wasserbäch <[email protected]>
Co-developed-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Thorsten Leemhuis <[email protected]>
---
scripts/checkpatch.pl | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 78cc595b98ce..d739ce0909b1 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3250,6 +3250,18 @@ sub process {
$commit_log_possible_stack_dump = 0;
}

+# Check for odd tags before a URI/URL
+ if ($in_commit_log &&
+ $line =~ /^\s*(\w+):\s*http/ && $1 ne 'Link') {
+ if ($1 =~ /^v(?:ersion)?\d+/i) {
+ WARN("COMMIT_LOG_VERSIONING",
+ "Patch version information should be after the --- line\n" . $herecurr);
+ } else {
+ WARN("COMMIT_LOG_USE_LINK",
+ "Unknown link reference '$1:', use 'Link:' instead\n" . $herecurr);
+ }
+ }
+
# Check for lines starting with a #
if ($in_commit_log && $line =~ /^#/) {
if (WARN("COMMIT_COMMENT_SYMBOL",
--
2.39.0

2023-01-20 13:19:28

by Thorsten Leemhuis

[permalink] [raw]
Subject: [PATCH v4 3/3] checkpatch: use proper way for show problematic line

Instead of using an unnecessarily complicated approach to print a line
that is warned about, use `$herecurr` instead, just like everywhere else
in checkpatch.

While at it, remove a superfluous space in one of the changed lines,
too. In a unmodified line also remove a superfluous check for a space
before a signed-off-by tag, to me consistent with the check at the
start of the section.

All three problems were found by Joe Perches during review of new code
inspired by the code modified here.

Signed-off-by: Thorsten Leemhuis <[email protected]>
---
scripts/checkpatch.pl | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index b74d6002f773..b1bf0c7b03a7 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3142,17 +3142,17 @@ sub process {
if ($sign_off =~ /^co-developed-by:$/i) {
if ($email eq $author) {
WARN("BAD_SIGN_OFF",
- "Co-developed-by: should not be used to attribute nominal patch author '$author'\n" . "$here\n" . $rawline);
+ "Co-developed-by: should not be used to attribute nominal patch author '$author'\n" . $herecurr);
}
if (!defined $lines[$linenr]) {
WARN("BAD_SIGN_OFF",
- "Co-developed-by: must be immediately followed by Signed-off-by:\n" . "$here\n" . $rawline);
- } elsif ($rawlines[$linenr] !~ /^\s*signed-off-by:\s*(.*)/i) {
+ "Co-developed-by: must be immediately followed by Signed-off-by:\n" . $herecurr);
+ } elsif ($rawlines[$linenr] !~ /^signed-off-by:\s*(.*)/i) {
WARN("BAD_SIGN_OFF",
- "Co-developed-by: must be immediately followed by Signed-off-by:\n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
+ "Co-developed-by: must be immediately followed by Signed-off-by:\n" . $herecurr . $rawlines[$linenr] . "\n");
} elsif ($1 ne $email) {
WARN("BAD_SIGN_OFF",
- "Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
+ "Co-developed-by and Signed-off-by: name/email do not match\n" . $herecurr . $rawlines[$linenr] . "\n");
}
}

--
2.39.0

2023-01-20 13:19:59

by Thorsten Leemhuis

[permalink] [raw]
Subject: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

From: Kai Wasserbäch <[email protected]>

Encourage patch authors to link to reports by issuing a warning, if
a Reported-by: is not accompanied by a link to the report. Those links
are often extremely useful for any code archaeologist that wants to know
more about the backstory of a change than the commit message provides.
That includes maintainers higher up in the patch-flow hierarchy, which
is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:

> Again, the commit has a link to the patch *submission*, which is
> almost entirely useless. There's no link to the actual problem the
> patch fixes.
>
> [...]
>
> Put another way: I can see that
>
> Reported-by: Zhangfei Gao <[email protected]>
>
> in the commit, but I don't have a clue what the actual report was, and
> there really isn't enough information in the commit itself, except for
> a fairly handwavy "Device drivers might, for instance, still need to
> flush operations.."
>
> I don't want to know what device drivers _might_ do. I would want to
> have an actual pointer to what they do and where.

Another reason why these links are wanted: the ongoing regression
tracking efforts can only scale with them, as they allow the regression
tracking bot 'regzbot' to automatically connect tracked reports with
patches that are posted or committed to fix tracked regressions.

Link: https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/ [2]
Link: https://lore.kernel.org/all/CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com/ [3]
Signed-off-by: Kai Wasserbäch <[email protected]>
Co-developed-by: Thorsten Leemhuis <[email protected]>
Signed-off-by: Thorsten Leemhuis <[email protected]>
---
scripts/checkpatch.pl | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index d739ce0909b1..b74d6002f773 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3155,8 +3155,20 @@ sub process {
"Co-developed-by and Signed-off-by: name/email do not match \n" . "$here\n" . $rawline . "\n" .$rawlines[$linenr]);
}
}
+
+# check if Reported-by: is followed by a Link:
+ if ($sign_off =~ /^reported(?:|-and-tested)-by:$/i) {
+ if (!defined $lines[$linenr]) {
+ WARN("BAD_REPORTED_BY_LINK",
+ "Reported-by: should be immediately followed by Link: to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
+ } elsif ($rawlines[$linenr] !~ m{^link:\s*https?://}i) {
+ WARN("BAD_REPORTED_BY_LINK",
+ "Reported-by: should be immediately followed by Link: with a URL to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
+ }
+ }
}

+
# Check Fixes: styles is correct
if (!$in_header_lines &&
$line =~ /^\s*fixes:?\s*(?:commit\s*)?[0-9a-f]{5,}\b/i) {
--
2.39.0

2023-01-20 15:36:16

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH v4 0/3] checkpatch.pl: warn about discouraged tags and missing Link: tags

On Fri, 2023-01-20 at 13:35 +0100, Thorsten Leemhuis wrote:
> Hi, please consider the following checkpatch.pl patches for review.
>
> The first two changes make checkpatch.pl check for a few mistakes wrt to
> links to bug reports Linus recently complained about a few times.
> Avoiding those is also important for my regression tracking efforts a
> lot, as the automated tracking performed by regzbot relies on the proper
> usage of the Link: tag.
>
> The third patch fixes a few small oddities noticed in existing code
> during review of the two changes.

Hey Andrew. Please forward this patch series upstream.

Thanks, Joe

>
> Ciao, Thorsten
> ---
> v3->v4:
> - address review feedback from Joe (do not allow leading spaces in
> matches, check if Link: is actually followed by a URL in one go,
> and use m{} for matching https://) (Thorsten)
> - catch Reported-and-tested-by as well (Thorsten)
>
> v2->v3:
> - address review feedback from Joe (grammer fixes, use of $herecurr, use
> of --no-merges in commit log quote) (Thorsten)
> - warn when people try to add version information to the commit log
> using code suggested by Joe (Thorsten)
> - add a patch to make existing code in a nearby area use $herecurr where
> it should, which Joe noticed during review
>
> v1->v2:
> - add commit messages explaining the reasoning (Thorsten)
> - approach the usage of wrong tags with an allow list as suggested by
> Joe (Thorsten)
>
> Kai Wasserb?ch (2):
> checkpatch: warn when unknown tags are used for links
> checkpatch: warn when Reported-by: is not followed by Link:
>
> Thorsten Leemhuis (1):
> checkpatch: use proper way for show problematic line
>
> scripts/checkpatch.pl | 34 +++++++++++++++++++++++++++++-----
> 1 file changed, 29 insertions(+), 5 deletions(-)
>
>
> base-commit: 5dc4c995db9eb45f6373a956eb1f69460e69e6d4

2023-02-27 13:25:53

by Matthieu Baerts

[permalink] [raw]
Subject: Re: [PATCH v4 1/3] checkpatch: warn when unknown tags are used for links

Hello,

On 20/01/2023 13:35, Thorsten Leemhuis wrote:
> From: Kai Wasserbäch <[email protected]>
>
> Issue a warning when encountering URLs behind unknown tags, as Linus
> recently stated ```please stop making up random tags that make no sense.
> Just use "Link:"```[1]. That statement was triggered by an use of
> 'BugLink', but that's not the only tag people invented:
>
> $ git log -100000 --no-merges --format=email -P \
> --grep='^\w+:[ \t]*http' | grep -Poh '^\w+:[ \t]*http' | \
> sort | uniq -c | sort -rn | head -n 20
> 103958 Link: http
> 418 BugLink: http
> 372 Patchwork: http
> 280 Closes: http
> 224 Bug: http
> 123 References: http
> 84 Bugzilla: http
> 61 URL: http
> 42 v1: http
> 38 Datasheet: http
> 20 v2: http
> 9 Ref: http
> 9 Fixes: http
> 9 Buglink: http
> 8 v3: http
> 8 Reference: http
> 7 See: http
> 6 1: http
> 5 link: http
> 3 Link:http
>
> Some of these non-standard tags make it harder for external tools that
> rely on use of proper tags. One of those tools is the regression
> tracking bot 'regzbot', which looks out for "Link:" tags pointing to
> reports of tracked regressions.

I'm sorry for the late feedback but would it be possible to add an
exception for the "Closes" tag followed by a URL?

This tag is useful -- at least for us when maintaining the MPTCP subtree
-- to have tickets being automatically closed when a patch is accepted.
I don't think this "Closes" tag is a "random one that makes no sense"
but I agree it is not an "official" one described in the documentation.

On our side, we are using GitHub to manage issues but this also works
with GitLab and probably others. Other keywords are also accepted [1][2]
but I guess it is best to stick with one, especially when it is already
used according to the list provided above.

Would it then be OK to allow this "Closes" tag in checkpatch.pl and
mention it in the documentation (Submitting patches)?

Or should we switch to the "Link" tag instead (and re-do the tracking
manually)?

Cheers,
Matt

[1]
https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests
[2]
https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#default-closing-pattern
--
Tessares | Belgium | Hybrid Access Solutions
http://www.tessares.net

2023-03-02 04:46:15

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On Fri, 20 Jan 2023 13:35:19 +0100 Thorsten Leemhuis wrote:
> From: Kai Wasserbäch <[email protected]>
>
> Encourage patch authors to link to reports by issuing a warning, if
> a Reported-by: is not accompanied by a link to the report. Those links
> are often extremely useful for any code archaeologist that wants to know
> more about the backstory of a change than the commit message provides.
> That includes maintainers higher up in the patch-flow hierarchy, which
> is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:

Is it okay if we exclude syzbot reports from this rule?
If full syzbot report ID is provided - it's as good as a link.
And regression tracking doesn't seem to happen much on syzbot
reports either.

I like the addition otherwise, it's already catching missing links
in netdev land!

2023-03-02 05:17:31

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On 02.03.23 05:46, Jakub Kicinski wrote:
> On Fri, 20 Jan 2023 13:35:19 +0100 Thorsten Leemhuis wrote:
>> From: Kai Wasserbäch <[email protected]>
>>
>> Encourage patch authors to link to reports by issuing a warning, if
>> a Reported-by: is not accompanied by a link to the report. Those links
>> are often extremely useful for any code archaeologist that wants to know
>> more about the backstory of a change than the commit message provides.
>> That includes maintainers higher up in the patch-flow hierarchy, which
>> is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:
>
> Is it okay if we exclude syzbot reports from this rule?
> If full syzbot report ID is provided - it's as good as a link.

Hmmm. Not sure. Every special case makes things harder for humans and
software that looks at a commits downstream. Clicking on a link also
makes things easy for code archaeologists that might look into the issue
months or years later (which might not even know how to find the report
and potential discussions on lore from the syzbot report ID).

Hence, wouldn't it be better to ask the syzbot folks to change their
reporting slightly and suggest something like this instead in their
reports (the last line is the new one):

```
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
Link: https://lore.kernel.org/r/[email protected]/
```

This might not be to hard if they known the message-id in advance. Maybe
they could even use the syzbot report ID as msg-id to make things even
easier. And for developers not much would change afaics, they just need
to copy and paste two lines instead of one.

> And regression tracking doesn't seem to happen much on syzbot
> reports either.

Yeah, right now I most of the time stay away from CI reports and leave
the tracking to the people that run the CI (unless it's something I
consider worth tracking), but I hope that might change over time to have
things in one place.

> I like the addition otherwise, it's already catching missing links
> in netdev land!

Thx for saying this!

Ciao, Thorsten

2023-03-02 05:36:25

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH v4 1/3] checkpatch: warn when unknown tags are used for links

On 27.02.23 14:25, Matthieu Baerts wrote:
> On 20/01/2023 13:35, Thorsten Leemhuis wrote:
>> From: Kai Wasserbäch <[email protected]>
>>
>> Issue a warning when encountering URLs behind unknown tags, as Linus
>> recently stated ```please stop making up random tags that make no sense.
>> Just use "Link:"```[1]. That statement was triggered by an use of
>> 'BugLink', but that's not the only tag people invented:
>>
>> $ git log -100000 --no-merges --format=email -P \
>> --grep='^\w+:[ \t]*http' | grep -Poh '^\w+:[ \t]*http' | \
>> sort | uniq -c | sort -rn | head -n 20
>> 103958 Link: http
>> 418 BugLink: http
>> 372 Patchwork: http
>> 280 Closes: http
>> 224 Bug: http
>> 123 References: http
>> [...]
>>
>> Some of these non-standard tags make it harder for external tools that
>> rely on use of proper tags. One of those tools is the regression
>> tracking bot 'regzbot', which looks out for "Link:" tags pointing to
>> reports of tracked regressions.
>
> I'm sorry for the late feedback but would it be possible to add an
> exception for the "Closes" tag followed by a URL?

As I just wrote in a reply to Jakub: Not sure. Every special case makes
things harder for humans and software that looks at a commits downstream.

> This tag is useful -- at least for us when maintaining the MPTCP subtree
> -- to have tickets being automatically closed when a patch is accepted.
> I don't think this "Closes" tag is a "random one that makes no sense"
> but I agree it is not an "official" one described in the documentation.
>
> On our side, we are using GitHub to manage issues but this also works
> with GitLab and probably others. Other keywords are also accepted [1][2]
> but I guess it is best to stick with one, especially when it is already
> used according to the list provided above.
>
> Would it then be OK to allow this "Closes" tag in checkpatch.pl and
> mention it in the documentation (Submitting patches)?
>
> Or should we switch to the "Link" tag instead (and re-do the tracking
> manually)?
>
> [1]
> https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests
> [2]
> https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#default-closing-pattern

For the record, let me repeat and further elaborate what I already said
on social media before you wrote your mail:

* I'm not mostly neutral here, but it was Linus who wrote "please stop
making up random tags that make no sense." in [1]. This was triggered by
a use of "BugLink:"; maybe there are tools out there that rely on that
tag, hence their users might ask for a exception as well. That's why I
think it's Linus call to grant any exceptions.

[1]
https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/

* if such an exception is made, it IMHO must be documented in our
documentation, so any software and humans that rely on these tags are
aware of it.

Ciao, Thorsten

2023-03-02 05:40:35

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On Thu, 2 Mar 2023 06:17:22 +0100 Thorsten Leemhuis wrote:
> On 02.03.23 05:46, Jakub Kicinski wrote:
> > On Fri, 20 Jan 2023 13:35:19 +0100 Thorsten Leemhuis wrote:
> >> Encourage patch authors to link to reports by issuing a warning, if
> >> a Reported-by: is not accompanied by a link to the report. Those links
> >> are often extremely useful for any code archaeologist that wants to know
> >> more about the backstory of a change than the commit message provides.
> >> That includes maintainers higher up in the patch-flow hierarchy, which
> >> is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:
> >
> > Is it okay if we exclude syzbot reports from this rule?
> > If full syzbot report ID is provided - it's as good as a link.
>
> Hmmm. Not sure. Every special case makes things harder for humans and
> software that looks at a commits downstream. Clicking on a link also
> makes things easy for code archaeologists that might look into the issue
> months or years later (which might not even know how to find the report
> and potential discussions on lore from the syzbot report ID).

No other system comes close to syzbot in terms of reporting meaningful
bugs, IMHO special casing it doesn't risk creep.

Interestingly other bots attach links which are 100% pointless noise:

Reported-by: Abaci Robot <[email protected]>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4174

Oh, eh. Let's see how noisy this check is once the merge window is over.

> Hence, wouldn't it be better to ask the syzbot folks to change their
> reporting slightly and suggest something like this instead in their
> reports (the last line is the new one):
>
> ```
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
> Link: https://lore.kernel.org/r/[email protected]/
> ```
>
> This might not be to hard if they known the message-id in advance. Maybe
> they could even use the syzbot report ID as msg-id to make things even
> easier. And for developers not much would change afaics, they just need
> to copy and paste two lines instead of one.

Dmitry, WDYT?

2023-03-02 08:27:56

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On Thu, 2 Mar 2023 at 06:40, Jakub Kicinski <[email protected]> wrote:
>
> On Thu, 2 Mar 2023 06:17:22 +0100 Thorsten Leemhuis wrote:
> > On 02.03.23 05:46, Jakub Kicinski wrote:
> > > On Fri, 20 Jan 2023 13:35:19 +0100 Thorsten Leemhuis wrote:
> > >> Encourage patch authors to link to reports by issuing a warning, if
> > >> a Reported-by: is not accompanied by a link to the report. Those links
> > >> are often extremely useful for any code archaeologist that wants to know
> > >> more about the backstory of a change than the commit message provides.
> > >> That includes maintainers higher up in the patch-flow hierarchy, which
> > >> is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:
> > >
> > > Is it okay if we exclude syzbot reports from this rule?
> > > If full syzbot report ID is provided - it's as good as a link.
> >
> > Hmmm. Not sure. Every special case makes things harder for humans and
> > software that looks at a commits downstream. Clicking on a link also
> > makes things easy for code archaeologists that might look into the issue
> > months or years later (which might not even know how to find the report
> > and potential discussions on lore from the syzbot report ID).
>
> No other system comes close to syzbot in terms of reporting meaningful
> bugs, IMHO special casing it doesn't risk creep.
>
> Interestingly other bots attach links which are 100% pointless noise:
>
> Reported-by: Abaci Robot <[email protected]>
> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4174
>
> Oh, eh. Let's see how noisy this check is once the merge window is over.
>
> > Hence, wouldn't it be better to ask the syzbot folks to change their
> > reporting slightly and suggest something like this instead in their
> > reports (the last line is the new one):
> >
> > ```
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
> > Link: https://lore.kernel.org/r/[email protected]/
> > ```
> >
> > This might not be to hard if they known the message-id in advance. Maybe
> > they could even use the syzbot report ID as msg-id to make things even
> > easier. And for developers not much would change afaics, they just need
> > to copy and paste two lines instead of one.
>
> Dmitry, WDYT?

Hi Jakub, Thorsten,

Adding a Link to syzbot reports should be relatively trivial.

Ted proposed to use Link _instead_ of Reported-by:

https://github.com/google/syzkaller/issues/3596
> in fact, it might be nice if we could encourage upstream developers
> put in the commit trailer:
> Link: https://syzkaller.appspot.com/bug?id=5266d464285a03cee9dbfda7d2452a72c3c2ae7c
> in addition to, or better yet, instead of:
> Reported-by: [email protected]

We could also use a link in the Reported-by tag, e.g.:

Reported-by: https://syzkaller.appspot.com/b/5266d464285a03cee9db

Some folks parse Reported-by to collect stats.

What is better?

2023-03-02 09:05:14

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On 02.03.23 09:27, Dmitry Vyukov wrote:
> On Thu, 2 Mar 2023 at 06:40, Jakub Kicinski <[email protected]> wrote:
>> On Thu, 2 Mar 2023 06:17:22 +0100 Thorsten Leemhuis wrote:
>>> On 02.03.23 05:46, Jakub Kicinski wrote:
>>>> On Fri, 20 Jan 2023 13:35:19 +0100 Thorsten Leemhuis wrote:
>>>>> Encourage patch authors to link to reports by issuing a warning, if
>>>>> a Reported-by: is not accompanied by a link to the report. Those links
>>>>> are often extremely useful for any code archaeologist that wants to know
>>>>> more about the backstory of a change than the commit message provides.
>>>>> That includes maintainers higher up in the patch-flow hierarchy, which
>>>>> is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:
>>>>
>>>> Is it okay if we exclude syzbot reports from this rule?
>>>> If full syzbot report ID is provided - it's as good as a link.
>>>
>>> Hmmm. Not sure. Every special case makes things harder for humans and
>>> software that looks at a commits downstream. Clicking on a link also
>>> makes things easy for code archaeologists that might look into the issue
>>> months or years later (which might not even know how to find the report
>>> and potential discussions on lore from the syzbot report ID).
>>
>> No other system comes close to syzbot in terms of reporting meaningful
>> bugs, IMHO special casing it doesn't risk creep.
>>
>> Interestingly other bots attach links which are 100% pointless noise:
>>
>> Reported-by: Abaci Robot <[email protected]>
>> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4174
>>
>> Oh, eh. Let's see how noisy this check is once the merge window is over.
>>
>>> Hence, wouldn't it be better to ask the syzbot folks to change their
>>> reporting slightly and suggest something like this instead in their
>>> reports (the last line is the new one):
>>>
>>> ```
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: [email protected]
>>> Link: https://lore.kernel.org/r/[email protected]/
>>> ```
>>>
>>> This might not be to hard if they known the message-id in advance. Maybe
>>> they could even use the syzbot report ID as msg-id to make things even
>>> easier. And for developers not much would change afaics, they just need
>>> to copy and paste two lines instead of one.
>>
>> Dmitry, WDYT?
>
> Adding a Link to syzbot reports should be relatively trivial.

Sounds good.

> Ted proposed to use Link _instead_ of Reported-by:
> https://github.com/google/syzkaller/issues/3596
>> in fact, it might be nice if we could encourage upstream developers
>> put in the commit trailer:
>> Link: https://syzkaller.appspot.com/bug?id=5266d464285a03cee9dbfda7d2452a72c3c2ae7c
>> in addition to, or better yet, instead of:
>> Reported-by: [email protected]
>
> We could also use a link in the Reported-by tag, e.g.:
>
> Reported-by: https://syzkaller.appspot.com/b/5266d464285a03cee9db
>
> Some folks parse Reported-by to collect stats.
>
> What is better?

Here are my thoughts:

* we should definitely have a "Link:" to the report in lore, as that's
the long-term archive under our own control and also where discussions
happen after the report was posted; but I'm biased here, as such a tag
would make tracking with regzbot a no-brainer ;)

* "Reported-by:" IMHO should stay for the hat tip and stats aspects; I
don't care if it includes the syzbot report ID or not (omitting it might
be good for the stats aspects and is more friendly to the eyes, but
those are just details)

* a Link: to the syzkaller web ui might be nice, too -- and likely is
the easiest thing to look out for on the syzbot server side

IOW something like this maybe:

Reported-by: [email protected]
Link: https://lore.kernel.org/r/[email protected]/
Link: https://syzkaller.appspot.com/b/cafecafecaca0cafecafe

Something like the following would look more normal, but of course is
only possible if syzbot starts out to look for such Link: tags (not sure
if the msgid is valid here, but you get the idea):

Reported-by: [email protected]
Link:
https://lore.kernel.org/r/syzbot+cafecafecaca0cafecafe-syzkaller-appspotmail-com@google.com/

Ciao, Thorsten

2023-03-02 09:11:54

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On Thu, 2 Mar 2023 at 10:04, Thorsten Leemhuis <[email protected]> wrote:
>
> On 02.03.23 09:27, Dmitry Vyukov wrote:
> > On Thu, 2 Mar 2023 at 06:40, Jakub Kicinski <[email protected]> wrote:
> >> On Thu, 2 Mar 2023 06:17:22 +0100 Thorsten Leemhuis wrote:
> >>> On 02.03.23 05:46, Jakub Kicinski wrote:
> >>>> On Fri, 20 Jan 2023 13:35:19 +0100 Thorsten Leemhuis wrote:
> >>>>> Encourage patch authors to link to reports by issuing a warning, if
> >>>>> a Reported-by: is not accompanied by a link to the report. Those links
> >>>>> are often extremely useful for any code archaeologist that wants to know
> >>>>> more about the backstory of a change than the commit message provides.
> >>>>> That includes maintainers higher up in the patch-flow hierarchy, which
> >>>>> is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:
> >>>>
> >>>> Is it okay if we exclude syzbot reports from this rule?
> >>>> If full syzbot report ID is provided - it's as good as a link.
> >>>
> >>> Hmmm. Not sure. Every special case makes things harder for humans and
> >>> software that looks at a commits downstream. Clicking on a link also
> >>> makes things easy for code archaeologists that might look into the issue
> >>> months or years later (which might not even know how to find the report
> >>> and potential discussions on lore from the syzbot report ID).
> >>
> >> No other system comes close to syzbot in terms of reporting meaningful
> >> bugs, IMHO special casing it doesn't risk creep.
> >>
> >> Interestingly other bots attach links which are 100% pointless noise:
> >>
> >> Reported-by: Abaci Robot <[email protected]>
> >> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4174
> >>
> >> Oh, eh. Let's see how noisy this check is once the merge window is over.
> >>
> >>> Hence, wouldn't it be better to ask the syzbot folks to change their
> >>> reporting slightly and suggest something like this instead in their
> >>> reports (the last line is the new one):
> >>>
> >>> ```
> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>> Reported-by: [email protected]
> >>> Link: https://lore.kernel.org/r/[email protected]/
> >>> ```
> >>>
> >>> This might not be to hard if they known the message-id in advance. Maybe
> >>> they could even use the syzbot report ID as msg-id to make things even
> >>> easier. And for developers not much would change afaics, they just need
> >>> to copy and paste two lines instead of one.
> >>
> >> Dmitry, WDYT?
> >
> > Adding a Link to syzbot reports should be relatively trivial.
>
> Sounds good.
>
> > Ted proposed to use Link _instead_ of Reported-by:
> > https://github.com/google/syzkaller/issues/3596
> >> in fact, it might be nice if we could encourage upstream developers
> >> put in the commit trailer:
> >> Link: https://syzkaller.appspot.com/bug?id=5266d464285a03cee9dbfda7d2452a72c3c2ae7c
> >> in addition to, or better yet, instead of:
> >> Reported-by: [email protected]
> >
> > We could also use a link in the Reported-by tag, e.g.:
> >
> > Reported-by: https://syzkaller.appspot.com/b/5266d464285a03cee9db
> >
> > Some folks parse Reported-by to collect stats.
> >
> > What is better?
>
> Here are my thoughts:
>
> * we should definitely have a "Link:" to the report in lore, as that's
> the long-term archive under our own control and also where discussions
> happen after the report was posted; but I'm biased here, as such a tag
> would make tracking with regzbot a no-brainer ;)
>
> * "Reported-by:" IMHO should stay for the hat tip and stats aspects; I
> don't care if it includes the syzbot report ID or not (omitting it might
> be good for the stats aspects and is more friendly to the eyes, but
> those are just details)
>
> * a Link: to the syzkaller web ui might be nice, too -- and likely is
> the easiest thing to look out for on the syzbot server side
>
> IOW something like this maybe:
>
> Reported-by: [email protected]
> Link: https://lore.kernel.org/r/[email protected]/
> Link: https://syzkaller.appspot.com/b/cafecafecaca0cafecafe
>
> Something like the following would look more normal, but of course is
> only possible if syzbot starts out to look for such Link: tags (not sure
> if the msgid is valid here, but you get the idea):
>
> Reported-by: [email protected]
> Link:
> https://lore.kernel.org/r/syzbot+cafecafecaca0cafecafe-syzkaller-appspotmail-com@google.com/

Oh, you mean lore link.

We can parse out our hash from any tag, but the problem is that the
current email api we use, does not allow to specify Message-ID before
sending, so we don't know it when generating the text.
We don't even know it after sending, the API is super simple:
https://pkg.go.dev/google.golang.org/appengine/mail
So we don't know what the lore link will be...

2023-03-02 09:49:16

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On 02.03.23 10:11, Dmitry Vyukov wrote:
> On Thu, 2 Mar 2023 at 10:04, Thorsten Leemhuis <[email protected]> wrote:
>> On 02.03.23 09:27, Dmitry Vyukov wrote:
>>> On Thu, 2 Mar 2023 at 06:40, Jakub Kicinski <[email protected]> wrote:
>>>> On Thu, 2 Mar 2023 06:17:22 +0100 Thorsten Leemhuis wrote:
>>>>> On 02.03.23 05:46, Jakub Kicinski wrote:
>>>>>> On Fri, 20 Jan 2023 13:35:19 +0100 Thorsten Leemhuis wrote:
>>>>>>> Encourage patch authors to link to reports by issuing a warning, if
>>>>>>> a Reported-by: is not accompanied by a link to the report. Those links
>>>>>>> are often extremely useful for any code archaeologist that wants to know
>>>>>>> more about the backstory of a change than the commit message provides.
>>>>>>> That includes maintainers higher up in the patch-flow hierarchy, which
>>>>>>> is why Linus asks developers to add such links [1, 2, 3]. To quote [1]:
>>>>>>
>>>>>> Is it okay if we exclude syzbot reports from this rule?
>>>>>> If full syzbot report ID is provided - it's as good as a link.
>>>>>
>>>>> Hmmm. Not sure. Every special case makes things harder for humans and
>>>>> software that looks at a commits downstream. Clicking on a link also
>>>>> makes things easy for code archaeologists that might look into the issue
>>>>> months or years later (which might not even know how to find the report
>>>>> and potential discussions on lore from the syzbot report ID).
>>>>
>>>> No other system comes close to syzbot in terms of reporting meaningful
>>>> bugs, IMHO special casing it doesn't risk creep.
>>>>
>>>> Interestingly other bots attach links which are 100% pointless noise:
>>>>
>>>> Reported-by: Abaci Robot <[email protected]>
>>>> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4174
>>>>
>>>> Oh, eh. Let's see how noisy this check is once the merge window is over.
>>>>
>>>>> Hence, wouldn't it be better to ask the syzbot folks to change their
>>>>> reporting slightly and suggest something like this instead in their
>>>>> reports (the last line is the new one):
>>>>>
>>>>> ```
>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>> Reported-by: [email protected]
>>>>> Link: https://lore.kernel.org/r/[email protected]/
>>>>> ```
>>>>>
>>>>> This might not be to hard if they known the message-id in advance. Maybe
>>>>> they could even use the syzbot report ID as msg-id to make things even
>>>>> easier. And for developers not much would change afaics, they just need
>>>>> to copy and paste two lines instead of one.
>>>>
>>>> Dmitry, WDYT?
>>>
>>> Adding a Link to syzbot reports should be relatively trivial.
>>
>> Sounds good.
>>
>>> Ted proposed to use Link _instead_ of Reported-by:
>>> https://github.com/google/syzkaller/issues/3596
>>>> in fact, it might be nice if we could encourage upstream developers
>>>> put in the commit trailer:
>>>> Link: https://syzkaller.appspot.com/bug?id=5266d464285a03cee9dbfda7d2452a72c3c2ae7c
>>>> in addition to, or better yet, instead of:
>>>> Reported-by: [email protected]
>>>
>>> We could also use a link in the Reported-by tag, e.g.:
>>>
>>> Reported-by: https://syzkaller.appspot.com/b/5266d464285a03cee9db
>>>
>>> Some folks parse Reported-by to collect stats.
>>>
>>> What is better?
>>
>> Here are my thoughts:
>>
>> * we should definitely have a "Link:" to the report in lore, as that's
>> the long-term archive under our own control and also where discussions
>> happen after the report was posted; but I'm biased here, as such a tag
>> would make tracking with regzbot a no-brainer ;)
>>
>> * "Reported-by:" IMHO should stay for the hat tip and stats aspects; I
>> don't care if it includes the syzbot report ID or not (omitting it might
>> be good for the stats aspects and is more friendly to the eyes, but
>> those are just details)
>>
>> * a Link: to the syzkaller web ui might be nice, too -- and likely is
>> the easiest thing to look out for on the syzbot server side
>>
>> IOW something like this maybe:
>>
>> Reported-by: [email protected]
>> Link: https://lore.kernel.org/r/[email protected]/
>> Link: https://syzkaller.appspot.com/b/cafecafecaca0cafecafe
>>
>> Something like the following would look more normal, but of course is
>> only possible if syzbot starts out to look for such Link: tags (not sure
>> if the msgid is valid here, but you get the idea):
>>
>> Reported-by: [email protected]
>> Link:
>> https://lore.kernel.org/r/syzbot+cafecafecaca0cafecafe-syzkaller-appspotmail-com@google.com/
>
> Oh, you mean lore link.
>
> We can parse out our hash from any tag, but the problem is that the
> current email api we use, does not allow to specify Message-ID before
> sending, so we don't know it when generating the text.
> We don't even know it after sending, the API is super simple:
> https://pkg.go.dev/google.golang.org/appengine/mail
> So we don't know what the lore link will be...

That's... unfortunate, as from my understanding of things that would be
the most important "Link:" to have in any patches that fix issues report
by syzbot. But well, that's how it is for now. In that case I'd vote for
this:

Reported-by: [email protected]
Link: https://syzkaller.appspot.com/b/cafecafecaca0cafecafe

Regzbot can handle this, as long as somebody tells it about that URL.
IOW: it creates a little extra work for some human. But that is not much
of a problem, especially as of now, as I only track syzbot reports that
for one reason or another make me go "I should better track this".

Ciao, Thorsten

2023-03-03 02:10:18

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On Thu, 2 Mar 2023 10:48:22 +0100 Thorsten Leemhuis <[email protected]> wrote:

> > We can parse out our hash from any tag, but the problem is that the
> > current email api we use, does not allow to specify Message-ID before
> > sending, so we don't know it when generating the text.
> > We don't even know it after sending, the API is super simple:
> > https://pkg.go.dev/google.golang.org/appengine/mail
> > So we don't know what the lore link will be...
>
> That's... unfortunate, as from my understanding of things that would be
> the most important "Link:" to have in any patches that fix issues report
> by syzbot. But well, that's how it is for now. In that case I'd vote for
> this:
>
> Reported-by: [email protected]
> Link: https://syzkaller.appspot.com/b/cafecafecaca0cafecafe

As you previously mentioned, patch preparers should also include
the lore link so any followup discussion is easily located.

2023-03-06 08:53:32

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On Fri, 3 Mar 2023 at 03:10, Andrew Morton <[email protected]> wrote:
> > > We can parse out our hash from any tag, but the problem is that the
> > > current email api we use, does not allow to specify Message-ID before
> > > sending, so we don't know it when generating the text.
> > > We don't even know it after sending, the API is super simple:
> > > https://pkg.go.dev/google.golang.org/appengine/mail
> > > So we don't know what the lore link will be...
> >
> > That's... unfortunate, as from my understanding of things that would be
> > the most important "Link:" to have in any patches that fix issues report
> > by syzbot. But well, that's how it is for now. In that case I'd vote for
> > this:
> >
> > Reported-by: [email protected]
> > Link: https://syzkaller.appspot.com/b/cafecafecaca0cafecafe
>
> As you previously mentioned, patch preparers should also include
> the lore link so any followup discussion is easily located.

If the link we need to include is to lore, then we don't need to
change the current syzbot Reported-by, right? Instead of asking 3
tags, we can ask only for:

Reported-by: [email protected]
Link: https://lore.kernel.org/r/[email protected]/

But as I mentioned we can't provide the lore link at the moment, we
can only add a text to ask to include it.

This also means that checkpatch does not need special casing for syzbot.

2023-03-07 11:36:57

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] checkpatch: warn when Reported-by: is not followed by Link:

On 06.03.23 09:53, Dmitry Vyukov wrote:
> On Fri, 3 Mar 2023 at 03:10, Andrew Morton <[email protected]> wrote:
>>>> We can parse out our hash from any tag, but the problem is that the
>>>> current email api we use, does not allow to specify Message-ID before
>>>> sending, so we don't know it when generating the text.
>>>> We don't even know it after sending, the API is super simple:
>>>> https://pkg.go.dev/google.golang.org/appengine/mail
>>>> So we don't know what the lore link will be...
>>>
>>> That's... unfortunate, as from my understanding of things that would be
>>> the most important "Link:" to have in any patches that fix issues report
>>> by syzbot. But well, that's how it is for now. In that case I'd vote for
>>> this:
>>>
>>> Reported-by: [email protected]
>>> Link: https://syzkaller.appspot.com/b/cafecafecaca0cafecafe
>>
>> As you previously mentioned, patch preparers should also include
>> the lore link so any followup discussion is easily located.
>
> If the link we need to include is to lore, then we don't need to
> change the current syzbot Reported-by, right? Instead of asking 3
> tags, we can ask only for:
>
> Reported-by: [email protected]
> Link: https://lore.kernel.org/r/[email protected]/

Yeah, that's not perfect (see below), but at least better. As mentioned
earlier: if the Reported-by: includes the sysbot-id (e.g. the
df61b36319e045c00a08) is up to you.

> But as I mentioned we can't provide the lore link at the moment, we
> can only add a text to ask to include it.

Yeah, that would be good. Normally it's the oblation of the developer
anyway to add Link: tags to any report (which most of the time means: in
lore) when fixing things. Obviously the chance that they actually do it
is a lot bigger when syzbot would suggest it.

> This also means that checkpatch does not need special casing for syzbot.

Yup

Ciao, Thorsten