2022-12-06 09:46:08

by Joe Perches

[permalink] [raw]
Subject: Re: Fw: [PATCH 0/2] feat: checkpatch: prohibit Buglink: and warn about missing Link:

On Tue, 2022-12-06 at 09:50 +0100, Thorsten Leemhuis wrote:
> On 06.12.22 08:44, Joe Perches wrote:
> > On Tue, 2022-12-06 at 08:17 +0100, Thorsten Leemhuis wrote:
> > > On 06.12.22 07:27, Thorsten Leemhuis wrote:
> > > > On 06.12.22 06:54, Joe Perches wrote:
> > []
> > > > > and perhaps a more
> > > > > generic, "is the thing in front of a URI/URL" a known/supported entry,
> > > > > instead of using an known invalid test would be a better mechanism.
> > > >
> > > > Are you sure about that? It's not that I disagree completely, but it
> > > > sounds overly restrictive to me and makes it harder for new tags to
> > > > evolve in case we might want them.
> >
> > It's easy to add newly supported values to a list.
> >
> > > > And what tags would be on this allow-list? Anything else then "Link" and
> > > > "Patchwork"? Those are the ones that looked common and valid to me when
> > > > I ran
> > > >
> > > > git log --grep='http' v4.0.. | grep http | grep -v ' Link: ' | less
> > > >
> > > > and skimmed the output. Maybe "Datasheet" should be allowed, too -- not
> > > > sure.
> > []
> > > > But I found a few others that likely should be on the disallow list:
> > > > "Closes:", "Bug:", "Gitlab issue:", "References:", "Ref:", "Bugzilla:",
> > > > "RHBZ:", and "link", as "Link" should be used instead in all of these
> > > > cases afaics.
> >
> > Do understand please that checkpatch will never be perfect.
> > At best, it's just a guidance tool.
>
> Of course -- and that's actually a reason why I prefer a disallow list
> over an allow list, as that gives guidance in the way of "don't use this
> tag, use Link instead" instead of enforcing "always use Link: when
> linking somewhere" (now that I've written it like that it feels even
> more odd, because it's obvious that it's a link, so why bother with a
> tag; but whatever).
>
> I also think the approach with a disallow list will not bother
> developers much, while the other forces them a bit to much into a scheme.
>
> > To me most of these are in the noise level, but perhaps all should just
> > use Link:
> >
> > $ git log -100000 --format=email -P --grep='^\w+:[ \t]*http' | \
> > grep -Poh '^\w+:[ \t]*http' | \
> > sort | uniq -c | sort -rn
> > 103889 Link: http
> > 415 BugLink: http
> > 372 Patchwork: http
> > 270 Closes: http
> > 221 Bug: http
> > 121 References: http
> > [...]
>
> Ha, I considered doing something like that when I wrote my earlier mail,
> but was to lazy. :-D thx!
>
> Yeah, they are not that often, but I grew tired arguing about that,
> that's why I think checkpatch is the better place and in the better
> position to handle that.

I'm not sure that "Patchwork:" is a reasonable prefix.
Is that documented anywhere?

> Anyway, so how to move forward now? Do you insist on a allow list (IOW:
> a Link: or Patchwork: before every http...)? Or is a disallow list with
> the most common unwanted tags for links (that you thankfully compiled)
> fine for you as well?

Maybe
---
scripts/checkpatch.pl | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 1c3d13e65c2d0..a526a354cdfbc 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3250,6 +3250,13 @@ sub process {
$commit_log_possible_stack_dump = 0;
}

+# Check for odd prefixes before a URI/URL
+ if ($in_commit_log &&
+ $line =~ /^\s*(\w+):\s*http/ && $1 !~ /^(?:Link|Patchwork)/) {
+ WARN("PREFER_LINK",
+ "Unusual link reference '$1:', prefer 'Link:'\n" . $herecurr);
+ }
+
# Check for lines starting with a #
if ($in_commit_log && $line =~ /^#/) {
if (WARN("COMMIT_COMMENT_SYMBOL",


2022-12-06 11:05:55

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: Fw: [PATCH 0/2] feat: checkpatch: prohibit Buglink: and warn about missing Link:

On 06.12.22 10:21, Joe Perches wrote:
> On Tue, 2022-12-06 at 09:50 +0100, Thorsten Leemhuis wrote:
>> On 06.12.22 08:44, Joe Perches wrote:
>>> On Tue, 2022-12-06 at 08:17 +0100, Thorsten Leemhuis wrote:
>>>> On 06.12.22 07:27, Thorsten Leemhuis wrote:
>>>>> On 06.12.22 06:54, Joe Perches wrote:
> [...]
>> Ha, I considered doing something like that when I wrote my earlier mail,
>> but was to lazy. :-D thx!
>>
>> Yeah, they are not that often, but I grew tired arguing about that,
>> that's why I think checkpatch is the better place and in the better
>> position to handle that.
>
> I'm not sure that "Patchwork:" is a reasonable prefix.

/me neither

> Is that documented anywhere?

Couldn't find anything.

>> Anyway, so how to move forward now? Do you insist on a allow list (IOW:
>> a Link: or Patchwork: before every http...)? Or is a disallow list with
>> the most common unwanted tags for links (that you thankfully compiled)
>> fine for you as well?
>
> Maybe
> ---
> scripts/checkpatch.pl | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 1c3d13e65c2d0..a526a354cdfbc 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -3250,6 +3250,13 @@ sub process {
> $commit_log_possible_stack_dump = 0;
> }
>
> +# Check for odd prefixes before a URI/URL
> + if ($in_commit_log &&
> + $line =~ /^\s*(\w+):\s*http/ && $1 !~ /^(?:Link|Patchwork)/) {
> + WARN("PREFER_LINK",
> + "Unusual link reference '$1:', prefer 'Link:'\n" . $herecurr);
> + }
> +

LGTM: I did some tests and it seem to do the right thing. Can we have
your Signed-off-by: for that snippet?

Ciao, Thorsten

2022-12-06 11:24:57

by Joe Perches

[permalink] [raw]
Subject: Re: Fw: [PATCH 0/2] feat: checkpatch: prohibit Buglink: and warn about missing Link:

On Tue, 2022-12-06 at 11:06 +0100, Thorsten Leemhuis wrote:
> On 06.12.22 10:21, Joe Perches wrote:

> > I'm not sure that "Patchwork:" is a reasonable prefix.
>
> /me neither
>
> > Is that documented anywhere?
>
> Couldn't find anything.

I knew that.

btw:

there are a _lot_ more uses of Link: with patchwork content than
Patchwork: prefix uses, so maybe just "Link:" should be accepted.

$ git log --format=email -i --grep "patchwork" | grep -i "patchwork" | \
cut -f1-3 -d'/' | sort | uniq -c | sort -rn | head -10
25789 Link: https://patchwork.freedesktop.org
7160 Link: http://patchwork.freedesktop.org
4109 Patchwork: https://patchwork.linux-mips.org
777 Patchwork: http://patchwork.linux-mips.org
372 Patchwork: https://patchwork.freedesktop.org
200 https://patchwork.kernel.org
154 Link: https://patchwork.kernel.org
116 [1] https://patchwork.kernel.org
76 https://patchwork.ozlabs.org
33 http://patchwork.ozlabs.org

> LGTM: I did some tests and it seem to do the right thing. Can we have
> your Signed-off-by: for that snippet?

It's your patch, I'm just suggesting...

cheers, Joe

2022-12-08 14:20:58

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: Fw: [PATCH 0/2] feat: checkpatch: prohibit Buglink: and warn about missing Link:



On 06.12.22 10:21, Joe Perches wrote:
> On Tue, 2022-12-06 at 09:50 +0100, Thorsten Leemhuis wrote:
>> On 06.12.22 08:44, Joe Perches wrote:
>>> On Tue, 2022-12-06 at 08:17 +0100, Thorsten Leemhuis wrote:
>>>> On 06.12.22 07:27, Thorsten Leemhuis wrote:
>>>>> On 06.12.22 06:54, Joe Perches wrote:
>>> []
>>>>>> and perhaps a more
>>>>>> generic, "is the thing in front of a URI/URL" a known/supported entry,
>>>>>> instead of using an known invalid test would be a better mechanism.
>>>>>
>>>>> Are you sure about that? It's not that I disagree completely, but it
>>>>> sounds overly restrictive to me and makes it harder for new tags to
>>>>> evolve in case we might want them.
>>>
>>> It's easy to add newly supported values to a list.
>>>
>>>>> And what tags would be on this allow-list? Anything else then "Link" and
>>>>> "Patchwork"? Those are the ones that looked common and valid to me when
>>>>> I ran
>>>>>
>>>>> git log --grep='http' v4.0.. | grep http | grep -v ' Link: ' | less
>>>>>
>>>>> and skimmed the output. Maybe "Datasheet" should be allowed, too -- not
>>>>> sure.
>>> []
>>>>> But I found a few others that likely should be on the disallow list:
>>>>> "Closes:", "Bug:", "Gitlab issue:", "References:", "Ref:", "Bugzilla:",
>>>>> "RHBZ:", and "link", as "Link" should be used instead in all of these
>>>>> cases afaics.
>>>
>>> Do understand please that checkpatch will never be perfect.
>>> At best, it's just a guidance tool.
>>
>> Of course -- and that's actually a reason why I prefer a disallow list
>> over an allow list, as that gives guidance in the way of "don't use this
>> tag, use Link instead" instead of enforcing "always use Link: when
>> linking somewhere" (now that I've written it like that it feels even
>> more odd, because it's obvious that it's a link, so why bother with a
>> tag; but whatever).
>>
>> I also think the approach with a disallow list will not bother
>> developers much, while the other forces them a bit to much into a scheme.
>>
>>> To me most of these are in the noise level, but perhaps all should just
>>> use Link:
>>>
>>> $ git log -100000 --format=email -P --grep='^\w+:[ \t]*http' | \
>>> grep -Poh '^\w+:[ \t]*http' | \
>>> sort | uniq -c | sort -rn
>>> 103889 Link: http
>>> 415 BugLink: http
>>> 372 Patchwork: http
>>> 270 Closes: http
>>> 221 Bug: http
>>> 121 References: http
>>> 101 v1: http
>>> 77 Bugzilla: http
>>> 60 URL: http
>>> 59 v2: http
>>> 37 Datasheet: http
>>> 35 v3: http
>>> 19 v4: http
>>> 12 v5: http
>
>> Ha, I considered doing something like that when I wrote my earlier mail,
>> but was to lazy. :-D thx!
>>
>> Yeah, they are not that often, but I grew tired arguing about that,
>> that's why I think checkpatch is the better place and in the better
>> position to handle that.
>
> I'm not sure that "Patchwork:" is a reasonable prefix.
> Is that documented anywhere?
>
>> Anyway, so how to move forward now? Do you insist on a allow list (IOW:
>> a Link: or Patchwork: before every http...)? Or is a disallow list with
>> the most common unwanted tags for links (that you thankfully compiled)
>> fine for you as well?
>
> Maybe
> ---
> scripts/checkpatch.pl | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 1c3d13e65c2d0..a526a354cdfbc 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -3250,6 +3250,13 @@ sub process {
> $commit_log_possible_stack_dump = 0;
> }
>
> +# Check for odd prefixes before a URI/URL
> + if ($in_commit_log &&
> + $line =~ /^\s*(\w+):\s*http/ && $1 !~ /^(?:Link|Patchwork)/) {
> + WARN("PREFER_LINK",
> + "Unusual link reference '$1:', prefer 'Link:'\n" . $herecurr);
> + }
> +

One more thing: That afaics would result in a warning when people use
things like "v1: https://example.com/somewhere", which some people
apparently like. Those imho are not considered tags, hence I'd say we
allow them, unless you disagree.

Ciao, Thorsten

2022-12-08 18:07:54

by Joe Perches

[permalink] [raw]
Subject: Re: Fw: [PATCH 0/2] feat: checkpatch: prohibit Buglink: and warn about missing Link:

On Thu, 2022-12-08 at 14:18 +0100, Thorsten Leemhuis wrote:
> On 06.12.22 10:21, Joe Perches wrote:
> > On Tue, 2022-12-06 at 09:50 +0100, Thorsten Leemhuis wrote:
> > > On 06.12.22 08:44, Joe Perches wrote:
[]
> > > > To me most of these are in the noise level, but perhaps all should just
> > > > use Link:
> > > >
> > > > $ git log -100000 --format=email -P --grep='^\w+:[ \t]*http' | \
> > > > grep -Poh '^\w+:[ \t]*http' | \
> > > > sort | uniq -c | sort -rn
> > > > 103889 Link: http
> > > > 415 BugLink: http
> > > > 372 Patchwork: http
> > > > 270 Closes: http
> > > > 221 Bug: http
> > > > 121 References: http
> > > > 101 v1: http
> > > > 77 Bugzilla: http
> > > > 60 URL: http
> > > > 59 v2: http
> > > > 37 Datasheet: http
> > > > 35 v3: http
> > > > 19 v4: http
> > > > 12 v5: http
> >
> > > Ha, I considered doing something like that when I wrote my earlier mail,
> > > but was to lazy. :-D thx!
[]
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> > @@ -3250,6 +3250,13 @@ sub process {
> > $commit_log_possible_stack_dump = 0;
> > }
> >
> > +# Check for odd prefixes before a URI/URL
> > + if ($in_commit_log &&
> > + $line =~ /^\s*(\w+):\s*http/ && $1 !~ /^(?:Link|Patchwork)/) {
> > + WARN("PREFER_LINK",
> > + "Unusual link reference '$1:', prefer 'Link:'\n" . $herecurr);
> > + }
> > +
>
> One more thing: That afaics would result in a warning when people use
> things like "v1: https://example.com/somewhere", which some people
> apparently like. Those imho are not considered tags, hence I'd say we
> allow them, unless you disagree.

I'd say no as almost all of those are when patches contain
references to previous patch submissions that should instead
be below the --- line. Perhaps there should be a separate
warning for those v<n>: uses saying to move them below the ---
but there really aren't many uses.

Most of the v<n>: style prefixes are from git pulls/merges.
Redoing the git log grep with --no-merges shows that fairly well.

$ git log -100000 --no-merges --format=email -P --grep='^\w+:[ \t]*http' | \
grep -Poh '^\w+:[ \t]*http' | \
sort | uniq -c | sort -rn
103958 Link: http
418 BugLink: http
372 Patchwork: http
280 Closes: http
224 Bug: http
123 References: http
84 Bugzilla: http
61 URL: http
42 v1: http
38 Datasheet: http
20 v2: http
9 Ref: http
9 Fixes: http
9 Buglink: http
8 v3: http
8 Reference: http
7 See: http
6 1: http
5 link: http
3 Link:http
3 IGT: http
3 0: http
2 Website: http
2 Schematics: http
2 RHBZ: http
2 Reported: http
2 MR: http
2 Links: http
2 LINK: http
2 Link: http
2 Bugs: http
2 BUG: http
2 2: http
1 v5: http
1 v4: http
1 V1: http
1 v1:http
1 Twitter: http
1 tree: http
1 tool: http
1 tests: http
1 tasks: http
1 Source: http
1 SoM: http
1 scctc: http
1 Reproducer: http
1 reliable: http
1 Related: http
1 Reference:http
1 oscca: http
1 Mesa: http
1 Lore: http
1 Links:http
1 ink: http
1 in: http
1 IETF: http
1 here: http
1 Examples: http
1 bz: http
1 Bug:http
1 AlsaInfo: http