2021-07-02 22:47:14

by John 'Warthog9' Hawley

[permalink] [raw]
Subject: [PATCH] checkpatch: Add check for common mailing list helper checks

From: John 'Warthog9' Hawley <[email protected]>

Mailing lists in an attempt to try and avoid sending certain
administrative e-mails to the list, will check the first few lines
(usually ~10) looking for keywords. If those key words are found it
shunts the e-mail to the list admin contact instead of potentially
passing it through to the list.

This adds a known list of the potential things that are checked for, and
while it may search deeper into the message (it goes till it believes
it's into the patch section) than the mailing list software this should
help give some warning if the wrong word is potentially added in the
wrong place in the beginning of a patch message.

Signed-off-by: John 'Warthog9' Hawley (VMware) <[email protected]>
---
scripts/checkpatch.pl | 34 ++++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 461d4221e4a4..c2f6e349f304 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2852,6 +2852,40 @@ sub process {

$cnt_lines++ if ($realcnt != 0);

+# check if this may trip up common mailing list helpers to redirect email to the admin contact
+ if ($in_commit_log &&
+ ($line =~ /\bcancel\b/i ||
+ $line =~ /\badd me\b/i ||
+ $line =~ /\bdelete me\b/i ||
+ $line =~ /\bremove\s+me\b/i ||
+ $line =~ /\bchange\b.*\baddress\b/ ||
+ $line =~ /\bsubscribe\b/i ||
+ $line =~ /^sub\b/i ||
+ $line =~ /\bunsubscribe\b/i ||
+ $line =~ /^unsub\b/i ||
+ $line =~ /^\s*help\s*$/i ||
+ $line =~ /^\s*info\s*$/i ||
+ $line =~ /^\s*info\s+\S+\s*$/i ||
+ $line =~ /^\s*lists\s*$/i ||
+ $line =~ /^\s*which\s*$/i ||
+ $line =~ /^\s*which\s+\S+\s*$/i ||
+ $line =~ /^\s*index\s*$/i ||
+ $line =~ /^\s*index\s+\S+\s*$/i ||
+ $line =~ /^\s*who\s*$/i ||
+ $line =~ /^\s*who\s+\S+\s*$/i ||
+ $line =~ /^\s*get\s+\S+\s*$/i ||
+ $line =~ /^\s*get\s+\S+\s+\S+\s*$/i ||
+ $line =~ /^\s*approve\b/i ||
+ $line =~ /^\s*passwd\b/i ||
+ $line =~ /^\s*newinfo\b/i ||
+ $line =~ /^\s*config\b/i ||
+ $line =~ /^\s*newconfig\b/i ||
+ $line =~ /^\s*writeconfig\b/i ||
+ $line =~ /^\s*mkdigest\b/i)){
+ WARN("MAILING LIST HELPER",
+ "Line matches common mailing list helpers, and may not be delivered correctly. Consider rewording (particularly the first word)\n" . $herecurr);
+ }
+
# Verify the existence of a commit log if appropriate
# 2 is used because a $signature is counted in $commit_log_lines
if ($in_commit_log) {
--
2.26.3


2021-07-03 03:28:47

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] checkpatch: Add check for common mailing list helper checks

On Fri, 2021-07-02 at 15:37 -0700, John 'Warthog9' Hawley (VMware)
wrote:
> From: John 'Warthog9' Hawley <[email protected]>
>
> Mailing lists in an attempt to try and avoid sending certain
> administrative e-mails to the list, will check the first few lines
> (usually ~10) looking for keywords. If those key words are found it
> shunts the e-mail to the list admin contact instead of potentially
> passing it through to the list.
>
> This adds a known list of the potential things that are checked for, and
> while it may search deeper into the message (it goes till it believes
> it's into the patch section) than the mailing list software this should
> help give some warning if the wrong word is potentially added in the
> wrong place in the beginning of a patch message.
[]
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -2852,6 +2852,40 @@ sub process {
[]
> +# check if this may trip up common mailing list helpers to redirect email to the admin contact
> + if ($in_commit_log &&
> + ($line =~ /\bcancel\b/i ||
> + $line =~ /\badd me\b/i ||
> + $line =~ /\bdelete me\b/i ||
> + $line =~ /\bremove\s+me\b/i ||
> + $line =~ /\bchange\b.*\baddress\b/ ||
> + $line =~ /\bsubscribe\b/i ||
> + $line =~ /^sub\b/i ||
> + $line =~ /\bunsubscribe\b/i ||
> + $line =~ /^unsub\b/i ||
> + $line =~ /^\s*help\s*$/i ||
> + $line =~ /^\s*info\s*$/i ||
> + $line =~ /^\s*info\s+\S+\s*$/i ||
> + $line =~ /^\s*lists\s*$/i ||
> + $line =~ /^\s*which\s*$/i ||
> + $line =~ /^\s*which\s+\S+\s*$/i ||
> + $line =~ /^\s*index\s*$/i ||
> + $line =~ /^\s*index\s+\S+\s*$/i ||
> + $line =~ /^\s*who\s*$/i ||
> + $line =~ /^\s*who\s+\S+\s*$/i ||
> + $line =~ /^\s*get\s+\S+\s*$/i ||
> + $line =~ /^\s*get\s+\S+\s+\S+\s*$/i ||
> + $line =~ /^\s*approve\b/i ||
> + $line =~ /^\s*passwd\b/i ||
> + $line =~ /^\s*newinfo\b/i ||
> + $line =~ /^\s*config\b/i ||
> + $line =~ /^\s*newconfig\b/i ||
> + $line =~ /^\s*writeconfig\b/i ||
> + $line =~ /^\s*mkdigest\b/i)){

Have you checked this list against actual commits?

> + WARN("MAILING LIST HELPER",

This must use underscores for the spaces

MAILING_LIST_HELPER


2021-07-03 18:41:01

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] checkpatch: Add check for common mailing list helper checks

On Fri, 2021-07-02 at 15:37 -0700, John 'Warthog9' Hawley (VMware)
wrote:
> From: John 'Warthog9' Hawley <[email protected]>
>
> Mailing lists in an attempt to try and avoid sending certain
> administrative e-mails to the list, will check the first few lines
> (usually ~10) looking for keywords. If those key words are found it
> shunts the e-mail to the list admin contact instead of potentially
> passing it through to the list.

Perhaps the below is a bit better, but I believe a few of the tests
are going to be tripped a bit too often.

Especially "cancel", "config" and maybe "subscribe" too.

For instance:

$ git log --grep='\bcancel\b' -P -i --pretty=oneline -10000 | wc -l
1693

$ git log --grep='^config\b' -P -i --pretty=oneline -10000 | wc -l
890

$ git log --grep='\bsubscribe\b' -P -i --pretty=oneline -10000 | wc -l
123

---
scripts/checkpatch.pl | 45 ++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 461d4221e4a4a..fcbcc26da875e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -865,6 +865,37 @@ our $allowed_asm_includes = qr{(?x:
)};
# memory.h: ARM has a custom one

+our $mailing_list_phrases = qr{(?xi:
+ \bcancel\b |
+ \badd\s+me\b |
+ \bdelete\s+me\b |
+ \bremove\s+me\b |
+ \bchange\b.*\baddress\b |
+ \bsubscribe\b |
+ ^sub\b |
+ \bunsubscribe\b |
+ ^unsub\b |
+ ^\s*help\s*$ |
+ ^\s*info\s*$ |
+ ^\s*info\s+\S+\s*$ |
+ ^\s*lists\s*$ |
+ ^\s*which\s*$ |
+ ^\s*which\s+\S+\s*$ |
+ ^\s*index\s*$ |
+ ^\s*index\s+\S+\s*$ |
+ ^\s*who\s*$ |
+ ^\s*who\s+\S+\s*$ |
+ ^\s*get\s+\S+\s*$ |
+ ^\s*get\s+\S+\s+\S+\s*$ |
+ ^\s*approve\b |
+ ^\s*passwd\b |
+ ^\s*newinfo\b |
+ ^\s*config\b |
+ ^\s*newconfig\b |
+ ^\s*writeconfig\b |
+ ^\s*mkdigest\b
+)};
+
# Load common spelling mistakes and build regular expression list.
my $misspellings;
my %spelling_fix;
@@ -2581,6 +2612,7 @@ sub process {
my $has_patch_separator = 0; #Found a --- line
my $has_commit_log = 0; #Encountered lines before patch
my $commit_log_lines = 0; #Number of commit log lines
+ my $commit_log_missing = 0; #Emitted a missing commit message warning
my $commit_log_possible_stack_dump = 0;
my $commit_log_long_line = 0;
my $commit_log_has_diff = 0;
@@ -2852,16 +2884,23 @@ sub process {

$cnt_lines++ if ($realcnt != 0);

+# check if words in the commit message may trip up common mailing list helpers
+# to redirect email to the admin contact
+ if ($in_commit_log && $commit_log_lines < 10 &&
+ $line =~ /($mailing_list_phrases)/) {
+ WARN("MAILING_LIST_HELPER",
+ "Line matches common mailing list helpers and may not be delivered correctly - consider rewording '$1'\n" . $herecurr);
+ }
+
# Verify the existence of a commit log if appropriate
-# 2 is used because a $signature is counted in $commit_log_lines
if ($in_commit_log) {
if ($line !~ /^\s*$/) {
$commit_log_lines++; #could be a $signature
}
- } elsif ($has_commit_log && $commit_log_lines < 2) {
+ } elsif ($has_commit_log && !$commit_log_missing) {
WARN("COMMIT_MESSAGE",
"Missing commit description - Add an appropriate one\n");
- $commit_log_lines = 2; #warn only once
+ $commit_log_missing = 1; #warn only once
}

# Check if the commit log has what seems like a diff which can confuse patch

2021-07-06 19:33:21

by John 'Warthog9' Hawley

[permalink] [raw]
Subject: Re: [PATCH] checkpatch: Add check for common mailing list helper checks

On 7/3/21 11:39 AM, Joe Perches wrote:
> On Fri, 2021-07-02 at 15:37 -0700, John 'Warthog9' Hawley (VMware)
> wrote:
>> From: John 'Warthog9' Hawley <[email protected]>
>>
>> Mailing lists in an attempt to try and avoid sending certain
>> administrative e-mails to the list, will check the first few lines
>> (usually ~10) looking for keywords. If those key words are found it
>> shunts the e-mail to the list admin contact instead of potentially
>> passing it through to the list.
>
> Perhaps the below is a bit better, but I believe a few of the tests
> are going to be tripped a bit too often.
>
> Especially "cancel", "config" and maybe "subscribe" too.
>
> For instance:
>
> $ git log --grep='\bcancel\b' -P -i --pretty=oneline -10000 | wc -l
> 1693
>
> $ git log --grep='^config\b' -P -i --pretty=oneline -10000 | wc -l
> 890
>
> $ git log --grep='\bsubscribe\b' -P -i --pretty=oneline -10000 | wc -l
> 123

A part of getting this into checkpatch.pl is getting some better
feedback mechanisms for why patches may not be passing through the list
correctly with regexes that have been in place for at least 14 years.
These, aren't tripped over often, but have run into a instance at least
recently that triggered me trying to get at least some self check, and
notification, pieces in place.

>
> ---
> scripts/checkpatch.pl | 45 ++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 42 insertions(+), 3 deletions(-)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 461d4221e4a4a..fcbcc26da875e 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -865,6 +865,37 @@ our $allowed_asm_includes = qr{(?x:
> )};
> # memory.h: ARM has a custom one
>
> +our $mailing_list_phrases = qr{(?xi:
> + \bcancel\b |
> + \badd\s+me\b |
> + \bdelete\s+me\b |
> + \bremove\s+me\b |
> + \bchange\b.*\baddress\b |
> + \bsubscribe\b |
> + ^sub\b |
> + \bunsubscribe\b |
> + ^unsub\b |
> + ^\s*help\s*$ |
> + ^\s*info\s*$ |
> + ^\s*info\s+\S+\s*$ |
> + ^\s*lists\s*$ |
> + ^\s*which\s*$ |
> + ^\s*which\s+\S+\s*$ |
> + ^\s*index\s*$ |
> + ^\s*index\s+\S+\s*$ |
> + ^\s*who\s*$ |
> + ^\s*who\s+\S+\s*$ |
> + ^\s*get\s+\S+\s*$ |
> + ^\s*get\s+\S+\s+\S+\s*$ |
> + ^\s*approve\b |
> + ^\s*passwd\b |
> + ^\s*newinfo\b |
> + ^\s*config\b |
> + ^\s*newconfig\b |
> + ^\s*writeconfig\b |
> + ^\s*mkdigest\b
> +)};
> +
> # Load common spelling mistakes and build regular expression list.
> my $misspellings;
> my %spelling_fix;
> @@ -2581,6 +2612,7 @@ sub process {
> my $has_patch_separator = 0; #Found a --- line
> my $has_commit_log = 0; #Encountered lines before patch
> my $commit_log_lines = 0; #Number of commit log lines
> + my $commit_log_missing = 0; #Emitted a missing commit message warning
> my $commit_log_possible_stack_dump = 0;
> my $commit_log_long_line = 0;
> my $commit_log_has_diff = 0;
> @@ -2852,16 +2884,23 @@ sub process {
>
> $cnt_lines++ if ($realcnt != 0);
>
> +# check if words in the commit message may trip up common mailing list helpers
> +# to redirect email to the admin contact
> + if ($in_commit_log && $commit_log_lines < 10 &&
> + $line =~ /($mailing_list_phrases)/) {
> + WARN("MAILING_LIST_HELPER",
> + "Line matches common mailing list helpers and may not be delivered correctly - consider rewording '$1'\n" . $herecurr);
> + }
> +
> # Verify the existence of a commit log if appropriate
> -# 2 is used because a $signature is counted in $commit_log_lines
> if ($in_commit_log) {
> if ($line !~ /^\s*$/) {
> $commit_log_lines++; #could be a $signature
> }
> - } elsif ($has_commit_log && $commit_log_lines < 2) {
> + } elsif ($has_commit_log && !$commit_log_missing) {
> WARN("COMMIT_MESSAGE",
> "Missing commit description - Add an appropriate one\n");
> - $commit_log_lines = 2; #warn only once
> + $commit_log_missing = 1; #warn only once
> }
>
> # Check if the commit log has what seems like a diff which can confuse patch
>

2021-07-07 18:54:50

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] checkpatch: Add check for common mailing list helper checks

On Tue, 2021-07-06 at 12:31 -0700, John 'Warthog9' Hawley wrote:
> On 7/3/21 11:39 AM, Joe Perches wrote:
> > On Fri, 2021-07-02 at 15:37 -0700, John 'Warthog9' Hawley (VMware)
> > wrote:
> > > From: John 'Warthog9' Hawley <[email protected]>
> > >
> > > Mailing lists in an attempt to try and avoid sending certain
> > > administrative e-mails to the list, will check the first few lines
> > > (usually ~10) looking for keywords. If those key words are found it
> > > shunts the e-mail to the list admin contact instead of potentially
> > > passing it through to the list.
> >
> > Perhaps the below is a bit better, but I believe a few of the tests
> > are going to be tripped a bit too often.
> >
> > Especially "cancel", "config" and maybe "subscribe" too.
> >
> > For instance:
> >
> > $ git log --grep='\bcancel\b' -P -i --pretty=oneline -10000 | wc -l
> > 1693
> >
> > $ git log --grep='^config\b' -P -i --pretty=oneline -10000 | wc -l
> > 890
> >
> > $ git log --grep='\bsubscribe\b' -P -i --pretty=oneline -10000 | wc -l
> > 123
>
> A part of getting this into checkpatch.pl is getting some better
> feedback mechanisms for why patches may not be passing through the list
> correctly with regexes that have been in place for at least 14 years.
> These, aren't tripped over often,

3000+ commits with regex matches seem rather a lot to me.

> but have run into a instance at least
> recently that triggered me trying to get at least some self check, and
> notification, pieces in place.

No worries, but perhaps the message might be reworded to
say something about possible mailing list moderation rather
than imply rejection.

> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> > @@ -865,6 +865,37 @@ our $allowed_asm_includes = qr{(?x:
> > ?)};
> > ?# memory.h: ARM has a custom one
> >
> > +our $mailing_list_phrases = qr{(?xi:
> > + \bcancel\b |

Mere use of the word "cancel" in the commit description seems undesirable to me.

> > +# check if words in the commit message may trip up common mailing list helpers
> > +# to redirect email to the admin contact
> > + if ($in_commit_log && $commit_log_lines < 10 &&
> > + $line =~ /($mailing_list_phrases)/) {
> > + WARN("MAILING_LIST_HELPER",
> > + "Line matches common mailing list helpers and may not be delivered correctly - consider rewording '$1'\n" . $herecurr);

Maybe FILTERS for phrases and helpers

Maybe something like:

"Use of '$1' in this patch's commit description might cause mailing list moderation or rejection\n"


2021-07-08 00:27:39

by John 'Warthog9' Hawley

[permalink] [raw]
Subject: Re: [PATCH] checkpatch: Add check for common mailing list helper checks

On 7/7/21 10:53 AM, Joe Perches wrote:
> On Tue, 2021-07-06 at 12:31 -0700, John 'Warthog9' Hawley wrote:
>> On 7/3/21 11:39 AM, Joe Perches wrote:
>>> On Fri, 2021-07-02 at 15:37 -0700, John 'Warthog9' Hawley (VMware)
>>> wrote:
>>>> From: John 'Warthog9' Hawley <[email protected]>
>>>>
>>>> Mailing lists in an attempt to try and avoid sending certain
>>>> administrative e-mails to the list, will check the first few lines
>>>> (usually ~10) looking for keywords. If those key words are found it
>>>> shunts the e-mail to the list admin contact instead of potentially
>>>> passing it through to the list.
>>>
>>> Perhaps the below is a bit better, but I believe a few of the tests
>>> are going to be tripped a bit too often.
>>>
>>> Especially "cancel", "config" and maybe "subscribe" too.
>>>
>>> For instance:
>>>
>>> $ git log --grep='\bcancel\b' -P -i --pretty=oneline -10000 | wc -l
>>> 1693
>>>
>>> $ git log --grep='^config\b' -P -i --pretty=oneline -10000 | wc -l
>>> 890
>>>
>>> $ git log --grep='\bsubscribe\b' -P -i --pretty=oneline -10000 | wc -l
>>> 123
>>
>> A part of getting this into checkpatch.pl is getting some better
>> feedback mechanisms for why patches may not be passing through the list
>> correctly with regexes that have been in place for at least 14 years.
>> These, aren't tripped over often,
>
> 3000+ commits with regex matches seem rather a lot to me.>
>> but have run into a instance at least
>> recently that triggered me trying to get at least some self check, and
>> notification, pieces in place.
>
> No worries, but perhaps the message might be reworded to
> say something about possible mailing list moderation rather
> than imply rejection.
>>>> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> []
>>> @@ -865,6 +865,37 @@ our $allowed_asm_includes = qr{(?x:
>>>  )};
>>>  # memory.h: ARM has a custom one
>>>
>>> +our $mailing_list_phrases = qr{(?xi:
>>> + \bcancel\b |
>
> Mere use of the word "cancel" in the commit description seems undesirable to me.>
>>> +# check if words in the commit message may trip up common mailing list helpers
>>> +# to redirect email to the admin contact
>>> + if ($in_commit_log && $commit_log_lines < 10 &&
>>> + $line =~ /($mailing_list_phrases)/) {
>>> + WARN("MAILING_LIST_HELPER",
>>> + "Line matches common mailing list helpers and may not be delivered correctly - consider rewording '$1'\n" . $herecurr);
>
> Maybe FILTERS for phrases and helpers
>
> Maybe something like:
>
> "Use of '$1' in this patch's commit description might cause mailing list moderation or rejection\n"

"Use of '$1' in this patch's commit description, may cause the message
to be considered an administrative message and may be misdirected and
undelivered to the mailing list\n"

might be more overall accurate since the mailing list is likely push the
message into the admin queue as if it was a control message, and it may
not be easy to get it back into the main stream for delivery.

I'm not picky on the exact wording of the message though, just need to
convey that the commit_log may have something in it that may trip up the
mailing list, and the message may not make it through.