2015-11-16 22:43:25

by Brian Norris

[permalink] [raw]
Subject: [BUG] checkpatch: false positive for commits with quote characters

Hi,

What is the Blessed (TM) style for referencing commits that have quote
characters in their subject line? e.g., this commit:

commit 43163022927b6e7d202a7e6f939c3f392465494d
Author: Brian Norris <[email protected]>
Date: Tue May 19 14:38:22 2015 -0700

mtd: m25p80: allow arbitrary OF matching for "jedec,spi-nor"

Checkpatch reports false positive errors like this:

ERROR: Please use git commit description style 'commit <12+ chars of
sha1> ("<title line>")'

when I try to reference it on this patch:

https://lkml.org/lkml/2015/11/16/826

I understand the double quoting is a little nasty to parse, but I think
that just means we should relax the regexes in checkpatch.pl. I could
try to patch myself, but I figured I'd just follow checkpatch's advice
instead:

NOTE: If any of the errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Brian


2015-11-17 17:48:33

by Joe Perches

[permalink] [raw]
Subject: Re: [BUG] checkpatch: false positive for commits with quote characters

On Mon, 2015-11-16 at 14:43 -0800, Brian Norris wrote:
> Hi,
>
> What is the Blessed (TM) style for referencing commits that have quote
> characters in their subject line? e.g., this commit:
>
> commit 43163022927b6e7d202a7e6f939c3f392465494d
> Author: Brian Norris <[email protected]>
> Date: Tue May 19 14:38:22 2015 -0700
>
> mtd: m25p80: allow arbitrary OF matching for "jedec,spi-nor"
>
> Checkpatch reports false positive errors like this:
>
> ERROR: Please use git commit description style 'commit <12+ chars of
> sha1> ("")'

Hi Brian.

What version of checkpatch are you using?

Using linux-next:

$ git log --stat -p -1 --format=email 43163022927b6e7d202a7e6f939c3f392465494d | ./scripts/checkpatch.pl --strict -
total: 0 errors, 0 warnings, 0 checks, 53 lines checked

Your patch has no obvious style problems and is ready for submission.

2015-11-17 18:03:40

by Brian Norris

[permalink] [raw]
Subject: Re: [BUG] checkpatch: false positive for commits with quote characters

On Tue, Nov 17, 2015 at 09:48:27AM -0800, Joe Perches wrote:
> On Mon, 2015-11-16 at 14:43 -0800, Brian Norris wrote:
> > Hi,
> >
> > What is the Blessed (TM) style for referencing commits that have quote
> > characters in their subject line? e.g., this commit:
> >
> > commit 43163022927b6e7d202a7e6f939c3f392465494d
> > Author: Brian Norris <[email protected]>
> > Date: Tue May 19 14:38:22 2015 -0700
> >
> > mtd: m25p80: allow arbitrary OF matching for "jedec,spi-nor"
> >
> > Checkpatch reports false positive errors like this:
> >
> > ERROR: Please use git commit description style 'commit <12+ chars of
> > sha1> ("")'
>
> Hi Brian.
>
> What version of checkpatch are you using?
>
> Using linux-next:
>
> $ git log --stat -p -1 --format=email 43163022927b6e7d202a7e6f939c3f392465494d | ./scripts/checkpatch.pl --strict -

I was referring to running checkpatch on this:

https://lkml.org/lkml/2015/11/16/826

which *referenced* commit 43163022927b6e7d202a7e6f939c3f392465494d.
Sorry if that wasn't clear.

See below,
Brian

$ curl http://patchwork.ozlabs.org/patch/545234/mbox/ | scripts/checkpatch.pl -
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3741 0 3741 0 0 11525 0 --:--:-- --:--:-- --:--:-- 11510
ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'Commit 43163022927b ("mtd: m25p80: allow arbitrary OF matching for "jedec,spi-nor"")'
#17:
Commit 43163022927b ("mtd: m25p80: allow arbitrary OF matching for

total: 1 errors, 0 warnings, 29 lines checked

Your patch has style problems, please review.

NOTE: Ignored message types: FILE_PATH_CHANGES

NOTE: If any of the errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

2015-12-04 00:13:26

by Brian Norris

[permalink] [raw]
Subject: Re: [BUG] checkpatch: false positive for commits with quote characters

Ping? I've hit some different false positives today on the same rule.
I'll stop bothering to report them if no one cares.

On Tue, Nov 17, 2015 at 10:03:36AM -0800, Brian Norris wrote:
> On Tue, Nov 17, 2015 at 09:48:27AM -0800, Joe Perches wrote:
> > On Mon, 2015-11-16 at 14:43 -0800, Brian Norris wrote:
> > > Hi,
> > >
> > > What is the Blessed (TM) style for referencing commits that have quote
> > > characters in their subject line? e.g., this commit:
> > >
> > > commit 43163022927b6e7d202a7e6f939c3f392465494d
> > > Author: Brian Norris <[email protected]>
> > > Date: Tue May 19 14:38:22 2015 -0700
> > >
> > > mtd: m25p80: allow arbitrary OF matching for "jedec,spi-nor"
> > >
> > > Checkpatch reports false positive errors like this:
> > >
> > > ERROR: Please use git commit description style 'commit <12+ chars of
> > > sha1> ("")'
> >
> > Hi Brian.
> >
> > What version of checkpatch are you using?
> >
> > Using linux-next:
> >
> > $ git log --stat -p -1 --format=email 43163022927b6e7d202a7e6f939c3f392465494d | ./scripts/checkpatch.pl --strict -
>
> I was referring to running checkpatch on this:
>
> https://lkml.org/lkml/2015/11/16/826
>
> which *referenced* commit 43163022927b6e7d202a7e6f939c3f392465494d.
> Sorry if that wasn't clear.
>
> See below,
> Brian
>
> $ curl http://patchwork.ozlabs.org/patch/545234/mbox/ | scripts/checkpatch.pl -
> % Total % Received % Xferd Average Speed Time Time Time Current
> Dload Upload Total Spent Left Speed
> 100 3741 0 3741 0 0 11525 0 --:--:-- --:--:-- --:--:-- 11510
> ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'Commit 43163022927b ("mtd: m25p80: allow arbitrary OF matching for "jedec,spi-nor"")'
> #17:
> Commit 43163022927b ("mtd: m25p80: allow arbitrary OF matching for
>
> total: 1 errors, 0 warnings, 29 lines checked
>
> Your patch has style problems, please review.
>
> NOTE: Ignored message types: FILE_PATH_CHANGES
>
> NOTE: If any of the errors are false positives, please report
> them to the maintainer, see CHECKPATCH in MAINTAINERS.
>

2015-12-04 00:29:44

by Joe Perches

[permalink] [raw]
Subject: Re: [BUG] checkpatch: false positive for commits with quote characters

On Thu, 2015-12-03 at 16:13 -0800, Brian Norris wrote:
> Ping? I've hit some different false positives today on the same rule.
> I'll stop bothering to report them if no one cares.

Perhaps this:
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9f0949b..196b77b 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2397,22 +2397,26 @@ sub process {
? $long = 1 if ($line =~ /\bcommit\s+[0-9a-f]{41,}/i);
? $space = 0 if ($line =~ /\bcommit [0-9a-f]/i);
? $case = 0 if ($line =~ /\b[Cc]ommit\s+[0-9a-f]{5,40}[^A-F]/);
- if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)"\)/i) {
+ if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("(.*)"\)/i) {
? $orig_desc = $1;
? $hasparens = 1;
+ print("here1\n");
? } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s*$/i &&
? ?defined $rawlines[$linenr] &&
- ?$rawlines[$linenr] =~ /^\s*\("([^"]+)"\)/) {
+ ?$rawlines[$linenr] =~ /^\s*\("(.*)"\)/) {
? $orig_desc = $1;
? $hasparens = 1;
+ print("here2\n");
? } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("[^"]+$/i &&
? ?defined $rawlines[$linenr] &&
- ?$rawlines[$linenr] =~ /^\s*[^"]+"\)/) {
+ ?$rawlines[$linenr] =~ /^\s*.*"\)/) {
? $line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)$/i;
? $orig_desc = $1;
- $rawlines[$linenr] =~ /^\s*([^"]+)"\)/;
+ $rawlines[$linenr] =~ /^\s*(.*)"\)/;
? $orig_desc .= " " . $1;
? $hasparens = 1;
+ print("orig_desc: <$orig_desc>\n");
+ print("here3\n");
? }
?
? ($id, $description) = git_commit_info($orig_commit,

2015-12-04 00:36:33

by Joe Perches

[permalink] [raw]
Subject: Re: [BUG] checkpatch: false positive for commits with quote characters

On Thu, 2015-12-03 at 16:29 -0800, Joe Perches wrote:
> On Thu, 2015-12-03 at 16:13 -0800, Brian Norris wrote:
> > Ping? I've hit some different false positives today on the same rule.
> > I'll stop bothering to report them if no one cares.
>
> Perhaps this:

(minus the debugging this time...)
---
?scripts/checkpatch.pl | 8 ++++----
?1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index d4960f7..b23dff8 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2397,20 +2397,20 @@ sub process {
? $long = 1 if ($line =~ /\bcommit\s+[0-9a-f]{41,}/i);
? $space = 0 if ($line =~ /\bcommit [0-9a-f]/i);
? $case = 0 if ($line =~ /\b[Cc]ommit\s+[0-9a-f]{5,40}[^A-F]/);
- if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)"\)/i) {
+ if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("(.*)"\)/i) {
? $orig_desc = $1;
? $hasparens = 1;
? } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s*$/i &&
? ?defined $rawlines[$linenr] &&
- ?$rawlines[$linenr] =~ /^\s*\("([^"]+)"\)/) {
+ ?$rawlines[$linenr] =~ /^\s*\("(.*)"\)/) {
? $orig_desc = $1;
? $hasparens = 1;
? } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("[^"]+$/i &&
? ?defined $rawlines[$linenr] &&
- ?$rawlines[$linenr] =~ /^\s*[^"]+"\)/) {
+ ?$rawlines[$linenr] =~ /^\s*.*"\)/) {
? $line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)$/i;
? $orig_desc = $1;
- $rawlines[$linenr] =~ /^\s*([^"]+)"\)/;
+ $rawlines[$linenr] =~ /^\s*(.*)"\)/;
? $orig_desc .= " " . $1;
? $hasparens = 1;
? }

2015-12-04 01:39:53

by Brian Norris

[permalink] [raw]
Subject: Re: [BUG] checkpatch: false positive for commits with quote characters

On Thu, Dec 03, 2015 at 04:36:28PM -0800, Joe Perches wrote:
> On Thu, 2015-12-03 at 16:29 -0800, Joe Perches wrote:
> > On Thu, 2015-12-03 at 16:13 -0800, Brian Norris wrote:
> > > Ping? I've hit some different false positives today on the same rule.
> > > I'll stop bothering to report them if no one cares.
> >
> > Perhaps this:

Aside from the non-breaking spaces your copy of Evolution inserted, it's
an improvement. (It doesn't give a false positive for the patch I
reported.) But it's still got some holes.

> (minus the debugging this time...)
> ---
> ?scripts/checkpatch.pl | 8 ++++----
> ?1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index d4960f7..b23dff8 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -2397,20 +2397,20 @@ sub process {
> ? $long = 1 if ($line =~ /\bcommit\s+[0-9a-f]{41,}/i);
> ? $space = 0 if ($line =~ /\bcommit [0-9a-f]/i);
> ? $case = 0 if ($line =~ /\b[Cc]ommit\s+[0-9a-f]{5,40}[^A-F]/);
> - if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)"\)/i) {
> + if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("(.*)"\)/i) {

You're got the reverse problem now. This will match too broadly. For
instance, if there is another instance of terminating quote +
parentheses on the same line, then we'll get a false postive. e.g., if I
wrote a patch that included something like this:

Commit 6dc0dcde406b ("parisc: Use platform_device_register_simple("rtc-generic")") is cool (as in "cool")

Or even if it's wrapped like this:

Commit 6dc0dcde406b ("parisc: Use
platform_device_register_simple("rtc-generic")") is cool (as in "cool")

Then the regexes will think the description was:

parisc: Use platform_device_register_simple("rtc-generic")") is cool (as in "cool

A hacky workaround for that one: only check for (loosely, not proper
regex syntax):

parsed_description =~ description . ")";

rather than:

description == parsed_description

Not sure how far you want to go on chasing false positives...

> ? $orig_desc = $1;
> ? $hasparens = 1;
> ? } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s*$/i &&
> ? ?defined $rawlines[$linenr] &&
> - ?$rawlines[$linenr] =~ /^\s*\("([^"]+)"\)/) {
> + ?$rawlines[$linenr] =~ /^\s*\("(.*)"\)/) {
> ? $orig_desc = $1;
> ? $hasparens = 1;
> ? } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("[^"]+$/i &&
> ? ?defined $rawlines[$linenr] &&
> - ?$rawlines[$linenr] =~ /^\s*[^"]+"\)/) {
> + ?$rawlines[$linenr] =~ /^\s*.*"\)/) {
> ? $line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)$/i;
> ? $orig_desc = $1;
> - $rawlines[$linenr] =~ /^\s*([^"]+)"\)/;
> + $rawlines[$linenr] =~ /^\s*(.*)"\)/;
> ? $orig_desc .= " " . $1;
> ? $hasparens = 1;
> ? }

BTW, another false positive: just including this text in a commit
triggers a different one:

http://lkml.kernel.org/g/[email protected]

A simple hack (appended, in addition to yours) would be to assume that
when people are trying to include badly-formatted commit hashes, they
will be preceding them with whitespace, the beginning of a line, or
encapsulating punctuation. Or use a URL parser.

Brian

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index d2993f19df3f..e7110ba3242b 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2349,7 +2349,7 @@ sub process {
# Check for git id commit length and improperly formed commit descriptions
if ($in_commit_log && !$commit_log_possible_stack_dump &&
($line =~ /\bcommit\s+[0-9a-f]{5,}\b/i ||
- ($line =~ /\b[0-9a-f]{12,40}\b/i &&
+ ($line =~ /\b[\s^\("][0-9a-f]{12,40}\b/i &&
$line !~ /[\<\[][0-9a-f]{12,40}[\>\]]/i &&
$line !~ /\bfixes:\s*[0-9a-f]{12,40}/i))) {
my $init_char = "c";