2021-09-23 14:41:32

by Jerome Forissier

[permalink] [raw]
Subject: [PATCH] checkpatch: relax regexp for COMMIT_LOG_LONG_LINE

One exceptions to the COMMIT_LOG_LONG_LINE rule is a file path followed
by :. That is typically some sort diagnostic message from a compiler or
a build tool, in which case we don't want to wrap the lines but keep the
message unmodified.
The regular expression used to match this pattern currently doesn't
accept absolute paths or + characters. This can result in false
positives as in the following (out-of-tree) example:

...
/home/jerome/work/optee_repo_qemu/build/../toolchains/aarch32/bin/arm-linux-gnueabihf-ld.bfd: /home/jerome/work/toolchains-gcc10.2/aarch32/bin/../lib/gcc/arm-none-linux-gnueabihf/10.2.1/../../../../arm-none-linux-gnueabihf/lib/libstdc++.a(eh_alloc.o): in function `__cxa_allocate_exception':
/tmp/dgboter/bbs/build03--cen7x86_64/buildbot/cen7x86_64--arm-none-linux-gnueabihf/build/src/gcc/libstdc++-v3/libsupc++/eh_alloc.cc:284: undefined reference to `malloc'
...

Update the regular expression to match the above paths.

Signed-off-by: Jerome Forissier <[email protected]>
---
scripts/checkpatch.pl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index c27d2312cfc3..bf2094cb4271 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3147,7 +3147,7 @@ sub process {
length($line) > 75 &&
!($line =~ /^\s*[a-zA-Z0-9_\/\.]+\s+\|\s+\d+/ ||
# file delta changes
- $line =~ /^\s*(?:[\w\.\-]+\/)++[\w\.\-]+:/ ||
+ $line =~ /^\s*(?:[\w\.\-\+]*\/)++[\w\.\-\+]+:/ ||
# filename then :
$line =~ /^\s*(?:Fixes:|Link:|$signature_tags)/i ||
# A Fixes: or Link: line or signature tag line
--
2.30.2


2021-11-05 16:06:46

by Jerome Forissier

[permalink] [raw]
Subject: Re: [PATCH] checkpatch: relax regexp for COMMIT_LOG_LONG_LINE

On 9/23/21 16:38, Jerome Forissier wrote:
> One exceptions to the COMMIT_LOG_LONG_LINE rule is a file path followed
> by :. That is typically some sort diagnostic message from a compiler or
> a build tool, in which case we don't want to wrap the lines but keep the
> message unmodified.
> The regular expression used to match this pattern currently doesn't
> accept absolute paths or + characters. This can result in false
> positives as in the following (out-of-tree) example:
>
> ...
> /home/jerome/work/optee_repo_qemu/build/../toolchains/aarch32/bin/arm-linux-gnueabihf-ld.bfd: /home/jerome/work/toolchains-gcc10.2/aarch32/bin/../lib/gcc/arm-none-linux-gnueabihf/10.2.1/../../../../arm-none-linux-gnueabihf/lib/libstdc++.a(eh_alloc.o): in function `__cxa_allocate_exception':
> /tmp/dgboter/bbs/build03--cen7x86_64/buildbot/cen7x86_64--arm-none-linux-gnueabihf/build/src/gcc/libstdc++-v3/libsupc++/eh_alloc.cc:284: undefined reference to `malloc'
> ...
>
> Update the regular expression to match the above paths.
>
> Signed-off-by: Jerome Forissier <[email protected]>
> ---
> scripts/checkpatch.pl | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index c27d2312cfc3..bf2094cb4271 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -3147,7 +3147,7 @@ sub process {
> length($line) > 75 &&
> !($line =~ /^\s*[a-zA-Z0-9_\/\.]+\s+\|\s+\d+/ ||
> # file delta changes
> - $line =~ /^\s*(?:[\w\.\-]+\/)++[\w\.\-]+:/ ||
> + $line =~ /^\s*(?:[\w\.\-\+]*\/)++[\w\.\-\+]+:/ ||
> # filename then :
> $line =~ /^\s*(?:Fixes:|Link:|$signature_tags)/i ||
> # A Fixes: or Link: line or signature tag line
>

Ping?

Thanks,
--
Jerome

2021-11-06 14:44:13

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] checkpatch: relax regexp for COMMIT_LOG_LONG_LINE

On Fri, 2021-11-05 at 11:31 +0100, Jerome Forissier wrote:
> On 9/23/21 16:38, Jerome Forissier wrote:
> > One exceptions to the COMMIT_LOG_LONG_LINE rule is a file path followed
> > by :. That is typically some sort diagnostic message from a compiler or
> > a build tool, in which case we don't want to wrap the lines but keep the
> > message unmodified.
> > The regular expression used to match this pattern currently doesn't
> > accept absolute paths or + characters. This can result in false
> > positives as in the following (out-of-tree) example:
[]
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> > @@ -3147,7 +3147,7 @@ sub process {
> > length($line) > 75 &&
> > !($line =~ /^\s*[a-zA-Z0-9_\/\.]+\s+\|\s+\d+/ ||
> > # file delta changes
> > - $line =~ /^\s*(?:[\w\.\-]+\/)++[\w\.\-]+:/ ||
> > + $line =~ /^\s*(?:[\w\.\-\+]*\/)++[\w\.\-\+]+:/ ||
> > # filename then :

I looked the number of new matches

$ git log --format=email -100000 | \
grep -P '^\s*(?:[\w\.\-]+\/)++[\w\.\-]+:' | wc -l
21160
$ git log --format=email -100000 | \
grep -P '^\s*(?:[\w\.\-\+]*\/)++[\w\.\-]+:' | wc -l
21627

OK, so around 3% more matches.

And then looked only at these new matches

$ git log --format=email -100000 | \
grep -P '^\s*(?:[\w\.\-\+]*\/)++[\w\.\-]+:' | \
grep -P -v '^\s*(?:[\w\.\-]+\/)++[\w\.\-]+:'

And all these new matches look OK to me to ignore for long lines.
Out of tree or not...

Acked-by: Joe Perches <[email protected]>