2019-07-10 23:20:08

by Matteo Croce

[permalink] [raw]
Subject: [PATCH] checkpatch.pl: warn on invalid commit hash

It can happen that a commit message refers to an invalid hash, because
the referenced hash changed following a rebase, or simply by mistake.
Add a check in checkpatch.pl which checks that an hash referenced by a Fixes
tag or just cited in the commit message is a valid commit hash.

$ scripts/checkpatch.pl <<'EOF'
Subject: [PATCH] test commit

Sample test commit to test checkpatch.pl
Commit 1da177e4c3f4 ("Linux-2.6.12-rc2") really exists,
commit 0bba044c4ce7 ("tree") is valid but not a commit,
while commit b4cc0b1c0cca ("unknown") is invalid.

Fixes: f0cacc14cade ("unknown")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
EOF
WARNING: Invalid hash 0bba044c4ce7
WARNING: Invalid hash b4cc0b1c0cca
WARNING: Invalid hash f0cacc14cade
total: 0 errors, 3 warnings, 4 lines checked

Signed-off-by: Matteo Croce <[email protected]>
---
scripts/checkpatch.pl | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index a6d436809bf5..6fe15fbe876f 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2898,6 +2898,13 @@ sub process {
}
}

+# check for invalid hashes
+ if ($in_commit_log && $line =~ /(^fixes:|commit)\s+([0-9a-f]{6,40})\b/i) {
+ if (`git cat-file -t $2 2>/dev/null` ne "commit\n") {
+ WARN('INVALID_COMMIT_HASH', "Invalid commit hash $2");
+ }
+ }
+
# ignore non-hunk lines and lines being removed
next if (!$hunk_line || $line =~ /^-/);

--
2.21.0


2019-07-10 23:53:20

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] checkpatch.pl: warn on invalid commit hash

On Thu, 2019-07-11 at 01:19 +0200, Matteo Croce wrote:
> It can happen that a commit message refers to an invalid hash, because
> the referenced hash changed following a rebase, or simply by mistake.
> Add a check in checkpatch.pl which checks that an hash referenced by a Fixes
> tag or just cited in the commit message is a valid commit hash.

Hi Matteo

> $ scripts/checkpatch.pl <<'EOF'
> Subject: [PATCH] test commit
>
> Sample test commit to test checkpatch.pl
> Commit 1da177e4c3f4 ("Linux-2.6.12-rc2") really exists,
> commit 0bba044c4ce7 ("tree") is valid but not a commit,
> while commit b4cc0b1c0cca ("unknown") is invalid.
>
> Fixes: f0cacc14cade ("unknown")
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> EOF
> WARNING: Invalid hash 0bba044c4ce7
> WARNING: Invalid hash b4cc0b1c0cca
> WARNING: Invalid hash f0cacc14cade
> total: 0 errors, 3 warnings, 4 lines checked

[]
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -2898,6 +2898,13 @@ sub process {
> }
> }
>
> +# check for invalid hashes
> + if ($in_commit_log && $line =~ /(^fixes:|commit)\s+([0-9a-f]{6,40})\b/i) {
> + if (`git cat-file -t $2 2>/dev/null` ne "commit\n") {
> + WARN('INVALID_COMMIT_HASH', "Invalid commit hash $2");

This seems fine as a concept, but this should use a
'\n' and . $herecurr like:

> WARN('INVALID_COMMIT_HASH', "Invalid commit hash $2\n" . $herecurr);

And while a single quote around the identifier works, please
use the double quote style like all the other uses of WARN.

Maybe call it "UNKNOWN_COMMIT_ID" too as it might be valid
for someone else's tree that has not yet been pulled and all
other references in checkpatch use ID rather than hash.

WARN("UNKNOWN_COMMIT_HASH",
"Unknown commit id '$2', maybe rebased or not pulled?\n" . $herecurr);

Finally, why wouldn't the existing git_commit_info subroutine
be used instead of an independent 'git cat-file' which may not
even run if git is not available?

Perhaps use something like:

my $id;
my $description;
($id, $description) = git_commit_info($2, undef, undef);
if (!defined($id)) {
WARN(etc...);
}