2021-05-27 02:32:39

by Julius Werner

[permalink] [raw]
Subject: [PATCH v2 0/3] Detect suspicious indentation after conditional

This patch series is adding functionality to checkpatch.pl to test for
incorrect code indentation after a conditional statement, like this:

if (a)
b;
c;

(Indentation implies that `c;` was guarded by the conditional, but it
isn't.) The main part is re-sending a patch from Ivo Sieben that was
already proposed in 2014 [1]. I don't know why it was never merged --
it seems that there was no discussion on it. I hope that it was only
overlooked, because it works great, and I think this is a very important
class of common error to catch.

I have tested it extensively on the kernel tree and in the course of
that found a few more edge cases that get fixed by the other two
patches. With all these applied, the vast majority of hits I get from
this check on the kernel tree are actual indentation errors or other
code style violations (e.g. case label and statement on the same line).
The only significant remaining group of false positives I found are
cases of macros being defined within a function, which are overall very
rare. I think the benefit of adding this check would far outweigh the
remaining amount of noise.

[1]: https://lore.kernel.org/patchwork/patch/465116

Ivo Sieben (1):
Suspicious indentation detection after conditional statement

Julius Werner (2):
checkpatch: ctx_statement_block: Fix preprocessor guard tracking
checkpatch: Ignore labels when checking indentation

Changelog:
v2: Expanded fix to ctx_block_get, some minor simplifications

scripts/checkpatch.pl | 56 +++++++++++++++++++++++++++++++++++++++----
1 file changed, 52 insertions(+), 4 deletions(-)

--
2.29.2


2021-05-27 02:32:49

by Julius Werner

[permalink] [raw]
Subject: [PATCH v2 2/3] Suspicious indentation detection after conditional statement

From: Ivo Sieben <[email protected]>

Raise a SUSPICIOUS_CODE_INDENT warning when unexpected indentation is found
after a conditional statement. This can be used to find missing braces or
wrong indentation in/after a conditional statement.

For example the following error is caught;

if (foo)
bar();
return;

Which can be either intended by the programmer as:

if (foo)
bar();
return;

or
if (foo) {
bar();
return;
}

Signed-off-by: Ivo Sieben <[email protected]>
jwerner: Simplified some search patterns
Signed-off-by: Julius Werner <[email protected]>
---
scripts/checkpatch.pl | 41 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 4aab2450ad629e..624a23c05f5388 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -4183,6 +4183,47 @@ sub process {
WARN("SUSPECT_CODE_INDENT",
"suspect code indent for conditional statements ($indent, $sindent)\n" . $herecurr . "$stat_real\n");
}
+
+# Also check if the next statement after the previous condition has the same indent
+ my ($stat_next, undef, $line_nr_next_next) =
+ ctx_statement_block($line_nr_next, $remain_next, $off_next);
+ my $s_next = $stat_next;
+
+ # Remove line prefixes
+ $s_next =~ s/\n./\n/g;
+
+ # Remove any comments
+ $s_next =~ s/$;//g;
+
+ # Skip this check for in case next statement starts with 'else'
+ if ($s_next !~ /\belse\b/) {
+
+ # Remove while that belongs to a do {} while
+ if ($stat =~ /\bdo\b/) {
+ $s_next =~ s/^.*\bwhile\b\s*($balanced_parens)\s*?//;
+ }
+
+ # Remove blank lines
+ $s_next =~ s/\s*\\?\n//g;
+
+ # Get the real next lines
+ my $next_nof_lines = $line_nr_next_next - $line_nr_next;
+ my $stat_next_real = raw_line($line_nr_next, $next_nof_lines);
+ if (!defined($stat_next_real)) {
+ $stat_next_real = "";
+ } elsif ($next_nof_lines > 1) {
+ $stat_next_real = "[...]\n$stat_next_real";
+ }
+ my (undef, $nindent) = line_stats('+' . $s_next);
+
+ #print "stat_next<$stat_next> stat<$stat> indent<$indent> nindent<$nindent> s_next<$s_next> stat_next_real<$stat_next_real>\n";
+
+ if ($nindent > $indent) {
+ WARN("SUSPICIOUS_CODE_INDENT",
+ "suspicious code indentation after conditional statements\n" .
+ $herecurr . "$stat_real\n$stat_next_real\n");
+ }
+ }
}

# Track the 'values' across context and added lines.
--
2.29.2

2021-05-27 02:33:14

by Julius Werner

[permalink] [raw]
Subject: [PATCH v2 3/3] checkpatch: Ignore labels when checking indentation

Goto labels are commonly written in the leftmost column (sometimes with
one space in front), regardless of indentation level. Sometimes they're
on a line of their own, but sometimes the same line is shared with a
normal code statement that then starts at the expected indentation
level. When checking indentation, we should check where that normal
piece of code starts, not where the label starts (there's a separate
INDENTED_LABEL test to check the label itself). Therefore, the
line_stats() function that is used to get indentation level should treat
goto labels like whitespace. The SUSPICIOUS_CODE_INDENT test also needs
to explicitly ignore labels to make sure it doesn't get confused by
them.

Signed-off-by: Julius Werner <[email protected]>
---
scripts/checkpatch.pl | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 624a23c05f5388..6fd16111b52cc6 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1396,8 +1396,12 @@ sub copy_spacing {
sub line_stats {
my ($line) = @_;

- # Drop the diff line leader and expand tabs
+ # Drop the diff line leader
$line =~ s/^.//;
+
+ # Treat labels like whitespace when counting indentation
+ $line =~ s/^( ?$Ident:)/" " x length($1)/e;
+
$line = expand_tabs($line);

# Pick the indent from the front of the line.
@@ -4195,6 +4199,9 @@ sub process {
# Remove any comments
$s_next =~ s/$;//g;

+ # Remove any leading labels
+ $s_next =~ s/\n( ?$Ident:)/"\n" . " " x length($1)/eg;
+
# Skip this check for in case next statement starts with 'else'
if ($s_next !~ /\belse\b/) {

--
2.29.2

2021-05-27 05:12:36

by Julius Werner

[permalink] [raw]
Subject: [PATCH v2 1/3] checkpatch: Fix preprocessor guard handling in context tracker functions

The preprocessor guard tracking in ctx_statement_block() is (and seems
to have always been) subtly broken whenever tracking over an #else: the
code is supposed to restore state from the current top of the stack
(like and #endif just without removing it). However, it indexes the
stack at [$#stack - 1]. In Perl, $# does not give you the length of an
array, it gives you the highest valid index. Therefore, the correct
index should just be [$#stack].

The preprocessor guard tracking also gets confused when
ctx_statement_block() was called on a line that is already inside a
preprocessor guard, and steps out of it within the same statement. This
happens commonly with constructs like this:

#if CONFIG_XXX
for (a = first_a(); !a_finished(); a = next_a()) {
#else
for (b = first_b(); !b_finished(); b = next_b()) {
#endif
... loop body ...

The best course of action in this case is to not try to restore any
previous state (which we don't have) at all, so we should just keep our
current state if $#stack is already 0.

Also fix an analogous problem in ctx_block_get().

Signed-off-by: Julius Werner <[email protected]>
---
scripts/checkpatch.pl | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index df8b23dc1eb0af..4aab2450ad629e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1530,15 +1530,13 @@ sub ctx_statement_block {

my $type = '';
my $level = 0;
- my @stack = ();
+ my @stack = (['', $level]);
my $p;
my $c;
my $len = 0;

my $remainder;
while (1) {
- @stack = (['', 0]) if ($#stack == -1);
-
#warn "CSB: blk<$blk> remain<$remain>\n";
# If we are about to drop off the end, pull in more
# context.
@@ -1572,9 +1570,9 @@ sub ctx_statement_block {
# Handle nested #if/#else.
if ($remainder =~ /^#\s*(?:ifndef|ifdef|if)\s/) {
push(@stack, [ $type, $level ]);
- } elsif ($remainder =~ /^#\s*(?:else|elif)\b/) {
- ($type, $level) = @{$stack[$#stack - 1]};
- } elsif ($remainder =~ /^#\s*endif\b/) {
+ } elsif ($remainder =~ /^#\s*(?:else|elif)\b/ && $#stack > 0) {
+ ($type, $level) = @{$stack[$#stack]};
+ } elsif ($remainder =~ /^#\s*endif\b/ && $#stack > 0) {
($type, $level) = @{pop(@stack)};
}

@@ -1744,9 +1742,9 @@ sub ctx_block_get {
# Handle nested #if/#else.
if ($lines[$line] =~ /^.\s*#\s*(?:ifndef|ifdef|if)\s/) {
push(@stack, $level);
- } elsif ($lines[$line] =~ /^.\s*#\s*(?:else|elif)\b/) {
- $level = $stack[$#stack - 1];
- } elsif ($lines[$line] =~ /^.\s*#\s*endif\b/) {
+ } elsif ($lines[$line] =~ /^.\s*#\s*(?:else|elif)\b/ && $#stack > 0) {
+ $level = $stack[$#stack];
+ } elsif ($lines[$line] =~ /^.\s*#\s*endif\b/ && $#stack > 0) {
$level = pop(@stack);
}

--
2.29.2