2007-05-27 17:43:00

by Andy Whitcroft

[permalink] [raw]
Subject: [PATCH] add a trivial patch style checker


We are seeing increasing levels of minor patch style violations in
submissions to the mailing lists as well as making it into the tree.
These detract from the quality of the submission and cause unnessary
work for reviewers.

As a first step package up the current state of the patch style
checker and include it in the kernel tree. Add instructions
suggesting running it on submissions. This adds version v0.01 of
the patch-trivia-detector.

Signed-off-by: Andy Whitcroft <[email protected]>
---

As discussed here is the updated version of the
patchstylecheckemail.pl tool. This version incorporates
all of Randy Dunlap and my updates. I have also cleaned
it up significantly to aid supporting it, overall indent
and the like.

I am proposing we handle the support of this tool overall
outside kernel tree so we can maintain the regression test
suite and surrounding infrastructure for the proposed
automated style robot. Obviously with updates to the
checker itself flowing from there to the kernel tree.
To this end I have added a version number to the checker.

Also if either Joel or Randy want to be on on the MAINTAINERS
entry yell and we'll get it updated, wouldn't want to list
anyone without permission.

I have the infrastructure in place to for the automated robot
if and when its deemed ready.

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index a417b25..23637e8 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -118,7 +118,21 @@ then only post say 15 or so at a time and wait for review and integration.



-4) Select e-mail destination.
+4) Style check your changes.
+
+Check your patch for basic style violations, details of which can be
+found in Documentation/CodingStyle. Failure to do so simply wastes
+the reviewers time and will get your patch rejected, probabally
+without even being read.
+
+At a minimum you should check your patches with the patch style
+checker prior to submission (scripts/patch-trivia-detector.pl).
+You should be able to justify all violations that remain in your
+patch.
+
+
+
+5) Select e-mail destination.

Look through the MAINTAINERS file and the source code, and determine
if your change applies to a specific subsystem of the kernel, with
@@ -146,7 +160,7 @@ discussed should the patch then be submitted to Linus.



-5) Select your CC (e-mail carbon copy) list.
+6) Select your CC (e-mail carbon copy) list.

Unless you have a reason NOT to do so, CC [email protected].

@@ -187,8 +201,7 @@ URL: <http://www.kernel.org/pub/linux/kernel/people/bunk/trivial/>



-
-6) No MIME, no links, no compression, no attachments. Just plain text.
+7) No MIME, no links, no compression, no attachments. Just plain text.

Linus and other kernel developers need to be able to read and comment
on the changes you are submitting. It is important for a kernel
@@ -223,7 +236,7 @@ pref("mailnews.display.disable_format_flowed_support", true);



-7) E-mail size.
+8) E-mail size.

When sending patches to Linus, always follow step #6.

@@ -234,7 +247,7 @@ server, and provide instead a URL (link) pointing to your patch.



-8) Name your kernel version.
+9) Name your kernel version.

It is important to note, either in the subject line or in the patch
description, the kernel version to which this patch applies.
@@ -244,7 +257,7 @@ Linus will not apply it.



-9) Don't get discouraged. Re-submit.
+10) Don't get discouraged. Re-submit.

After you have submitted your change, be patient and wait. If Linus
likes your change and applies it, it will appear in the next version
@@ -270,7 +283,7 @@ When in doubt, solicit comments on linux-kernel mailing list.



-10) Include PATCH in the subject
+11) Include PATCH in the subject

Due to high e-mail traffic to Linus, and to linux-kernel, it is common
convention to prefix your subject line with [PATCH]. This lets Linus
@@ -279,7 +292,7 @@ e-mail discussions.



-11) Sign your work
+12) Sign your work

To improve tracking of who did what, especially with patches that can
percolate to their final resting place in the kernel through several
@@ -328,7 +341,8 @@ now, but you can do this to mark internal company procedures or just
point out some special detail about the sign-off.


-12) The canonical patch format
+
+13) The canonical patch format

The canonical patch subject line is:

@@ -427,6 +441,10 @@ section Linus Computer Science 101.
Nuff said. If your code deviates too much from this, it is likely
to be rejected without further review, and without comment.

+Check your patches with the patch style checker prior to submission
+(scripts/patch-trivia-detector.pl). You should be able to justify
+all violations that remain in your patch.
+


2) #ifdefs are ugly
diff --git a/MAINTAINERS b/MAINTAINERS
index d7533dd..b8a5611 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -30,8 +30,11 @@ trivial patch so apply some common sense.
job the maintainers (and especially Linus) do is to keep things
looking the same. Sometimes this means that the clever hack in
your driver to get around a problem actually needs to become a
- generalized kernel feature ready for next time. See
- Documentation/CodingStyle for guidance here.
+ generalized kernel feature ready for next time.
+
+ PLEASE check your patch with the automated style checker
+ (scripts/patch-trivia-detector.pl) to catch trival style
+ violations. See Documentation/CodingStyle for guidance here.

PLEASE try to include any credit lines you want added with the
patch. It avoids people being missed off by mistake and makes
@@ -2724,6 +2727,11 @@ L: [email protected]
L: [email protected]
S: Supported

+PATCH TRIVIA DETECTOR
+P: Andy Whitcroft
+M: [email protected]
+S: Supported
+
PC87360 HARDWARE MONITORING DRIVER
P: Jim Cromie
M: [email protected]
diff --git a/scripts/patch-trivia-detector.pl b/scripts/patch-trivia-detector.pl
new file mode 100755
index 0000000..e71c4ef
--- /dev/null
+++ b/scripts/patch-trivia-detector.pl
@@ -0,0 +1,553 @@
+#!/usr/bin/perl -w
+# (c) 2001, Dave Jones. <[email protected]> (the file handling bit)
+# (c) 2005, Joel Scohpp <[email protected]> (the ugly bit)
+# (c) 2007, Andy Whitcroft <[email protected]> (new tests etc)
+# Licensed under the terms of the GNU GPL License version 2
+
+use strict;
+
+my $P = $0;
+
+my $V = '0.01';
+
+use Getopt::Long qw(:config no_auto_abbrev);
+
+my $quiet = 0;
+my $tree = 1;
+my $chk_signoff = 1;
+GetOptions(
+ 'q|quiet' => \$quiet,
+ 'tree!' => \$tree,
+ 'signoff!' => \$chk_signoff,
+) or exit;
+
+my $exit = 0;
+
+if ($#ARGV < 0) {
+ print "usage: patchstylecheckemail.pl [options] patchfile\n";
+ print "version: $V\n";
+ print "options: -q => quiet\n";
+ print " --no-tree => run without a kernel tree\n";
+ exit(1);
+}
+
+if ($tree && !top_of_kernel_tree()) {
+ print "Must be run from the top-level dir. of a kernel tree\n";
+ exit(2);
+}
+
+my @lines = ();
+while (<>) {
+ chomp;
+ push(@lines, $_);
+ if (eof(ARGV)) {
+ if (!process($ARGV, @lines)) {
+ $exit = 1;
+ }
+ @lines = ();
+ }
+}
+
+exit($exit);
+
+sub top_of_kernel_tree {
+ if ((-f "COPYING") && (-f "CREDITS") && (-f "Kbuild") &&
+ (-f "MAINTAINERS") && (-f "Makefile") && (-f "README") &&
+ (-d "Documentation") && (-d "arch") && (-d "include") &&
+ (-d "drivers") && (-d "fs") && (-d "init") && (-d "ipc") &&
+ (-d "kernel") && (-d "lib") && (-d "scripts")) {
+ return 1;
+ }
+ return 0;
+}
+
+sub expand_tabs {
+ my ($str) = @_;
+
+ my $res = '';
+ my $n = 0;
+ for my $c (split(//, $str)) {
+ if ($c eq "\t") {
+ $res .= ' ';
+ $n++;
+ for (; ($n % 8) != 0; $n++) {
+ $res .= ' ';
+ }
+ next;
+ }
+ $res .= $c;
+ $n++;
+ }
+
+ return $res;
+}
+
+sub cat_vet {
+ my ($vet) = @_;
+
+ $vet =~ s/\t/^I/;
+ $vet =~ s/$/\$/;
+
+ return $vet;
+}
+
+sub process {
+ my $filename = shift;
+ my @lines = @_;
+
+ my $linenr=0;
+ my $prevline="";
+ my $stashline="";
+
+ my $lineforcounting='';
+ my $indent;
+ my $previndent=0;
+ my $stashindent=0;
+
+ my $clean = 1;
+ my $signoff = 0;
+
+ # Trace the real file/line as we go.
+ my $realfile = '';
+ my $realline = 0;
+ my $realcnt = 0;
+ my $here = '';
+ my $in_comment = 0;
+ my $first_line = 0;
+
+ foreach my $line (@lines) {
+ $linenr++;
+
+#extract the filename as it passes
+ if($line=~/^\+\+\+\s+(\S+)/) {
+ $realfile=$1;
+ $in_comment = 0;
+ next;
+ }
+#extract the line range in the file after the patch is applied
+ if($line=~/^@@ -\d+,\d+ \+(\d+)(,(\d+))? @@/) {
+ $first_line = 1;
+ $in_comment = 0;
+ $realline=$1-1;
+ if (defined $2) {
+ $realcnt=$3+1;
+ } else {
+ $realcnt=1+1;
+ }
+ next;
+ }
+#check the patch for a signoff:
+ if ($line =~ /^[[:space:]]*Signed-off-by:/i) {
+ $signoff++;
+ }
+#track the line number as we move through the hunk
+ if($line=~/^( |\+)/) {
+ $realline++;
+ $realcnt-- if ($realcnt);
+
+ # track any sort of multi-line comment. Obviously if
+ # the added text or context do not include the whole
+ # comment we will not see it. Such is life.
+ #
+ # Guestimate if this is a continuing comment. If this
+ # is the start of a diff block and this line starts
+ # ' *' then it is very likely a comment.
+ if ($first_line and $line =~ m@^.\s*\*@) {
+ $in_comment = 1;
+ }
+ if ($line =~ m@/\*@) {
+ $in_comment = 1;
+ }
+ if ($line =~ m@\*/@) {
+ $in_comment = 0;
+ }
+
+ $lineforcounting = $line;
+ $lineforcounting =~ s/^\+//;
+ $lineforcounting = expand_tabs($lineforcounting);
+
+ my ($white) = ($lineforcounting =~ /^(\s*)/);
+ $indent = length($white);
+
+ # Track the previous line.
+ ($prevline, $stashline) = ($stashline, $line);
+ ($previndent, $stashindent) = ($stashindent, $indent);
+ $first_line = 0;
+ }
+
+#make up the handle for any error we report on this line
+ $here = "PATCH: $ARGV:$linenr:";
+ $here .= "\nFILE: $realfile:$realline:" if ($realcnt);
+
+ my $herecurr = "$here\n$line\n\n";
+ my $hereprev = "$here\n$prevline\n$line\n\n";
+
+#ignore lines not being added
+ if($line=~/^[^\+]/) {next;}
+
+# check we are in a valid source file *.[hc] if not then ignore this hunk
+ next if ($realfile !~ /\.[hc]$/);
+
+#trailing whitespace
+ if($line=~/\S\s+$/) {
+ my $herevet = "$here\n" . cat_vet($line) . "\n\n";
+ print ("trailing whitespace\n");
+ print "$herevet";
+ $clean = 0;
+ }
+#80 column limit
+ if(!($prevline=~/\/\*\*/) && length($lineforcounting) > 80){
+ print "line over 80 characters\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+# at the beginning of a line any tabs must come first and anything
+# more than 8 must use tabs.
+ if($line=~/^\+\s* \t\s*\S/ or $line=~/^\+\s* \s*/) {
+ my $herevet = "$here\n" . cat_vet($line) . "\n\n";
+ print ("use tabs not spaces\n");
+ print "$herevet";
+ $clean = 0;
+ }
+
+ #
+ # The rest of our checks refer specifically to C style
+ # only apply those _outside_ comments.
+ #
+ next if ($in_comment);
+
+# no C99 // comments
+ if ($line =~ qr|//|) {
+ print "do not use C99 // comments\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+ # Remove comments from the line before processing.
+ $line =~ s@/\*.*\*/@@g;
+ $line =~ s@/\*.*@@;
+ $line =~ s@.*\*/@@;
+ $line =~ s@//.*@@;
+
+#EXPORT_SYMBOL should immediately follow its function closing }.
+ if (($line =~ /EXPORT_SYMBOL.*\(.*\)/) ||
+ ($line =~ /EXPORT_UNUSED_SYMBOL.*\(.*\)/)) {
+ if (($prevline !~ /^}/) &&
+ ($prevline !~ /^\+}/) &&
+ ($prevline !~ /^ }/)) {
+ print "EXPORT_SYMBOL(func); should immediately follow its function\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+ }
+
+ # check for static initialisers.
+ if ($line=~/\s*static\s.*=\s+(0|NULL);/) {
+ print "do not initialise statics to 0 or NULL\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+ # check for new typedefs.
+ if ($line=~/\s*typedef\s/) {
+ print "do not add new typedefs\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+# * goes on variable not on type
+ if($line=~/[A-Za-z\d_]+\* [A-Za-z\d_]+/) {
+ print ("\"foo* bar\" should be \"foo *bar\"\n");
+ print "$herecurr";
+ $clean = 0;
+ }
+
+# no BUG() or BUG_ON()
+ if ($line =~ /\b(BUG|BUG_ON)\b/) {
+ print "Try to use WARN_ON & Recovery code rather than BUG() or BUG_ON()\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+# printk should use KERN_* levels
+ if (($line =~ /\bprintk\b/) &&
+ ($line !~ /KERN_/)) {
+ print "printk() should include KERN_ facility level\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+#function brace can't be on same line, except for #defines of do while, or if closed on same line
+ if(($line=~/[A-Za-z\d_]+\**\s+\**[A-Za-z\d_]+\(.*\).* {/) and
+ !($line=~/\#define.*do\s{/) and !($line=~/}/)){
+ print "braces following function declarations go on the next line\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+ my $opline = $line;
+ $opline =~ s/^.//;
+ if (!($line=~/\#\s*include/)) {
+ # Check operator spacing.
+ my @elements = split(/(<<=|>>=|<=|>=|==|!=|\+=|-=|\*=|\/=|%=|\^=|\|=|&=|->|<<|>>|<|>|=|!|~|&&|\|\||,|\^|\+\+|--|;|&|\||\+|-|\*|\/\/|\/)/, $opline);
+ for (my $n = 0; $n < $#elements; $n += 2) {
+ # $wN says we have white-space before or after
+ # $sN says we have a separator before or after
+ # $oN says we have another operator before or after
+ my $w1 = $elements[$n] =~ /\s$/;
+ my $s1 = $elements[$n] =~ /(\[|\(|\s)$/;
+ my $o1 = $elements[$n] eq '';
+ my $op = $elements[$n + 1];
+ my $w2 = 1;
+ my $s2 = 1;
+ my $o2 = 0;
+ # If we have something after the operator handle it.
+ if (defined $elements[$n + 2]) {
+ $w2 = $elements[$n + 2] =~ /^\s/;
+ $s2 = $elements[$n + 2] =~ /^(\s|\)|\]|;)/;
+ $o2 = $elements[$n + 2] eq '';
+ }
+
+ # Generate the context.
+ my $at = "here: ";
+ for (my $m = $n; $m >= 0; $m--) {
+ if ($elements[$m] ne '') {
+ $at .= $elements[$m];
+ last;
+ }
+ }
+ $at .= $op;
+ for (my $m = $n + 2; defined $elements[$m]; $m++) {
+ if ($elements[$m] ne '') {
+ $at .= $elements[$m];
+ last;
+ }
+ }
+
+ ##print "<$s1:$op:$s2> <$elements[$n]:$elements[$n + 1]:$elements[$n + 2]>\n";
+ # Skip things apparently in quotes.
+ next if ($line=~/\".*\Q$op\E.*\"/ or $line=~/\'\Q$op\E\'/);
+
+ # We need ; as an operator. // is a comment.
+ if ($op eq ';' or $op eq '//') {
+
+ # -> should have no spaces
+ } elsif ($op eq '->') {
+ if ($s1 or $s2) {
+ print "no spaces around that '$op' $at\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+ # , must have a space on the right.
+ } elsif ($op eq ',') {
+ if (!$s2) {
+ print "need space after that '$op' $at\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+ # unary ! and unary ~ are allowed no space on the right
+ } elsif ($op eq '!' or $op eq '~') {
+ if (!$s1 && !$o1) {
+ print "need space before that '$op' $at\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+ if ($s2) {
+ print "no space after that '$op' $at\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+ # unary ++ and unary -- are allowed no space on one side.
+ } elsif ($op eq '++' or $op eq '--') {
+ if (($s1 && $s2) || ((!$s1 && !$o1) && (!$s2 && !$o2))) {
+ print "need space one side of that '$op' $at\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+ # & is both unary and binary
+ # unary:
+ # a &b
+ # binary (consistent spacing):
+ # a&b OK
+ # a & b OK
+ #
+ # boiling down to: if there is a space on the right then there
+ # should be one on the left.
+ #
+ # - is the same
+ #
+ # * is the same only adding:
+ # type:
+ # (foo *)
+ # (foo **)
+ #
+ } elsif ($op eq '&' or $op eq '-' or $op eq '*') {
+ if ($w2 and !$w1) {
+ print "need space before that '$op' $at\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+ # << and >> may either have or not have spaces both sides
+ } elsif ($op eq '<<' or $op eq '>>' or $op eq '+' or $op eq '/' or
+ $op eq '^' or $op eq '|')
+ {
+ if ($s1 != $s2) {
+ print "need consistent spacing around '$op' $at\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+ # All the others need spaces both sides.
+ } elsif (!$s1 or !$s2) {
+ print "need spaces around that '$op' $at\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+ }
+ }
+
+#need space before brace following if, while, etc
+ if($line=~/\(.*\){/) {
+ print ("need a space before the brace\n");
+ print "$herecurr";
+ $clean = 0;
+ }
+
+#gotos aren't indented
+ if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
+ print "Gotos should not be indented\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+# Need a space before open parenthesis after if, while etc
+ if($line=~/(if|while|for|switch)\(/) {
+ print "need a space before the open parenthesis\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+# Check for illegal assignment in if conditional.
+ if($line=~/(if|while)\s*\(.*[^<>!=]=[^=].*\)/) {
+ print "do not use assignment in if condition\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+ # Check for }<nl>else {, these must be at the same
+ # indent level to be relevant to each other.
+ if($prevline=~/}\s*$/ and $line=~/^.\s*else\s*/ and
+ $previndent == $indent) {
+ print "else should follow close brace\n";
+ print "$hereprev";
+ $clean = 0;
+ }
+
+ # Check for switch () {<nl>case, these must be at the
+ # same indent. We will only catch the first one, as our
+ # context is very small but people tend to be consistent
+ # so we will catch them out more often than not.
+ if($prevline=~/\s*switch\s*\(.*\)/ and $line=~/\s*case\s+/
+ and $previndent != $indent) {
+ print "switch and case should be at the same indent\n";
+ print "$hereprev";
+ $clean = 0;
+ }
+
+#studly caps, commented out until figure out how to distinguish between use of existing and adding new
+# if(($line=~/[\w_][a-z\d]+[A-Z]/) and !($line=~/print/)) {
+# print ("No studly caps, use _\n");
+# print "$herecurr";
+# $clean = 0;
+# }
+
+#no spaces allowed after \ in define
+ if($line=~/\#define.*\\\s$/){
+ print("Whitepspace after \\ makes next lines useless\n");
+ print "$herecurr";
+ $clean = 0;
+ }
+
+#warn if <asm/foo.h> is #included and <linux/foo.h> is available.
+ if ($tree && $line =~ qr|\s*\#\s*include\s*\<asm\/(.*)\.h\>|) {
+ my $checkfile = "include/linux/$1.h";
+ if (-f $checkfile) {
+ print "Use #include <linux/$1.h> instead of <asm/$1.h>\n";
+ print $herecurr;
+ $clean = 0;
+ }
+ }
+
+#if/while/etc brace do not go on next line, unless #defining a do while loop, or if that brace on the next line is for something else
+ if ($prevline=~/(if|while|for|switch)\s*\(/) {
+ my @opened = $prevline=~/\(/g;
+ my @closed = $prevline=~/\)/g;
+ my $nr_line = $linenr;
+ my $remaining = $realcnt;
+ my $next_line = $line;
+ my $extra_lines = 0;
+ my $display_segment = $prevline;
+
+ while ($remaining > 0 && scalar @opened > scalar @closed) {
+ $prevline .= $next_line;
+ $display_segment .= "\n" . $next_line;
+ $next_line = $lines[$nr_line];
+ $nr_line++;
+ $remaining--;
+
+ @opened = $prevline=~/\(/g;
+ @closed = $prevline=~/\)/g;
+ }
+
+ if(($prevline=~/(if|while|for|switch)\s*\(.*\)\s*$/) and ($next_line=~/{/) and
+ !($next_line=~/(if|while|for)/) and !($next_line=~/\#define.*do.*while/)) {
+ print "That { should be on the previous line\n";
+ print "$display_segment\n$next_line\n\n";
+ $clean = 0;
+ }
+ }
+
+#multiline macros should be enclosed in a do while loop
+ if(($prevline=~/\#define.*\\/) and !($prevline=~/do\s+{/) and
+ !($prevline=~/\(\{/) and ($line=~/;\s*\\/) and
+ !($line=~/do.*{/) and !($line=~/\(\{/)) {
+ print "Macros with multiple statements should be enclosed in a do - while loop\n";
+ print "$hereprev";
+ $clean = 0;
+ }
+
+# don't include <linux/video_decoder.h>
+ if ($line =~ qr|\#\s*include\s*\<linux/video_decoder\.h\>|) {
+ print "Don't use <linux/video_decoder.h>: see Documentation/feature-removal-schedule.txt\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+
+# don't use kernel_thread()
+ if ($line =~ /\bkernel_thread\b/) {
+ print "Don't use kernel_thread(), use kthread(): see Documentation/feature-removal-schedule.txt\n";
+ print "$herecurr";
+ $clean = 0;
+ }
+ }
+
+ if ($chk_signoff && $signoff == 0) {
+ $clean = 0;
+ print "Missing Signed-off-by: line(s)\n";
+ }
+
+ if($clean == 1 && $quiet == 0){
+ print "Your patch has no obvious style problems and is ready for submission.\n"
+ }
+ if($clean == 0 && $quiet == 0){
+ print "Your patch has style problems, please review. If any of these errors\n";
+ print "are false positives report them to the maintainer, see\n";
+ print "PATCH TRIVIA DETECTOR in MAINTAINERS.\n";
+ }
+ return $clean;
+}


2007-05-27 17:08:18

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

On Sun, 27 May 2007 18:11:25 +0100 Andy Whitcroft wrote:

> Also if either Joel or Randy want to be on on the MAINTAINERS
> entry yell and we'll get it updated, wouldn't want to list
> anyone without permission.

Yes, please.

> I have the infrastructure in place to for the automated robot
> if and when its deemed ready.
>
> diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
> index a417b25..23637e8 100644
> --- a/Documentation/SubmittingPatches
> +++ b/Documentation/SubmittingPatches

It needs to be added to Documentation/SubmitChecklist also...
(but I'm working in yard/house this holiday-not weekend).

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2007-05-27 17:49:55

by Andreas Schwab

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Andy Whitcroft <[email protected]> writes:

> @@ -223,7 +236,7 @@ pref("mailnews.display.disable_format_flowed_support", true);
>
>
>
> -7) E-mail size.
> +8) E-mail size.
>
> When sending patches to Linus, always follow step #6.

That's step #7 now.

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE Linux Products GmbH, Maxfeldstra?e 5, 90409 N?rnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2007-05-27 21:50:36

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

On Sun, May 27, 2007 at 06:11:25PM +0100, Andy Whitcroft wrote:
> +# (c) 2001, Dave Jones. <[email protected]> (the file handling bit)

dead email address.

Dave

--
http://www.codemonkey.org.uk

2007-05-28 00:19:04

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

On Sun, 27 May 2007 18:11:25 +0100 Andy Whitcroft <[email protected]> wrote:

> +#gotos aren't indented
> + if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
> + print "Gotos should not be indented\n";
> + print "$herecurr";
> + $clean = 0;
> + }

This should be "labels". Plus a lot of people do indent the labels by
a single space to avoid confusing `diff -p', which seems a reasonable
thing to not complain about.

2007-05-28 09:12:47

by Sam Ravnborg

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

> diff --git a/scripts/patch-trivia-detector.pl b/scripts/patch-trivia-detector.pl
> new file mode 100755
> index 0000000..e71c4ef
> --- /dev/null
> +++ b/scripts/patch-trivia-detector.pl

Could we please name this checkpatch.pl to follow naming style in scripts/
The namespace check* and clean* and mk* is well established so it is preferred
that we stick to it.

The patch-trivia-detector was a nice working name but as it enters the kernel
it is better that it follows the surroundings.

Sam

2007-05-28 09:48:35

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker


On May 27 2007 18:11, Andy Whitcroft wrote:
>+
>+my $P = $0;
>+
>+my $V = '0.01';
>+
>+use Getopt::Long qw(:config no_auto_abbrev);
>[...]
>+sub top_of_kernel_tree {
>+ if ((-f "COPYING") && (-f "CREDITS") && (-f "Kbuild") &&
>+ (-f "MAINTAINERS") && (-f "Makefile") && (-f "README") &&
>+ (-d "Documentation") && (-d "arch") && (-d "include") &&
>+ (-d "drivers") && (-d "fs") && (-d "init") && (-d "ipc") &&
>+ (-d "kernel") && (-d "lib") && (-d "scripts")) {
>+ return 1;
>+ }
>+ return 0;
>+}
>[...]

The consistent style quite looses after this point, esp. space around
operators ;-)

>+sub process {
>+ my $filename = shift;
>+ my @lines = @_;
>+
>+ my $linenr=0;
>+ my $prevline="";
>+ my $stashline="";
>+
>+ my $lineforcounting='';
>+ my $indent;
>+ my $previndent=0;
>+ my $stashindent=0;
>+
>+ my $clean = 1;
>+ my $signoff = 0;
>+
>+ # Trace the real file/line as we go.
>+ my $realfile = '';
>+ my $realline = 0;
>+ my $realcnt = 0;
>+ my $here = '';
>+ my $in_comment = 0;
>+ my $first_line = 0;

Like in the kernel/C (as far as .bss is concerned), they usually do not
need initialization. (They will be different from 0 though, namely
undef, but doing ++$x where x is undefined will also DTRT.)

>+
>+ foreach my $line (@lines) {
>+ $linenr++;
>+
>+#extract the filename as it passes
>+ if($line=~/^\+\+\+\s+(\S+)/) {

/^\+{3}\s+(\S+)/ "there's always more than one way to do it in Perl" - meh

>+ $realfile=$1;
>+ $in_comment = 0;
>+ next;
>+ }
>+#extract the line range in the file after the patch is applied
>+ if($line=~/^@@ -\d+,\d+ \+(\d+)(,(\d+))? @@/) {

The @ should be escaped afaicr
/^\@\@\s+-\d+,\d+\s+\+(\d+)(,(\d+))? \@\@/

>+ $first_line = 1;
>+ $in_comment = 0;
>+ $realline=$1-1;
>+ if (defined $2) {
>+ $realcnt=$3+1;
>+ } else {
>+ $realcnt=1+1;
>+ }
>+ next;
>+ }
>+#check the patch for a signoff:
>+ if ($line =~ /^[[:space:]]*Signed-off-by:/i) {

[[:space:]] = \s
Perhaps require a \s after : and get rid of /i? ;-)

>+ $signoff++;
>+ }
>+#track the line number as we move through the hunk
>+ if($line=~/^( |\+)/) {

Or ... /^([ \+])/ (mostly depends on taste)

>+ $realline++;
>+ $realcnt-- if ($realcnt);

That's a bit ambiguous at first, since it usually means (I think)
if(defined($realcnt) || $realcnt != 0).
Using if ($realcnt == 0) could help.

>+
>+ # track any sort of multi-line comment. Obviously if
>+ # the added text or context do not include the whole
>+ # comment we will not see it. Such is life.
>+ #
>+ # Guestimate if this is a continuing comment. If this
>+ # is the start of a diff block and this line starts
>+ # ' *' then it is very likely a comment.
>+ if ($first_line and $line =~ m@^.\s*\*@) {
>+ $in_comment = 1;
>+ }

m@@ and and needed? Could be just
if ($first_line == 0 && $line =~ /^.\s*\*/)

>+ if ($line =~ m@/\*@) {
>+ $in_comment = 1;
>+ }
>+ if ($line =~ m@\*/@) {
>+ $in_comment = 0;
>+ }

Although not necessary here, I'd like to point out that I've had
more luck using paired characters as boundaries, i.e. m{} or s{}{},
which means less escaping for the inner regex in some cases.

>+# check we are in a valid source file *.[hc] if not then ignore this hunk
>+ next if ($realfile !~ /\.[hc]$/);

Oh I think trailing whitespace and 80 column limit should also be
applied to .S and .s files.

>+
>+#trailing whitespace
>+ if($line=~/\S\s+$/) {
>+ my $herevet = "$here\n" . cat_vet($line) . "\n\n";
>+ print ("trailing whitespace\n");
>+ print "$herevet";
>+ $clean = 0;
>+ }
>+#80 column limit
>+ if(!($prevline=~/\/\*\*/) && length($lineforcounting) > 80){

Actually, I think this should be "> 79" (after stripping a .diff's
control column), since the cursor may move to the 81th column when
editing an 80-col line - which is what we want to avoid, no?

>+ print "line over 80 characters\n";
>+ print "$herecurr";
>+ $clean = 0;
>+ }
>+
>+# at the beginning of a line any tabs must come first and anything
>+# more than 8 must use tabs.
>+ if($line=~/^\+\s* \t\s*\S/ or $line=~/^\+\s* \s*/) {

|| will do instead of or.
Somehow this does not catch "Only spaces" lines (/^ +\S/), because there's
no \t in them. (Talk about regex fun.)

I'd say split and simplify:
if ($line =~ /^\s* {8,}/) {
print "Make that big space a \\t\n";
}
if ($line =~ /^ +\t/) {
print "Inadvertent space at line start, or make it \\t\n";
}

>+ my $herevet = "$here\n" . cat_vet($line) . "\n\n";
>+ print ("use tabs not spaces\n");
>+ print "$herevet";
>+ $clean = 0;
>+ }
>+
>+ #
>+ # The rest of our checks refer specifically to C style
>+ # only apply those _outside_ comments.
>+ #
>+ next if ($in_comment);
>+
>+# no C99 // comments
>+ if ($line =~ qr|//|) {
>+ print "do not use C99 // comments\n";
>+ print "$herecurr";
>+ $clean = 0;
>+ }

(Now we're using qr{}, what about being consistent with m{}?)
It may break on rare occassions:

printk("Hello // World\n");

>+ # Remove comments from the line before processing.
>+ $line =~ s@/\*.*\*/@@g;
>+ $line =~ s@/\*.*@@;
>+ $line =~ s@.*\*/@@;
>+ $line =~ s@//.*@@;
>+
>+#EXPORT_SYMBOL should immediately follow its function closing }.
>+ if (($line =~ /EXPORT_SYMBOL.*\(.*\)/) ||
>+ ($line =~ /EXPORT_UNUSED_SYMBOL.*\(.*\)/)) {
>+ if (($prevline !~ /^}/) &&
>+ ($prevline !~ /^\+}/) &&
>+ ($prevline !~ /^ }/)) {
>+ print "EXPORT_SYMBOL(func); should immediately follow its function\n";
>+ print "$herecurr";
>+ $clean = 0;
>+ }
>+ }
>+
>+ # check for static initialisers.
>+ if ($line=~/\s*static\s.*=\s+(0|NULL);/) {
>+ print "do not initialise statics to 0 or NULL\n";
>+ print "$herecurr";
>+ $clean = 0;
>+ }
>+
>+ # check for new typedefs.
>+ if ($line=~/\s*typedef\s/) {
>+ print "do not add new typedefs\n";
>+ print "$herecurr";
>+ $clean = 0;
>+ }
>+
>+# * goes on variable not on type
>+ if($line=~/[A-Za-z\d_]+\* [A-Za-z\d_]+/) {
>+ print ("\"foo* bar\" should be \"foo *bar\"\n");

print does not need parentheses in most cases.

>+ print "$herecurr";
>+ $clean = 0;
>+ }
>+
>+# no BUG() or BUG_ON()
>+ if ($line =~ /\b(BUG|BUG_ON)\b/) {
>+ print "Try to use WARN_ON & Recovery code rather than BUG() or BUG_ON()\n";
>+ print "$herecurr";
>+ $clean = 0;
>+ }
>+
>+# printk should use KERN_* levels
>+ if (($line =~ /\bprintk\b/) &&
>+ ($line !~ /KERN_/)) {
>+ print "printk() should include KERN_ facility level\n";
>+ print "$herecurr";
>+ $clean = 0;
>+ }

if ($line =~ /\bprintk\((?!KERN_)/)



Attention span ran out. :)

Jan
--

2007-05-28 10:49:04

by Andy Whitcroft

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Randy Dunlap wrote:
> On Sun, 27 May 2007 18:11:25 +0100 Andy Whitcroft wrote:
>
>> Also if either Joel or Randy want to be on on the MAINTAINERS
>> entry yell and we'll get it updated, wouldn't want to list
>> anyone without permission.
>
> Yes, please.
>
>> I have the infrastructure in place to for the automated robot
>> if and when its deemed ready.
>>
>> diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
>> index a417b25..23637e8 100644
>> --- a/Documentation/SubmittingPatches
>> +++ b/Documentation/SubmittingPatches
>
> It needs to be added to Documentation/SubmitChecklist also...
> (but I'm working in yard/house this holiday-not weekend).

So true.

I've sorted this one out, but there seems to be a number of little
changes trickling in so I will roll them all up and send up a V2 pretty
soon.

-apw

2007-05-28 12:10:27

by Andy Whitcroft

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Andrew Morton wrote:
> On Sun, 27 May 2007 18:11:25 +0100 Andy Whitcroft <[email protected]> wrote:
>
>> +#gotos aren't indented
>> + if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
>> + print "Gotos should not be indented\n";
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>
> This should be "labels". Plus a lot of people do indent the labels by
> a single space to avoid confusing `diff -p', which seems a reasonable
> thing to not complain about.

Yep makes sense. Done.

-apw

2007-05-29 01:27:35

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Andy Whitcroft <[email protected]> writes:
> +
> +# no BUG() or BUG_ON()
> + if ($line =~ /\b(BUG|BUG_ON)\b/) {
> + print "Try to use WARN_ON & Recovery code rather than BUG() or BUG_ON()\n";

Just outlawing BUG_ON doesn't seem like a good idea to me. We'll just end
up with lots of untested and likely buggy recovery code or no asserts. Both
would be bad.

> +#need space before brace following if, while, etc
> + if($line=~/\(.*\){/) {
> + print ("need a space before the brace\n");
> + print "$herecurr";
> + $clean = 0;
> + }
> +
> +#gotos aren't indented

You mean goto labels? Surely goto statements are to be indented.
Confusing message

> + if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
> + print "Gotos should not be indented\n";
> + print "$herecurr";
> + $clean = 0;
> + }

emacs generates one space label in front of a goto label. I wouldn't
outlaw this.

> +# don't include <linux/video_decoder.h>

It would be probably better to define some syntax that makes it possible
to auto extract those from feature-removal-schedule.txt. Otherwise
long term this will become messy.

Possible further checks that might make sense:
- panic() anywhere in drivers/*
- externs in .c files without asmlinkage
- general checking that everything in a fully visible {} block is the right
indentation

-Andi

2007-05-29 01:52:08

by Qi Yong

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

On 28/05/07, Andy Whitcroft <[email protected]> wrote:

...

> diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
> index a417b25..23637e8 100644
> --- a/Documentation/SubmittingPatches
> +++ b/Documentation/SubmittingPatches
> @@ -118,7 +118,21 @@ then only post say 15 or so at a time and wait for review and integration.
>

...

> -8) Name your kernel version.
> +9) Name your kernel version.
>
> It is important to note, either in the subject line or in the patch
> description, the kernel version to which this patch applies.

If omitted, we could assume it default to the latest linus git tree.

> -10) Include PATCH in the subject
> +11) Include PATCH in the subject
>
> Due to high e-mail traffic to Linus, and to linux-kernel, it is common
> convention to prefix your subject line with [PATCH]. This lets Linus

or use "patch", to be easier to read.

--
Qi Yong

2007-05-29 09:01:37

by Andy Whitcroft

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Jan Engelhardt wrote:
> On May 27 2007 18:11, Andy Whitcroft wrote:
>> +
>> +my $P = $0;
>> +
>> +my $V = '0.01';
>> +
>> +use Getopt::Long qw(:config no_auto_abbrev);
>> [...]
>> +sub top_of_kernel_tree {
>> + if ((-f "COPYING") && (-f "CREDITS") && (-f "Kbuild") &&
>> + (-f "MAINTAINERS") && (-f "Makefile") && (-f "README") &&
>> + (-d "Documentation") && (-d "arch") && (-d "include") &&
>> + (-d "drivers") && (-d "fs") && (-d "init") && (-d "ipc") &&
>> + (-d "kernel") && (-d "lib") && (-d "scripts")) {
>> + return 1;
>> + }
>> + return 0;
>> +}
>> [...]
>
> The consistent style quite looses after this point, esp. space around
> operators ;-)

Yep thats on my list for cleanup ...

>> +sub process {
>> + my $filename = shift;
>> + my @lines = @_;
>> +
>> + my $linenr=0;
>> + my $prevline="";
>> + my $stashline="";
>> +
>> + my $lineforcounting='';
>> + my $indent;
>> + my $previndent=0;
>> + my $stashindent=0;
>> +
>> + my $clean = 1;
>> + my $signoff = 0;
>> +
>> + # Trace the real file/line as we go.
>> + my $realfile = '';
>> + my $realline = 0;
>> + my $realcnt = 0;
>> + my $here = '';
>> + my $in_comment = 0;
>> + my $first_line = 0;
>
> Like in the kernel/C (as far as .bss is concerned), they usually do not
> need initialization. (They will be different from 0 though, namely
> undef, but doing ++$x where x is undefined will also DTRT.)

Well I tend to prefer initialisers as they are the only up front hint as
to the notional type of the thing.

>> +
>> + foreach my $line (@lines) {
>> + $linenr++;
>> +
>> +#extract the filename as it passes
>> + if($line=~/^\+\+\+\s+(\S+)/) {
>
> /^\+{3}\s+(\S+)/ "there's always more than one way to do it in Perl" - meh
>
>> + $realfile=$1;
>> + $in_comment = 0;
>> + next;
>> + }
>> +#extract the line range in the file after the patch is applied
>> + if($line=~/^@@ -\d+,\d+ \+(\d+)(,(\d+))? @@/) {
>
> The @ should be escaped afaicr
> /^\@\@\s+-\d+,\d+\s+\+(\d+)(,(\d+))? \@\@/

The @'s need quoting yes. The spaces are officially spaces though.

>> + $first_line = 1;
>> + $in_comment = 0;
>> + $realline=$1-1;
>> + if (defined $2) {
>> + $realcnt=$3+1;
>> + } else {
>> + $realcnt=1+1;
>> + }
>> + next;
>> + }
>> +#check the patch for a signoff:
>> + if ($line =~ /^[[:space:]]*Signed-off-by:/i) {
>
> [[:space:]] = \s
> Perhaps require a \s after : and get rid of /i? ;-)

Yeah, split this into a couple of checks implementing the recommended
capitalisation and checking for a space after the : etc.

>> + $signoff++;
>> + }
>> +#track the line number as we move through the hunk
>> + if($line=~/^( |\+)/) {
>
> Or ... /^([ \+])/ (mostly depends on taste)

We generally use [] so switched to that.

>> + $realline++;
>> + $realcnt-- if ($realcnt);
>
> That's a bit ambiguous at first, since it usually means (I think)
> if(defined($realcnt) || $realcnt != 0).
> Using if ($realcnt == 0) could help.

if ($realcnt != 0); but yes

>> +
>> + # track any sort of multi-line comment. Obviously if
>> + # the added text or context do not include the whole
>> + # comment we will not see it. Such is life.
>> + #
>> + # Guestimate if this is a continuing comment. If this
>> + # is the start of a diff block and this line starts
>> + # ' *' then it is very likely a comment.
>> + if ($first_line and $line =~ m@^.\s*\*@) {
>> + $in_comment = 1;
>> + }
>
> m@@ and and needed? Could be just
> if ($first_line == 0 && $line =~ /^.\s*\*/)
>

This one is being consistent with the ones around it.

>> + if ($line =~ m@/\*@) {
>> + $in_comment = 1;
>> + }
>> + if ($line =~ m@\*/@) {
>> + $in_comment = 0;
>> + }
>
> Although not necessary here, I'd like to point out that I've had
> more luck using paired characters as boundaries, i.e. m{} or s{}{},
> which means less escaping for the inner regex in some cases.

Some consistancy would help for sure. Will try and move to some common
form.

>> +# check we are in a valid source file *.[hc] if not then ignore this hunk
>> + next if ($realfile !~ /\.[hc]$/);
>
> Oh I think trailing whitespace and 80 column limit should also be
> applied to .S and .s files.

Seems sane.

>> +
>> +#trailing whitespace
>> + if($line=~/\S\s+$/) {
>> + my $herevet = "$here\n" . cat_vet($line) . "\n\n";
>> + print ("trailing whitespace\n");
>> + print "$herevet";
>> + $clean = 0;
>> + }
>> +#80 column limit
>> + if(!($prevline=~/\/\*\*/) && length($lineforcounting) > 80){
>
> Actually, I think this should be "> 79" (after stripping a .diff's
> control column), since the cursor may move to the 81th column when
> editing an 80-col line - which is what we want to avoid, no?

80 tends to work for me because of that "if on 80 then don't wrap until
there is another character" behaviour of most terminals. Anyone else
with a firm opinion.

>> + print "line over 80 characters\n";
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>> +
>> +# at the beginning of a line any tabs must come first and anything
>> +# more than 8 must use tabs.
>> + if($line=~/^\+\s* \t\s*\S/ or $line=~/^\+\s* \s*/) {
>
> || will do instead of or.
> Somehow this does not catch "Only spaces" lines (/^ +\S/), because there's
> no \t in them. (Talk about regex fun.)

Well this is designed to pick up spaces before tabs and any sets of
spaces which are long enough to be a tab. The only thing it would not
pick up as bad is things indented less than 8 which would be possibly
frowned upon but would be correct with spaces.

> I'd say split and simplify:
> if ($line =~ /^\s* {8,}/) {
> print "Make that big space a \\t\n";
> }
> if ($line =~ /^ +\t/) {
> print "Inadvertent space at line start, or make it \\t\n";
> }
>
>> + my $herevet = "$here\n" . cat_vet($line) . "\n\n";
>> + print ("use tabs not spaces\n");
>> + print "$herevet";
>> + $clean = 0;
>> + }
>> +
>> + #
>> + # The rest of our checks refer specifically to C style
>> + # only apply those _outside_ comments.
>> + #
>> + next if ($in_comment);
>> +
>> +# no C99 // comments
>> + if ($line =~ qr|//|) {
>> + print "do not use C99 // comments\n";
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>
> (Now we're using qr{}, what about being consistent with m{}?)
> It may break on rare occassions:
>
> printk("Hello // World\n");

Yep, tighened this up to catch this one.

>> + # Remove comments from the line before processing.
>> + $line =~ s@/\*.*\*/@@g;
>> + $line =~ s@/\*.*@@;
>> + $line =~ s@.*\*/@@;
>> + $line =~ s@//.*@@;
>> +
>> +#EXPORT_SYMBOL should immediately follow its function closing }.
>> + if (($line =~ /EXPORT_SYMBOL.*\(.*\)/) ||
>> + ($line =~ /EXPORT_UNUSED_SYMBOL.*\(.*\)/)) {
>> + if (($prevline !~ /^}/) &&
>> + ($prevline !~ /^\+}/) &&
>> + ($prevline !~ /^ }/)) {
>> + print "EXPORT_SYMBOL(func); should immediately follow its function\n";
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>> + }
>> +
>> + # check for static initialisers.
>> + if ($line=~/\s*static\s.*=\s+(0|NULL);/) {
>> + print "do not initialise statics to 0 or NULL\n";
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>> +
>> + # check for new typedefs.
>> + if ($line=~/\s*typedef\s/) {
>> + print "do not add new typedefs\n";
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>> +
>> +# * goes on variable not on type
>> + if($line=~/[A-Za-z\d_]+\* [A-Za-z\d_]+/) {
>> + print ("\"foo* bar\" should be \"foo *bar\"\n");
>
> print does not need parentheses in most cases.

Indeed, removed.

>> + print "$herecurr";
>> + $clean = 0;
>> + }
>> +
>> +# no BUG() or BUG_ON()
>> + if ($line =~ /\b(BUG|BUG_ON)\b/) {
>> + print "Try to use WARN_ON & Recovery code rather than BUG() or BUG_ON()\n";
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>> +
>> +# printk should use KERN_* levels
>> + if (($line =~ /\bprintk\b/) &&
>> + ($line !~ /KERN_/)) {
>> + print "printk() should include KERN_ facility level\n";
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>
> if ($line =~ /\bprintk\((?!KERN_)/)
>

Yep, nicer, changed.

> Attention span ran out. :)

Thanks for bothering :) Any review is appreciated.

-apw

2007-05-29 09:05:38

by Andy Whitcroft

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Andi Kleen wrote:
> Andy Whitcroft <[email protected]> writes:
>> +
>> +# no BUG() or BUG_ON()
>> + if ($line =~ /\b(BUG|BUG_ON)\b/) {
>> + print "Try to use WARN_ON & Recovery code rather than BUG() or BUG_ON()\n";
>
> Just outlawing BUG_ON doesn't seem like a good idea to me. We'll just end
> up with lots of untested and likely buggy recovery code or no asserts. Both
> would be bad.

Thats not an unreasonable position. And I tend to agree with it.
Either we try and have two levels Warnings and Errors, or we just drop
this one for now.

Anyone got an oppinion, so we can get a consensus.

>> +#need space before brace following if, while, etc
>> + if($line=~/\(.*\){/) {
>> + print ("need a space before the brace\n");
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>> +
>> +#gotos aren't indented
>
> You mean goto labels? Surely goto statements are to be indented.
> Confusing message

Yes, changed to labels

>> + if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
>> + print "Gotos should not be indented\n";
>> + print "$herecurr";
>> + $clean = 0;
>> + }
>
> emacs generates one space label in front of a goto label. I wouldn't
> outlaw this.

Yep, we also now allow one space something to do with diff -p not
getting confused...

>> +# don't include <linux/video_decoder.h>
>
> It would be probably better to define some syntax that makes it possible
> to auto extract those from feature-removal-schedule.txt. Otherwise
> long term this will become messy.

Yeah, that is a very sensible idea.

> Possible further checks that might make sense:
> - panic() anywhere in drivers/*
> - externs in .c files without asmlinkage
> - general checking that everything in a fully visible {} block is the right
> indentation
>
> -Andi

Thanks.

-apw

2007-05-29 11:53:59

by Heiko Carstens

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

On Tue, May 29, 2007 at 04:23:45AM +0200, Andi Kleen wrote:
> Possible further checks that might make sense:
> - panic() anywhere in drivers/*

A driver should be allowed to panic. E.g. if it detects that due to a
firmware or driver bug memory corruption happened. IMHO the best thing
to do then is panic.

2007-05-29 13:19:49

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

On Tue, May 29, 2007 at 01:53:24PM +0200, Heiko Carstens wrote:
> On Tue, May 29, 2007 at 04:23:45AM +0200, Andi Kleen wrote:
> > Possible further checks that might make sense:
> > - panic() anywhere in drivers/*
>
> A driver should be allowed to panic. E.g. if it detects that due to a
> firmware or driver bug memory corruption happened. IMHO the best thing
> to do then is panic.

That is not how Linux normally operates. A BUG() doesn't panic() by
default either.

And on systems with IOMMU that is exactly the wrong thing to do.

Besides the problem is that bad drivers tend to badly abuse it
(e.g. see some particular BSD derviced SCSI drivers). We definitely
don't want any more of such code.

-Andi

2007-05-29 14:22:50

by Heiko Carstens

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

> > > Possible further checks that might make sense:
> > > - panic() anywhere in drivers/*
> >
> > A driver should be allowed to panic. E.g. if it detects that due to a
> > firmware or driver bug memory corruption happened. IMHO the best thing
> > to do then is panic.
>
> That is not how Linux normally operates. A BUG() doesn't panic() by
> default either.
>
> And on systems with IOMMU that is exactly the wrong thing to do.
>
> Besides the problem is that bad drivers tend to badly abuse it
> (e.g. see some particular BSD derviced SCSI drivers). We definitely
> don't want any more of such code.

So you prefer random data corruption over an emergency stop?
That doesn't make much sense to me...

2007-05-29 15:00:28

by Joel Schopp

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Randy Dunlap wrote:
> On Sun, 27 May 2007 18:11:25 +0100 Andy Whitcroft wrote:
>
>> Also if either Joel or Randy want to be on on the MAINTAINERS
>> entry yell and we'll get it updated, wouldn't want to list
>> anyone without permission.
>
> Yes, please.

Yes. Add me as well.

2007-05-29 16:13:10

by Joel Schopp

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

>>> + if(!($prevline=~/\/\*\*/) && length($lineforcounting) > 80){
>> Actually, I think this should be "> 79" (after stripping a .diff's
>> control column), since the cursor may move to the 81th column when
>> editing an 80-col line - which is what we want to avoid, no?
>
> 80 tends to work for me because of that "if on 80 then don't wrap until
> there is another character" behaviour of most terminals. Anyone else
> with a firm opinion.

I think 80 is good. What the specific number is does not matter much, we all have
screens wider than 80 characters. The point is just to have a number that prevents
really long lines and prevents people from indenting too many levels past our minds
ability to keep up. We've already all been coding to 80, and it happens to be a nice
round number we can all remember and love. The only reason I see to select 79 is
that prime numbers are generally cooler than other numbers.

-Joel

2007-05-29 16:21:23

by Julio M. Merino Vidal

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

On 29/05/2007, at 18:12, Joel Schopp wrote:

>>>> + if(!($prevline=~/\/\*\*/) && length($lineforcounting) > 80){
>>> Actually, I think this should be "> 79" (after stripping a .diff's
>>> control column), since the cursor may move to the 81th column when
>>> editing an 80-col line - which is what we want to avoid, no?
>> 80 tends to work for me because of that "if on 80 then don't wrap
>> until
>> there is another character" behaviour of most terminals. Anyone else
>> with a firm opinion.
>
> I think 80 is good. What the specific number is does not matter
> much, we all have screens wider than 80 characters. The point is
> just to have a number that prevents really long lines and prevents
> people from indenting too many levels past our minds ability to
> keep up. We've already all been coding to 80, and it happens to be
> a nice round number we can all remember and love. The only reason
> I see to select 79 is that prime numbers are generally cooler than
> other numbers.

We do have screens wider than 80 characters, but almost all the time
I spend in terminal windows, they are set to 24x80 (the default
size). It is a matter of habit, and I bet I'm not alone. Hence, 80
is "annoying" not only because patches will wrap, but also because in
some editors the 80th character will also wrap.

Just my 2 cents,

--
Julio M. Merino Vidal <[email protected]>


2007-05-29 16:46:56

by Heiko Carstens

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

On Tue, May 29, 2007 at 04:58:18PM +0200, Andi Kleen wrote:
> > So you prefer random data corruption over an emergency stop?
>
> With an oops you can at least recover the system and actually
> look at the problem. On a machine with a panic you're just dead
> and the probability of actually being able to do something about the problem
> is much lower. On x86 systems you typically don't even get
> any message out.

Ok, that's the different approach of analyzing problems. On s390 we
prefer dumps of a crashed system, since that is much easier to
debug than a kernel that just printed some lines and then went on
as if nothing happened. Besides the nice side effects that a BUG()
statement has of course.

> And I'm not convinced drivers are in a good position to decide
> if memory was likely corrupted or not anyways. At least the
> panics I see in driver sources seem to be just random logic
> bugs from someone not familiar with BUG().
>
> Also they typically don't make much attempt to figure out
> if there might have been data corruption.

I'm talking of a specific problem where we just added a panic to the
zfcp device driver. If that panic ever triggers we know for sure that
memory corruption happened.
So I'm just asking to not say in general that panic() in a device driver
is a bad thing.

> If you're really worried about memory corruption in drivers
> you should just use an IOMMU.

IOMMU and s390 don't fit together.

> > That doesn't make much sense to me...
> So you're always setting panic_on_oops?

On our internal tests, yes. Otherwise it's of course up to the distros.
Btw. the default implementation for BUG() is panic(). See asm-generic/bug.h.

2007-05-29 18:56:43

by Andy Whitcroft

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Andi Kleen wrote:

>> +# don't include <linux/video_decoder.h>
>
> It would be probably better to define some syntax that makes it possible
> to auto extract those from feature-removal-schedule.txt. Otherwise
> long term this will become messy.

Turns out that we already have Files: entries in f-r-s.txt and so I've
enhanced things to pick them out if they start include/ and check them.
Works well for the two that exists as of today.

-apw

2007-05-29 20:03:26

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker


On May 29 2007 10:01, Andy Whitcroft wrote:
>>
>> The consistent style quite looses after this point, esp. space around
>> operators ;-)
>
>Yep thats on my list for cleanup ...

Hey, you might even try running patch-trivia-detector.pl through itself. :)


>> Attention span ran out. :)
>
>Thanks for bothering. Any review is appreciated.

Followup comes..


Jan
--

2007-05-29 20:27:24

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker


On May 29 2007 10:05, Andy Whitcroft wrote:
>
>>> + if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
>>> + print "Gotos should not be indented\n";
>>> + print "$herecurr";
>>> + $clean = 0;
>>> + }
>>
>> emacs generates one space label in front of a goto label. I wouldn't
>> outlaw this.
>
>Yep, we also now allow one space something to do with diff -p not
>getting confused...

Not only diff and emacs. At least joe (joe.sf.net) is another editor
whose %y tag or so also takes lines with

no space at the front and properly paired parens,braces,etc.

as key lines for %y. Since usually, functions are the only thing at
the beginning of a line, this works well. That's why labels should
be allowed to be indented by one.


Jan
--

2007-05-29 22:26:32

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Heiko Carstens <[email protected]> writes:
>
> I'm talking of a specific problem where we just added a panic to the
> zfcp device driver. If that panic ever triggers we know for sure that
> memory corruption happened.
> So I'm just asking to not say in general that panic() in a device driver
> is a bad thing.

Ok there might be exceptions, but in general I still think it's true.
It probably fits the "you must be able to justify any warnings" rule.
I would still like this check to be added.

-Andi

2007-05-29 22:34:37

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

On Tue, 29 May 2007 22:22:43 +0200 (MEST) Jan Engelhardt wrote:

>
> On May 29 2007 10:05, Andy Whitcroft wrote:
> >
> >>> + if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
> >>> + print "Gotos should not be indented\n";
> >>> + print "$herecurr";
> >>> + $clean = 0;
> >>> + }
> >>
> >> emacs generates one space label in front of a goto label. I wouldn't
> >> outlaw this.
> >
> >Yep, we also now allow one space something to do with diff -p not
> >getting confused...
>
> Not only diff and emacs. At least joe (joe.sf.net) is another editor
> whose %y tag or so also takes lines with
>
> no space at the front and properly paired parens,braces,etc.
>
> as key lines for %y. Since usually, functions are the only thing at
> the beginning of a line, this works well. That's why labels should
> be allowed to be indented by one.

That's just a joe bug then.

diff no longer requires a beginning space, at least in the testing
that I did it worked with no leading space on a label:.


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2007-05-30 08:41:11

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker


On May 29 2007 15:36, Randy Dunlap wrote:
>> >>> + if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
>> >>> + print "Gotos should not be indented\n";
>> >>> + print "$herecurr";
>> >>> + $clean = 0;
>> >>> + }
>> >>
>> >> emacs generates one space label in front of a goto label. I wouldn't
>> >> outlaw this.
>> >
>> >Yep, we also now allow one space something to do with diff -p not
>> >getting confused...
>>
>> Not only diff and emacs. At least joe (joe.sf.net) is another editor
[...]
>
>That's just a joe bug then.
>
>diff no longer requires a beginning space, at least in the testing
>that I did it worked with no leading space on a label:.

But not everyone runs the latest and greatest. diffutils 2.8.7
takes /^label:/ as a marker for the @@ line.


Jan
--

2007-05-30 09:39:32

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

> So you prefer random data corruption over an emergency stop?

With an oops you can at least recover the system and actually
look at the problem. On a machine with a panic you're just dead
and the probability of actually being able to do something about the problem
is much lower. On x86 systems you typically don't even get
any message out.

And I'm not convinced drivers are in a good position to decide
if memory was likely corrupted or not anyways. At least the
panics I see in driver sources seem to be just random logic
bugs from someone not familiar with BUG().

Also they typically don't make much attempt to figure out
if there might have been data corruption.

If you're really worried about memory corruption in drivers
you should just use an IOMMU.

> That doesn't make much sense to me...

So you're always setting panic_on_oops?

-Andi

2007-05-30 15:32:56

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker

Jan Engelhardt wrote:
> On May 29 2007 15:36, Randy Dunlap wrote:
>>>>>> + if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
>>>>>> + print "Gotos should not be indented\n";
>>>>>> + print "$herecurr";
>>>>>> + $clean = 0;
>>>>>> + }
>>>>> emacs generates one space label in front of a goto label. I wouldn't
>>>>> outlaw this.
>>>> Yep, we also now allow one space something to do with diff -p not
>>>> getting confused...
>>> Not only diff and emacs. At least joe (joe.sf.net) is another editor
> [...]
>> That's just a joe bug then.
>>
>> diff no longer requires a beginning space, at least in the testing
>> that I did it worked with no leading space on a label:.
>
> But not everyone runs the latest and greatest. diffutils 2.8.7
> takes /^label:/ as a marker for the @@ line.

OK. What does "marker for the @@ line" mean?

--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2007-05-30 16:11:24

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker


On May 30 2007 08:33, Randy Dunlap wrote:
>> > That's just a joe bug then.
>> >
>> > diff no longer requires a beginning space, at least in the testing
>> > that I did it worked with no leading space on a label:.
>>
>> But not everyone runs the latest and greatest. diffutils 2.8.7
>> takes /^label:/ as a marker for the @@ line.
>
> OK. What does "marker for the @@ line" mean?

if ($_ !~ /^[\(\[\{]/ && $_ =~ /^(\S+)/) {
print "@@ -a,b +c,d @@ $1\n"
}

So
label:
will be the 'mark[er]' for the @@ line

Just try yourself:
<<< foo.c <<<
int main(void)
{
x
y
z
label:
a
b
c
d
e
f
}
>>>
<<< foo2.c <<<
int main(void)
{
x
y
z
label:
a
b
c
d
d+
e
f
}
>>>
<<< diff -dpru foo.c foo2.c <<<
--- foo.c 2007-05-30 10:33:45.714222656 +0200
+++ foo2.c 2007-05-30 18:03:43.061471495 +0200
@@ -8,6 +8,7 @@ label:
b
c
d
+ d+
e
f
}
>>>

Jan
--

2007-05-31 12:08:08

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker II

> Yeah, that is a very sensible idea.
>
> > Possible further checks that might make sense:
> > - panic() anywhere in drivers/*
> > - externs in .c files without asmlinkage
> > - general checking that everything in a fully visible {} block is the right
> > indentation
> >

Here are some more warnings I would like to see:

- Warning for any spinlock/mutex definition that doesn't have a comment
nearby (all locks ought to be documented)
- Keep an ifdef count and give minus points for too many
- Warn about any architecture ifdefs (__i386__ etc.)
While not 100% illegal this is definitely something that needs to be
justified

-Andi

2007-05-31 19:30:18

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker


>+sub top_of_kernel_tree {
>+ if ((-f "COPYING") && (-f "CREDITS") && (-f "Kbuild") &&
>+ (-f "MAINTAINERS") && (-f "Makefile") && (-f "README") &&
>+ (-d "Documentation") && (-d "arch") && (-d "include") &&
>+ (-d "drivers") && (-d "fs") && (-d "init") && (-d "ipc") &&
>+ (-d "kernel") && (-d "lib") && (-d "scripts")) {
>+ return 1;
>+ }
>+ return 0;
>+}

Looks like some redundant parentheses.

>+#gotos aren't indented
>+ if($line=~/^\s*[A-Za-z\d_]+:/ and !($line=~/^\s*default:/)){
>+ print "Gotos should not be indented\n";
>+ print "$herecurr";
>+ $clean = 0;
>+ }

I think this was discussed already - some indent (1 space?) should be allowed.

>+#studly caps, commented out until figure out how to distinguish between use
>+#of existing and adding new

Yeah that's a big problem, given that linux's mm/ directory has
quite a lot of camel case function names.

>+# if(($line=~/[\w_][a-z\d]+[A-Z]/) and !($line=~/print/)) {
>+# print ("No studly caps, use _\n");
>+# print "$herecurr";
>+# $clean = 0;
>+# }
>+
>+#no spaces allowed after \ in define
>+ if($line=~/\#define.*\\\s$/){

Usually, #s do _not_ need to be quoted (in contrast to @).
I am at stake to be wrong, anyone know more? :)

>+#if/while/etc brace do not go on next line, unless #defining a do while loop, or if that brace on the next line is for something else
>+ if ($prevline=~/(if|while|for|switch)\s*\(/) {

/(?:if|while|.../

I don't see $1 being captured/used, so..

>+ my @opened = $prevline=~/\(/g;
>+ my @closed = $prevline=~/\)/g;
>+ my $nr_line = $linenr;
>+ my $remaining = $realcnt;
>+ my $next_line = $line;
>+ my $extra_lines = 0;
>+ my $display_segment = $prevline;
>+
>+ while ($remaining > 0 && scalar @opened > scalar @closed) {
>+ $prevline .= $next_line;
>+ $display_segment .= "\n" . $next_line;
>+ $next_line = $lines[$nr_line];
>+ $nr_line++;
>+ $remaining--;
>+
>+ @opened = $prevline=~/\(/g;
>+ @closed = $prevline=~/\)/g;
>+ }
>+
>+ if(($prevline=~/(if|while|for|switch)\s*\(.*\)\s*$/) and ($next_line=~/{/) and
>+ !($next_line=~/(if|while|for)/) and !($next_line=~/\#define.*do.*while/)) {

Same.

>+# don't include <linux/video_decoder.h>

Who does that?



Jan
--

2007-05-31 20:00:47

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker II

On Thu, May 31, 2007 at 02:07:53PM +0200, Andi Kleen wrote:
> > Yeah, that is a very sensible idea.
> >
> > > Possible further checks that might make sense:
> > > - panic() anywhere in drivers/*
> > > - externs in .c files without asmlinkage
> > > - general checking that everything in a fully visible {} block is the right
> > > indentation
> > >
>
> Here are some more warnings I would like to see:
>
> - Warning for any spinlock/mutex definition that doesn't have a comment
> nearby (all locks ought to be documented)

Also barriers. (Probably even moreso).

Dave

--
http://www.codemonkey.org.uk

2007-06-01 14:19:26

by Andy Whitcroft

[permalink] [raw]
Subject: Re: [PATCH] add a trivial patch style checker II

Dave Jones wrote:
> On Thu, May 31, 2007 at 02:07:53PM +0200, Andi Kleen wrote:
> > > Yeah, that is a very sensible idea.
> > >
> > > > Possible further checks that might make sense:
> > > > - panic() anywhere in drivers/*
> > > > - externs in .c files without asmlinkage
> > > > - general checking that everything in a fully visible {} block is the right
> > > > indentation
> > > >
> >
> > Here are some more warnings I would like to see:
> >
> > - Warning for any spinlock/mutex definition that doesn't have a comment
> > nearby (all locks ought to be documented)
>
> Also barriers. (Probably even moreso).

Both of these seem a pretty good idea. Should be in version 0.03 which
I'll try and get to Andrew over the weekend. Example reports from files
in 2.6.22-rc2-mm1 below.

-apw

spinlock_t definition without comment
FILE: lib/statistic.c:243:
+ spinlock_t lock;

struct mutex definition without comment
FILE: include/linux/kernelcapi.h:67:
+ struct mutex recv_mtx;

memory barrier without comment
FILE: fs/ext2/balloc.c:1250:
+ smp_rmb();