Drop 80-character limit in checkpatch.pl
Serious issues:
===============
(1) The code is hard to edit with common text editors
-----------------------------------------------------
For example, let's have this function
struct some_structure *some_function(struct susystem *s, struct s_buffer *b,
unsigned some_number, unsigned some_flags,
unsigned long *extra_return_value)
Now, try to make this function static. Or try to add/remove/change some
argument. You end up manually realigning all the arguments. The same problem
happens when editing longer expressions.
This is the most annoying problem with imposed text alignment. If there were
no alignment, you just type "I static SPACE ESC" in vi and get it there
(similarly HOME static SPACE in notepad). If you have to maintain alignment,
the number of keystokes becomes much higher:
(insert the keyword)
I static SPACE ESC
(split the second argument)
f , l s ENTER
(align the second argument)
TAB TAB TAB TAB TAB SPACE SPACE SPACE SPACE ESC
(see if the third argument fits --- it doesn't, revert it)
J u
(align the third argument)
j ^ XXXXXX i TAB TAB SPACE SPACE SPACE SPACE ESC
(split the fourth argument --- it is aligned well)
f , l s ENTER ESC
(align the fifth argument)
j ^ XXXXXX i TAB TAB SPACE SPACE SPACE SPACE ESC
There are shorter possible sequences using vi keystrokes but the person doing
the editing doesn't always find the optimal way quickly enough, so the actual
editing will look somehow like this.
For emacs, there is a script in CodingStyle to make it conform to the Linux
kernel coding style, but the script only changes tabbing, it does absolutely
nothing with respect to the line width --- with the script loaded, emacs still
doesn't realign the arguments if you add a "static" keyword there.
[ I'm wondering, which editor does the person who invented and supports this
line-wrapping actually use. I suppose none :-) --- it looks like that person
mostly reads the code and tries to make it look aesthetical. ]
(2) Grepping the code is unreliable
-----------------------------------
For example, you see a message "device: some error happened" in syslog.
Now, you grep the kernel with a command (grep -r "some error happened" .) to
find out where it came from. You miss it if the code is formatted like this:
if (some_error_happened) {
printk(KERN_ERR "%s: some error "
"happened", dev->name);
}
In this case, you usually narrow the search to "error happened" and "some error"
and eventually find it.
Now, the real problem comes, when there are two instance of "some error
happened" message, one is wrapped and the other is not. Here, grep misleads you
to the non-wrapped instance and you are never going to find out that there is
another place where this error could be reported.
Similar grepping problems also exist (less frequently) when grepping the code
--- i.e. Where is the function "bla" called with argument BLA_SOME_FLAG?).
Grep and regular expressions give unreliable results because of 80-line
wrapping.
Less serious issues:
====================
(3) Wrapping wastes space on wide-screen displays
-------------------------------------------------
Obvious. The screen area past column 80 is not used at all.
(4) Wrapping wastes space on small 80x25 display
------------------------------------------------
It is less obvious, but let's see how artifical 80-column wrapping wastes space
on a real 80-column display.
With wrapping: 4 lines:
struc = some_function(s, buffer, variable +
get_something(123), FLAG_ONE |
FLAG_TWO | FLAG_THREE | FLAG_FOUR,
&extra_ret);
Without wrapping (note that the editor/viewer wraps it automatically at
column 80): 2 lines:
struc = some_function(sub, buffer, variable + ge
t_something(123), FLAG_ONE | FLAG_TWO | FLAG_THREE | FLAG_FOUR, &extra_ret);
(5) Wrapping makes long expressions harder to understand
--------------------------------------------------------
If I have a complex expression, I do not try to wrap it at predefined
80-column boundaries, but at logical boundaries within the expression to make
it more readable (the brain can't find matching parentheses fast, so we can
help it by aligning the code according to topmost terms in the expression).
Example:
if (unlikely(call_some_function(s, value) != RET
_SUCCESS) ||
(var_1 == prev_var_1 && var_2 == prev_var_2)
||
flags & (FLAG_1 | FLAG_2) ||
some_other_condition) {
}
Now, if we impose 80-column limit, we get this. One may argue that is looks
aesthetically better, but it is also less intelligible than the previous
version:
if (unlikely(call_some_function(s, value) !=
RET_SUCCESS) || (var_1 == prev_var_1 &&
var_2 == prev_var_2) || flags & (FLAG_1 |
FLAG_2) || some_other_condition) {
}
Counter-arguments:
==================
Some people say that 80-column wrapping is imposed so that kernel code is
readable and editable in 80-column text mode.
(for example http://lkml.indiana.edu/hypermail/linux/kernel/0807.2/0010.html)
I personally use 80-column text mode a lot of time, yet I find the artifical
wrapping annoying. The key point is that the editor and viewer wrap the text
automatically at column 80, so there is never any text invisible and there is
no need to wrap it manually.
vi, less, mcview, emacs, grep, pine (*), mutt --- all these will automatically
skip to the next line if some line is longer than terminal width.
(*) pine does this auto-wrapping only in viewer mode, but no one writes code in
pine anyway.
If you rely on this auto-wrapping, you get split words (such as call_some_functi
on()), but I find these split words less annoying than having to manually
realign lines with tabs and spaces each time I edit something.
---
If this 80-column wrapping was only taken as a guidance --- use it or not, as
you like --- there would be little problem with it, because this manual
realigning doesn't annoy anyone but the person who writes the code.
But some maintainers take output of the script checkpatch.pl dogmatically,
requiring that every new work must pass the script without a warning. This is
counterproductive --- if I write a driver and I will be doing most maintenance
work on it in the future, it is viable that the driver is formatted in such
a way that is best editable for me, not for anyone else. And as shown in example
(1), this 80-column requirement makes even simple changes much harder.
So: I am submitting this patch for the checkpatch.pl script.
Signed-off-by: Mikulas Patocka <[email protected]>
---
scripts/checkpatch.pl | 9 ---------
1 file changed, 9 deletions(-)
Index: linux-2.6.32-devel/scripts/checkpatch.pl
===================================================================
--- linux-2.6.32-devel.orig/scripts/checkpatch.pl 2009-12-07 19:57:31.000000000 +0100
+++ linux-2.6.32-devel/scripts/checkpatch.pl 2009-12-07 19:57:44.000000000 +0100
@@ -1374,15 +1374,6 @@ sub process {
# check we are in a valid source file if not then ignore this hunk
next if ($realfile !~ /\.(h|c|s|S|pl|sh)$/);
-#80 column limit
- if ($line =~ /^\+/ && $prevrawline !~ /\/\*\*/ &&
- $rawline !~ /^.\s*\*\s*\@$Ident\s/ &&
- $line !~ /^\+\s*printk\s*\(\s*(?:KERN_\S+\s*)?"[X\t]*"\s*(?:,|\)\s*;)\s*$/ &&
- $length > 80)
- {
- WARN("line over 80 characters\n" . $herecurr);
- }
-
# check for adding lines without a newline.
if ($line =~ /^\+/ && defined $lines[$linenr] && $lines[$linenr] =~ /^\\ No newline at end of file/) {
WARN("adding a line without newline at end of file\n" . $herecurr);
On Tuesday 15 December 2009 10:57:49 pm Mikulas Patocka wrote:
> But some maintainers take output of the script checkpatch.pl dogmatically,
> requiring that every new work must pass the script without a warning. This is
> counterproductive --- if I write a driver and I will be doing most maintenance
> work on it in the future, it is viable that the driver is formatted in such
> a way that is best editable for me, not for anyone else. And as shown in example
> (1), this 80-column requirement makes even simple changes much harder.
It has been agreed in the past that 80-column warnings shouldn't result
automatically in the rejection of the new hardware support so the above
argumentation is a bit weak.
> So: I am submitting this patch for the checkpatch.pl script.
>
>
> Signed-off-by: Mikulas Patocka <[email protected]>
Having this limitation makes people at least think about designing their
code properly and making it more compact.
If you don't want to see such warnings just apply the following patch that
I've been keeping locally (it helped me to work on staging drivers):
From: Bartlomiej Zolnierkiewicz <[email protected]>
Subject: [PATCH] checkpatch.pl: add option to disable 80 column limit check
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
---
scripts/checkpatch.pl | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
Index: b/scripts/checkpatch.pl
===================================================================
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -17,6 +17,7 @@ use Getopt::Long qw(:config no_auto_abbr
my $quiet = 0;
my $tree = 1;
my $chk_signoff = 1;
+my $chk_cl = 1;
my $chk_patch = 1;
my $tst_only;
my $emacs = 0;
@@ -41,6 +42,7 @@ Options:
-q, --quiet quiet
--no-tree run without a kernel tree
--no-signoff do not check for 'Signed-off-by' line
+ --no-cl do not apply 80 column limit
--patch treat FILE as patchfile (default)
--emacs emacs compile window format
--terse one line per report
@@ -67,6 +69,7 @@ GetOptions(
'q|quiet+' => \$quiet,
'tree!' => \$tree,
'signoff!' => \$chk_signoff,
+ 'cl!' => \$chk_cl,
'patch!' => \$chk_patch,
'emacs!' => \$emacs,
'terse!' => \$terse,
@@ -1365,7 +1368,7 @@ sub process {
next if ($realfile !~ /\.(h|c|s|S|pl|sh)$/);
#80 column limit
- if ($line =~ /^\+/ && $prevrawline !~ /\/\*\*/ &&
+ if ($chk_cl && $line =~ /^\+/ && $prevrawline !~ /\/\*\*/ &&
$rawline !~ /^.\s*\*\s*\@$Ident\s/ &&
$line !~ /^\+\s*printk\s*\(\s*(?:KERN_\S+\s*)?"[X\t]*"\s*(?:,|\)\s*;)\s*$/ &&
$length > 80)
Mikulas Patocka <[email protected]> writes:
>
> Signed-off-by: Mikulas Patocka <[email protected]>
Acked-by: Andi Kleen <[email protected]>
-Andi
--
[email protected] -- Speaking for myself only.
I've no comments on the actual patch but...
* Mikulas Patocka <[email protected]>:
> Drop 80-character limit in checkpatch.pl
>
> Serious issues:
> ===============
>
> (1) The code is hard to edit with common text editors
You mention vi; most distros provide vim these days.
In my .vimrc, I have:
set ai
set si
That helps a bit.
> (2) Grepping the code is unreliable
Reviewers (at least in my part of the kernel) have been
requesting that user-visible (and thus developer greppable)
output be combined on one line precisely for this reason.
We've been rejecting patches that unnecessarily break up these
strings at 80 cols.
> But some maintainers take output of the script checkpatch.pl
> dogmatically, requiring that every new work must pass the
> script without a warning.
This is the real problem. And I think the solution is to lobby
your maintainer.
fwiw,
/ac
On Tue, Dec 15, 2009 at 04:57:49PM -0500, Mikulas Patocka wrote:
> (5) Wrapping makes long expressions harder to understand
> --------------------------------------------------------
>
> If I have a complex expression, I do not try to wrap it at predefined
> 80-column boundaries, but at logical boundaries within the expression to make
> it more readable (the brain can't find matching parentheses fast, so we can
> help it by aligning the code according to topmost terms in the expression).
>
> Example:
> if (unlikely(call_some_function(s, value) != RET
> _SUCCESS) ||
> (var_1 == prev_var_1 && var_2 == prev_var_2)
> ||
> flags & (FLAG_1 | FLAG_2) ||
> some_other_condition) {
> }
>
> Now, if we impose 80-column limit, we get this. One may argue that is looks
> aesthetically better, but it is also less intelligible than the previous
> version:
> if (unlikely(call_some_function(s, value) !=
> RET_SUCCESS) || (var_1 == prev_var_1 &&
> var_2 == prev_var_2) || flags & (FLAG_1 |
> FLAG_2) || some_other_condition) {
> }
>
For starters, this is just crap. If you're writing code like this, then
line wrapping is really the least of your concerns. Take your function
return value and assign it to a variable before testing it in unlikely()
as per existing conventions and most of this goes away in this example.
If you're testing an absurd amount of conditions in a single block with
needlessly verbose variable names, then yes, you will go over 80
characters. Consequently, your "clean" example doesn't look much better
than your purposely obfuscated one.
The 80 character limit isn't a hard limit, but it's still an excellent
guideline largely because it stops people from writing code like in your
above example. Some amount of common sense is necessary, and most
people in a position of applying patches can work out when to ignore
checkpatch and when not to. The exception that you also noted is that
some people run checkpatch on inbound patches before applying them, so
having a flag to ignore the line size (or at least make it non-fatal) is
probably not the worst idea ever -- although I disagree with killing off
any notice about the line size completely.
Aiming to keep things around 80 characters has served us well over the
years, so I don't quite agree with suddenly adopting an anything goes
policy in checkpatch. People that don't care about the limit can disable
it in their scripts while folks running their patches through prior to
list submission are still better off being reminded that they should keep
things reasonably close. One only has to take a look through some of the
staging/ drivers pre-cleanup efforts for examples that clearly benefit
from triggering these warnings rather than implicitly supporting insane
tab depths.
Paul Mundt <[email protected]> writes:
> For starters, this is just crap. If you're writing code like this, then
> line wrapping is really the least of your concerns. Take your function
> return value and assign it to a variable before testing it in unlikely()
> as per existing conventions and most of this goes away in this example.
> If you're testing an absurd amount of conditions in a single block with
> needlessly verbose variable names, then yes, you will go over 80
> characters. Consequently, your "clean" example doesn't look much better
> than your purposely obfuscated one.
Then perhaps checkpatch should warn about too complex expressions
instead? Line length is rather weak indication of complexity.
The typical problem being printk() with the printed info approaching
80 chars (but there are others, of course).
BTW I remember we have already agreed that the 80-chars limit is a bad
thing, though apparently nobody bothered to fix the docs and checkpatch.
> The 80 character limit isn't a hard limit, but it's still an excellent
> guideline largely because it stops people from writing code like in your
> above example.
What is more important is it stops people from writing a perfectly
readable code, forcing them to make the code (much) worse.
> Some amount of common sense is necessary, and most
> people in a position of applying patches can work out when to ignore
> checkpatch and when not to.
Precisely - we need common sense instead of artificial limits which are
almost unrelated to the actual problem.
> Aiming to keep things around 80 characters has served us well over the
> years,
I don't think so. I think it made more harm than good. Some people
routinely ignore this "!!!!! ERROR !!!!!" but a large part of developers
make the code (sometimes much) worse for this very reason.
Fortunately the patches are reviewed.
> One only has to take a look through some of the
> staging/ drivers pre-cleanup efforts for examples that clearly benefit
> from triggering these warnings rather than implicitly supporting insane
> tab depths.
Checkpatch doesn't check for sane/insane tab depths. It's not that
trivial. Unfortunately, checking line length is a very poor substitute.
Yes, checkpatch probably could warn about having many long lines (though
I wouldn't mention the "80" number). But it should be clear it's not
about the length but about (possible) complexity.
--
Krzysztof Halasa
On Wed, Dec 16, 2009 at 6:26 AM, Bartlomiej Zolnierkiewicz
<[email protected]> wrote:
> On Tuesday 15 December 2009 10:57:49 pm Mikulas Patocka wrote:
>
>> But some maintainers take output of the script checkpatch.pl dogmatically,
>> requiring that every new work must pass the script without a warning. This is
>> counterproductive --- if I write a driver and I will be doing most maintenance
>> work on it in the future, it is viable that the driver is formatted in such
>> a way that is best editable for me, not for anyone else. And as shown in example
>> (1), this 80-column requirement makes even simple changes much harder.
>
> It has been agreed in the past that 80-column warnings shouldn't result
> automatically in the rejection of the new hardware support so the above
> argumentation is a bit weak.
>
>> So: I am submitting this patch for the checkpatch.pl script.
>>
>>
>> Signed-off-by: Mikulas Patocka <[email protected]>
>
> Having this limitation makes people at least think about designing their
> code properly and making it more compact.
>
> If you don't want to see such warnings just apply the following patch that
> I've been keeping locally (it helped me to work on staging drivers):
>
> From: Bartlomiej Zolnierkiewicz <[email protected]>
> Subject: [PATCH] checkpatch.pl: add option to disable 80 column limit check
>
> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
I like this patch, this is actually what I wanted to do.
Acked-by: WANG Cong <[email protected]>
Thanks!
> ---
> scripts/checkpatch.pl | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> Index: b/scripts/checkpatch.pl
> ===================================================================
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -17,6 +17,7 @@ use Getopt::Long qw(:config no_auto_abbr
> my $quiet = 0;
> my $tree = 1;
> my $chk_signoff = 1;
> +my $chk_cl = 1;
> my $chk_patch = 1;
> my $tst_only;
> my $emacs = 0;
> @@ -41,6 +42,7 @@ Options:
> -q, --quiet quiet
> --no-tree run without a kernel tree
> --no-signoff do not check for 'Signed-off-by' line
> + --no-cl do not apply 80 column limit
> --patch treat FILE as patchfile (default)
> --emacs emacs compile window format
> --terse one line per report
> @@ -67,6 +69,7 @@ GetOptions(
> 'q|quiet+' => \$quiet,
> 'tree!' => \$tree,
> 'signoff!' => \$chk_signoff,
> + 'cl!' => \$chk_cl,
> 'patch!' => \$chk_patch,
> 'emacs!' => \$emacs,
> 'terse!' => \$terse,
> @@ -1365,7 +1368,7 @@ sub process {
> next if ($realfile !~ /\.(h|c|s|S|pl|sh)$/);
>
> #80 column limit
> - if ($line =~ /^\+/ && $prevrawline !~ /\/\*\*/ &&
> + if ($chk_cl && $line =~ /^\+/ && $prevrawline !~ /\/\*\*/ &&
> $rawline !~ /^.\s*\*\s*\@$Ident\s/ &&
> $line !~ /^\+\s*printk\s*\(\s*(?:KERN_\S+\s*)?"[X\t]*"\s*(?:,|\)\s*;)\s*$/ &&
> $length > 80)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Thu, 17 Dec 2009, Américo Wang wrote:
>
> I like this patch, this is actually what I wanted to do.
I have nothing against a switch, but it had better default to off.
The whole 80-char limit is insane. It results in insane "fixes". Just
about every time somebody "improves" a patch due to the warning, the
result is worse than the original patch.
Linus
On Thursday 17 December 2009 04:14:37 pm Linus Torvalds wrote:
>
> On Thu, 17 Dec 2009, Américo Wang wrote:
> >
> > I like this patch, this is actually what I wanted to do.
>
> I have nothing against a switch, but it had better default to off.
>
> The whole 80-char limit is insane. It results in insane "fixes". Just
> about every time somebody "improves" a patch due to the warning, the
> result is worse than the original patch.
Examples? :)
--
Bartlomiej Zolnierkiewicz
On Thu, 17 Dec 2009, Bartlomiej Zolnierkiewicz wrote:
>
> Examples? :)
They're typically things like this:
- ret = sscanf (buf, "0x%lx - 0x%lx", &start_addr, &end_addr);
+ ret = sscanf(buf, "0x%lx - 0x%lx", &start_addr,
+ &end_addr);
ie a line-break that is just annoying and makes the code harder to read
rather than easier.
Linus
On Thursday 17 December 2009 04:37:41 pm Linus Torvalds wrote:
>
> On Thu, 17 Dec 2009, Bartlomiej Zolnierkiewicz wrote:
> >
> > Examples? :)
>
> They're typically things like this:
>
> - ret = sscanf (buf, "0x%lx - 0x%lx", &start_addr, &end_addr);
> + ret = sscanf(buf, "0x%lx - 0x%lx", &start_addr,
> + &end_addr);
>
> ie a line-break that is just annoying and makes the code harder to read
> rather than easier.
Well, it could have been done in the other way:
- ret = sscanf (buf, "0x%lx - 0x%lx", &start_addr, &end_addr);
+ ret = sscanf(buf, "0x%lx - 0x%lx",
+ &start_addr, &end_addr);
Just an example that the limit itself is usually not a problem
but its literal interpretation is..
I don't feel strongly about it the either way so how's about just
adding the switch and than changing the default (Alasdair?) on top
of it?
--
Bartlomiej Zolnierkiewicz
On Thu, 17 Dec 2009, Bartlomiej Zolnierkiewicz wrote:
>
> Well, it could have been done in the other way:
>
> - ret = sscanf (buf, "0x%lx - 0x%lx", &start_addr, &end_addr);
> + ret = sscanf(buf, "0x%lx - 0x%lx",
> + &start_addr, &end_addr);
>
> Just an example that the limit itself is usually not a problem
> but its literal interpretation is..
What? Your version is no better.
In the above case it doesn't matter, but I've had grep's that fail due to
people splitting the actual string etc, which just drives me wild. We
fixed that to allow checkpatch to skip those warnings, but the fact is,
the fundamnetal problem has always been the "80 character" part.
I don't think any kernel developers use a vt100 any more. And even if they
do, I bet they curse the "24 lines" more than they curse the occasional
80+ character lines.
I'd be ok with changing the warning to 132 characters, which is another
perfectly fine historical limit. Or we can split the difference, and say
"ok, 106 characters is too much". I don't care. But 80 characters is
causing too many idiotic changes.
There are way worse problems in many patches than long lines. Too complex
expressions. Too deep indentation. Pure crap code. People seem to get way
too hung up on ".. but at least it passes checkpatch".
Linus
On Thu, Dec 17, 2009 at 9:51 PM, Linus Torvalds
<[email protected]> wrote:
>
>
> On Thu, 17 Dec 2009, Bartlomiej Zolnierkiewicz wrote:
>>
>> Well, it could have been done in the other way:
>>
>> - ? ? ? ? ? ? ? ? ? ? ret = sscanf (buf, "0x%lx - 0x%lx", &start_addr, &end_addr);
>> + ? ? ? ? ? ? ? ? ? ? ret = sscanf(buf, "0x%lx - 0x%lx",
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?&start_addr, &end_addr);
>>
>> Just an example that the limit itself is usually not a problem
>> but its literal interpretation is..
>
> What? Your version is no better.
>
> In the above case it doesn't matter, but I've had grep's that fail due to
> people splitting the actual string etc, which just drives me wild. We
> fixed that to allow checkpatch to skip those warnings, but the fact is,
> the fundamnetal problem has always been the "80 character" part.
>
> I don't think any kernel developers use a vt100 any more. And even if they
> do, I bet they curse the "24 lines" more than they curse the occasional
> 80+ character lines.
>
> I'd be ok with changing the warning to 132 characters, which is another
> perfectly fine historical limit. Or we can split the difference, and say
> "ok, 106 characters is too much". I don't care. But 80 characters is
> causing too many idiotic changes.
>
> There are way worse problems in many patches than long lines. Too complex
> expressions. Too deep indentation. Pure crap code. People seem to get way
> too hung up on ".. but at least it passes checkpatch".
>
I truely agree on this.It will better if we can change the warning for
100+ as suggested.This cleans the code alot infact.
-Ram
Linus Torvalds <[email protected]> writes:
> fixed that to allow checkpatch to skip those warnings, but the fact is,
> the fundamnetal problem has always been the "80 character" part.
AFAIK it still warns for it. At least I see those warnings
regularly and I always ignore them for printk/pr_*
-Andi
--
[email protected] -- Speaking for myself only.
On Thu, 17 Dec 2009, Paul Mundt wrote:
> On Tue, Dec 15, 2009 at 04:57:49PM -0500, Mikulas Patocka wrote:
> > (5) Wrapping makes long expressions harder to understand
> > --------------------------------------------------------
> >
> > If I have a complex expression, I do not try to wrap it at predefined
> > 80-column boundaries, but at logical boundaries within the expression to make
> > it more readable (the brain can't find matching parentheses fast, so we can
> > help it by aligning the code according to topmost terms in the expression).
> >
> > Example:
> > if (unlikely(call_some_function(s, value) != RET
> > _SUCCESS) ||
> > (var_1 == prev_var_1 && var_2 == prev_var_2)
> > ||
> > flags & (FLAG_1 | FLAG_2) ||
> > some_other_condition) {
> > }
> >
> > Now, if we impose 80-column limit, we get this. One may argue that is looks
> > aesthetically better, but it is also less intelligible than the previous
> > version:
> > if (unlikely(call_some_function(s, value) !=
> > RET_SUCCESS) || (var_1 == prev_var_1 &&
> > var_2 == prev_var_2) || flags & (FLAG_1 |
> > FLAG_2) || some_other_condition) {
> > }
> >
> For starters, this is just crap. If you're writing code like this, then
> line wrapping is really the least of your concerns. Take your function
> return value and assign it to a variable before testing it in unlikely()
> as per existing conventions and most of this goes away in this example.
I wouldn't say that this is better:
int csf_failed, vars_equal, flags_12;
...
csf_failed = call_some_function(s, value) != RET_SUCCESS;
vars_equal = var_1 == prev_var_1 && var_2 == prev_var_2;
flags_12 = flags & (FLAG_1 | FLAG_2);
if (unlikely(csf_failed) || vars_equal ||
flags_12 || some_other_conditions) {
}
If you think that it is better, it's OK, just write your code like that.
And don't force it to everyone.
> If you're testing an absurd amount of conditions in a single block with
> needlessly verbose variable names, then yes, you will go over 80
> characters. Consequently, your "clean" example doesn't look much better
> than your purposely obfuscated one.
For me, breaking the expression into variables is worse because:
- adding/removing conditions must be done at 3 places (vs. my original
example, where it would be done only on 1 place)
- when reading the conditions, your eyes must skip to two places (vs. my
original example, where you only read it at one place)
But everyone has different brain, so it may be that for you, the extra
variables are really more understandable. So write your code like that and
don't preach.
Mikulas
On Thu, Dec 17, 2009 at 05:37:57PM -0500, Mikulas Patocka wrote:
> On Thu, 17 Dec 2009, Paul Mundt wrote:
> > On Tue, Dec 15, 2009 at 04:57:49PM -0500, Mikulas Patocka wrote:
> > > (5) Wrapping makes long expressions harder to understand
> > > --------------------------------------------------------
> > >
> > > If I have a complex expression, I do not try to wrap it at predefined
> > > 80-column boundaries, but at logical boundaries within the expression to make
> > > it more readable (the brain can't find matching parentheses fast, so we can
> > > help it by aligning the code according to topmost terms in the expression).
> > >
> > > Example:
> > > if (unlikely(call_some_function(s, value) != RET
> > > _SUCCESS) ||
> > > (var_1 == prev_var_1 && var_2 == prev_var_2)
> > > ||
> > > flags & (FLAG_1 | FLAG_2) ||
> > > some_other_condition) {
> > > }
> > >
> > > Now, if we impose 80-column limit, we get this. One may argue that is looks
> > > aesthetically better, but it is also less intelligible than the previous
> > > version:
> > > if (unlikely(call_some_function(s, value) !=
> > > RET_SUCCESS) || (var_1 == prev_var_1 &&
> > > var_2 == prev_var_2) || flags & (FLAG_1 |
> > > FLAG_2) || some_other_condition) {
> > > }
> > >
> > For starters, this is just crap. If you're writing code like this, then
> > line wrapping is really the least of your concerns. Take your function
> > return value and assign it to a variable before testing it in unlikely()
> > as per existing conventions and most of this goes away in this example.
>
> I wouldn't say that this is better:
> int csf_failed, vars_equal, flags_12;
>
> ...
>
> csf_failed = call_some_function(s, value) != RET_SUCCESS;
> vars_equal = var_1 == prev_var_1 && var_2 == prev_var_2;
> flags_12 = flags & (FLAG_1 | FLAG_2);
> if (unlikely(csf_failed) || vars_equal ||
> flags_12 || some_other_conditions) {
> }
>
> If you think that it is better, it's OK, just write your code like that.
> And don't force it to everyone.
>
No, I wouldn't say that that's better either, but that's also not how I
suggested cleaning reworking it. We have existing conventions for how
complex blocks are broken down in to more readable forms which you seem
to have issues grasping. My point is that you are purposely obfuscating
things, and therefore your entire rationale is suspect at best.
> For me, breaking the expression into variables is worse because:
> - adding/removing conditions must be done at 3 places (vs. my original
> example, where it would be done only on 1 place)
> - when reading the conditions, your eyes must skip to two places (vs. my
> original example, where you only read it at one place)
>
> But everyone has different brain, so it may be that for you, the extra
> variables are really more understandable. So write your code like that and
> don't preach.
>
This isn't a subjective thing as you are now trying to spin it. There are
existing conventions that the majority of kernel code follows and any of
your above obfuscated blocks are unlikely to show up in any code outside
of drivers/. How about actually reading through the coding style document
and existing code before attempting to cripple checkpatch.
You've also ignored everything that was stated about the reasons for why
we have this limit and why it is useful, instead focusing purely on the
fact that I've taken issue with your bogus examples that would never have
been applied in the first place.
If you simply wanted a way to get around the limit, then something like
Bart's patch is quite reasonable and I don't believe anyone has any
objections to it. This half-baked rationale for tossing it out completely
using examples that don't even show up in the kernel though is simply
nonsense. Perhaps if you spent half as much time auditing vendor patches
you would have some idea of why having these warnings is not just a nice
thing to have, but also the only sensible default.
On Thu, 17 Dec 2009, Krzysztof Halasa wrote:
> Paul Mundt <[email protected]> writes:
>
> > For starters, this is just crap. If you're writing code like this, then
> > line wrapping is really the least of your concerns. Take your function
> > return value and assign it to a variable before testing it in unlikely()
> > as per existing conventions and most of this goes away in this example.
> > If you're testing an absurd amount of conditions in a single block with
> > needlessly verbose variable names, then yes, you will go over 80
> > characters. Consequently, your "clean" example doesn't look much better
> > than your purposely obfuscated one.
>
> Then perhaps checkpatch should warn about too complex expressions
> instead? Line length is rather weak indication of complexity.
> The typical problem being printk() with the printed info approaching
> 80 chars (but there are others, of course).
No. If someone wrote an expression too long to be understood, he will not
understand it when maintaining his code, he burns himself and he will
eventually learn to not write such long expressions. Or ... if he
understands it, there is no problem with it.
There is no need to make a script for it. The script isn't so smart to
tell what is understandable and what is nto.
All these coding-style discussions are just nit-picking. If you move the
code up or down a few lines, it won't help anything.
The real problems of hard-to-understand code are totally different than
"is this line too long" or "is this expression too complex".
Examples (what I personally found hard to understand):
* excessive inheritance in C++ --- if you have object->method(params), it
is impossible to determine what is really called if the person is using
inheritance trees deeper than two levels.
* excessively deep calls --- for example, python's module loader is
written partially in python and partially in C and it is impossible to
determine what is the control flow between the parts (see also the
previous problem about inheritance). In Linux, an example is waking a
process waiting on a bit, that calls 8 nested functions.
* excessive abstractions --- find that these two drivers have some common
part and blast an abstraction layer with method tables in an effort to
share these common parts. I have seen when two drivers were merged this
way, the resulting driver was as big as those two separate drivers. What
the sharing saved, the abstraction layer added.
* repeating the same logic again and again and not making it a macro or a
function --- it's not seen in Linux kernel, but in other project I have
seen the code for allocating and initializing a structure being repeated
38 times. It's not hard to read, it's hard to modify. Linux community
tends to be obsessive the opposite way (which leads to problems too, if
done excessively), see the previous paragraph.
* code diffuse --- one thing at different places. An example is the Linux
cdrom stack (ide-cd, sr, drivers/cdrom).
* constructing function names by concatenating macro arguments --- i.e.
try to find with grep, where is "outb_p" defined.
--- and rules for code formatting fix really nothing.
Mikulas
On Fri, 18 Dec 2009, Paul Mundt wrote:
> On Thu, Dec 17, 2009 at 05:37:57PM -0500, Mikulas Patocka wrote:
> > On Thu, 17 Dec 2009, Paul Mundt wrote:
> > > On Tue, Dec 15, 2009 at 04:57:49PM -0500, Mikulas Patocka wrote:
> > > > (5) Wrapping makes long expressions harder to understand
> > > > --------------------------------------------------------
> > > >
> > > > If I have a complex expression, I do not try to wrap it at predefined
> > > > 80-column boundaries, but at logical boundaries within the expression to make
> > > > it more readable (the brain can't find matching parentheses fast, so we can
> > > > help it by aligning the code according to topmost terms in the expression).
> > > >
> > > > Example:
> > > > if (unlikely(call_some_function(s, value) != RET
> > > > _SUCCESS) ||
> > > > (var_1 == prev_var_1 && var_2 == prev_var_2)
> > > > ||
> > > > flags & (FLAG_1 | FLAG_2) ||
> > > > some_other_condition) {
> > > > }
> > > >
> > > > Now, if we impose 80-column limit, we get this. One may argue that is looks
> > > > aesthetically better, but it is also less intelligible than the previous
> > > > version:
> > > > if (unlikely(call_some_function(s, value) !=
> > > > RET_SUCCESS) || (var_1 == prev_var_1 &&
> > > > var_2 == prev_var_2) || flags & (FLAG_1 |
> > > > FLAG_2) || some_other_condition) {
> > > > }
> > > >
> > > For starters, this is just crap. If you're writing code like this, then
> > > line wrapping is really the least of your concerns. Take your function
> > > return value and assign it to a variable before testing it in unlikely()
> > > as per existing conventions and most of this goes away in this example.
> >
> > I wouldn't say that this is better:
> > int csf_failed, vars_equal, flags_12;
> >
> > ...
> >
> > csf_failed = call_some_function(s, value) != RET_SUCCESS;
> > vars_equal = var_1 == prev_var_1 && var_2 == prev_var_2;
> > flags_12 = flags & (FLAG_1 | FLAG_2);
> > if (unlikely(csf_failed) || vars_equal ||
> > flags_12 || some_other_conditions) {
> > }
> >
> > If you think that it is better, it's OK, just write your code like that.
> > And don't force it to everyone.
> >
> No, I wouldn't say that that's better either, but that's also not how I
> suggested cleaning reworking it. We have existing conventions for how
> complex blocks are broken down in to more readable forms which you seem
> to have issues grasping. My point is that you are purposely obfuscating
> things, and therefore your entire rationale is suspect at best.
So, how would you clean this complex code block?
Mikulas
On Thu, Dec 17, 2009 at 06:29:25PM -0500, Mikulas Patocka wrote:
> No. If someone wrote an expression too long to be understood, he will not
> understand it when maintaining his code, he burns himself and he will
> eventually learn to not write such long expressions. Or ... if he
> understands it, there is no problem with it.
Yes, there is. Somebody else trying to find a bug. And having to
figure out which of five unfamiliar pieces of code is likelier
candidate.
> There is no need to make a script for it. The script isn't so smart to
> tell what is understandable and what is nto.
Oh, definitely. No arguments on that one...
On Thu, 17 Dec 2009 18:29:25 EST, Mikulas Patocka said:
> There is no need to make a script for it. The script isn't so smart to
> tell what is understandable and what is nto.
Yeah, but I respectfully submit that if the regexp '^\t{6}' matches a non-
continuation line, it's probably in its rights to whinge.
grep -r -P '^\t{7}(?!(.*,$|.*\);$))' . | more
produces a whole lot of "this can't end well" output. My favorite so far:
fs/reisersfs/do_balan.c, lines 460-477 (note: 3 leading tabs elided)
leaf_paste_entries(&bi,
n +
item_pos
-
ret_val,
l_pos_in_item,
1,
(struct
reiserfs_de_head
*)
body,
body
+
DEH_SIZE,
tb->
insert_size
[0]
);
Yes, that used to be 24 more columns to the right. Gaak.
On Thu, 2009-12-17 at 23:29 -0500, [email protected] wrote:
> Yeah, but I respectfully submit that if the regexp '^\t{6}' matches a non-
> continuation line, it's probably in its rights to whinge.
> grep -r -P '^\t{7:}(?!(.*,$|.*\);$))' . | more
> produces a whole lot of "this can't end well" output.
I think this is a good test to add to checkpatch.
Add 105 character long line WARN test
Add 80 character long line STRICT test
Add 6+ leading indent tabs test, consider restructuring
Signed-off-by: Joe Perches <[email protected]>
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index bc4114f..c358251 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1374,13 +1374,20 @@ sub process {
# check we are in a valid source file if not then ignore this hunk
next if ($realfile !~ /\.(h|c|s|S|pl|sh)$/);
-#80 column limit
+#Line too long
if ($line =~ /^\+/ && $prevrawline !~ /\/\*\*/ &&
$rawline !~ /^.\s*\*\s*\@$Ident\s/ &&
- $line !~ /^\+\s*printk\s*\(\s*(?:KERN_\S+\s*)?"[X\t]*"\s*(?:,|\)\s*;)\s*$/ &&
- $length > 80)
- {
- WARN("line over 80 characters\n" . $herecurr);
+ $line !~ /^\+\s*printk\s*\(\s*(?:KERN_\S+\s*)?"[X\t]*"\s*(?:,|\)\s*;)\s*$/) {
+ if ($length > 105) {
+ WARN("line over 105 characters\n" . $herecurr);
+ } elsif ($length > 80) {
+ CHK("line over 80 characters\n" . $herecurr);
+ }
+ }
+
+#too many leading tabs - deep leading indent
+ if ($line =~ /^\+ {6,}(?!(.*,$|.*\);$))/) {
+ CHK("Too many leading tabs. Consider restructuring code\n" . $herecurr);
}
# check for adding lines without a newline.
On Thu, Dec 17, 2009 at 09:12:24PM -0800, Joe Perches wrote:
> On Thu, 2009-12-17 at 23:29 -0500, [email protected] wrote:
> > Yeah, but I respectfully submit that if the regexp '^\t{6}' matches a non-
> > continuation line, it's probably in its rights to whinge.
> > grep -r -P '^\t{7:}(?!(.*,$|.*\);$))' . | more
> > produces a whole lot of "this can't end well" output.
>
> I think this is a good test to add to checkpatch.
>
> Add 105 character long line WARN test
> Add 80 character long line STRICT test
> Add 6+ leading indent tabs test, consider restructuring
>
This looks like a reasonable compromise.
On Thu, 17 Dec 2009, Bartlomiej Zolnierkiewicz wrote:
> > > I like this patch, this is actually what I wanted to do.
> >
> > I have nothing against a switch, but it had better default to off.
> >
> > The whole 80-char limit is insane. It results in insane "fixes". Just
> > about every time somebody "improves" a patch due to the warning, the
> > result is worse than the original patch.
>
> Examples? :)
balance_leaf() in fs/reiserfs/do_balan.c
Example picked totally at random:
set_le_ih_k_offset(ih,
le_ih_k_offset(ih) +
(tb->
lbytes <<
(is_indirect_le_ih
(ih) ? tb->tb_sb->
s_blocksize_bits -
UNFM_P_SHIFT :
0)));
See how everything is nicely aligned to 80 cols?
Generally, don't look at this function after having a good lunch you want
to keep. You have been warned.
--
Jiri Kosina
SUSE Labs, Novell Inc.
On 17/12/09 16:21, Linus Torvalds wrote:
> I'd be ok with changing the warning to 132 characters, which is another
> perfectly fine historical limit. Or we can split the difference, and say
> "ok, 106 characters is too much". I don't care. But 80 characters is
> causing too many idiotic changes.
I find smaller limits help when you want to do side-by-side diffs,
or `git blame` for example, but I agree there needs to be taste applied,
so as to not mangle the code.
cheers,
P?draig.
On Friday 18 December 2009 02:04:37 pm Jiri Kosina wrote:
> On Thu, 17 Dec 2009, Bartlomiej Zolnierkiewicz wrote:
>
> > > > I like this patch, this is actually what I wanted to do.
> > >
> > > I have nothing against a switch, but it had better default to off.
> > >
> > > The whole 80-char limit is insane. It results in insane "fixes". Just
> > > about every time somebody "improves" a patch due to the warning, the
> > > result is worse than the original patch.
> >
> > Examples? :)
>
> balance_leaf() in fs/reiserfs/do_balan.c
>
> Example picked totally at random:
>
> set_le_ih_k_offset(ih,
> le_ih_k_offset(ih) +
> (tb->
> lbytes <<
> (is_indirect_le_ih
> (ih) ? tb->tb_sb->
> s_blocksize_bits -
> UNFM_P_SHIFT :
> 0)));
>
> See how everything is nicely aligned to 80 cols?
I see but the above code is an utter crap anyway.
Firstly what kind of a function parameter is that:
le_ih_k_offset(ih) + (tb->lbytes << (is_indirect_le_ih(ih) ? tb->tb_sb->s_blocksize_bits - UNFM_P_SHIFT : 0))
?
[ BTW 'tb->tb_sb->s_blocksize_bits - UNFM_P_SHIFT' construct is used five
times in balance_leaf() and is a likely candidate for helper / macro. ]
More importantly the whole balance_leaf() function is almost 1400 LOC (!)
big and impossible to read: code for handling particular 'switch' blocks
should be factored out into separate functions etc.
The point I was making is that the once we remove the limit we don't have
other tool to _automatically_ point suspicious code areas (yes, I would
also prefer intelligent static code checker over dumb limit but it simply
not here as things are today and dumb limit works surprisingly well most
of the time -- please note how that the structural problem with the code
example given is immediately visible with the current limit).
> Generally, don't look at this function after having a good lunch you want
> to keep. You have been warned.
No worries, I visit dark places (ide, staging, ..) and come back alive.. ;)
--
Bartlomiej Zolnierkiewicz
[email protected] writes:
> Yeah, but I respectfully submit that if the regexp '^\t{6}' matches a non-
> continuation line, it's probably in its rights to whinge.
Yes, but don't make it a hard error, only a suggestion that something is
probably really wrong.
> fs/reisersfs/do_balan.c, lines 460-477 (note: 3 leading tabs elided)
>
> leaf_paste_entries(&bi,
> n +
> item_pos
> -
> ret_val,
> l_pos_in_item,
> 1,
> (struct
> reiserfs_de_head
> *)
> body,
> body
> +
> DEH_SIZE,
> tb->
> insert_size
> [0]
> );
>
> Yes, that used to be 24 more columns to the right. Gaak.
Precisely. It's a clear show of the damage hard rules like that do.
I can't even tell how the code should be fixed and if the simple merging
would do, since I can't really imagine how it would look like :-)
--
Krzysztof Halasa
Joe Perches <[email protected]> writes:
> I think this is a good test to add to checkpatch.
>
> Add 105 character long line WARN test
Not sure if that's long enough (maybe it is).
132 looks like the next "natural" number.
The problematic case like printk(KERN_DEBUG "%s:%s:%s", var ? "yes" :
"no" and so on. If we split the code like that then better use 132 than
105 - even with 132 chars we may need to split, remember it's the printk
output which has to fit in 80 chars, or maybe usually has.
The main problem is checkpatch can't estimate "complexity". Fortunately
the reviewers can.
> Add 80 character long line STRICT test
Not sure what do you mean.
> Add 6+ leading indent tabs test, consider restructuring
This makes perfect sense, at least until shown otherwise (though being
a warning instead of an error means it's ok to trigger in perfectly
valid code).
--
Krzysztof Halasa
Bartlomiej Zolnierkiewicz <[email protected]> writes:
>> set_le_ih_k_offset(ih,
>> le_ih_k_offset(ih) +
>> (tb->
>> lbytes <<
>> (is_indirect_le_ih
>> (ih) ? tb->tb_sb->
>> s_blocksize_bits -
>> UNFM_P_SHIFT :
>> 0)));
> Firstly what kind of a function parameter is that:
It's hard to tell without unmangling the line first :-)
--
Krzysztof Halasa
On Fri, Dec 18, 2009 at 8:28 AM, Krzysztof Halasa <[email protected]> wrote:
> [email protected] writes:
>
>> Yeah, but I respectfully submit that if the regexp '^\t{6}' matches a non-
>> continuation line, it's probably in its rights to whinge.
>
> Yes, but don't make it a hard error, only a suggestion that something is
> probably really wrong.
>
>> fs/reisersfs/do_balan.c, lines 460-477 (note: 3 leading tabs elided)
>>
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? leaf_paste_entries(&bi,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?n +
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?item_pos
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?ret_val,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?l_pos_in_item,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?1,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(struct
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? reiserfs_de_head
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? *)
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?body,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?body
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?+
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?DEH_SIZE,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?tb->
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?insert_size
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?[0]
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? );
>>
>> Yes, that used to be 24 more columns to the right. ?Gaak.
>
> Precisely. It's a clear show of the damage hard rules like that do.
> I can't even tell how the code should be fixed and if the simple merging
> would do, since I can't really imagine how it would look like :-)
I think you're missing the point, there is no simple fix for this sort
of formatting problem. The 80 column hard limit didn't cause this
mangling, someone applying an obviously misguided quick fix to the
too-long line caused the problem. The appropriate thing to do when
checkpatch tells you that a line is too long isn't necessarily to chop
the line into pieces. The appropriate thing to do is look at the
surrounding context and determine what the underlying problem is. In
this case Bart pointed out that quite a lot of the overly-complex
parameter could be reasonably moved to a helper macro, and the
too-long surrounding function should be broken up into manageable
pieces.
The fundamental disconnect here is that some people seem to think that
style guidelines like "no lines over 80 columns" are goals, they
aren't, they are just heuristics that have been found to point out bad
coding practices.
Note: I'm not specifically arguing for keeping the 80-column rule, the
project I work on uses 100 columns, and that's quite workable, but I
haven't had any problem working with 80 columns as a limit either. I
do however think that just removing the limit without replacing it
with something better is a bad idea.
-Kevin Granade
> --
> Krzysztof Halasa
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/
>
On Friday 18 December 2009 03:28:12 pm Krzysztof Halasa wrote:
> [email protected] writes:
>
> > Yeah, but I respectfully submit that if the regexp '^\t{6}' matches a non-
> > continuation line, it's probably in its rights to whinge.
>
> Yes, but don't make it a hard error, only a suggestion that something is
> probably really wrong.
>
> > fs/reisersfs/do_balan.c, lines 460-477 (note: 3 leading tabs elided)
> >
> > leaf_paste_entries(&bi,
> > n +
> > item_pos
> > -
> > ret_val,
> > l_pos_in_item,
> > 1,
> > (struct
> > reiserfs_de_head
> > *)
> > body,
> > body
> > +
> > DEH_SIZE,
> > tb->
> > insert_size
> > [0]
> > );
> >
> > Yes, that used to be 24 more columns to the right. Gaak.
>
> Precisely. It's a clear show of the damage hard rules like that do.
It is not a hard rule and the above damage was done in 2005 when some
random developer (Linus Torvalds? Is this even a real name? ;-) decided
to mindlessly pass do_balan.c file through Lindent script and later
didn't finish the task of cleaning the code properly (see commit bd4c625
for details).
We didn't even have checkpatch.pl back then (it was added in 2007).
> I can't even tell how the code should be fixed and if the simple merging
> would do, since I can't really imagine how it would look like :-)
The old code [ no tabs removed ]:
- leaf_paste_entries (bi.bi_bh, n + item_pos - ret_val, l_pos_in_item, 1,
- (struct reiserfs_de_head *)body,
- body + DEH_SIZE, tb->insert_size[0]
- );
--
Bartlomiej Zolnierkiewicz
On Fri, Dec 18, 2009 at 03:37:00PM +0100, Krzysztof Halasa wrote:
> Not sure if that's long enough (maybe it is).
> 132 looks like the next "natural" number.
Well it's good that this thread has produced more examples where it's
reasonable and acceptable to exceed 80 characters.
What do people feel about files where the policy is to place all the
parameters passed into a function on the same line, regardless of its
consequent length?
(What kicked this all off was a patch Mikulas submitted containing many
long lines, one of which hits column 264. Personally, I dislike reading
code with lines that wrap, but using a 132-column terminal width is
fine.)
Alasdair
On Thu, 17 Dec 2009, Linus Torvalds wrote:
> On Thu, 17 Dec 2009, Bartlomiej Zolnierkiewicz wrote:
> >
> > Well, it could have been done in the other way:
> >
> > - ret = sscanf (buf, "0x%lx - 0x%lx", &start_addr, &end_addr);
> > + ret = sscanf(buf, "0x%lx - 0x%lx",
> > + &start_addr, &end_addr);
> >
> > Just an example that the limit itself is usually not a problem
> > but its literal interpretation is..
>
> What? Your version is no better.
>
> In the above case it doesn't matter, but I've had grep's that fail due to
> people splitting the actual string etc, which just drives me wild. We
> fixed that to allow checkpatch to skip those warnings, but the fact is,
> the fundamnetal problem has always been the "80 character" part.
>
> I don't think any kernel developers use a vt100 any more. And even if they
> do, I bet they curse the "24 lines" more than they curse the occasional
> 80+ character lines.
>
> I'd be ok with changing the warning to 132 characters, which is another
> perfectly fine historical limit. Or we can split the difference, and say
> "ok, 106 characters is too much". I don't care. But 80 characters is
> causing too many idiotic changes.
The problem with 106 character --- I have 100-character display on one
computer. If the code were be formatted for 106 characters, it would look
like
bla - bla - bla - bla - bla -
bla -
bla - bla - bla - bla - bla -
bla;
--- i.e. every second line wasted.
On the other hand, if you go without limit, it will be
bla - bla - bla - bla -
bla - bla - bla - bla - bla - bla - bla - bla;
--- much better.
The same problem comes for 132 chatacters --- many people have
128-character display (1024x768 framebuffer) and they see spurious
wrapping with 132 character text.
In general, if you select "n" as a line width, you find someone with
"n - delta" column display who will have problems.
It would be best to keep line length is unlimited and say that it is the
responsibility of the viewer to wrap it.
And make limit of 80 characters only for block comments --- the reader
doesn't want to read wrapped words in comments.
Mikulas
> There are way worse problems in many patches than long lines. Too complex
> expressions. Too deep indentation. Pure crap code. People seem to get way
> too hung up on ".. but at least it passes checkpatch".
>
> Linus
>
> Note: I'm not specifically arguing for keeping the 80-column rule, the
> project I work on uses 100 columns, and that's quite workable, but I
> haven't had any problem working with 80 columns as a limit either. I
> do however think that just removing the limit without replacing it
> with something better is a bad idea.
>
> -Kevin Granade
But think what happens when someone views that 100-char code on 80-char
terminal (or for example 94-char, that I used for some times too) ---
every second line will be wasted with just 20 characters on the left. On
the other hand, if you have unlimited line length, it will look better on
80-char terminal.
Mikulas
On Fri, 18 Dec 2009, Mikulas Patocka wrote:
>
> But think what happens when someone views that 100-char code on 80-char
> terminal (or for example 94-char, that I used for some times too) ---
> every second line will be wasted with just 20 characters on the left.
What kind of CRAZY crap argument is that?
The current rule is 80 characters - but how many lines are actually even
close to 80 characters long? Very few.
So that "every second line" statement is pure and utter idiocy. Instead,
what you get is totally the reverse: instead of wasting two lines _anyway_
due to a hard-newline, you often get a single line instead (since most of
us don't work in 24x80 windows to begin with).
Linus
On Fri, 18 Dec 2009 15:12:41 +0000 Alasdair G Kergon wrote:
> On Fri, Dec 18, 2009 at 03:37:00PM +0100, Krzysztof Halasa wrote:
> > Not sure if that's long enough (maybe it is).
> > 132 looks like the next "natural" number.
>
> Well it's good that this thread has produced more examples where it's
> reasonable and acceptable to exceed 80 characters.
>
> What do people feel about files where the policy is to place all the
> parameters passed into a function on the same line, regardless of its
> consequent length?
Not good IMO. It's much easier to read something that is restricted in
width than to have to scroll/pan side to side.
(e.g., newspaper columns. Oops, what's a newspaper?)
> (What kicked this all off was a patch Mikulas submitted containing many
> long lines, one of which hits column 264. Personally, I dislike reading
> code with lines that wrap, but using a 132-column terminal width is
> fine.)
---
~Randy
On Fri, 18 Dec 2009, Linus Torvalds wrote:
> On Fri, 18 Dec 2009, Mikulas Patocka wrote:
> >
> > But think what happens when someone views that 100-char code on 80-char
> > terminal (or for example 94-char, that I used for some times too) ---
> > every second line will be wasted with just 20 characters on the left.
>
> What kind of CRAZY crap argument is that?
>
> The current rule is 80 characters - but how many lines are actually even
> close to 80 characters long? Very few.
>
> So that "every second line" statement is pure and utter idiocy. Instead,
> what you get is totally the reverse: instead of wasting two lines _anyway_
> due to a hard-newline, you often get a single line instead (since most of
> us don't work in 24x80 windows to begin with).
>
> Linus
Function declarations are often larger than 80 characters. And if you wrap
them at 100, you are wasting every second line on 80-character display.
If you make line length unlimited, the space will be used optimally on all
displays --- a function with 200 character declaration will use 3 lines on
80-character display and 2 lines on 100-character display.
And besides --- wrapping at 100 doesn't fix the initial problem (why I
posted this thread) --- that making any modification to the function
header or long expression requires the user to manually realign the
arguments.
Mikulas
On Fri, 18 Dec 2009, Randy Dunlap wrote:
> On Fri, 18 Dec 2009 15:12:41 +0000 Alasdair G Kergon wrote:
>
> > On Fri, Dec 18, 2009 at 03:37:00PM +0100, Krzysztof Halasa wrote:
> > > Not sure if that's long enough (maybe it is).
> > > 132 looks like the next "natural" number.
> >
> > Well it's good that this thread has produced more examples where it's
> > reasonable and acceptable to exceed 80 characters.
> >
> > What do people feel about files where the policy is to place all the
> > parameters passed into a function on the same line, regardless of its
> > consequent length?
>
> Not good IMO. It's much easier to read something that is restricted in
> width than to have to scroll/pan side to side.
> (e.g., newspaper columns. Oops, what's a newspaper?)
I don't like scrolling too but most viewers and editors wrap it
automatically to screen width. For example, restrict your terminal to 50
columnts and try using "less".
Mikulas
On Fri, 18 Dec 2009, Mikulas Patocka wrote:
>
> Function declarations are often larger than 80 characters. And if you wrap
> them at 100, you are wasting every second line on 80-character display.
If you care so much, use a non-wrapping editor. That's what I do.
> If you make line length unlimited, the space will be used optimally on all
> displays --- a function with 200 character declaration will use 3 lines on
> 80-character display and 2 lines on 100-character display.
Yeah, and that's just crazy.
We all use good hardware these days, but we certainly don't have unlimited
line length. And wrapping is ugly. So sane people (definition: "me") use
editors that don't wrap (marking long lines at the end instead), and for
the very rare case when I use a small terminal, I'll need to go look if I
care (which is seldom).
> And besides --- wrapping at 100 doesn't fix the initial problem (why I
> posted this thread) --- that making any modification to the function
> header or long expression requires the user to manually realign the
> arguments.
Sure. Nothing fixes the problem that you need to _occasionally_ wrap.
But the real problem is that crazy people consider checkpatch.pl to be so
important that they wrap whether it makes sense or not.
Sense. It's what some people have. Too rare, though.
I'll happily remove the checkpatch.pl limit entirely, and ask people to
try to use common sense, though.
Linus
On Fri, 2009-12-18 at 15:37 +0100, Krzysztof Halasa wrote:
> > Add 80 character long line STRICT test
> Not sure what do you mean.
CHK messages are only printed when --strict is on the command line
> This makes perfect sense, at least until shown otherwise (though being
> a warning instead of an error means it's ok to trigger in perfectly
> valid code).
It's a CHK, only printed with --strict.
On Fri, 18 Dec 2009, Paul Mundt wrote:
> On Thu, Dec 17, 2009 at 09:12:24PM -0800, Joe Perches wrote:
> > On Thu, 2009-12-17 at 23:29 -0500, [email protected] wrote:
> > > Yeah, but I respectfully submit that if the regexp '^\t{6}' matches a non-
> > > continuation line, it's probably in its rights to whinge.
> > > grep -r -P '^\t{7:}(?!(.*,$|.*\);$))' . | more
> > > produces a whole lot of "this can't end well" output.
> >
> > I think this is a good test to add to checkpatch.
> >
> > Add 105 character long line WARN test
> > Add 80 character long line STRICT test
> > Add 6+ leading indent tabs test, consider restructuring
> >
> This looks like a reasonable compromise.
I like this. Except I think the indent test should count spaces too some
way. Or do we already warn about excessive space that should be tabs?
Linus
On Fri, 2009-12-18 at 09:43 -0800, Linus Torvalds wrote:
> I like this. Except I think the indent test should count spaces too some
> way. Or do we already warn about excessive space that should be tabs?
Andy's the checkpatch guy to verify this but:
There's a separate test for spaces that should be tabs.
If you had a 64 space indent and ran the silly script,
it'd complain about the spaces.
If you converted the spaces to tabs and ran the silly
script again with --strict, it'd complain about the
indent depth.
Linus Torvalds <[email protected]> writes:
>
> I like this. Except I think the indent test should count spaces too some
> way. Or do we already warn about excessive space that should be tabs?
checkpatch warns about those.
(it triggers for me all the time because those happen when you
copy patch reject blocks from hand and remove the leading character
without removing the space afterwards)
-Andi
--
[email protected] -- Speaking for myself only.
On Fri, Dec 18, 2009 at 10:43 AM, Mikulas Patocka <[email protected]> wrote:
>> Note: I'm not specifically arguing for keeping the 80-column rule, the
>> project I work on uses 100 columns, and that's quite workable, but I
>> haven't had any problem working with 80 columns as a limit either. ?I
>> do however think that just removing the limit without replacing it
>> with something better is a bad idea.
>>
>> -Kevin Granade
>
> But think what happens when someone views that 100-char code on 80-char
> terminal (or for example 94-char, that I used for some times too) ---
> every second line will be wasted with just 20 characters on the left. On
> the other hand, if you have unlimited line length, it will look better on
> 80-char terminal.
1. I think that is why the limit is at 80 characters, so it's at the
lowest common denominator of screen and will not wrap at all.
(actually wasn't the original limit an email mangling issue?)
2. Aside from choosing the specific limit of 80, I don't think line
wrapping is a major goal of the line length limitation, as I said
before it is a heuristic that is indicative of BAD CODE. The wrapping
issue is somewhat incidental.
3. I don't trust automated line-wrapping to provide readable code
anyway. People I think will be much better at doing this, for
example, a common issue is with formatted string functions like scanf.
A person will (ok, should) know to preferentially wrap after the
format string, and if they are really being nice can generalize about
other things, like name-value pairs:
scanf("%d,%d;%d,%d;%d,%d",
key1, value1,
key2, value2,
key3, value3);
whereas automated wrapping will only know about a limited number of
rules, generally purely syntax-based: (yes, this is somewhat
contrived, but I think it illustrates the kind of limitation I'm
talking about.)
scanf("%d,%d;%d,%d;%d,%d", key1, value1, key2,
value2, key3, value3);
But this is all really beside the point, if you really want to you can
write a text editor that will unwrap the text for you, and then
re-wrap it exactly how you want it. I don't think this would be all
that much harder than one that could intelligently wrap the lines in
the first place.
The point I keep coming back to is to follow the *important* coding
guidelines, like avoiding excessive nesting and properly factoring
your code into sub-modules (macros, functions, inlined functions, etc.
as appropriate.) The line limit acts as a warning that you are
probably doing something wrong, just shortening the line in question
is not the answer.
Other incidental issues:
"wasting lines" - Don't care, factor properly and a single unit of
code should fit in a screenfull or two.
"effort wasted manually wrapping lines" - Also don't care, if you
think the text editor should be smart enough to wrap lines
intelligently, just use it to do it for you. (hint: they're generally
not actually that smart). Also consider the alternative there, if
multi-hundred character lines are the norm, just how easy are the
contents of those lines to manipulate?
>
> Mikulas
>
Mikulas Patocka <[email protected]> writes:
> It would be best to keep line length is unlimited and say that it is the
> responsibility of the viewer to wrap it.
It would mean no continuation lines due to length at all.
Perhaps it's the way to go.
--
Krzysztof Halasa
Mikulas Patocka <[email protected]> writes:
> I don't like scrolling too but most viewers and editors wrap it
> automatically to screen width. For example, restrict your terminal to 50
> columnts and try using "less".
It's the tools.
I think it's much easier to fix tools ("smart wrap") than to rewrite the
source.
--
Krzysztof Halasa
On Fri, Dec 18, 2009 at 02:04:37PM +0100, Jiri Kosina wrote:
> On Thu, 17 Dec 2009, Bartlomiej Zolnierkiewicz wrote:
>
> > > > I like this patch, this is actually what I wanted to do.
> > >
> > > I have nothing against a switch, but it had better default to off.
> > >
> > > The whole 80-char limit is insane. It results in insane "fixes". Just
> > > about every time somebody "improves" a patch due to the warning, the
> > > result is worse than the original patch.
> >
> > Examples? :)
>
> balance_leaf() in fs/reiserfs/do_balan.c
>
> Example picked totally at random:
>
> set_le_ih_k_offset(ih,
> le_ih_k_offset(ih) +
> (tb->
> lbytes <<
> (is_indirect_le_ih
> (ih) ? tb->tb_sb->
> s_blocksize_bits -
> UNFM_P_SHIFT :
> 0)));
>
> See how everything is nicely aligned to 80 cols?
>
>
> Generally, don't look at this function after having a good lunch you want
> to keep. You have been warned.
>
This isn't a valid example, as it wasn't written by a human. This is the
result of Lindent being run blindly on the file and nothing more.
Try again.
On Mon, 21 Dec 2009, Paul Mundt wrote:
> > Example picked totally at random:
> >
> > set_le_ih_k_offset(ih,
> > le_ih_k_offset(ih) +
> > (tb->
> > lbytes <<
> > (is_indirect_le_ih
> > (ih) ? tb->tb_sb->
> > s_blocksize_bits -
> > UNFM_P_SHIFT :
> > 0)));
> >
> > See how everything is nicely aligned to 80 cols?
> >
> >
> > Generally, don't look at this function after having a good lunch you want
> > to keep. You have been warned.
> >
> This isn't a valid example, as it wasn't written by a human. This is the
> result of Lindent being run blindly on the file and nothing more.
>
> Try again.
But having hard rule imposed on everyone to stick with 80-char rule makes
people to do exactly this kind of things (blindly running indent -l80 and
believing that it makes the code more readable because it is
CodingStyle-compliant(tm)).
--
Jiri Kosina
SUSE Labs, Novell Inc.
On Fri, Dec 18, 2009 at 8:55 AM, Bartlomiej Zolnierkiewicz
<[email protected]> wrote:
> On Friday 18 December 2009 02:04:37 pm Jiri Kosina wrote:
>> On Thu, 17 Dec 2009, Bartlomiej Zolnierkiewicz wrote:
>>
>> > > > I like this patch, this is actually what I wanted to do.
>> > >
>> > > I have nothing against a switch, but it had better default to off.
>> > >
>> > > The whole 80-char limit is insane. It results in insane "fixes". Just
>> > > about every time somebody "improves" a patch due to the warning, the
>> > > result is worse than the original patch.
>> >
>> > Examples? :)
>>
>> balance_leaf() in fs/reiserfs/do_balan.c
>>
>> Example picked totally at random:
>>
>> ? ? ? set_le_ih_k_offset(ih,
>> ? ? ? ? ? ? ? ? ? ? ? ? ?le_ih_k_offset(ih) +
>> ? ? ? ? ? ? ? ? ? ? ? ? ?(tb->
>> ? ? ? ? ? ? ? ? ? ? ? ? ? lbytes <<
>> ? ? ? ? ? ? ? ? ? ? ? ? ? (is_indirect_le_ih
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ?(ih) ? tb->tb_sb->
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ?s_blocksize_bits -
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ?UNFM_P_SHIFT :
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ?0)));
>>
>> See how everything is nicely aligned to 80 cols?
>
> I see but the above code is an utter crap anyway.
>
> Firstly what kind of a function parameter is that:
>
> ? le_ih_k_offset(ih) + (tb->lbytes << (is_indirect_le_ih(ih) ? tb->tb_sb->s_blocksize_bits - UNFM_P_SHIFT : 0))
>
> ?
>
> [ BTW 'tb->tb_sb->s_blocksize_bits - UNFM_P_SHIFT' construct is used five
> ?times in balance_leaf() and is a likely candidate for helper / macro. ]
>
> More importantly the whole balance_leaf() function is almost 1400 LOC (!)
> big and impossible to read: code for handling particular 'switch' blocks
> should be factored out into separate functions etc.
>
> The point I was making is that the once we remove the limit we don't have
> other tool to _automatically_ point suspicious code areas (yes, I would
> also prefer intelligent static code checker over dumb limit but it simply
A static code checker (like sparse) could do much more intelligent
wrapping analysis. Maybe someone will take this up as a thesis
project and provide a replacement for indent. Post example
reformattings to LKML until the formatting gods are satisfied.
The problem I see is that C statements that over over 50 chars or so
should be flagged if they aren't declarations or strings. That's an
independent problem from excessive nesting. Tools like ident aren't
smart enough to separate these problems and flags the combination of
statement length and nesting.
I'd be a lot happier if someone simply patched out all of the trailing
whitespace that has gotten in to the kernel tree over the years.
jonsmirl@terra:~/fs$ grep -rI [[:space:]]$ * | wc
47804 231898 2888009
> not here as things are today and dumb limit works surprisingly well most
> of the time -- please note how that the structural problem with the code
> example given is immediately visible with the current limit).
>
>> Generally, don't look at this function after having a good lunch you want
>> to keep. You have been warned.
>
> No worries, I visit dark places (ide, staging, ..) and come back alive.. ;)
>
> --
> Bartlomiej Zolnierkiewicz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/
>
--
Jon Smirl
[email protected]