2006-05-27 02:27:20

by Jeff Garzik

[permalink] [raw]
Subject: [SCRIPT] chomp: trim trailing whitespace


Attached to this email is chomp.pl, a Perl script which removes trailing
whitespace from several files. I've had this for years, as trailing
whitespace is one of my pet peeves.

Now that git-applymbox complains loudly whenever a patch adds trailing
whitespace, I figured this script may be useful to others.

Jeff




Attachments:
chomp.pl (1.02 kB)

2006-05-27 04:17:58

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

#!/usr/bin/perl
#
# Clean a text file of stealth whitespace
#

use bytes;

$name = 'cleanfile';

foreach $f ( @ARGV ) {
print STDERR "$name: $f\n";

if (! -f $f) {
print STDERR "$f: not a file\n";
next;
}

if (!open(FILE, '+<', $f)) {
print STDERR "$name: Cannot open file: $f: $!\n";
next;
}

binmode FILE;

# First, verify that it is not a binary file
$is_binary = 0;

while (read(FILE, $data, 65536) > 0) {
if ($data =~ /\0/) {
$is_binary = 1;
last;
}
}

if ($is_binary) {
print STDERR "$name: $f: binary file\n";
next;
}

seek(FILE, 0, 0);

@blanks = ();
@lines = ();

while ( defined($line = <FILE>) ) {
$line =~ s/[ \t\r\n]*$/\n/;

if ( $line eq "\n" ) {
push(@blanks, $line);
} else {
push(@lines, @blanks);
push(@lines, $line);
@blanks = ();
}
}

# Any blanks at the end of the file are discarded

seek(FILE, 0, 0);
print FILE @lines;

if ( !defined($where = tell(FILE)) ||
!truncate(FILE, $where) ) {
die "$name: Failed to truncate modified file: $f: $!\n";
}
close(FILE);
}


Attachments:
cleanfile (1.10 kB)

2006-05-27 10:18:43

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

> Attached to this email is chomp.pl, a Perl script which removes trailing
> whitespace from several files. I've had this for years, as trailing whitespace
> is one of my pet peeves.
>
> Now that git-applymbox complains loudly whenever a patch adds trailing
> whitespace, I figured this script may be useful to others.
>

Pretty long script. How about this two-liner? It does not show 'bytes
chomped' but it also trims trailing whitespace.

#!/usr/bin/perl -i -p
s/[ \t\r\n]+$//



Jan Engelhardt
--

2006-05-27 10:24:42

by Thomas Glanzmann

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

Hello,

> #!/usr/bin/perl -i -p
> s/[ \t\r\n]+$//

perl -p -i -e 's/\s+$//' file1 file2 file3 ...

Thomas

2006-05-27 10:36:28

by NeilBrown

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

On Saturday May 27, [email protected] wrote:
> Hello,
>
> > #!/usr/bin/perl -i -p
> > s/[ \t\r\n]+$//
>
> perl -p -i -e 's/\s+$//' file1 file2 file3 ...
>

Uhm... have either of you actually tried those? When I tried, I lose
all the '\n' characters :-(

perl -pi -e 's/[ \t\r]+$//' *.[ch]

seems to actually work.

NeilBrown

2006-05-27 11:33:04

by Jeff Garzik

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

Jan Engelhardt wrote:
>> Attached to this email is chomp.pl, a Perl script which removes trailing
>> whitespace from several files. I've had this for years, as trailing whitespace
>> is one of my pet peeves.
>>
>> Now that git-applymbox complains loudly whenever a patch adds trailing
>> whitespace, I figured this script may be useful to others.
>>
>
> Pretty long script. How about this two-liner? It does not show 'bytes
> chomped' but it also trims trailing whitespace.
>
> #!/usr/bin/perl -i -p
> s/[ \t\r\n]+$//

Yes, it does, but a bit too aggressive for what we need :)

Jeff



2006-05-27 11:42:06

by Jeff Garzik

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

H. Peter Anvin wrote:
> Jeff Garzik wrote:
>>
>> Attached to this email is chomp.pl, a Perl script which removes
>> trailing whitespace from several files. I've had this for years, as
>> trailing whitespace is one of my pet peeves.
>>
>> Now that git-applymbox complains loudly whenever a patch adds trailing
>> whitespace, I figured this script may be useful to others.
>>
>
> This is the script I use for the same purpose. It's a bit more
> sophisticated, in that it detects and avoids binary files, and doesn't
> throw an error if it encounters a directory (which can happen if you
> give it a wildcard.)

Chewing the EOF blanks is nice. The only nit I have is that your script
rewrites the file even if nothing was changed.

Jeff



2006-05-27 11:48:33

by Dmitry Fedorov

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

Jan Engelhardt wrote:
>> Attached to this email is chomp.pl, a Perl script which removes trailing
>> whitespace from several files. I've had this for years, as
trailing whitespace
>> is one of my pet peeves.

And my scripts.


Attachments:
(No filename) (226.00 B)
find-text-files (7.65 kB)
truncate-eol-whitespace (4.80 kB)
Download all attachments

2006-05-27 12:42:13

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

>> Pretty long script. How about this two-liner? It does not show 'bytes
>> chomped' but it also trims trailing whitespace.
>>
>> #!/usr/bin/perl -i -p
>> s/[ \t\r\n]+$//
>
> Yes, it does, but a bit too aggressive for what we need :)
>
Whoops, should have been s/[ \t\r]+$//
And the CL form is
perl -i -pe '...'

Somehow, you can't group it to -ipe, but who cares.


Jan Engelhardt
--

2006-05-27 15:29:00

by Martin Langhoff

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

I love perl golf for this kind of stuff... but git-stripspace is part
of git already. Even then, I tend to do it with perl -pi -e ''
constructs ;-)

cheers,


m

2006-05-27 16:13:23

by Linus Torvalds

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace



On Sun, 28 May 2006, Martin Langhoff wrote:
>
> I love perl golf for this kind of stuff... but git-stripspace is part
> of git already. Even then, I tend to do it with perl -pi -e ''
> constructs ;-)

Well, git-stripspace actually does something slightly differently, in that
it also removes extraneous all-whitespace lines from the beginning, the
end, and the middle (in the middle, the rule is: two or more empty lines
are collapsed into one).

Ie it's a total hack for parsing just commit messages (and it is in C,
because I can personally write 25 lines of C in about a millionth of the
time I can write 3 lines of perl).

Linus

2006-05-28 08:33:29

by Keith Owens

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

Jan Engelhardt (on Sat, 27 May 2006 14:42:02 +0200 (MEST)) wrote:
>And the CL form is
> perl -i -pe '...'
>Somehow, you can't group it to -ipe, but who cares.

-i takes an optional extension which is used to optionally create
backup files. As such, -i must be followed by space if you want no
extension (and no backup).

2006-05-28 09:25:07

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

#!/usr/bin/perl
#
# Clean a text file of stealth whitespace
#

use bytes;

$name = 'cleanfile';

foreach $f ( @ARGV ) {
print STDERR "$name: $f\n";

if (! -f $f) {
print STDERR "$f: not a file\n";
next;
}

if (!open(FILE, '+<', $f)) {
print STDERR "$name: Cannot open file: $f: $!\n";
next;
}

binmode FILE;

# First, verify that it is not a binary file
$is_binary = 0;

while (read(FILE, $data, 65536) > 0) {
if ($data =~ /\0/) {
$is_binary = 1;
last;
}
}

if ($is_binary) {
print STDERR "$name: $f: binary file\n";
next;
}

seek(FILE, 0, 0);

$in_bytes = 0;
$out_bytes = 0;
$blank_bytes = 0;

@blanks = ();
@lines = ();

while ( defined($line = <FILE>) ) {
$in_bytes += length($line);
$line =~ s/[ \t\r\n]*$/\n/;

if ( $line eq "\n" ) {
push(@blanks, $line);
$blank_bytes += length($line);
} else {
push(@lines, @blanks);
$out_bytes += $blank_bytes;
push(@lines, $line);
$out_bytes += length($line);
@blanks = ();
$blank_bytes = 0;
}
}

# Any blanks at the end of the file are discarded

if ($in_bytes != $out_bytes) {
# Only write to the file if changed
seek(FILE, 0, 0);
print FILE @lines;

if ( !defined($where = tell(FILE)) ||
!truncate(FILE, $where) ) {
die "$name: Failed to truncate modified file: $f: $!\n";
}
}

close(FILE);
}


Attachments:
cleanfile (1.38 kB)

2006-05-28 10:00:08

by Johannes Schindelin

[permalink] [raw]
Subject: Re: [SCRIPT] chomp: trim trailing whitespace

Hi,

On Sat, 27 May 2006, Linus Torvalds wrote:

> Well, git-stripspace actually does something slightly differently, in that
> it also removes extraneous all-whitespace lines from the beginning, the
> end, and the middle (in the middle, the rule is: two or more empty lines
> are collapsed into one).
>
> Ie it's a total hack for parsing just commit messages (and it is in C,
> because I can personally write 25 lines of C in about a millionth of the
> time I can write 3 lines of perl).

But there is no good reason not to add some code and a command line
switch, so that this tool with a very generic name actually performs what
a normal person would expect from that name.

Ciao,
Dscho