Attached to this email is chomp.pl, a Perl script which removes trailing
whitespace from several files. I've had this for years, as trailing
whitespace is one of my pet peeves.
Now that git-applymbox complains loudly whenever a patch adds trailing
whitespace, I figured this script may be useful to others.
Jeff
#!/usr/bin/perl
#
# Clean a text file of stealth whitespace
#
use bytes;
$name = 'cleanfile';
foreach $f ( @ARGV ) {
print STDERR "$name: $f\n";
if (! -f $f) {
print STDERR "$f: not a file\n";
next;
}
if (!open(FILE, '+<', $f)) {
print STDERR "$name: Cannot open file: $f: $!\n";
next;
}
binmode FILE;
# First, verify that it is not a binary file
$is_binary = 0;
while (read(FILE, $data, 65536) > 0) {
if ($data =~ /\0/) {
$is_binary = 1;
last;
}
}
if ($is_binary) {
print STDERR "$name: $f: binary file\n";
next;
}
seek(FILE, 0, 0);
@blanks = ();
@lines = ();
while ( defined($line = <FILE>) ) {
$line =~ s/[ \t\r\n]*$/\n/;
if ( $line eq "\n" ) {
push(@blanks, $line);
} else {
push(@lines, @blanks);
push(@lines, $line);
@blanks = ();
}
}
# Any blanks at the end of the file are discarded
seek(FILE, 0, 0);
print FILE @lines;
if ( !defined($where = tell(FILE)) ||
!truncate(FILE, $where) ) {
die "$name: Failed to truncate modified file: $f: $!\n";
}
close(FILE);
}
> Attached to this email is chomp.pl, a Perl script which removes trailing
> whitespace from several files. I've had this for years, as trailing whitespace
> is one of my pet peeves.
>
> Now that git-applymbox complains loudly whenever a patch adds trailing
> whitespace, I figured this script may be useful to others.
>
Pretty long script. How about this two-liner? It does not show 'bytes
chomped' but it also trims trailing whitespace.
#!/usr/bin/perl -i -p
s/[ \t\r\n]+$//
Jan Engelhardt
--
Hello,
> #!/usr/bin/perl -i -p
> s/[ \t\r\n]+$//
perl -p -i -e 's/\s+$//' file1 file2 file3 ...
Thomas
On Saturday May 27, [email protected] wrote:
> Hello,
>
> > #!/usr/bin/perl -i -p
> > s/[ \t\r\n]+$//
>
> perl -p -i -e 's/\s+$//' file1 file2 file3 ...
>
Uhm... have either of you actually tried those? When I tried, I lose
all the '\n' characters :-(
perl -pi -e 's/[ \t\r]+$//' *.[ch]
seems to actually work.
NeilBrown
Jan Engelhardt wrote:
>> Attached to this email is chomp.pl, a Perl script which removes trailing
>> whitespace from several files. I've had this for years, as trailing whitespace
>> is one of my pet peeves.
>>
>> Now that git-applymbox complains loudly whenever a patch adds trailing
>> whitespace, I figured this script may be useful to others.
>>
>
> Pretty long script. How about this two-liner? It does not show 'bytes
> chomped' but it also trims trailing whitespace.
>
> #!/usr/bin/perl -i -p
> s/[ \t\r\n]+$//
Yes, it does, but a bit too aggressive for what we need :)
Jeff
H. Peter Anvin wrote:
> Jeff Garzik wrote:
>>
>> Attached to this email is chomp.pl, a Perl script which removes
>> trailing whitespace from several files. I've had this for years, as
>> trailing whitespace is one of my pet peeves.
>>
>> Now that git-applymbox complains loudly whenever a patch adds trailing
>> whitespace, I figured this script may be useful to others.
>>
>
> This is the script I use for the same purpose. It's a bit more
> sophisticated, in that it detects and avoids binary files, and doesn't
> throw an error if it encounters a directory (which can happen if you
> give it a wildcard.)
Chewing the EOF blanks is nice. The only nit I have is that your script
rewrites the file even if nothing was changed.
Jeff
Jan Engelhardt wrote:
>> Attached to this email is chomp.pl, a Perl script which removes trailing
>> whitespace from several files. I've had this for years, as
trailing whitespace
>> is one of my pet peeves.
And my scripts.
>> Pretty long script. How about this two-liner? It does not show 'bytes
>> chomped' but it also trims trailing whitespace.
>>
>> #!/usr/bin/perl -i -p
>> s/[ \t\r\n]+$//
>
> Yes, it does, but a bit too aggressive for what we need :)
>
Whoops, should have been s/[ \t\r]+$//
And the CL form is
perl -i -pe '...'
Somehow, you can't group it to -ipe, but who cares.
Jan Engelhardt
--
I love perl golf for this kind of stuff... but git-stripspace is part
of git already. Even then, I tend to do it with perl -pi -e ''
constructs ;-)
cheers,
m
On Sun, 28 May 2006, Martin Langhoff wrote:
>
> I love perl golf for this kind of stuff... but git-stripspace is part
> of git already. Even then, I tend to do it with perl -pi -e ''
> constructs ;-)
Well, git-stripspace actually does something slightly differently, in that
it also removes extraneous all-whitespace lines from the beginning, the
end, and the middle (in the middle, the rule is: two or more empty lines
are collapsed into one).
Ie it's a total hack for parsing just commit messages (and it is in C,
because I can personally write 25 lines of C in about a millionth of the
time I can write 3 lines of perl).
Linus
Jan Engelhardt (on Sat, 27 May 2006 14:42:02 +0200 (MEST)) wrote:
>And the CL form is
> perl -i -pe '...'
>Somehow, you can't group it to -ipe, but who cares.
-i takes an optional extension which is used to optionally create
backup files. As such, -i must be followed by space if you want no
extension (and no backup).
#!/usr/bin/perl
#
# Clean a text file of stealth whitespace
#
use bytes;
$name = 'cleanfile';
foreach $f ( @ARGV ) {
print STDERR "$name: $f\n";
if (! -f $f) {
print STDERR "$f: not a file\n";
next;
}
if (!open(FILE, '+<', $f)) {
print STDERR "$name: Cannot open file: $f: $!\n";
next;
}
binmode FILE;
# First, verify that it is not a binary file
$is_binary = 0;
while (read(FILE, $data, 65536) > 0) {
if ($data =~ /\0/) {
$is_binary = 1;
last;
}
}
if ($is_binary) {
print STDERR "$name: $f: binary file\n";
next;
}
seek(FILE, 0, 0);
$in_bytes = 0;
$out_bytes = 0;
$blank_bytes = 0;
@blanks = ();
@lines = ();
while ( defined($line = <FILE>) ) {
$in_bytes += length($line);
$line =~ s/[ \t\r\n]*$/\n/;
if ( $line eq "\n" ) {
push(@blanks, $line);
$blank_bytes += length($line);
} else {
push(@lines, @blanks);
$out_bytes += $blank_bytes;
push(@lines, $line);
$out_bytes += length($line);
@blanks = ();
$blank_bytes = 0;
}
}
# Any blanks at the end of the file are discarded
if ($in_bytes != $out_bytes) {
# Only write to the file if changed
seek(FILE, 0, 0);
print FILE @lines;
if ( !defined($where = tell(FILE)) ||
!truncate(FILE, $where) ) {
die "$name: Failed to truncate modified file: $f: $!\n";
}
}
close(FILE);
}
Hi,
On Sat, 27 May 2006, Linus Torvalds wrote:
> Well, git-stripspace actually does something slightly differently, in that
> it also removes extraneous all-whitespace lines from the beginning, the
> end, and the middle (in the middle, the rule is: two or more empty lines
> are collapsed into one).
>
> Ie it's a total hack for parsing just commit messages (and it is in C,
> because I can personally write 25 lines of C in about a millionth of the
> time I can write 3 lines of perl).
But there is no good reason not to add some code and a command line
switch, so that this tool with a very generic name actually performs what
a normal person would expect from that name.
Ciao,
Dscho