Hello Linus,
you can either use bk receive to patch this mail, you can pull from
bk://krusty.dt.e-technik.uni-dortmund.de (NOTE: no trailing slash)
or you can apply the patch below.
Patch description:
This patch updates BK-kernel-tools/shortlog to a new upstream version,
adding more address -> name translations.
Matthias
##### DIFFSTAT #####
# shortlog | 27 ++++++++++++++++++++++++++-
# 1 files changed, 26 insertions(+), 1 deletion(-)
##### GNUPATCH #####
# This is a BitKeeper generated patch for the following project:
# Project Name: BK kernel tools
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.42 -> 1.43
# shortlog 1.18 -> 1.19
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 03/03/26 [email protected] 1.43
# Translate another 18 addresses to names.
# Update to upstream 0.85 (was 0.83).
# --------------------------------------------
#
diff -Nru a/shortlog b/shortlog
--- a/shortlog Wed Mar 26 11:30:36 2003
+++ b/shortlog Wed Mar 26 11:30:36 2003
@@ -8,7 +8,7 @@
# Tomas Szepe <[email protected]>
# Vitezslav Samel <[email protected]>
#
-# $Id: lk-changelog.pl,v 0.83 2003/03/19 08:19:48 vita Exp $
+# $Id: lk-changelog.pl,v 0.85 2003/03/26 08:22:11 vita Exp $
# ----------------------------------------------------------------------
# Distribution of this script is permitted under the terms of the
# GNU General Public License (GNU GPL) v2.
@@ -77,6 +77,7 @@
'[email protected]' => 'Arnaldo Carvalho de Melo',
'[email protected]' => 'Arnaldo Carvalho de Melo',
'[email protected]' => 'Arnaldo Carvalho de Melo',
+'[email protected]' => 'Allen Curtis',
'[email protected]' => 'Adam Radford', # google
'[email protected]' => 'Adam Radford', # google
'[email protected]' => 'Adam J. Richter',
@@ -100,6 +101,7 @@
'[email protected]' => 'Andrew Morton',
'[email protected]' => 'Andrew Morton',
'[email protected]' => 'Adam Kropelin', # lbdb
+'[email protected]' => 'Alan Cox',
'[email protected]' => 'Alan Cox',
'[email protected]' => 'Alan Cox',
'[email protected]' => 'Alan Cox',
@@ -143,6 +145,7 @@
'[email protected]' => 'Benjamin LaHaise',
'[email protected]' => 'Benjamin LaHaise',
'[email protected]' => 'Benjamin LaHaise', # guessed
+'[email protected]' => 'Bruce D. Elliott', # it's typo IMHO
'[email protected]' => 'Bruce D. Elliott',
'[email protected]' => 'Bart De Schuymer',
'[email protected]' => 'Brian Beattie', # from david.nelson
@@ -158,6 +161,7 @@
'[email protected]' => 'Bj?rn Andersson', # google, guessed ?
'[email protected]' => 'Bjorn Wesen',
'[email protected]' => 'Bjorn Helgaas',
+'[email protected]' => 'Dave Blaschke',
'[email protected]' => 'Oskar Andreasson',
'[email protected]' => 'Blake Matheny', # google
'[email protected]' => 'Boris Itkis', # by Kristian Peters
@@ -195,21 +199,25 @@
'[email protected]' => 'Kevin Brosius',
'[email protected]' => 'Colin Gibbs',
'[email protected]' => 'Matthew Dobson',
+'[email protected]' => 'Jonathan Corbet',
'[email protected]' => 'Kevin Corry',
'[email protected]' => 'Cort Dougan',
'[email protected]' => 'Tom Coughlan',
'[email protected]' => 'Chris Hanson',
'[email protected]' => 'Christoph Rohland',
+'[email protected]' => 'Jeb J. Cramer',
'[email protected]' => 'Charles-Edouard Ruault',
'[email protected]' => 'Chad N. Tindel',
'[email protected]' => 'Ruslan U. Zakirov',
'[email protected]' => 'Chris Wedgwood',
+'[email protected]' => 'Charles Fumuso',
'[email protected]' => 'Christopher Yeoh',
'[email protected]' => 'David M?ller',
'[email protected]' => 'Bj?rn Augustsson',
'[email protected]' => 'Dan Aloni',
'[email protected]' => 'Daisy Chang', # from shortlog
'[email protected]' => 'Dale Farnsworth',
+'[email protected]' => 'Dale Farnsworth',
'[email protected]' => 'Martin Dalecki',
'[email protected]' => 'Martin Dalecki',
'[email protected]' => 'Dan Zink',
@@ -257,6 +265,7 @@
'[email protected]' => 'Doug Ledford',
'[email protected]' => 'Doug Ledford',
'[email protected]' => 'Doug Ledford',
+'[email protected]' => 'David Stevens',
'[email protected]' => 'Dave McCracken',
'[email protected]' => 'Denis Oliver Kropp',
'[email protected]' => 'Douglas Gilbert',
@@ -340,6 +349,7 @@
'[email protected]' => 'Michael Grundy',
'[email protected]' => 'Guillermo S. Romero',
'[email protected]' => 'Ghozlane Toumi',
+'[email protected]' => 'Geoffrey Wehrman',
'[email protected]' => 'Jamal Hadi Salim',
'[email protected]' => 'Hanna Linder',
'[email protected]' => 'Harald Welte',
@@ -366,6 +376,8 @@
'[email protected]' => 'Thiemo Seufer', # google
'[email protected]' => 'Adams IT Services',
'[email protected]' => 'Ivan Kokshaysky',
+'[email protected].(none)' => 'Ivan Kokshaysky',
+'ink@undisclosed.(none)' => 'Ivan Kokshaysky',
'[email protected]' => 'Ion Badulescu',
'[email protected]' => 'Ion Badulescu',
'[email protected]' => 'Ishan O. Jayawardena',
@@ -419,6 +431,7 @@
'[email protected]' => 'John Hesterberg',
'[email protected]' => 'Jack Hammer',
'[email protected]' => 'Jack Thomasson',
+'[email protected]' => 'Jason McMullan',
'[email protected]' => 'James Morris',
'[email protected]' => 'James Morris', # it's typo IMHO
'[email protected]' => 'Jochen Suckfuell',
@@ -468,6 +481,7 @@
'[email protected]' => 'Kasper Dupont',
'[email protected]' => 'Keith Underwood',
'[email protected]' => 'Kenneth W. Chen',
+'[email protected]' => 'Jeffrey S. Laing',
'[email protected]' => 'Paul Clements',
'[email protected]' => 'Kent Yoder',
'[email protected]' => 'Ari Juhani H?meenaho',
@@ -501,6 +515,7 @@
'[email protected]' => 'Lee Nash', # lbdb
'[email protected]' => 'Leigh Brown', # lbdb
'[email protected]' => 'John Levon',
+'[email protected]' => 'Luis F. Ortiz',
'[email protected]' => 'Dominik Brodowski',
'[email protected]' => 'Jan Marek',
'[email protected]' => 'Lionel Bouton',
@@ -514,6 +529,7 @@
'[email protected]' => 'Luben Tuikov',
'[email protected]' => 'Luc Van Oostenryck', # lbdb
'[email protected]' => 'Lucas Correia Villa Real', # google
+'[email protected]' => 'Jason Lunz',
'[email protected]' => 'Marc-Christian Petersen',
'[email protected]' => 'Marcus Alanen',
'[email protected]' => 'Armin Schindler',
@@ -693,6 +709,7 @@
'[email protected]' => 'Robert Love',
'[email protected]' => 'Rob Radez',
'[email protected]' => 'Robert Olsson',
+'[email protected]' => 'Dean Roehrich',
'[email protected]' => 'Rohit Seth',
'[email protected]' => 'Paul Rolland',
'[email protected]' => 'Roland McGrath',
@@ -834,6 +851,7 @@
'[email protected]' => 'Thomas Wahrenbruch',
'[email protected]' => 'Derek Atkins',
'[email protected]' => 'Wolfgang Denk',
+'[email protected]' => 'Ulrich Weigand',
'[email protected]' => 'Wes Schreiner',
'[email protected]' => 'Wolfram Gloger', # lbdb
'[email protected]' => 'Wayne Whitney',
@@ -854,6 +872,7 @@
'[email protected]' => 'Yaacov Akiba Slama',
'[email protected]' => 'Yokota Hiroshi',
'[email protected]' => 'Hideaki Yoshifuji', # lbdb
+'[email protected]' => 'Hideaki Yoshifuji',
'[email protected]' => 'Yuri Per', # lbdb
'[email protected]' => 'Pete Zaitcev',
'[email protected]' => 'Zinx Verituse',
@@ -1387,6 +1406,12 @@
__END__
# --------------------------------------------------------------------
# $Log: lk-changelog.pl,v $
+# Revision 0.85 2003/03/26 08:22:11 vita
+# Added 6 names for new addresses.
+#
+# Revision 0.84 2003/03/24 08:45:20 vita
+# Added 12 names for new addresses from current 2.5 BK tree.
+#
# Revision 0.83 2003/03/19 08:19:48 vita
# Added 4 new names for addresses from current linux-2.5 BK tree.
#
##### BKPATCH #####
This BitKeeper patch contains the following changesets:
1.42
## Wrapped with gzip_uu ##
begin 600 bkpatch22406
M'XL(`$R!@3X``]U6;6_;-A#^'/V*`U(@+5HQE&19MH`4RDO;=&[6HED'#-@7
M6KI8K"72("D[*?R']R]VHM(XZ;9B7;<OLP39U#W/'>_NT<G[\,&BR?=:X5PM
MA65"508QV(=S;5V^MVBO6=4OWVM-RT/;63Q<HE'8')[,Z`R'1>BT;FQ`P'?"
ME36LT=A\+V+)W1UWL\)\[_V+5Q_>'+\/@J,C.*V%6N`E.C@Z"IPV:]%4MEBA
M6G12,6>$LBTZP4K=;N^PVYCSF(XH3O@XG6[CZ3A-MQACFI:C2,RS289EO'-7
MZQ:_YBNA(XWHX*-MFF:347`&$1O%P)-#.N,(^#1/QCG/GO(HYQR^J%0Q5`B>
M1A#RX`3^Y3Q.@Q(^K"KAD#Q#M[+.H&A]>:56P-DD@<=71K?T,YL^>0:BJJ1:
M@*6T0>&F7QNT%BTC3\>?%U!U"]"=@U)WQJ&]`7T%/TN'GVPCUG`I6FQ8,(,T
MI6T&[W:M"L)O_`0!%SQX#E]KB*VU<8U>##5)HPG/1EDTV291-DVW5S@55V7&
MIX)C)>;57W3@@1?J:ASQ23+FV39.IJ.Q%]QGQ`.]??=^_K:K!U*;D$LOM6CR
MS5)+.(31_TYK0Y_>0F@VU_T97I/P/M?O'^CN+(H@"E[[ZSX\>EWET"S#TA>%
M/+)5\VP])-7WIN]`-`4^R:-I/IK`6CH!+ZY7\(A<\"GY.!`-7N--X;"LE6[%
M`JU495_I`SAZ#@?'9(:?R&`/GO5A!T[?.38W*!LTA>J:)J0F:K.XY9#Y-S@9
MS)X6IYYF5%7TESDU_!9**SA!LVB%4AY*0XJ@\PH+M6FD6NZV<F*Z$N&,P8NF
MD=HY#Y^,>WC98(O*%HVHI"WK._>GPWUX,]SWC.G(,W2WJ!NA"H-5+=PN"N4*
MI[?&'I^0+@F_,+A8UL7P7F#SY5PZRQ2Z@?2*K#`S6M3AN3"NO:6F/NUZ)8H'
M&AXXYPS>H4,#QVHM!_S8)_,QI$9T1A0ELH\K"G*O'3]TZM<#6=82?O28GC:*
M$D]KM3':%E*1TU)7V+.8Z&Z))$@+%P21U,F]?9#NP/:O+PVO+\[?DI?4Y[D4
MLE@+J=ECI14^&<@S(>$5&A)!6>/&+F4?-N6>T.A.VF516F2=LAN&57<7]$UO
M@U^Z<"8[F&V&JJ1\TO.H1B2CHK-,SMM=@A>B$501,GELZBO2RAOQ26Q$08+H
MKD.Y6H]W:IN)3YW21L+%+<H3QT,06V';H"S*AG8GPQIEA<V<]'8GD0MAEIV%
M,X]S:A!LEF8]W3I<XU4Q?+DXHL2LHXGT8,>7O15>&E2#P+*Q5[#M:%#4;,ZL
MK$AAOBW-/98WPPF#2V_WTDPRHL;]_Q)<R]UX^M-'V3_+!*6)A!6,_*A2OLE7
MVNR&%OC15G:&]N=@J%_,4CB9`0U"9,'^%^'B>^&R/EP\SND1>!@NWH7[HX?H
MG@>22)HGY"3[SS;,=^$X=2W)DRA/OPR7?4>XNS][)/YR:;OV*(Y1<'KA!;\#
(B+NW/VD*````
`
end
On Wed, 26 Mar 2003, Matthias Andree wrote:
>
> you can either use bk receive to patch this mail, you can pull from
> bk://krusty.dt.e-technik.uni-dortmund.de (NOTE: no trailing slash)
> or you can apply the patch below.
Btw, one feature I'd like to see in shortlog is the ability to use
regexps for email address matching, ie something like
'torvalds@.*transmeta.com' => 'Linus Torvalds'
...
'alan@.*swansea.linux.org.uk' => 'Alan Cox'
...
'[email protected]' => 'Benjamin LaHaise',
'bcrl@.*' => '?? Benjamin LaHaise',
..
I don't know whether you can force perl to do something like this, but if
somebody were to try...
Linus
Linus Torvalds <[email protected]> writes:
> On Wed, 26 Mar 2003, Matthias Andree wrote:
>>
>> you can either use bk receive to patch this mail, you can pull from
>> bk://krusty.dt.e-technik.uni-dortmund.de (NOTE: no trailing slash)
>> or you can apply the patch below.
>
> Btw, one feature I'd like to see in shortlog is the ability to use
> regexps for email address matching, ie something like
>
> 'torvalds@.*transmeta.com' => 'Linus Torvalds'
> ...
> 'alan@.*swansea.linux.org.uk' => 'Alan Cox'
> ...
> '[email protected]' => 'Benjamin LaHaise',
> 'bcrl@.*' => '?? Benjamin LaHaise',
> ..
>
> I don't know whether you can force perl to do something like this, but if
> somebody were to try...
if you change your list to:
@email_name_map = (
['torvalds@.*transmeta.com' => 'Linus Torvalds'],
...
['alan@.*swansea.linux.org.uk' => 'Alan Cox'],
...
['[email protected]' => 'Benjamin LaHaise'],
['bcrl@.*' => '?? Benjamin LaHaise'],
...);
something along these (untested) lines should do the trick:
sub email2name
{
my($email) = @_;
for my $i (@mailmap) {
my($pattern, $name) = @$i;
return $name if ($email =~ m/$pattern/i);
}
return '??';
}
Regards, Olaf.
On Wed, 26 Mar 2003, Linus Torvalds wrote:
> Btw, one feature I'd like to see in shortlog is the ability to use
> regexps for email address matching, ie something like
>
> 'torvalds@.*transmeta.com' => 'Linus Torvalds'
> ...
> 'alan@.*swansea.linux.org.uk' => 'Alan Cox'
> ...
> '[email protected]' => 'Benjamin LaHaise',
> 'bcrl@.*' => '?? Benjamin LaHaise',
> ..
>
> I don't know whether you can force perl to do something like this, but if
> somebody were to try...
I'd like to keep the hash for all those addresses that aren't wildcards
and that aren't regexps -- we have fast, that is O(1) to O(log n),
access to the hash (depending on Perl's implementation) and we have
worse than O(n) for regexp, where n is the count of address strings or
regexps.
Would you agree to a version that has a set of fixed addresses and a
separate list of regexps, tries the hash first and then a list of
regexps? That sounds like a) easy addition, b) good performance to me
(before implementing it). If so, I could add some code for that feature.
--
Matthias Andree
On Wed, Mar 26, 2003 at 09:21:22AM -0800, Linus Torvalds wrote:
...
> Btw, one feature I'd like to see in shortlog is the ability to use
> regexps for email address matching, ie something like
>
> 'torvalds@.*transmeta.com' => 'Linus Torvalds'
> ...
> 'alan@.*swansea.linux.org.uk' => 'Alan Cox'
> ...
...
> I don't know whether you can force perl to do something like this, but if
> somebody were to try...
My perl-incantation wizard friend (I consider myself mere journeyman)
uses usually convoluted 'map' constructs to do things like this.
It is amazing high-power way to make things -- and often helps you to
make "write only" script.
Wrapping converter regular expressions into map {} structure gives
something like:
@summary = map {
s/torvalds\@.*transmeta.com/Linus Torvalds/,
s/alan\@.*swansea.linux.org.uk/Alan Cox/,
... etc ...
1
} @summary;
See "man perlfunc" and look for " map ".
> Linus
/Matti Aarnio
On Wed, Mar 26, 2003 at 09:10:31PM +0100, Matthias Andree wrote:
> On Wed, 26 Mar 2003, Linus Torvalds wrote:
>
> > Btw, one feature I'd like to see in shortlog is the ability to use
> > regexps for email address matching, ie something like
> >
> > 'torvalds@.*transmeta.com' => 'Linus Torvalds'
> > ...
> > 'alan@.*swansea.linux.org.uk' => 'Alan Cox'
> > ...
> > '[email protected]' => 'Benjamin LaHaise',
> > 'bcrl@.*' => '?? Benjamin LaHaise',
> > ..
> >
> > I don't know whether you can force perl to do something like this, but if
> > somebody were to try...
Perl is very regex-friendly. Sure it can do this :)
> I'd like to keep the hash for all those addresses that aren't wildcards
> and that aren't regexps -- we have fast, that is O(1) to O(log n),
> access to the hash (depending on Perl's implementation) and we have
> worse than O(n) for regexp, where n is the count of address strings or
> regexps.
>
> Would you agree to a version that has a set of fixed addresses and a
> separate list of regexps, tries the hash first and then a list of
> regexps? That sounds like a) easy addition, b) good performance to me
> (before implementing it). If so, I could add some code for that feature.
Do we really care about performance here?
I think maintain-ability is probably more important.
In any case, splitting the lists into "fixed" and "regex" doesn't seem
like a bad idea, provided that the change was fairly easy and
self-contained.
Jeff
On Wed, 26 Mar 2003, Linus Torvalds wrote:
> I don't know whether you can force perl to do something like this, but if
> somebody were to try...
How about this (search for 'Alan Cox' to see the syntax):
Index: lk-changelog.pl
===================================================================
RCS file: /var/CVS/lk-changelog/lk-changelog.pl,v
retrieving revision 0.85
retrieving revision 0.88
diff -u -r0.85 -r0.88
--- lk-changelog.pl 26 Mar 2003 08:22:11 -0000 0.85
+++ lk-changelog.pl 26 Mar 2003 21:12:23 -0000 0.88
@@ -8,7 +8,7 @@
# Tomas Szepe <[email protected]>
# Vitezslav Samel <[email protected]>
#
-# $Id: lk-changelog.pl,v 0.85 2003/03/26 08:22:11 vita Exp $
+# $Id: lk-changelog.pl,v 0.88 2003/03/26 21:12:23 emma Exp $
# ----------------------------------------------------------------------
# Distribution of this script is permitted under the terms of the
# GNU General Public License (GNU GPL) v2.
@@ -53,6 +53,8 @@
use Text::Tabs;
use Text::Wrap;
+sub selftest();
+
# --------------------------------------------------------------------
# customize the following line to change the indentation of the change
# lines, $indent1 is used for the first line of an entry, $indent for
@@ -63,6 +65,11 @@
my $debug = 0;
# --------------------------------------------------------------------
+# Perl syntax magic here, "=>" is equivalent to ","
+my @addrregexps = (
+[ 'alan@.*\.swansea\.linux\.org\.uk' => 'Alan Cox', ],
+[ '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~' => '~~~~~~~~' ]);
+
# the key is the email address in ALL LOWER CAPS!
# the value is the real name of the person
#
@@ -101,8 +108,6 @@
'[email protected]' => 'Andrew Morton',
'[email protected]' => 'Andrew Morton',
'[email protected]' => 'Adam Kropelin', # lbdb
-'[email protected]' => 'Alan Cox',
-'[email protected]' => 'Alan Cox',
'[email protected]' => 'Alan Cox',
'[email protected]' => 'Alan Cox',
'[email protected]' => 'Alexander Atanasov',
@@ -889,12 +894,27 @@
my %address_unknown;
-# get name associated to an email address
-sub rmap_address {
- my @o = map {defined $addresses{$_} ? $addresses{$_} :
- scalar (($address_unknown{$_} = 1), $_); }
- map { lc; } @_;
- return wantarray ? @o : $o[0];
+# get name associated with an "email address" formatted
+# BK_USER,BK_HOST tuple
+sub rmap_address($) {
+ my $in = shift;
+ my $key = lc $in;
+ # try hash lookup first, return result if any
+ if (defined $addresses{$key}) {
+ return $addresses{$key};
+ }
+ # try matching against all regexps in listed order
+ # return result if any
+ foreach my $ar (@addrregexps) {
+ if ($in =~ m/$ar->[0]/) {
+ return $ar->[1];
+ }
+ }
+ # when the address is unknown, return the unchanged input
+ # and mark the address as unknown (so it can be printed in --warn
+ # mode).
+ $address_unknown{$key} = 1;
+ return $in;
}
# case insensitive string comparison
@@ -1274,12 +1294,26 @@
return print $opt{width} ? expand(wrap("", "", ($a))) : $a, "\n";
}
+sub selftest() {
+ my $rc = 0;
+ foreach my $address (keys %addresses) {
+ foreach my $ar (@addrregexps) {
+ if ($address =~ m/$ar->[0]/) {
+ print STDERR "Warning: address '$address'\n";
+ print STDERR " shadows regexp '$ar->[0]'\n";
+ $rc = 1;
+ }
+ }
+ }
+ return $rc;
+}
+
# === MAIN PROGRAM ===============================================
# Command line arguments
# What options do we support?
my @opts = ("help|?|h", "man", "mode=s", "compress!", "count!", "width:i",
"swap!", "merge!", "warn!", "multi!", "abbreviate-names!",
- "by-surname!");
+ "by-surname!", "selftest");
# "bitkeeper|bk!");
# How do we parse them?
@@ -1311,7 +1345,8 @@
unless defined $table{$opt{mode}};
pod2usage(-verbose => 0,
-message => "$0: No files given, refusing to read from a TTY.")
- if (not $opt{bitkeeper} and (@ARGV == 0) and (-t STDIN));
+ if (not $opt{selftest} and not $opt{bitkeeper}
+ and (@ARGV == 0) and (-t STDIN));
pod2usage(-verbose => 0,
-message => "$0: Must have one or two arguments in --bitkeeper mode.")
if ($opt{bitkeeper} && (@ARGV < 1 || @ARGV > 2));
@@ -1358,6 +1393,10 @@
foreach (@ARGV) { print STDERR "DEBUG: '$_'\n"; }
}
+if ($opt{selftest}) {
+ exit selftest;
+}
+
# Main program
my @prolog;
my %log;
@@ -1406,6 +1445,18 @@
__END__
# --------------------------------------------------------------------
# $Log: lk-changelog.pl,v $
+# Revision 0.88 2003/03/26 21:12:23 emma
+# Add selftest mode check:
+# * check all addresses against all regexps to find addresses shadowing
+# regular expressions.
+#
+# Revision 0.87 2003/03/26 21:02:53 emma
+# Fix broken regexp for Alan's swansea.linux.org.uk addresses. Add some comments.
+#
+# Revision 0.86 2003/03/26 20:57:49 emma
+# Support regexp queries (but try hash lookups first for efficiency).
+# Requested by Linus Torvalds.
+#
# Revision 0.85 2003/03/26 08:22:11 vita
# Added 6 names for new addresses.
#
@@ -1737,6 +1788,8 @@
--width[=WIDTH] specify the line length, if omitted: $COLUMNS or 80.
text lines will not exceed this length.
+ --selftest perform some self tests (for developers of this script)
+
Warning: Neither --compress nor --count are currently functional with
--mode=full.
@@ -1825,6 +1878,8 @@
=head1 TODO
=over
+
+=item * OBFUSCATE ADDRESSES (requested by Solar Designer)
=item * --compress-me-harder
--
Matthias Andree
one suggestion I would make would be to break out the regex into a
seperate file, that way you will no longer need to touch the script once
the core logic is correct.
got this working nicely to count users based upon patterns.
--jauder
On Wed, 26 Mar 2003, Jeff Garzik wrote:
> On Wed, Mar 26, 2003 at 09:10:31PM +0100, Matthias Andree wrote:
> > On Wed, 26 Mar 2003, Linus Torvalds wrote:
> >
> > > Btw, one feature I'd like to see in shortlog is the ability to use
> > > regexps for email address matching, ie something like
> > >
> > > 'torvalds@.*transmeta.com' => 'Linus Torvalds'
> > > ...
> > > 'alan@.*swansea.linux.org.uk' => 'Alan Cox'
> > > ...
> > > '[email protected]' => 'Benjamin LaHaise',
> > > 'bcrl@.*' => '?? Benjamin LaHaise',
> > > ..
> > >
> > > I don't know whether you can force perl to do something like this, but if
> > > somebody were to try...
>
> Perl is very regex-friendly. Sure it can do this :)
>
>
> > I'd like to keep the hash for all those addresses that aren't wildcards
> > and that aren't regexps -- we have fast, that is O(1) to O(log n),
> > access to the hash (depending on Perl's implementation) and we have
> > worse than O(n) for regexp, where n is the count of address strings or
> > regexps.
> >
> > Would you agree to a version that has a set of fixed addresses and a
> > separate list of regexps, tries the hash first and then a list of
> > regexps? That sounds like a) easy addition, b) good performance to me
> > (before implementing it). If so, I could add some code for that feature.
>
> Do we really care about performance here?
>
> I think maintain-ability is probably more important.
>
> In any case, splitting the lists into "fixed" and "regex" doesn't seem
> like a bad idea, provided that the change was fairly easy and
> self-contained.
>
> Jeff
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
On Wed, Mar 26, 2003 at 01:33:45PM -0800, Jauder Ho wrote:
>
> one suggestion I would make would be to break out the regex into a
> seperate file, that way you will no longer need to touch the script once
> the core logic is correct.
>
> got this working nicely to count users based upon patterns.
>
> --jauder
Second that.
Should definitely be split between fixed (quick hash lookup)
and regex. Regexes for dns entries get ugly fast (. as
wildcard). The split could be done while slurping the file
(check for /\*\[\]\+/).
>
> On Wed, 26 Mar 2003, Jeff Garzik wrote:
>
> > On Wed, Mar 26, 2003 at 09:10:31PM +0100, Matthias Andree wrote:
> > > On Wed, 26 Mar 2003, Linus Torvalds wrote:
> > >
> > > > Btw, one feature I'd like to see in shortlog is the ability to use
> > > > regexps for email address matching, ie something like
> > > >
> > > > 'torvalds@.*transmeta.com' => 'Linus Torvalds'
> > > > ...
> > > > 'alan@.*swansea.linux.org.uk' => 'Alan Cox'
> > > > ...
> > > > '[email protected]' => 'Benjamin LaHaise',
> > > > 'bcrl@.*' => '?? Benjamin LaHaise',
> > > > ..
> > > >
> > > > I don't know whether you can force perl to do something like this, but if
> > > > somebody were to try...
> >
> > Perl is very regex-friendly. Sure it can do this :)
> >
> >
> > > I'd like to keep the hash for all those addresses that aren't wildcards
> > > and that aren't regexps -- we have fast, that is O(1) to O(log n),
> > > access to the hash (depending on Perl's implementation) and we have
> > > worse than O(n) for regexp, where n is the count of address strings or
> > > regexps.
> > >
> > > Would you agree to a version that has a set of fixed addresses and a
> > > separate list of regexps, tries the hash first and then a list of
> > > regexps? That sounds like a) easy addition, b) good performance to me
> > > (before implementing it). If so, I could add some code for that feature.
> >
> > Do we really care about performance here?
> >
> > I think maintain-ability is probably more important.
> >
> > In any case, splitting the lists into "fixed" and "regex" doesn't seem
> > like a bad idea, provided that the change was fairly easy and
> > self-contained.
> >
> > Jeff
> >
> >
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: [email protected]
Remember Cernan and Schmitt