Embarrassing, huh? But I just found a bug in my code cause by
"if (strcmp(a,b))" instead of "if (!strcmp(a,b))".
This bites me about once a year, so Linus, please apply.
Preparing to have my "3l33t k3rN31 h@ck3r" card revoked,
Rusty.
Name: streq implementation
Author: Rusty Russell
Status: Trivial
D: I can't believe that after all these years I still make the "sense
D: of strcmp" mistake. So it's time to reintroduce my favorite macro.
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.38/include/linux/string.h working-2.5.38-streq/include/linux/string.h
--- linux-2.5.38/include/linux/string.h 2002-06-06 21:38:40.000000000 +1000
+++ working-2.5.38-streq/include/linux/string.h 2002-09-24 14:43:30.000000000 +1000
@@ -15,7 +15,7 @@ extern "C" {
extern char * strpbrk(const char *,const char *);
extern char * strsep(char **,const char *);
extern __kernel_size_t strspn(const char *,const char *);
-
+#define streq(a,b) (strcmp((a),(b)) == 0)
/*
* Include machine specific inline routines
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
On Tue, 24 Sep 2002, Rusty Russell wrote:
> Embarrassing, huh? But I just found a bug in my code cause by "if
> (strcmp(a,b))" instead of "if (!strcmp(a,b))".
there's a few more places that tend to cause wasted time, no matter what:
- list_add(elem, list) order of arguments. It can be mixed up easily, and
while i know all the consequences every few months i waste a few hours
on such a thing.
- kmalloc(size, flags)/gfp(order, flags) argument ordering. A few months
ago i wasted two days on such a bug - since 'size' was very small
usually, it never showed up that the allocated buffer was short, until
some rare load-test increased the 'size'.
we should do something about these. list_add() is hard, while we could
introduce a separate type for list heads, there are some valid uses of
non-head list_add(). But perhaps those could be separated out.
handling most of the gfp() mixups should be a bit easier, perhaps by
detecting invalid flags in the inline section, which is optimized away at
runtime in like 95% of the cases?
Ingo
In message <[email protected]> you
write:
> - kmalloc(size, flags)/gfp(order, flags) argument ordering. A few months
> ago i wasted two days on such a bug - since 'size' was very small
> usually, it never showed up that the allocated buffer was short, until
> some rare load-test increased the 'size'.
>
> we should do something about these. list_add() is hard, while we could
> introduce a separate type for list heads, there are some valid uses of
> non-head list_add(). But perhaps those could be separated out.
>
> handling most of the gfp() mixups should be a bit easier, perhaps by
> detecting invalid flags in the inline section, which is optimized away at
> runtime in like 95% of the cases?
I had a gcc patch which made enum typing strict for C
(-Wstrict-enums), which was designed for the GFP_xxx case and things
like it (you make them enums). It was against 2.95.2, RSN I should
update it, and (as Tridge suggested) make it an __attribute__.
For runtime checks (which are never as good) you could change the GFP_
defined to set the high bit.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
From: Rusty Russell <[email protected]>
Date: Tue, 24 Sep 2002 16:04:33 +1000
For runtime checks (which are never as good) you could change the GFP_
defined to set the high bit.
Another idea is to make a gfp_flags_t, that worked very well
for things like mm_segment_t.
In message <[email protected]> you write:
> From: Rusty Russell <[email protected]>
> Date: Tue, 24 Sep 2002 16:04:33 +1000
>
> For runtime checks (which are never as good) you could change the GFP_
> defined to set the high bit.
>
> Another idea is to make a gfp_flags_t, that worked very well
> for things like mm_segment_t.
But you can't or them together without some icky macro, unless they're
typedef to an integer type in which case gcc doesn't warn you if you
just stuck a "sizeof(x)" in there.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
i think the high gfp bit idea sounds good. 95% of the gfp()/kmalloc()
users use a static flag, so it's not like there is any widespread runtime
cost.
Ingo
On 24 September 2002 03:40, Ingo Molnar wrote:
> On Tue, 24 Sep 2002, Rusty Russell wrote:
> there's a few more places that tend to cause wasted time, no matter what:
> - kmalloc(size, flags)/gfp(order, flags) argument ordering. A few months
> ago i wasted two days on such a bug - since 'size' was very small
> usually, it never showed up that the allocated buffer was short, until
> some rare load-test increased the 'size'.
Aughment kmalloc(size, GFP_XXXXX) with kmalloc_XXXXX(size) (inline of course)?
You may use this form in those 90% cases where flags are constant.
This also gets rid of GFP_ prefixes (shorter).
Ingo, Rusty?
--
vda
--- linux-2.5.36/include/linux/slab.h.orig Tue Sep 24 11:00:42 2002
+++ linux-2.5.36/include/linux/slab.h Tue Sep 24 11:06:32 2002
@@ -61,6 +61,15 @@
extern void *kmalloc(size_t, int);
extern void kfree(const void *);
+extern inline void *kmalloc_NOHIGHIO(size_t sz) { return kmalloc(sz, GFP_NOHIGHIO); }
+extern inline void *kmalloc_NOIO (size_t sz) { return kmalloc(sz, GFP_NOIO ); }
+extern inline void *kmalloc_NOFS (size_t sz) { return kmalloc(sz, GFP_NOFS ); }
+extern inline void *kmalloc_ATOMIC (size_t sz) { return kmalloc(sz, GFP_ATOMIC ); }
+extern inline void *kmalloc_USER (size_t sz) { return kmalloc(sz, GFP_USER ); }
+extern inline void *kmalloc_HIGHUSER(size_t sz) { return kmalloc(sz, GFP_HIGHUSER); }
+extern inline void *kmalloc_KERNEL (size_t sz) { return kmalloc(sz, GFP_KERNEL ); }
+extern inline void *kmalloc_NFS (size_t sz) { return kmalloc(sz, GFP_NFS ); }
+extern inline void *kmalloc_KSWAPD (size_t sz) { return kmalloc(sz, GFP_KSWAPD ); }
extern int FASTCALL(kmem_cache_reap(int));
From: Rusty Russell <[email protected]>
Date: Tue, 24 Sep 2002 17:28:00 +1000
In message <[email protected]> you write:
> Another idea is to make a gfp_flags_t, that worked very well
> for things like mm_segment_t.
But you can't or them together without some icky macro, unless they're
typedef to an integer type in which case gcc doesn't warn you if you
just stuck a "sizeof(x)" in there.
Maybe if you make it unsigned char or something.
If there aren't too many spots which need to "or" stuff,
all the "icky macro" stuff could be contained to gfp.h
Followup to: <[email protected]>
By author: Ingo Molnar <[email protected]>
In newsgroup: linux.dev.kernel
>
> we should do something about these. list_add() is hard, while we could
> introduce a separate type for list heads, there are some valid uses of
> non-head list_add(). But perhaps those could be separated out.
>
A very, very long time ago, back in the 0.99.14 days, we actually
tried to switch to using a C++ compiler to compile the kernel, so we
could guarantee type-safe linkage. g++ sucked back then, so the
experiment was abandoned, but I don't know if there would be interest
in trying again.
-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>
Am Die, 2002-09-24 um 06.49 schrieb Rusty Russell:
> Embarrassing, huh? But I just found a bug in my code cause by
> "if (strcmp(a,b))" instead of "if (!strcmp(a,b))".
> diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.38/include/linux/string.h working-2.5.38-streq/include/linux/string.h
> --- linux-2.5.38/include/linux/string.h 2002-06-06 21:38:40.000000000 +1000
> +++ working-2.5.38-streq/include/linux/string.h 2002-09-24 14:43:30.000000000 +1000
> @@ -15,7 +15,7 @@ extern "C" {
> extern char * strpbrk(const char *,const char *);
> extern char * strsep(char **,const char *);
> extern __kernel_size_t strspn(const char *,const char *);
> -
> +#define streq(a,b) (strcmp((a),(b)) == 0)
Considering most compares will only care for equality/non-equality and
not about the type of unequality a strcmp usually returns, wouldn't it
be more wise and faster to use an approach like memcpy for comparison
instead of that stupid compare each character approach?
Something along the lines of:
Start comparying by granules with the biggest type the architecture has
to offer which will fit into the length of the string and going down
to the size of 1 char bailing out as soon as the granules don't match.
--
Servus,
Daniel
On Wed, Sep 25, 2002 at 01:27:32PM +0200, Daniel Egger wrote:
> Something along the lines of:
> Start comparying by granules with the biggest type the architecture has
> to offer which will fit into the length of the string and going down
> to the size of 1 char bailing out as soon as the granules don't match.
Small problem - you need to find the length first. Which means you need
to scan each string, and use the smaller.
So, instead of iterating in some fashion over the strings once, you
iterate over one string calculating its length. You iterate over the
other string to find its length. Then you iterate over both strings
comparing them.
Wouldn't it be faster just to iterate over both strings at the same time
only once?
--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
Daniel Egger wrote:
> Am Die, 2002-09-24 um 06.49 schrieb Rusty Russell:
>
>
>>Embarrassing, huh? But I just found a bug in my code cause by
>>"if (strcmp(a,b))" instead of "if (!strcmp(a,b))".
>
>
>>diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.38/include/linux/string.h working-2.5.38-streq/include/linux/string.h
>>--- linux-2.5.38/include/linux/string.h 2002-06-06 21:38:40.000000000 +1000
>>+++ working-2.5.38-streq/include/linux/string.h 2002-09-24 14:43:30.000000000 +1000
>>@@ -15,7 +15,7 @@ extern "C" {
>> extern char * strpbrk(const char *,const char *);
>> extern char * strsep(char **,const char *);
>> extern __kernel_size_t strspn(const char *,const char *);
>>-
>>+#define streq(a,b) (strcmp((a),(b)) == 0)
>
>
> Considering most compares will only care for equality/non-equality and
> not about the type of unequality a strcmp usually returns, wouldn't it
> be more wise and faster to use an approach like memcpy for comparison
> instead of that stupid compare each character approach?
>
> Something along the lines of:
> Start comparying by granules with the biggest type the architecture has
> to offer which will fit into the length of the string and going down
> to the size of 1 char bailing out as soon as the granules don't match.
I see two problems with this - first is that strings are not and can not
be assumed to be on nice word boundaries. While x86 and some (many)
CPUs can actually read words (longs/etc) at non-natural boundaries,
they suffer in performance. Other architectures can not even do the
reads and will cause a trap/exception that *may* be handled in software.
(Or may not)
Second, unless you know for sure how long the string buffer is you can
not easily bound down the string in larger chunks. (It may hurt
reading that byte just beyond the end of the string since it is on
another page, etc)
There is also the added overhead of noticing the '\0' byte at the
end of the string since it may be in the middle of your 32-bit
64-bit) data value.
Given all of this, it is rather unlikely that for strings it is worth
doing anything different than the (usually) highly tuned assembly of
the strcmp() code.
(memcpy() has, by its nature and API at least the size which can
significantly help, along with architecture knowledge that it can
then use to pick the right)
A side note: While memcmp() could use the same logic of memcpy(),
the greater/less than result code specifically is documented at
byte level. This was a real PITA for Alpha CPUs to make fast,
especially the earlier ones that did not even have natural byte
operators. (Alpha was designed with UNICODE characters in mind
and performance as its top concern.)
--
Michael Sinz -- Director, Systems Engineering -- Worldgate Communications
A master's secrets are only as good as
the master's ability to explain them to others.
Am Mit, 2002-09-25 um 13.45 schrieb Russell King:
> Small problem - you need to find the length first. Which means you need
> to scan each string, and use the smaller.
In most cases one of the strings is constant and thus the length known
at compiletime. I have no idea whether we are allowed (in kernel space)
to read beyond the end of a string, iff we are we could simply use the
fixed length.
> Wouldn't it be faster just to iterate over both strings at the same time
> only once?
Time has shown that the obvious solution is often not the fastest one.
:) Though in this case it really might be.
--
Servus,
Daniel