While profiling something completely unrelated, I noticed
that on the workloads I used memchr for, I saw a 30%-40% improvement
in performance, with the following trivial changes...
(basically, it saves 3 operations for each call)
$ diff -Nurp linux/lib/string.c{-old,}
--- linux/lib/string.c-old 2007-08-22 11:18:54.000000000 +0200
+++ linux/lib/string.c 2007-08-22 11:19:20.000000000 +0200
@@ -623,10 +623,12 @@ EXPORT_SYMBOL(strstr);
void *memchr(const void *s, int c, size_t n)
{
const unsigned char *p = s;
- while (n-- != 0) {
- if ((unsigned char)c == *p++) {
- return (void *)(p - 1);
+ while (n != 0) {
+ if ((unsigned char)c == *p) {
+ return (void *)p;
}
+ n--;
+ p++;
}
return NULL;
}
_________________________________________________________________
A lot of passions? Collect all your personal info on one central location ,
for free! http://get.live.com/live/features
"lode leroy" <[email protected]> writes:
> While profiling something completely unrelated, I noticed
> that on the workloads I used memchr for, I saw a 30%-40% improvement
> in performance, with the following trivial changes...
> (basically, it saves 3 operations for each call)
What kind of workload? I didn't think anything in tree
was memchr intensive.
-Andi
On Wednesday 22 August 2007, lode leroy wrote:
> While profiling something completely unrelated, I noticed
> that on the workloads I used memchr for, I saw a 30%-40% improvement
> in performance, with the following trivial changes...
> (basically, it saves 3 operations for each call)
Yes, but then you could be a bit more explicit to the compiler
on what you are doing here:
void *memchr(const void *s, int c, size_t n)
{
const unsigned char *p = s;
for (; n != 0; n--, p++) {
if ((unsigned char)c == *p) {
return (void *)p;
}
return NULL;
}
Now the compiler should see the loop more clearly.
Best Regards
Ingo Oeser
On Thu, Aug 23, 2007 at 02:13:20AM +0200, Ingo Oeser wrote:
> On Wednesday 22 August 2007, lode leroy wrote:
> > While profiling something completely unrelated, I noticed
> > that on the workloads I used memchr for, I saw a 30%-40% improvement
> > in performance, with the following trivial changes...
> > (basically, it saves 3 operations for each call)
>
> Yes, but then you could be a bit more explicit to the compiler
> on what you are doing here:
>
> void *memchr(const void *s, int c, size_t n)
> {
> const unsigned char *p = s;
>
> for (; n != 0; n--, p++) {
> if ((unsigned char)c == *p) {
> return (void *)p;
> }
> return NULL;
> }
>
> Now the compiler should see the loop more clearly.
And you can do even better with this:
void *memchr(const void *s, int c, size_t n)
{
const unsigned char *p = s, *e = s + n;
const unsigned char *e = p + n;
for (; p < e ; p++)
if ((unsigned char)c == *p)
return (void *)p;
return NULL;
}
which changes the inner loop from:
50: 38 08 cmp %cl,(%eax)
52: 74 08 je 5c <memchr2+0x1a>
54: 4a dec %edx
55: 40 inc %eax
56: 85 d2 test %edx,%edx
58: 75 f6 jne 50 <memchr2+0xe>
to:
6e: 38 08 cmp %cl,(%eax)
70: 74 07 je 79 <memchr3+0x1b>
72: 40 inc %eax
73: 39 d0 cmp %edx,%eax
75: 72 f7 jb 6e <memchr3+0x10>
--
Mathematics is the supreme nostalgia of our time.
Matt Mackall wrote:
> 6e: 38 08 cmp %cl,(%eax)
> 70: 74 07 je 79 <memchr3+0x1b>
> 72: 40 inc %eax
>
It's a bit gross that the compiler is using inc here rather than lea or
add, but still...
Er, something's spending 30% of its time in memchr? This is not the
code to fix.
J
On Thu, Aug 23, 2007 at 06:03:29PM -0700, Jeremy Fitzhardinge wrote:
> Matt Mackall wrote:
> > 6e: 38 08 cmp %cl,(%eax)
> > 70: 74 07 je 79 <memchr3+0x1b>
> > 72: 40 inc %eax
> >
> It's a bit gross that the compiler is using inc here rather than lea or
> add, but still...
>
> Er, something's spending 30% of its time in memchr? This is not the
> code to fix.
Indeed. I'm just pointing out the general optimization of walking
one counter instead of two.
Hmm, perhaps the culprit is validate_nla?
--
Mathematics is the supreme nostalgia of our time.
On Aug 23 2007 19:13, Matt Mackall wrote:
>
>And you can do even better with this:
>
>void *memchr(const void *s, int c, size_t n)
>{
> const unsigned char *p = s, *e = s + n;
> const unsigned char *e = p + n;
Uhm, you have two "e"s in there.
> for (; p < e ; p++)
> if ((unsigned char)c == *p)
> return (void *)p;
>
> return NULL;
>}
Or do it glibc-style
void *memchr(const void *s, unsigned char c, size_t n)
{
...
for (; p + 3 < e; p += 4) {
if (c == p[0])
return (void *)&p[0];
if (c == p[1])
return (void *)&p[1];
if (c == p[2])
return (void *)&p[2];
if (c == p[3])
return (void *)&p[3];
}
... /* check the rest */
}
Jan
--
On Fri, Aug 24, 2007 at 02:54:48PM +0200, Jan Engelhardt wrote:
>
> On Aug 23 2007 19:13, Matt Mackall wrote:
> >
> >And you can do even better with this:
> >
> >void *memchr(const void *s, int c, size_t n)
> >{
> > const unsigned char *p = s, *e = s + n;
> > const unsigned char *e = p + n;
>
> Uhm, you have two "e"s in there.
Yep, that's what I get for editing in email.
> Or do it glibc-style
>
> void *memchr(const void *s, unsigned char c, size_t n)
> {
> ...
> for (; p + 3 < e; p += 4) {
> if (c == p[0])
> return (void *)&p[0];
> if (c == p[1])
> return (void *)&p[1];
> if (c == p[2])
> return (void *)&p[2];
> if (c == p[3])
> return (void *)&p[3];
> }
> ... /* check the rest */
> }
Yes, very funny.
--
Mathematics is the supreme nostalgia of our time.