2006-12-10 20:52:33

by folkert

[permalink] [raw]
Subject: strncpy optimalisation? (lib/string.c)

Hi,

In lib/string.c we have:

char *strncpy(char *dest, const char *src, size_t count)
{
char *tmp = dest;

while (count) {
if ((*tmp = *src) != 0)
src++;
tmp++;
count--;
}
return dest;
}

now I wonder isn't this ineffecient when strlen(src) < count? It would
then, if I'm correct, iterate count-strlen(src) times doing useless
increment/decrement. And since there are aprox. 580 instances in the
2.6.18.2 source, maybe some efficency can be won here.
Wouldn't it be better to do:
if ((*tmp = *src) == 0x00)
break;

So that would be:
--- lib/string.c 2006-11-04 02:33:58.000000000 +0100
+++ string-new.c 2006-12-10 21:50:05.000000000 +0100
@@ -97,8 +97,8 @@
char *tmp = dest;

while (count) {
- if ((*tmp = *src) != 0)
- src++;
+ if ((*tmp = *src) == 0x00)
+ break;
tmp++;
count--;
}


Folkert van Heusden

--
http://www.vanheusden.com/multitail - win een vlaai van multivlaai! zorg
ervoor dat multitail opgenomen wordt in Fedora Core, AIX, Solaris of
HP/UX en win een vlaai naar keuze
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com


2006-12-10 21:06:31

by Willy Tarreau

[permalink] [raw]
Subject: Re: strncpy optimalisation? (lib/string.c)

On Sun, Dec 10, 2006 at 09:52:30PM +0100, Folkert van Heusden wrote:
> Hi,
>
> In lib/string.c we have:
>
> char *strncpy(char *dest, const char *src, size_t count)
> {
> char *tmp = dest;
>
> while (count) {
> if ((*tmp = *src) != 0)
> src++;
> tmp++;
> count--;
> }
> return dest;
> }
>
> now I wonder isn't this ineffecient when strlen(src) < count? It would
> then, if I'm correct, iterate count-strlen(src) times doing useless
> increment/decrement. And since there are aprox. 580 instances in the
> 2.6.18.2 source, maybe some efficency can be won here.
> Wouldn't it be better to do:
> if ((*tmp = *src) == 0x00)
> break;
>
> So that would be:
> --- lib/string.c 2006-11-04 02:33:58.000000000 +0100
> +++ string-new.c 2006-12-10 21:50:05.000000000 +0100
> @@ -97,8 +97,8 @@
> char *tmp = dest;
>
> while (count) {
> - if ((*tmp = *src) != 0)
> - src++;
> + if ((*tmp = *src) == 0x00)
> + break;
> tmp++;
> count--;
> }

While your code is faster, it does not do exactly the same.
Original code completely pads the destination with zeroes,
while yours only adds the last zero. Your code does what
strncpy() is said to do, but maybe there's a particular
reason for it to behave differently in the kernel (helping
during debugging, or filling specific structs).

Just out of curiosity, have you tried to do a general
benchmark to check if original code eats much CPU ?

Regards,
Willy

2006-12-10 21:31:15

by Mitchell Blank Jr

[permalink] [raw]
Subject: Re: strncpy optimalisation? (lib/string.c)

Willy Tarreau wrote:
> Original code completely pads the destination with zeroes,
> while yours only adds the last zero. Your code does what
> strncpy() is said to do, but maybe there's a particular
> reason for it to behave differently in the kernel

No, the kernel's strncpy() behaves the same as the one in libc. Run
"man strncpy" if you don't believe me.

In the common case where you just want to copy a string and avoid
overflow use strlcpy() instead

-Mitch

2006-12-10 21:35:25

by folkert

[permalink] [raw]
Subject: Re: strncpy optimalisation? (lib/string.c)

...
> > now I wonder isn't this ineffecient when strlen(src) < count? It would
> > then, if I'm correct, iterate count-strlen(src) times doing useless
> > increment/decrement. And since there are aprox. 580 instances in the
> > 2.6.18.2 source, maybe some efficency can be won here.
> > Wouldn't it be better to do:
> > if ((*tmp = *src) == 0x00)
> > break;
> > So that would be:
> > --- lib/string.c 2006-11-04 02:33:58.000000000 +0100
> > +++ string-new.c 2006-12-10 21:50:05.000000000 +0100
> > @@ -97,8 +97,8 @@
> > char *tmp = dest;
> >
> > while (count) {
> > - if ((*tmp = *src) != 0)
> > - src++;
> > + if ((*tmp = *src) == 0x00)
> > + break;
> > tmp++;
> > count--;
> > }
> While your code is faster, it does not do exactly the same.
> Original code completely pads the destination with zeroes,
> while yours only adds the last zero. Your code does what
> strncpy() is said to do, but maybe there's a particular
> reason for it to behave differently in the kernel (helping
> during debugging, or filling specific structs).
> Just out of curiosity, have you tried to do a general
> benchmark to check if original code eats much CPU ?

My patch was incorrect; it would only repeatingly copy the first
character from the source.
This one (tested in test-code seperate from kernel) works:
diff -uNrBbd lib/string.c string-new.c
--- lib/string.c 2006-11-04 02:33:58.000000000 +0100
+++ string-new.c 2006-12-10 22:34:39.000000000 +0100
@@ -97,9 +97,10 @@
char *tmp = dest;

while (count) {
- if ((*tmp = *src) != 0)
- src++;
+ if (unlikely((*tmp = *src) == 0x00))
+ break;
tmp++;
+ src++;
count--;
}
return dest;

The improvement in speed depends on the size of the source and
destination. Maybe i did something wrong but it seems that in all cases
the new version is faster.

Test can be found here:
http://www.vanheusden.com/misc/kernel-strncpy-opt-test.c


Signed-off by: Folkert van Heusden <[email protected]>

Folkert van Heusden

--
http://www.vanheusden.com/multitail - win een vlaai van multivlaai! zorg
ervoor dat multitail opgenomen wordt in Fedora Core, AIX, Solaris of
HP/UX en win een vlaai naar keuze
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2006-12-10 21:39:40

by folkert

[permalink] [raw]
Subject: Re: strncpy optimalisation? (lib/string.c)

> > Original code completely pads the destination with zeroes,
> > while yours only adds the last zero. Your code does what
> > strncpy() is said to do, but maybe there's a particular
> > reason for it to behave differently in the kernel
> No, the kernel's strncpy() behaves the same as the one in libc. Run
> "man strncpy" if you don't believe me.
> In the common case where you just want to copy a string and avoid
> overflow use strlcpy() instead

Oops you're right! Maybe someone should take a look if the strncpy's
should be replaced by strlcpy's then because it is (ought to be) faster.


Folkert van Heusden

--
Ever wonder what is out there? Any alien races? Then please support
the seti@home project: setiathome.ssl.berkeley.edu
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2006-12-10 21:46:45

by Mitchell Blank Jr

[permalink] [raw]
Subject: Re: strncpy optimalisation? (lib/string.c)

Folkert van Heusden wrote:
> This one (tested in test-code seperate from kernel) works:

No it doesn't!

strncpy() guarantees that the entire destination buffer is written to.
If you call
strncpy(dest, "foo", 10000)
then you MUST write to 10000 bytes of memory, or your strncpy() is buggy.

Your patches basically turn strncpy() into strlcpy(). Don't do that.
They're separate functions for a reason.

-Mitch

2006-12-10 21:50:04

by folkert

[permalink] [raw]
Subject: Re: strncpy optimalisation? (lib/string.c)

> > This one (tested in test-code seperate from kernel) works:
> No it doesn't!
> strncpy() guarantees that the entire destination buffer is written to.
> If you call
> strncpy(dest, "foo", 10000)
> then you MUST write to 10000 bytes of memory, or your strncpy() is buggy.
> Your patches basically turn strncpy() into strlcpy(). Don't do that.
> They're separate functions for a reason.

Yes I saw that, didn't read your e-mail before I read Willy's message.

Can you please also have a look at my strlcpy patch?


Folkert van Heusden

--
http://www.vanheusden.com/recoverdm/ - got an unreadable cd with scratches?
recoverdm might help you recovering data
--------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2006-12-10 23:29:47

by Bernd Petrovitsch

[permalink] [raw]
Subject: Re: strncpy optimalisation? (lib/string.c)

On Sun, 2006-12-10 at 22:39 +0100, Folkert van Heusden wrote:
> > > Original code completely pads the destination with zeroes,
> > > while yours only adds the last zero. Your code does what
> > > strncpy() is said to do, but maybe there's a particular
> > > reason for it to behave differently in the kernel
> > No, the kernel's strncpy() behaves the same as the one in libc. Run
> > "man strncpy" if you don't believe me.
> > In the common case where you just want to copy a string and avoid
> > overflow use strlcpy() instead
>
> Oops you're right! Maybe someone should take a look if the strncpy's
> should be replaced by strlcpy's then because it is (ought to be) faster.

The last time some folks did this (quite automatically) ended in newly
introduced bugs leaking old/stale data from kernel top user space. Alan
Cox went over it (again) and fixed those broken "optimizations".

So whoever wants to do this, look for such issues too.

Bernd
--
Firmix Software GmbH http://www.firmix.at/
mobil: +43 664 4416156 fax: +43 1 7890849-55
Embedded Linux Development and Services