Hi,
Like the other patch (by that other person), I think it is faster to not
do a strlen first.
E.g. replace this:
ize_t strlcpy(char *dest, const char *src, size_t size)
{
size_t ret = strlen(src);
if (size) {
size_t len = (ret >= size) ? size - 1 : ret;
memcpy(dest, src, len);
dest[len] = '\0';
}
return ret;
}
by this:
size_t strlcpy(char *dest, const char *src, size_t size)
{
char *tmp = dest;
for(;;)
{
*dest = *src;
if (!*src)
break;
if (--size == 0)
break;
dest++;
src++;
}
*dest = 0x00;
return dest - tmp;
}
patch:
diff -uNrBbd lib/string.c string-new.c
--- lib/string.c 2006-11-04 02:33:58.000000000 +0100
+++ string-new.c 2006-12-10 22:22:08.000000000 +0100
@@ -121,14 +121,24 @@
*/
size_t strlcpy(char *dest, const char *src, size_t size)
{
- size_t ret = strlen(src);
+ char *tmp = dest;
- if (size) {
- size_t len = (ret >= size) ? size - 1 : ret;
- memcpy(dest, src, len);
- dest[len] = '\0';
+ for(;;)
+ {
+ *dest = *src;
+ if (!*src)
+ break;
+
+ if (--size == 0)
+ break;
+
+ dest++;
+ src++;
}
- return ret;
+
+ *dest = 0x00;
+
+ return dest - tmp;
}
EXPORT_SYMBOL(strlcpy);
#endif
I've tested the speed difference with this:
http://www.vanheusden.com/misc/kernel-strlcpy-opt-test.c
and the speed difference is quite a bit on a P4: 28% faster.
Signed-off by: Folkert van Heusden <[email protected]>
Folkert van Heusden
--
http://www.vanheusden.com/multitail - multitail is tail on steroids. multiple
windows, filtering, coloring, anything you can think of
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com
Folkert van Heusden wrote:
> Like the other patch (by that other person), I think it is faster to not
> do a strlen first.
Debatable. You've replaced a call to the highly-optimized memcpy function
with a byte-by-byte copy. It's probably a wash performance wise (have you
benchmarked?) and is certainly less clear.
-Mitch
> > Like the other patch (by that other person), I think it is faster to not
> > do a strlen first.
> Debatable. You've replaced a call to the highly-optimized memcpy function
> with a byte-by-byte copy. It's probably a wash performance wise (have you
> benchmarked?) and is certainly less clear.
I tested it outside the kernel.
The test source can be found here:
http://www.vanheusden.com/misc/kernel-strlcpy-opt-test.c
Folkert van Heusden
--
Feeling generous? -> http://www.vanheusden.com/wishlist.php
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com
Folkert van Heusden wrote:
> Hi,
>
> Like the other patch (by that other person), I think it is faster to not
> do a strlen first.
> E.g. replace this:
> ize_t strlcpy(char *dest, const char *src, size_t size)
> {
> size_t ret = strlen(src);
>
> if (size) {
> size_t len = (ret >= size) ? size - 1 : ret;
> memcpy(dest, src, len);
> dest[len] = '\0';
> }
> return ret;
> }
> by this:
> size_t strlcpy(char *dest, const char *src, size_t size)
> {
> char *tmp = dest;
>
> for(;;)
> {
> *dest = *src;
> if (!*src)
> break;
>
> if (--size == 0)
> break;
>
> dest++;
> src++;
> }
>
> *dest = 0x00;
>
> return dest - tmp;
> }
> patch:
> diff -uNrBbd lib/string.c string-new.c
> --- lib/string.c 2006-11-04 02:33:58.000000000 +0100
> +++ string-new.c 2006-12-10 22:22:08.000000000 +0100
> @@ -121,14 +121,24 @@
> */
> size_t strlcpy(char *dest, const char *src, size_t size)
> {
> - size_t ret = strlen(src);
> + char *tmp = dest;
>
> - if (size) {
> - size_t len = (ret >= size) ? size - 1 : ret;
> - memcpy(dest, src, len);
> - dest[len] = '\0';
> + for(;;)
> + {
> + *dest = *src;
> + if (!*src)
> + break;
> +
> + if (--size == 0)
> + break;
> +
> + dest++;
> + src++;
> }
> - return ret;
> +
> + *dest = 0x00;
> +
> + return dest - tmp;
> }
> EXPORT_SYMBOL(strlcpy);
> #endif
>
>
> I've tested the speed difference with this:
> http://www.vanheusden.com/misc/kernel-strlcpy-opt-test.c
> and the speed difference is quite a bit on a P4: 28% faster.
>
>
> Signed-off by: Folkert van Heusden <[email protected]>
>
>
> Folkert van Heusden
The two do not do exactly the same. The first one handles 'size == 0'
(should not happen?) safely while the other does not.
--
Eyal Lebedinsky ([email protected]) <http://samba.org/eyal/>
attach .zip as .dat
> > by this:
> > size_t strlcpy(char *dest, const char *src, size_t size)
> > {
> > char *tmp = dest;
> >
> > for(;;)
> > {
> > *dest = *src;
> > if (!*src)
> > break;
> >
> > if (--size == 0)
> > break;
> >
> > dest++;
> > src++;
> > }
> >
> > *dest = 0x00;
> >
> > return dest - tmp;
> > }
> >
> > I've tested the speed difference with this:
> > http://www.vanheusden.com/misc/kernel-strlcpy-opt-test.c
> > and the speed difference is quite a bit on a P4: 28% faster.
>
> The two do not do exactly the same. The first one handles 'size == 0'
> (should not happen?) safely while the other does not.
Ok, small change:
diff -uNrBbd lib/string.c string-new.c
--- lib/string.c 2006-11-04 02:33:58.000000000 +0100
+++ string-new.c 2006-12-10 23:02:26.000000000 +0100
@@ -121,14 +122,27 @@
*/
size_t strlcpy(char *dest, const char *src, size_t size)
{
- size_t ret = strlen(src);
+ char *tmp = dest;
- if (size) {
- size_t len = (ret >= size) ? size - 1 : ret;
- memcpy(dest, src, len);
- dest[len] = '\0';
+ if (likeley(size > 0))
+ {
+ for(;;)
+ {
+ *dest = *src;
+ if (unlikely(!*src))
+ break;
+
+ if (unlikely(--size == 0))
+ break;
+
+ dest++;
+ src++;
}
- return ret;
+ }
+
+ *dest = 0x00;
+
+ return dest - tmp;
}
EXPORT_SYMBOL(strlcpy);
#endif
Folkert van Heusden
--
Ever wonder what is out there? Any alien races? Then please support
the seti@home project: setiathome.ssl.berkeley.edu
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com
On Sun, Dec 10, 2006 at 10:23:51PM +0100, Folkert van Heusden wrote:
> Hi,
>
> Like the other patch (by that other person), I think it is faster to not
> do a strlen first.
>...
> --- lib/string.c 2006-11-04 02:33:58.000000000 +0100
> +++ string-new.c 2006-12-10 22:22:08.000000000 +0100
> @@ -121,14 +121,24 @@
> */
> size_t strlcpy(char *dest, const char *src, size_t size)
> {
> - size_t ret = strlen(src);
> + char *tmp = dest;
>
> - if (size) {
> - size_t len = (ret >= size) ? size - 1 : ret;
> - memcpy(dest, src, len);
> - dest[len] = '\0';
> + for(;;)
> + {
> + *dest = *src;
> + if (!*src)
> + break;
> +
> + if (--size == 0)
> + break;
> +
> + dest++;
> + src++;
> }
> - return ret;
> +
> + *dest = 0x00;
> +
> + return dest - tmp;
>...
Two bugs in your code:
- you copy a maximum of size bytes _plus_ \0
- size == 0 is no longer handled correctly
> I've tested the speed difference with this:
> http://www.vanheusden.com/misc/kernel-strlcpy-opt-test.c
> and the speed difference is quite a bit on a P4: 28% faster.
>...
My Athlon says:
org: 2.400000
new: 6.710000
IOW, your version is much slower.
But the main question is actually:
Does the performance of this function matter anywhere inside the kernel?
Is strlcpy() used in any fast path?
If not, there's no point in trying to optimize it.
> Folkert van Heusden
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
Hi, Adrian & Friends!
>> Like the other patch (by that other person), I think it is faster to not
>> do a strlen first.
There are PowerPC architectures, where a strlen is a matter of a single
instruction of the CPU.
Quote: "PowerPC 405 and 440 instruction sets offer the powerful
Determine Left-Most Zero Byte (DLMZB) instruction."
That's propably hard to beat.
So, if we really want to speed up these things, we should write
code which helps the compiler get that optimized instructions for
us - by improving the code of strlen() i.e.
Reference:
http://www-03.ibm.com/chips/power/powerpc/newsletter/sep2003/ppc_process_at_work2.html
Greets,
Clemens Koller
_______________________________
R&D Imaging Devices
Anagramm GmbH
Rupert-Mayer-Str. 45/1
81379 Muenchen
Germany
http://www.anagramm-technology.com
Phone: +49-89-741518-50
Fax: +49-89-741518-19