2024-01-29 14:15:58

by Rodrigo Campos

[permalink] [raw]
Subject: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

The return code should always be strlen(src) + strlen(dst), but dst is
considered shorter if size is less than strlen(dst).

While we are there, make sure to copy at most size-1 bytes and
null-terminate the dst buffer.

Signed-off-by: Rodrigo Campos <[email protected]>
---
tools/include/nolibc/string.h | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
index ed15c22b1b2a..b2149e1342a8 100644
--- a/tools/include/nolibc/string.h
+++ b/tools/include/nolibc/string.h
@@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen)
static __attribute__((unused))
size_t strlcat(char *dst, const char *src, size_t size)
{
- size_t len;
char c;
+ size_t len = strlen(dst);
+ size_t ret = strlen(src) + (size < len? size: len);

- for (len = 0; dst[len]; len++)
- ;
-
- for (;;) {
+ for (;len < size;) {
c = *src;
- if (len < size)
+ if (len < size - 1)
dst[len] = c;
+ if (len == size - 1)
+ dst[len] = '\0';
if (!c)
break;
len++;
src++;
}

- return len;
+ return ret;
}

static __attribute__((unused))
--
2.43.0



2024-02-11 10:52:46

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

Hi Rodrigo,

first, thanks for the series!

On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote:
> The return code should always be strlen(src) + strlen(dst), but dst is
> considered shorter if size is less than strlen(dst).
>
> While we are there, make sure to copy at most size-1 bytes and
> null-terminate the dst buffer.
>
> Signed-off-by: Rodrigo Campos <[email protected]>
> ---
> tools/include/nolibc/string.h | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
> index ed15c22b1b2a..b2149e1342a8 100644
> --- a/tools/include/nolibc/string.h
> +++ b/tools/include/nolibc/string.h
> @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen)
> static __attribute__((unused))
> size_t strlcat(char *dst, const char *src, size_t size)
> {
> - size_t len;
> char c;
> + size_t len = strlen(dst);
> + size_t ret = strlen(src) + (size < len? size: len);

From what I'm reading in the man page, ret should *always* be the sum
of the two string lengths. I guess it helps for reallocation. It's even
explicitly mentioned:

"While this may seem somewhat confusing, it was done to make truncation
detection simple."

Above ret will be bound to the existing size so a realloc wouldn't work.
Thus I think the correct solution is:

size_t ret = strlen(src) + len;

> - for (len = 0; dst[len]; len++)
> - ;
> -
> - for (;;) {
> + for (;len < size;) {
> c = *src;
> - if (len < size)
> + if (len < size - 1)
> dst[len] = c;
> + if (len == size - 1)
> + dst[len] = '\0';
> if (!c)
> break;
> len++;
> src++;
> }
>
> - return len;
> + return ret;
> }

The test inside the loop is going to make this not very efficient. Same
for the fact that we're measuring the length of src twice (once via
strlen, a second time through the loop). I've just had a look and it
compiles to 77 bytes at -Os. A simpler variant would consist in trying
to copy what fits in <size> and once reached, go on just for trailing
zero and the size measurement:

size_t strlcat(char *dst, const char *src, size_t size)
{
size_t len = strlen(dst);

while (len < size) {
if (!(dst[len] = *src))
return len;
len++;
src++;
}

/* end of src not found before size */

if (size)
dst[size - 1] = '\0';

while (*src++)
len++;

return len;
}

For me it's 58 bytes, or 19 less / 25% smaller, and at first glance it
should do the right thing as well.

What do you think ?

Thanks!
Willy

2024-02-12 23:18:36

by Rodrigo Campos

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On 2/11/24 11:48, Willy Tarreau wrote:
> Hi Rodrigo,
>
> first, thanks for the series!

Thank you, for your time and review! :)

> On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote:
>> The return code should always be strlen(src) + strlen(dst), but dst is
>> considered shorter if size is less than strlen(dst).
>>
>> While we are there, make sure to copy at most size-1 bytes and
>> null-terminate the dst buffer.
>>
>> Signed-off-by: Rodrigo Campos <[email protected]>
>> ---
>> tools/include/nolibc/string.h | 14 +++++++-------
>> 1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
>> index ed15c22b1b2a..b2149e1342a8 100644
>> --- a/tools/include/nolibc/string.h
>> +++ b/tools/include/nolibc/string.h
>> @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen)
>> static __attribute__((unused))
>> size_t strlcat(char *dst, const char *src, size_t size)
>> {
>> - size_t len;
>> char c;
>> + size_t len = strlen(dst);
>> + size_t ret = strlen(src) + (size < len? size: len);
>
> From what I'm reading in the man page, ret should *always* be the sum
> of the two string lengths. I guess it helps for reallocation. It's even
> explicitly mentioned:
>
> "While this may seem somewhat confusing, it was done to make truncation
> detection simple."

Yes, that was my *first* understanding of the manpage too. But it
doesn't seem to be the correct interpretation.

Also, this is exactly what I tried to say in the cover letter, with:

I thought the manpage was clear, but when checking against that,
I noted a few twists (like the manpage says the return code of
strlcat is strlen(src) + strlen(dst), but it was not clear it is
not that if size < strlen(dst). When looking at the libbsd
implementation and re-reading the manpage, I understood what it
really meant).


Sorry if I wasn't clear. Let me try again.

My first interpretation of the manpage was also that, and I think it
would make sense to be that one. However, it is wrong, they also say
this, that is what made me add the ternary operator:

Note, however, that if strlcat() traverses size characters
without finding a NUL, the length of the string is considered
to be size and the destination string will not be NUL
terminated (since there was no space for the NUL)

So, my interpretation is that it is the sum of both, except when we
can't find the NUL in that size, in which case we conside "size" to be
the len of dst.

If you compare it with the output of libbsd, the return code seems to be
exactly that. I was surprised too, as the manpage seem so clear... :-/

> Above ret will be bound to the existing size so a realloc wouldn't work.
> Thus I think the correct solution is:

Note that this implementation fails the tests added on patch 4. I've
tested them (output and return code) to match the libbsd implementation.


> The test inside the loop is going to make this not very efficient. Same
> for the fact that we're measuring the length of src twice (once via
> strlen, a second time through the loop). I've just had a look and it
> compiles to 77 bytes at -Os. A simpler variant would consist in trying

How are you measuring the size?

I've added noinline to strlcat to the version I sent, so now it is shown
in nm, but I have this as output:

$ nm --size -t x test.o
0000000000000004 V errno
0000000000000006 t strlcat.constprop.0
0000000000000008 V _auxv
0000000000000008 V environ
000000000000000e W strlen
000000000000000f W _start
0000000000000018 W raise
000000000000001b W abort
000000000000004c T main
000000000000005a t u64toa_r.isra.0
0000000000000095 W _start_c
00000000000002a8 t printf

How are you measuring it there?

Sorry, I'm not familiar with this :)


> to copy what fits in <size> and once reached, go on just for trailing
> zero and the size measurement:
>
> size_t strlcat(char *dst, const char *src, size_t size)
> {
> size_t len = strlen(dst);

The thing is, we need to return always at least strlen(src). Maybe plus
something else. Even if size==0 and we don't copy anything.

Maybe we can special case that, so we simplify the loop, and it's smaller?

I've been playing, but as I can't measure the size, I'm not sure what is
really better. I'd like to play a little once I know how to measure it :)



Best,
Rodrigo

2024-02-13 05:28:06

by Rodrigo Campos

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On 2/13/24 00:16, Rodrigo Campos wrote:
> On 2/11/24 11:48, Willy Tarreau wrote:
>> The test inside the loop is going to make this not very efficient. Same
>> for the fact that we're measuring the length of src twice (once via
>> strlen, a second time through the loop). I've just had a look and it
>> compiles to 77 bytes at -Os. A simpler variant would consist in trying

What code are you compiling that uses strlcat?

I've tried dropping almost all the flags to leave only:
-Os -nostdlib -lgcc

And I still don't see the 77 bytes.


> $ nm --size -t x test.o
>     0000000000000004 V errno
>     0000000000000006 t strlcat.constprop.0
>     0000000000000008 V _auxv
>     0000000000000008 V environ
>     000000000000000e W strlen
>     000000000000000f W _start
>     0000000000000018 W raise
>     000000000000001b W abort
>     000000000000004c T main
>     000000000000005a t u64toa_r.isra.0
>     0000000000000095 W _start_c
>     00000000000002a8 t printf
>
> How are you measuring it there?

btw, sorry for the late reply. I was on a flight that was rebooked, it
took me some days to arrive. But I finally arrived and have internet now :)

The constprop seems to be some gcc optimization to simplify things.

I see what happens. The example I was compiling was left with size=0 (a
leftover from when I was testing the code). That is very easy for gcc to
optimize it all out. That is why it can make it so small.

If I compile the example from my original email, though, I still don't
see the 77 bytes you mention. I see 31 in the nm output, that in decimal
is 49 bytes.

Maybe tomorrow with some sleep after the long flight, I see what else is
happening :)



Best,
Rodrigo

2024-02-13 06:06:31

by Rodrigo Campos

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On 2/11/24 11:48, Willy Tarreau wrote:
> For me it's 58 bytes, or 19 less / 25% smaller, and at first glance it
> should do the right thing as well.

Oh, here I get almost the same (57 bytes), so it must be the example we
are compiling what is different.

I'll check this after some sleep :)

2024-02-13 06:20:32

by Rodrigo Campos

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On 2/13/24 00:16, Rodrigo Campos wrote:
> On 2/11/24 11:48, Willy Tarreau wrote:
>> Hi Rodrigo,
>>
>> first, thanks for the series!
>
> Thank you, for your time and review! :)
>
>> On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote:
>>> The return code should always be strlen(src) + strlen(dst), but dst is
>>> considered shorter if size is less than strlen(dst).
>>>
>>> While we are there, make sure to copy at most size-1 bytes and
>>> null-terminate the dst buffer.
>>>
>>> Signed-off-by: Rodrigo Campos <[email protected]>
>>> ---
>>>   tools/include/nolibc/string.h | 14 +++++++-------
>>>   1 file changed, 7 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/tools/include/nolibc/string.h
>>> b/tools/include/nolibc/string.h
>>> index ed15c22b1b2a..b2149e1342a8 100644
>>> --- a/tools/include/nolibc/string.h
>>> +++ b/tools/include/nolibc/string.h
>>> @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen)
>>>   static __attribute__((unused))
>>>   size_t strlcat(char *dst, const char *src, size_t size)
>>>   {
>>> -    size_t len;
>>>       char c;
>>> +    size_t len = strlen(dst);
>>> +    size_t ret = strlen(src) + (size < len? size: len);
>>
>>  From what I'm reading in the man page, ret should *always* be the sum
>> of the two string lengths. I guess it helps for reallocation. It's even
>> explicitly mentioned:
>>
>>    "While this may seem somewhat confusing, it was done to make
>> truncation
>>     detection simple."
>
> Yes, that was my *first* understanding of the manpage too. But it
> doesn't seem to be the correct interpretation.
>
> Also, this is exactly what I tried to say in the cover letter, with:
>
>     I thought the manpage was clear, but when checking against that,
>     I noted a few twists (like the manpage says the return code of
>     strlcat is strlen(src) + strlen(dst), but it was not clear it is
>     not that if size < strlen(dst). When looking at the libbsd
>     implementation and re-reading the manpage, I understood what it
>     really meant).
>
>
> Sorry if I wasn't clear. Let me try again.
>
> My first interpretation of the manpage was also that, and I think it
> would make sense to be that one. However, it is wrong, they also say
> this, that is what made me add the ternary operator:
>
>     Note, however, that if strlcat() traverses size characters
>     without finding a NUL, the length of the string is considered
>     to be  size and the destination string will not be NUL
>     terminated (since there was no space for the NUL)
>
> So, my interpretation is that it is the sum of both, except when we
> can't find the NUL in that size, in which case we conside "size" to be
> the len of dst.
>
> If you compare it with the output of libbsd, the return code seems to be
> exactly that. I was surprised too, as the manpage seem so clear... :-/


I'm not convinced now if that is the right interpretation. I'll check
the libbsd code, I don't remember it now, it's been a few days already.

My memory is that it was not something so sane.

2024-02-13 07:03:08

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

Hi Rodrigo,

On Tue, Feb 13, 2024 at 12:16:06AM +0100, Rodrigo Campos wrote:
> On 2/11/24 11:48, Willy Tarreau wrote:
> > Hi Rodrigo,
> >
> > first, thanks for the series!
>
> Thank you, for your time and review! :)

You're welcome. I'm sorry I couldn't respond earlier.

> > On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote:
> > > The return code should always be strlen(src) + strlen(dst), but dst is
> > > considered shorter if size is less than strlen(dst).
> > >
> > > While we are there, make sure to copy at most size-1 bytes and
> > > null-terminate the dst buffer.
> > >
> > > Signed-off-by: Rodrigo Campos <[email protected]>
> > > ---
> > > tools/include/nolibc/string.h | 14 +++++++-------
> > > 1 file changed, 7 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
> > > index ed15c22b1b2a..b2149e1342a8 100644
> > > --- a/tools/include/nolibc/string.h
> > > +++ b/tools/include/nolibc/string.h
> > > @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen)
> > > static __attribute__((unused))
> > > size_t strlcat(char *dst, const char *src, size_t size)
> > > {
> > > - size_t len;
> > > char c;
> > > + size_t len = strlen(dst);
> > > + size_t ret = strlen(src) + (size < len? size: len);
> >
> > From what I'm reading in the man page, ret should *always* be the sum
> > of the two string lengths. I guess it helps for reallocation. It's even
> > explicitly mentioned:
> >
> > "While this may seem somewhat confusing, it was done to make truncation
> > detection simple."
>
> Yes, that was my *first* understanding of the manpage too. But it doesn't
> seem to be the correct interpretation.
>
> Also, this is exactly what I tried to say in the cover letter, with:
>
> I thought the manpage was clear, but when checking against that,
> I noted a few twists (like the manpage says the return code of
> strlcat is strlen(src) + strlen(dst), but it was not clear it is
> not that if size < strlen(dst). When looking at the libbsd
> implementation and re-reading the manpage, I understood what it
> really meant).
>
>
> Sorry if I wasn't clear. Let me try again.
>
> My first interpretation of the manpage was also that, and I think it would
> make sense to be that one. However, it is wrong, they also say this, that is
> what made me add the ternary operator:
>
> Note, however, that if strlcat() traverses size characters
> without finding a NUL, the length of the string is considered
> to be size and the destination string will not be NUL
> terminated (since there was no space for the NUL)
>
> So, my interpretation is that it is the sum of both, except when we can't
> find the NUL in that size, in which case we conside "size" to be the len of
> dst.

I've read it as well and I don't interpret it the same way. I'm reading it
as "if dst doesn't contain a NUL char before size, its length is considered
to be size", and the reason is explicitly explained just after:

This keeps strlcat() from running off the end of a string. In practice
this should not happen (as it means that either size is incorrect or
that dst is not a proper ``C'' string).

So this explicitly means that supporting this specific use case is by
definition incompatible with the use of strlen(). Thus it's a matter
of choice for us, either we explicitly want to support invalid strings
in the destination and we need the check but we're not allowed to use
strlen() on dst, or we're not interested in such bogus cases and we can
stick to strlen() and the test is not needed. But the test combined with
strlen() is not logical.

> If you compare it with the output of libbsd, the return code seems to be
> exactly that. I was surprised too, as the manpage seem so clear... :-/

I trust you since I have not tried. I understand the rationale, i.e. still
being able to realloc() the new string, except that the caller must be
extremely prudent here since there's no way for it to know that it faces
a non-terminated string and that it must resort to a special code path
that relies on two strlcpy() instead of strlcat().

> > Above ret will be bound to the existing size so a realloc wouldn't work.
> > Thus I think the correct solution is:
>
> Note that this implementation fails the tests added on patch 4. I've tested
> them (output and return code) to match the libbsd implementation.

OK.

> > The test inside the loop is going to make this not very efficient. Same
> > for the fact that we're measuring the length of src twice (once via
> > strlen, a second time through the loop). I've just had a look and it
> > compiles to 77 bytes at -Os. A simpler variant would consist in trying
>
> How are you measuring the size?
>
> I've added noinline to strlcat to the version I sent, so now it is shown in
> nm, but I have this as output:
>
> $ nm --size -t x test.o
> 0000000000000004 V errno
> 0000000000000006 t strlcat.constprop.0
> 0000000000000008 V _auxv
> 0000000000000008 V environ
> 000000000000000e W strlen
> 000000000000000f W _start
> 0000000000000018 W raise
> 000000000000001b W abort
> 000000000000004c T main
> 000000000000005a t u64toa_r.isra.0
> 0000000000000095 W _start_c
> 00000000000002a8 t printf
>
> How are you measuring it there?
>
> Sorry, I'm not familiar with this :)

Oh, sorry for not providing the detalis! I just remove "static" in front
of the function. The reason for this is that if the function is used only
once in a program, and it is inlined, you never know in practice how it
will be inlined, depending on the opportunities the compiler will face
in the calling function. However, when it compiles it into its own
function, you get a better picture at the emitted code.

A second option, that's sometimes more convenient when you're hacking
directly in the include files themselves, is to write wrappers around
these functions and look at them. For example:

$ cd tools/testing/selftests/nolibc/
$ printf "size_t test_strlcat(char *dst, const char *src, size_t size) { return strlcat(dst, src, size); }\n" | gcc-9.5 -xc -c -Os -I sysroot/x86/include -include nolibc.h - -o test.o
$ nm --size test.o
0000000000000004 V errno
0000000000000008 V _auxv
0000000000000008 V environ
000000000000000f W _start
0000000000000018 W raise
000000000000001b W abort
0000000000000025 T test_strlcat
000000000000008d W _start_c
$ objdump --disassemble=test_strlcat test.o
...

I can't say I'm having a preference, it depends how I'm proceeding.
When comparing multiple variants of a same function, I generally like
to just copy them into a new file under different names so that I can
compare them all at once.

> > to copy what fits in <size> and once reached, go on just for trailing
> > zero and the size measurement:
> >
> > size_t strlcat(char *dst, const char *src, size_t size)
> > {
> > size_t len = strlen(dst);
>
> The thing is, we need to return always at least strlen(src). Maybe plus
> something else. Even if size==0 and we don't copy anything.

Yes, but that's exactly what the function does, look at the end:

while (*src++)
len++;

return len;

> Maybe we can special case that, so we simplify the loop, and it's smaller?

As a general rule, to keep code small it's best to avoid special cases
and to make them part of the general case as much as possible. And if
some special cases need to be made, as much as possible we need to
arrange them around existing jump or return points, i.e. preferably
just before a return statement that uses a value that was already
computed, or sometimes around a break or continue in a loop, but
conditions inside loops should be avoided as much as possible because
compilers often maintain multiple indexes inside loops (e.g. both target
pointer and a counter), while conditions also tend to consume registers
that force the compiler to perform some permutations to keep all of its
variables available, the worst being calling functions from loops since
the ABI indicates that a number of registers are clobbered and need to
be saved and restored. But these are essentially hints and need to be
verified on a case-by-case basis. Similarly, some architectures provide
convenient addressing mechanisms such as a=b[c+d] that we have on x86
and not necessarily elsewhere, and which favors the usage of offsets
instead of pointers. But again it depends on a lot of variables.

> I've been playing, but as I can't measure the size, I'm not sure what is
> really better. I'd like to play a little once I know how to measure it :)

Be careful not to spend too much time on it, nm --size and objdump are
quickly addictive ;-)

Willy

2024-02-14 15:19:23

by Rodrigo Campos

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On 2/13/24 08:02, Willy Tarreau wrote:
> Hi Rodrigo,
>
> On Tue, Feb 13, 2024 at 12:16:06AM +0100, Rodrigo Campos wrote:
>> On 2/11/24 11:48, Willy Tarreau wrote:
>>> Hi Rodrigo,
> I've read it as well and I don't interpret it the same way. I'm reading it
> as "if dst doesn't contain a NUL char before size, its length is considered
> to be size", and the reason is explicitly explained just after:

Very true, I mixed dst and source. Thanks!

>> How are you measuring it there?
>>
>> Sorry, I'm not familiar with this :)
>
> Oh, sorry for not providing the detalis! I just remove "static" in front
> of the function.

Thanks for the examples and heuristics to keep code small :)

>> I've been playing, but as I can't measure the size, I'm not sure what is
>> really better. I'd like to play a little once I know how to measure it :)
>
> Be careful not to spend too much time on it, nm --size and objdump are
> quickly addictive ;-)

Haha, it is definitely addictive.


I'll share some new versions in a minute.

2024-02-14 16:00:02

by Rodrigo Campos

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On 2/11/24 11:48, Willy Tarreau wrote:
> On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote:
> The test inside the loop is going to make this not very efficient. Same
> for the fact that we're measuring the length of src twice (once via
> strlen, a second time through the loop). I've just had a look and it
> compiles to 77 bytes at -Os. A simpler variant would consist in trying

The sizes here don't match that, even using gcc 9.5.0 from debian sid
(your example in the other email was calling gcc 9.5).

Here I see bigger sizes, even using the same line you share to compile.

> For me it's 58 bytes, or 19 less / 25% smaller, and at first glance it
> should do the right thing as well.

I see 69 bytes for that func (nm --size says 45, that is in hex).
The function I posted in the patchset I see it as 101 bytes, so that is
here is 32 bytes less here.

Here are two versions that are significantly shorter than the 101 bytes,
that pass the tests (I added more to account for the src vs dst mismatch
that was easy to pass tests when both buffers have the same size as they
did before).

size_t strlcat_rata(char *dst, const char *src, size_t size)
{
const char *orig_src = src;
size_t len = 0;
for (;len < size; len++) {
if (dst[len] == '\0')
break;
}

/* Let's copy len < n < size-1 times from src.
* size is unsigned, so instead of size-1, that can wrap around,
* let's use len + 1 */
while (len + 1 < size) {
dst[len] = *src;
if (*src == '\0')
break;
len++;
src++;
}

if (src != orig_src)
dst[len] = '\0';

while (*src++)
len++;

return len;
}

This one compiles to 81 bytes here using gcc 13.2.0 and to 83 using gcc
9.5.0. Compared to the one posted in the patchset, it is significantly
smaller.


One based in the version you posted (uses strlen(dst) instead), is this one:

size_t strlcat_willy_fixed(char *dst, const char *src, size_t size)
{
const char *orig_src = src;
size_t len = strlen(dst);
if (size < len)
len = size;

for (;len + 1 < size; len++, src++) {
dst[len] = *src;
if (*src == '\0')
break;
}

if (orig_src != src)
dst[len] = '\0';

while (*src++)
len++;

return len;
}


Note the "if size < len, then len=size", I couldn't get rid of it
because we need to account for the smaller size of dst if we don't get
passed it for the return code.

This one is 90 bytes here.


What do you think? Can you make them shorter?

If you like one of these, I can repost with the improved tests too.

2024-02-14 22:19:24

by Rodrigo Campos

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On 2/14/24 16:34, Rodrigo Campos wrote:
> size_t strlcat_rata(char *dst, const char *src, size_t size)
> {
>         const char *orig_src = src;
>         size_t len = 0;
>         for (;len < size; len++) {
>                 if (dst[len] == '\0')
>                         break;
>         }

If you think about it, this is strnlen() and what follows is strncat().

> size_t strlcat_willy_fixed(char *dst, const char *src, size_t size)
> {
>         const char *orig_src = src;
>         size_t len = strlen(dst);
>         if (size < len)
>                 len = size;

Same here.

We can simplify the code by using them, but the size doesn't seem to be
smaller. Let me see what I can do.

2024-02-14 22:47:40

by Rodrigo Campos

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On 2/14/24 23:03, Rodrigo Campos wrote:
> On 2/14/24 16:34, Rodrigo Campos wrote:
>> size_t strlcat_rata(char *dst, const char *src, size_t size)
>> {
>>          const char *orig_src = src;
>>          size_t len = 0;
>>          for (;len < size; len++) {
>>                  if (dst[len] == '\0')
>>                          break;
>>          }
>
> If you think about it, this is strnlen() and what follows is strncat().
>
>> size_t strlcat_willy_fixed(char *dst, const char *src, size_t size)
>> {
>>          const char *orig_src = src;
>>          size_t len = strlen(dst);
>>          if (size < len)
>>                  len = size;
>
> Same here.
>
> We can simplify the code by using them, but the size doesn't seem to be
> smaller. Let me see what I can do.

Yeap, third option would be this:

size_t strlcat(char *dst, const char *src, size_t size)
{
size_t len = strnlen(dst, size);
if(size > len) {
strncpy(dst + len, src, size - len - 1);
dst[size - 1] = '\0';
}
return len + strlen(src);
}

I've tried to use strncpy return code to avoid doing the strlen(src),
but that is only useful if we copy all of src. Otherwise, we still need
to go till the end. And the function code ends-up being bigger because
we can only do it inside the if (and return from there), so we have to
add code there and keep the strlen(src) outside.

I'm not sure about the size of this variant, as different programs give
me different sizes.

For example, if I use it in this program:

int main(void) {
char buf[10] = "test123456";
buf[4] = '\0';
char *src = "bar";
size_t ret = strlcat(buf, src, 9);

printf("dst is: %s\n", buf);
printf("ret is: %d\n", ret);

printf("strlen(buf) is: %d\n", strlen(buf));
printf("strlen(src) is: %d\n", strlen(src));

return 0;
}

It is compiled to 74 bytes. This is smaller than the other two options I
posted.

But if I just use it in a wrapper function, as you suggested, like:

size_t test_strlcat_nlen(char *dst, const char *src, size_t size)
{
return strlcat(dst, src, size);
}

And compile like this:

gcc -xc -c -Os -fno-asynchronous-unwind-tables -fno-ident -s -Os -I
sysroot/x86/include -include nolibc.h strlcat_sizes.c -o test.o

or even like this, the result is the same:

gcc -xc -c -Os -I sysroot/x86/include -include nolibc.h strlcat_sizes.c
-o test.o

The result is 86 bytes, which is similar to the previous versions (81
and 90 bytes they were).

I haven't checked the assembler generated, I bet it is being smart and
taking advantage of the set size param when it makes it so much smaller.
I've tried calling it with more numbers than just "9", in lot of cases
it is smaller than 86 bytes.

I have the gut-feeling that for the types of programs that us nolibc,
all params might be known (dst, src, size) and we won't need to keep a
generic version of the function, the compiler will optimize it. Even
calling it several times in the compiled program, forcing it out and
inside the "if size > len)" case, the optimized version is still in the
76 bytes or so.

So, the code is so simple, the size is almost the same and it seems the
compiler has a better time optimizing it. I lean towards this version here.


What do you think?


Best,
Rodrigo

2024-02-18 10:21:16

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

Hi Rodrigo,

On Wed, Feb 14, 2024 at 12:34:46PM -0300, Rodrigo Campos wrote:
> Here are two versions that are significantly shorter than the 101 bytes,
> that pass the tests (I added more to account for the src vs dst mismatch
> that was easy to pass tests when both buffers have the same size as they did
> before).
>
> size_t strlcat_rata(char *dst, const char *src, size_t size)
> {
> const char *orig_src = src;
> size_t len = 0;
> for (;len < size; len++) {
> if (dst[len] == '\0')
> break;
> }
>
> /* Let's copy len < n < size-1 times from src.
> * size is unsigned, so instead of size-1, that can wrap around,
> * let's use len + 1 */
> while (len + 1 < size) {
> dst[len] = *src;
> if (*src == '\0')
> break;
> len++;
> src++;
> }
>
> if (src != orig_src)
> dst[len] = '\0';
>
> while (*src++)
> len++;
>
> return len;
> }
>
> This one compiles to 81 bytes here using gcc 13.2.0 and to 83 using gcc
> 9.5.0. Compared to the one posted in the patchset, it is significantly
> smaller.

OK this looks good to me. I think your test on src != orig_src is not
trivial and that testing instead if (len < size) would be better, and
possibly even shorter.

> One based in the version you posted (uses strlen(dst) instead), is this one:
>
> size_t strlcat_willy_fixed(char *dst, const char *src, size_t size)
> {
> const char *orig_src = src;
> size_t len = strlen(dst);
> if (size < len)
> len = size;
>
> for (;len + 1 < size; len++, src++) {
> dst[len] = *src;
> if (*src == '\0')
> break;
> }
>
> if (orig_src != src)
> dst[len] = '\0';
>
> while (*src++)
> len++;
>
> return len;
> }
>
>
> Note the "if size < len, then len=size", I couldn't get rid of it because we
> need to account for the smaller size of dst if we don't get passed it for
> the return code.

Please no, again as I mentioned earlier, it's wrong to use strlen(dst) in
this case: the only situation where we'd accept size < len is if dst is
already not zero-terminated, which means that strlen() cannot be used, or
you'd need strnlen() for that, I'm just seeing that we have it, I thought
we didn't.

> This one is 90 bytes here.

See the point I was making? Sometimes the amount of fat added around a
function call is important compared to just an increment and a comparison.
Of course that's not always true but that's one such example.

> What do you think? Can you make them shorter?

I don't want to enter that activity again this week-end ;-) It's sufficient
here.

> If you like one of these, I can repost with the improved tests too.

Yeah please check the few points above for your version, make sure to
clean it up according to the kernel's coding style (empty line after
variable declaration, */ on the next line for the multi-line comment
etc) and that's fine.

Thanks,
Willy

2024-02-18 10:22:52

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On Wed, Feb 14, 2024 at 07:03:10PM -0300, Rodrigo Campos wrote:
> On 2/14/24 16:34, Rodrigo Campos wrote:
> > size_t strlcat_rata(char *dst, const char *src, size_t size)
> > {
> > const char *orig_src = src;
> > size_t len = 0;
> > for (;len < size; len++) {
> > if (dst[len] == '\0')
> > break;
> > }
>
> If you think about it, this is strnlen() and what follows is strncat().

I agree, I just didn't remember we had strnlen() nor strncat().

Willy

2024-02-18 19:34:19

by Rodrigo Campos

[permalink] [raw]
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

On 2/18/24 07:20, Willy Tarreau wrote:
> Hi Rodrigo,
>
>
> OK this looks good to me.

I'll use that version in the next send then, thanks!

> I think your test on src != orig_src is not
> trivial and that testing instead if (len < size) would be better, and
> possibly even shorter.

Fixed that, thanks.

>> Note the "if size < len, then len=size", I couldn't get rid of it because we
>> need to account for the smaller size of dst if we don't get passed it for
>> the return code.
>
> Please no, again as I mentioned earlier, it's wrong to use strlen(dst) in
> this case: the only situation where we'd accept size < len is if dst is
> already not zero-terminated, which means that strlen() cannot be used, or
> you'd need strnlen() for that, I'm just seeing that we have it, I thought
> we didn't.

Right, sorry.

>> What do you think? Can you make them shorter?
>
> I don't want to enter that activity again this week-end ;-) It's sufficient
> here.

haha, fair :)

>> If you like one of these, I can repost with the improved tests too.
>
> Yeah please check the few points above for your version, make sure to
> clean it up according to the kernel's coding style (empty line after
> variable declaration, */ on the next line for the multi-line comment
> etc) and that's fine.

Will do



Best,
Rodrigo