2022-10-10 12:48:36

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH] tools/nolibc/string: fix memcmp() implementation

The C standard says that memcmp() must treat the buffers as consisting
of "unsigned chars". If char happens to be unsigned, the casts are ok,
but then obviously the c1 variable can never contain a negative
value. And when char is signed, the casts are wrong, and there's still
a problem with using an 8-bit quantity to hold the difference, because
that can range from -255 to +255.

For example, assuming char is signed, comparing two 1-byte buffers,
one containing 0x00 and another 0x80, the current implementation would
return -128 for both memcmp(a, b, 1) and memcmp(b, a, 1), whereas one
of those should of course return something positive.

Signed-off-by: Rasmus Villemoes <[email protected]>
---
tools/include/nolibc/string.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
index bef35bee9c44..cc1bddcb5927 100644
--- a/tools/include/nolibc/string.h
+++ b/tools/include/nolibc/string.h
@@ -19,9 +19,9 @@ static __attribute__((unused))
int memcmp(const void *s1, const void *s2, size_t n)
{
size_t ofs = 0;
- char c1 = 0;
+ int c1 = 0;

- while (ofs < n && !(c1 = ((char *)s1)[ofs] - ((char *)s2)[ofs])) {
+ while (ofs < n && !(c1 = ((unsigned char *)s1)[ofs] - ((unsigned char *)s2)[ofs])) {
ofs++;
}
return c1;
--
2.37.2


2022-10-13 07:05:10

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH] tools/nolibc/string: fix memcmp() implementation

Hi Rasmus,

On Mon, Oct 10, 2022 at 01:36:06PM +0200, Rasmus Villemoes wrote:
> The C standard says that memcmp() must treat the buffers as consisting
> of "unsigned chars". If char happens to be unsigned, the casts are ok,
> but then obviously the c1 variable can never contain a negative
> value. And when char is signed, the casts are wrong, and there's still
> a problem with using an 8-bit quantity to hold the difference, because
> that can range from -255 to +255.
>
> For example, assuming char is signed, comparing two 1-byte buffers,
> one containing 0x00 and another 0x80, the current implementation would
> return -128 for both memcmp(a, b, 1) and memcmp(b, a, 1), whereas one
> of those should of course return something positive.

You're totally right of course, thank you for spotting this one! I'm
queuing it now.

Regards,
Willy