2023-09-04 01:45:26

by David Laight

[permalink] [raw]
Subject: RE: [RFC PATCH v3 0/4] nolibc x86-64 string functions

From: Ammar Faizi
> Sent: 02 September 2023 14:35
>
> This is an RFC patchset v3 for nolibc x86-64 string functions.
>
> There are 4 patches in this series:
>
> ## Patch 1-2: Use `rep movsb`, `rep stosb` for:
> - memcpy() and memmove()
> - memset()
> respectively. They can simplify the generated ASM code.

It is worth pointing out that while the code size for 'rep xxxb'
is smaller, the performance is terrible.
The only time it is ever good is for the optimised forwards
copies on cpu that support it.

reverse, stos and scas are always horrid.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


2023-09-04 12:06:32

by Willy Tarreau

[permalink] [raw]
Subject: Re: [RFC PATCH v3 0/4] nolibc x86-64 string functions

On Sun, Sep 03, 2023 at 08:38:22PM +0000, David Laight wrote:
> From: Ammar Faizi
> > Sent: 02 September 2023 14:35
> >
> > This is an RFC patchset v3 for nolibc x86-64 string functions.
> >
> > There are 4 patches in this series:
> >
> > ## Patch 1-2: Use `rep movsb`, `rep stosb` for:
> > - memcpy() and memmove()
> > - memset()
> > respectively. They can simplify the generated ASM code.
>
> It is worth pointing out that while the code size for 'rep xxxb'
> is smaller, the performance is terrible.
> The only time it is ever good is for the optimised forwards
> copies on cpu that support it.
>
> reverse, stos and scas are always horrid.

It's terrible compared to other approaches but not *that* bad. Also we
absolutely don't care about performance here, rather about correctness
and compact size.

Regards,
Willy