2018-11-24 12:37:12

by David CARLIER

[permalink] [raw]
Subject: [PATCH] Little memset_explicit optimisation



Attachments:
0001-memzero_explicit-optimisation-for-size.patch (720.00 B)

2018-11-26 11:31:14

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] Little memset_explicit optimisation

On Sat, Nov 24, 2018 at 12:35:43PM +0000, David CARLIER wrote:
>

Hmm... Can we see the difference in assembly generation?


--
With Best Regards,
Andy Shevchenko



2018-11-26 11:39:24

by Joey Pabalinas

[permalink] [raw]
Subject: Re: [PATCH] Little memset_explicit optimisation

On Sat, Nov 24, 2018 at 12:35:43PM +0000, David CARLIER wrote:
> Using the return value of memset for save/load sake.
>
> Signed-off-by: David Carlier <[email protected]>
> ---
> lib/string.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/string.c b/lib/string.c
> index 38e4ca08e757..92da04a0213b 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -720,7 +720,7 @@ EXPORT_SYMBOL(memset);
> */
> void memzero_explicit(void *s, size_t count)
> {
> - memset(s, 0, count);
> + s = memset(s, 0, count);
> barrier_data(s);
> }
> EXPORT_SYMBOL(memzero_explicit);

Could you elaborate on the optimization that this patch performs?

--
Cheers,
Joey Pabalinas


Attachments:
(No filename) (742.00 B)
signature.asc (849.00 B)
Download all attachments

2018-11-26 19:37:54

by David CARLIER

[permalink] [raw]
Subject: Re: [PATCH] Little memset_explicit optimisation

Sorry I m not used yet at all to LKML rules.

So here a slight difference in assembly generated between the two
versions (amd64) :
`
.loc 1 7 7
leaq -12(%rbp), %rax
movq %rax, -8(%rbp)
- .loc 1 11 2
+ .loc 1 9 6
movq -8(%rbp), %rax
movl $4, %edx
movl $0, %esi
movq %rax, %rdi
call memset@PLT
+ movq %rax, -8(%rbp)
.loc 1 13 23
movq -8(%rbp), %rax
movl (%rax), %eax
`
On Mon, 26 Nov 2018 at 11:37, Joey Pabalinas <[email protected]> wrote:
>
> On Sat, Nov 24, 2018 at 12:35:43PM +0000, David CARLIER wrote:
> > Using the return value of memset for save/load sake.
> >
> > Signed-off-by: David Carlier <[email protected]>
> > ---
> > lib/string.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/string.c b/lib/string.c
> > index 38e4ca08e757..92da04a0213b 100644
> > --- a/lib/string.c
> > +++ b/lib/string.c
> > @@ -720,7 +720,7 @@ EXPORT_SYMBOL(memset);
> > */
> > void memzero_explicit(void *s, size_t count)
> > {
> > - memset(s, 0, count);
> > + s = memset(s, 0, count);
> > barrier_data(s);
> > }
> > EXPORT_SYMBOL(memzero_explicit);
>
> Could you elaborate on the optimization that this patch performs?
>
> --
> Cheers,
> Joey Pabalinas

2018-11-26 21:21:32

by Joey Pabalinas

[permalink] [raw]
Subject: Re: Re: [PATCH] Little memset_explicit optimisation

On Mon, Nov 26, 2018 at 07:36:19PM +0000, David CARLIER wrote:
> Sorry I m not used yet at all to LKML rules.
>
> So here a slight difference in assembly generated between the two
> versions (amd64) :
> `
> .loc 1 7 7
> leaq -12(%rbp), %rax
> movq %rax, -8(%rbp)
> - .loc 1 11 2
> + .loc 1 9 6
> movq -8(%rbp), %rax
> movl $4, %edx
> movl $0, %esi
> movq %rax, %rdi
> call memset@PLT
> + movq %rax, -8(%rbp)
> .loc 1 13 23
> movq -8(%rbp), %rax
> movl (%rax), %eax

What is the advantage of having the added `movq %rax, -8(%rbp)` here?

The next instruction is `movq -8(%rbp), %rax` and nothing afterwords
uses the value stored in `-8(%rbp)`.

Also, is this compiled without optimization? Take a looks at the
assembly in a small test case with -O1 (making sure to use the target
variable so it isn't optimized out) and compare the assembly generated
with and without that assignment.

--
Cheers,
Joey Pabalinas


Attachments:
(No filename) (1.02 kB)
signature.asc (849.00 B)
Download all attachments