2017-09-22 14:59:49

by Colin King

[permalink] [raw]
Subject: [PATCH] lib: zstd: make const array rtbTable static, reduces object code size

From: Colin Ian King <[email protected]>

Don't populate const array rtbTable on the stack, instead make it
static. Also split overly long line to clean a chechkpach warning.
Makes the object code smaller by nearly 500 bytes:

Before:
text data bss dec hex filename
13297 104 0 13401 3459 lib/zstd/fse_compress.o

After:
text data bss dec hex filename
12742 160 0 12902 3266 lib/zstd/fse_compress.o

(gcc 6.3.0, x86-64)

Signed-off-by: Colin Ian King <[email protected]>
---
lib/zstd/fse_compress.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/zstd/fse_compress.c b/lib/zstd/fse_compress.c
index ef3d1741d532..f41c13128ed2 100644
--- a/lib/zstd/fse_compress.c
+++ b/lib/zstd/fse_compress.c
@@ -618,7 +618,10 @@ size_t FSE_normalizeCount(short *normalizedCounter, unsigned tableLog, const uns
return ERROR(GENERIC); /* Too small tableLog, compression potentially impossible */

{
- U32 const rtbTable[] = {0, 473195, 504333, 520860, 550000, 700000, 750000, 830000};
+ U32 static const rtbTable[] = {
+ 0, 473195, 504333, 520860,
+ 550000, 700000, 750000, 830000
+ };
U64 const scale = 62 - tableLog;
U64 const step = div_u64((U64)1 << 62, (U32)total); /* <== here, one division ! */
U64 const vStep = 1ULL << (scale - 20);
--
2.14.1


2017-09-22 19:14:13

by Nick Terrell

[permalink] [raw]
Subject: Re: [PATCH] lib: zstd: make const array rtbTable static, reduces object code size

On 9/22/17, 8:00 AM, "[email protected] on behalf of Colin King" <[email protected] on behalf of [email protected]> wrote:
> From: Colin Ian King <[email protected]>
>
> Don't populate const array rtbTable on the stack, instead make it
> static. Also split overly long line to clean a chechkpach warning.
> Makes the object code smaller by nearly 500 bytes:
>
> Before:
> text data bss dec hex filename
> 13297 104 0 13401 3459 lib/zstd/fse_compress.o
>
> After:
> text data bss dec hex filename
> 12742 160 0 12902 3266 lib/zstd/fse_compress.o
>
> (gcc 6.3.0, x86-64)
>
> Signed-off-by: Colin Ian King <[email protected]>

I tested your patch with gcc-7.1 on x86, and benchmarked the speed on
upstream zstd. There isn't a noticeable speed difference, since it isn't a
particularly hot piece of code. Would you be able to submit the same patch
upstream [1], or would you be okay with me porting it back upstream, so it
doesn't get lost on an update?

I didn't expect gcc to leave constant arrays on the stack, that seems
silly. Clang makes it static, but gcc loads it onto the stack, and in 6.3+
it saves the data statically, and then uses vector instructions to load it
onto the stack [2].

Tested-by: Nick Terrell <[email protected]>

[1] https://github.com/facebook/zstd
[2] https://godbolt.org/g/fvTcED



2017-09-22 21:35:02

by Colin King

[permalink] [raw]
Subject: Re: [PATCH] lib: zstd: make const array rtbTable static, reduces object code size

On 22/09/17 20:14, Nick Terrell wrote:
> On 9/22/17, 8:00 AM, "[email protected] on behalf of Colin King" <[email protected] on behalf of [email protected]> wrote:
>> From: Colin Ian King <[email protected]>
>>
>> Don't populate const array rtbTable on the stack, instead make it
>> static. Also split overly long line to clean a chechkpach warning.
>> Makes the object code smaller by nearly 500 bytes:
>>
>> Before:
>> text data bss dec hex filename
>> 13297 104 0 13401 3459 lib/zstd/fse_compress.o
>>
>> After:
>> text data bss dec hex filename
>> 12742 160 0 12902 3266 lib/zstd/fse_compress.o
>>
>> (gcc 6.3.0, x86-64)
>>
>> Signed-off-by: Colin Ian King <[email protected]>
>
> I tested your patch with gcc-7.1 on x86, and benchmarked the speed on
> upstream zstd. There isn't a noticeable speed difference, since it isn't a
> particularly hot piece of code. Would you be able to submit the same patch
> upstream [1], or would you be okay with me porting it back upstream, so it
> doesn't get lost on an update?

Since you have more contact with the zstd codebase it may be preferable
for me to pass the port back to upstream over to you Nick (if that's
OK). As it stands, I did a quick check by building the original zstd
codebase with gcc 7.2 and didn't observe any object code size changes
between the original and the patched code. I see that it's optimized
with -O3 and its not using the same gcc flags as the kernel, so I
suspect that may need exploring further to see why there is such a large
change on the kernel version and that of the native user space library
build.

>
> I didn't expect gcc to leave constant arrays on the stack, that seems
> silly. Clang makes it static, but gcc loads it onto the stack, and in 6.3+
> it saves the data statically, and then uses vector instructions to load it
> onto the stack [2].

Oh, that latter information is interesting, didn't know that.

Thanks for testing.

Colin
>
> Tested-by: Nick Terrell <[email protected]>
>
> [1] https://github.com/facebook/zstd
> [2] https://godbolt.org/g/fvTcED
>
>
> N�����r��y���b�X��ǧv�^�)޺{.n�+���z�ޖ6���+�)���w*jg��������ݢj/���z�ޖ��2�ޙ���&�)ߡ�a�����G���h��j:+v���w�٥
>