2014-03-24 16:10:48

by Mathias Krause

[permalink] [raw]
Subject: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes

The recent addition of the AVX2 variant of the SHA1 hash function wrongly
disabled the AVX variant by introducing a flaw in the feature test. Fixed
in patch 1.

The alignment calculations of the AVX2 assembler implementation are
questionable, too. Especially the page alignment of the stack pointer is
broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
code alignment is fixed.

Please apply!

Mathias Krause (3):
crypto: x86/sha1 - re-enable the AVX variant
crypto: x86/sha1 - fix stack alignment of AVX2 variant
crypto: x86/sha1 - reduce size of the AVX2 asm implementation

arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
2 files changed, 18 insertions(+), 16 deletions(-)

--
1.7.10.4


2014-03-24 16:10:50

by Mathias Krause

[permalink] [raw]
Subject: [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant

The AVX2 implementation might waste up to a page of stack memory because
of a wrong alignment calculation. This will, in the worst case, increase
the stack usage of sha1_transform_avx2() alone to 5.4 kB -- way to big
for a kernel function. Even worse, it might also allocate *less* bytes
than needed if the stack pointer is already aligned bacause in that case
the 'sub %rbx, %rsp' is effectively moving the stack pointer upwards,
not downwards.

Fix those issues by changing and simplifying the alignment calculation
to use a 32 byte alignment, the alignment really needed.

Cc: Chandramouli Narayanan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Marek Vasut <[email protected]>
Signed-off-by: Mathias Krause <[email protected]>
---
arch/x86/crypto/sha1_avx2_x86_64_asm.S | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index 4f348544d1..bacac22b20 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -636,9 +636,7 @@ _loop3:

/* Align stack */
mov %rsp, %rbx
- and $(0x1000-1), %rbx
- sub $(8+32), %rbx
- sub %rbx, %rsp
+ and $~(0x20-1), %rsp
push %rbx
sub $RESERVE_STACK, %rsp

@@ -665,8 +663,7 @@ _loop3:
avx2_zeroupper

add $RESERVE_STACK, %rsp
- pop %rbx
- add %rbx, %rsp
+ pop %rsp

pop %r15
pop %r14
--
1.7.10.4

2014-03-24 16:10:49

by Mathias Krause

[permalink] [raw]
Subject: [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant

Commit 7c1da8d0d0 "crypto: sha - SHA1 transform x86_64 AVX2"
accidentally disabled the AVX variant by making the avx_usable() test
not only fail in case the CPU doesn't support AVX or OSXSAVE but also
if it doesn't support AVX2.

Fix that regression by splitting up the AVX/AVX2 test into two
functions. Also test for the BMI1 extension in the avx2_usable() test
as the AVX2 implementation not only makes use of BMI2 but also BMI1
instructions.

Cc: Chandramouli Narayanan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Marek Vasut <[email protected]>
Signed-off-by: Mathias Krause <[email protected]>
---
arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c
index 139a55c04d..74d16ef707 100644
--- a/arch/x86/crypto/sha1_ssse3_glue.c
+++ b/arch/x86/crypto/sha1_ssse3_glue.c
@@ -208,11 +208,7 @@ static bool __init avx_usable(void)
{
u64 xcr0;

-#if defined(CONFIG_AS_AVX2)
- if (!cpu_has_avx || !cpu_has_avx2 || !cpu_has_osxsave)
-#else
if (!cpu_has_avx || !cpu_has_osxsave)
-#endif
return false;

xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
@@ -224,11 +220,23 @@ static bool __init avx_usable(void)

return true;
}
+
+#ifdef CONFIG_AS_AVX2
+static bool __init avx2_usable(void)
+{
+ if (avx_usable() && cpu_has_avx2 && boot_cpu_has(X86_FEATURE_BMI1) &&
+ boot_cpu_has(X86_FEATURE_BMI2))
+ return true;
+
+ return false;
+}
+#endif
#endif

static int __init sha1_ssse3_mod_init(void)
{
char *algo_name;
+
/* test for SSSE3 first */
if (cpu_has_ssse3) {
sha1_transform_asm = sha1_transform_ssse3;
@@ -238,13 +246,11 @@ static int __init sha1_ssse3_mod_init(void)
#ifdef CONFIG_AS_AVX
/* allow AVX to override SSSE3, it's a little faster */
if (avx_usable()) {
- if (cpu_has_avx) {
- sha1_transform_asm = sha1_transform_avx;
- algo_name = "AVX";
- }
+ sha1_transform_asm = sha1_transform_avx;
+ algo_name = "AVX";
#ifdef CONFIG_AS_AVX2
- if (cpu_has_avx2 && boot_cpu_has(X86_FEATURE_BMI2)) {
- /* allow AVX2 to override AVX, it's a little faster */
+ /* allow AVX2 to override AVX, it's a little faster */
+ if (avx2_usable()) {
sha1_transform_asm = sha1_apply_transform_avx2;
algo_name = "AVX2";
}
--
1.7.10.4

2014-03-24 16:10:51

by Mathias Krause

[permalink] [raw]
Subject: [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation

There is really no need to page align sha1_transform_avx2. The default
alignment is just fine. This is not the hot code but only the entry
point, after all.

Cc: Chandramouli Narayanan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Marek Vasut <[email protected]>
Signed-off-by: Mathias Krause <[email protected]>
---
arch/x86/crypto/sha1_avx2_x86_64_asm.S | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index bacac22b20..1cd792db15 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -623,7 +623,6 @@ _loop3:
*/
.macro SHA1_VECTOR_ASM name
ENTRY(\name)
- .align 4096

push %rbx
push %rbp
--
1.7.10.4

2014-03-24 17:05:22

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes

On 03/24/2014 09:10 AM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
>

Yikes. Yes... the alignment calculation is confused in the extreme.

-hpa

2014-03-24 17:28:48

by chandramouli narayanan

[permalink] [raw]
Subject: Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes

On Mon, 2014-03-24 at 17:10 +0100, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
>
> Mathias Krause (3):
> crypto: x86/sha1 - re-enable the AVX variant
> crypto: x86/sha1 - fix stack alignment of AVX2 variant
> crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>
> arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
> arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
> 2 files changed, 18 insertions(+), 16 deletions(-)
>
Your fixes are the right on mark. I went through your patches and tested
them and found to be correct.

Sorry for causing regression and missing alignment issues in the patches
I submitted.

- mouli

2014-03-24 17:31:50

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes

On 03/24/2014 09:10 AM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
>
> Mathias Krause (3):
> crypto: x86/sha1 - re-enable the AVX variant
> crypto: x86/sha1 - fix stack alignment of AVX2 variant
> crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>
> arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
> arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
> 2 files changed, 18 insertions(+), 16 deletions(-)
>

For all these patches:

Reviewed-by: H. Peter Anvin <[email protected]>

Thank you for doing these.

-hpa

2014-03-24 20:36:25

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes

On Monday, March 24, 2014 at 05:10:36 PM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!

Nice,

Reviewed-by: Marek Vasut <[email protected]>

Best regards,
Marek Vasut

2014-03-25 07:55:42

by Mathias Krause

[permalink] [raw]
Subject: Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes

On 24 March 2014 18:29, chandramouli narayanan <[email protected]> wrote:
> On Mon, 2014-03-24 at 17:10 +0100, Mathias Krause wrote:
>> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
>> disabled the AVX variant by introducing a flaw in the feature test. Fixed
>> in patch 1.
>>
>> The alignment calculations of the AVX2 assembler implementation are
>> questionable, too. Especially the page alignment of the stack pointer is
>> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
>> code alignment is fixed.
>>
>> Please apply!
>>
>> Mathias Krause (3):
>> crypto: x86/sha1 - re-enable the AVX variant
>> crypto: x86/sha1 - fix stack alignment of AVX2 variant
>> crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>>
>> arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
>> arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
>> 2 files changed, 18 insertions(+), 16 deletions(-)
>>
> Your fixes are the right on mark. I went through your patches and tested
> them and found to be correct.

Thanks for double-checking!

> Sorry for causing regression and missing alignment issues in the patches
> I submitted.

No problem with that. But as I'm not subscribed to the linux-crypto
mailing list I haven't seen your earlier submissions. Otherwise I
would have objected earlier. ;)

Thanks,
Mathias

2014-03-25 12:44:44

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes

On Mon, Mar 24, 2014 at 09:19:44PM +0100, Marek Vasut wrote:
> On Monday, March 24, 2014 at 05:10:36 PM, Mathias Krause wrote:
> > The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> > disabled the AVX variant by introducing a flaw in the feature test. Fixed
> > in patch 1.
> >
> > The alignment calculations of the AVX2 assembler implementation are
> > questionable, too. Especially the page alignment of the stack pointer is
> > broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> > code alignment is fixed.
> >
> > Please apply!
>
> Nice,
>
> Reviewed-by: Marek Vasut <[email protected]>

All applied. Thanks Mathias!
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt