2023-04-19 21:56:58

by Scott Mayhew

[permalink] [raw]
Subject: RPCSEC GSS krb5 KUnit test fails on arm64 with h/w accelerated ciphers enabled

Chuck's recently-added RPCSEC GSS krb5 KUnit test
(net/sunrpc/auth_gss/gss_krb5_test.c) is failing on arm64, specifically
the RFC 3962 test cases (I'm just pasting the output of 1 case, but all
6 cases fail):

---8<---
[ 237.255197] # Subtest: RFC 3962 encryption
[ 237.255588] # RFC 3962 encryption: EXPECTATION FAILED at net/sunrpc/auth_gss/gss_krb5_test.c:772
Expected memcmp(param->next_iv->data, iv, param->next_iv->len) == 0, but
memcmp(param->next_iv->data, iv, param->next_iv->len) == 1 (0x1)

IV mismatch
---8<---

If I disable the hardware accelerated ciphers
(CONFIG_CRYPTO_AES_ARM64_CE_BLK and CONFIG_CRYPTO_AES_ARM64_NEON_BLK),
then the test works.

Likewise, if I modify Chuck's test to explicitly request
"cts(cbc(aes-generic))", then the test works.

The problem is that the asm helper aes_cbc_cts_encrypt in
arch/arm64/crypto/aes-modes.S doesn't return the next IV.

If I make the following change, then the test works:

diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
index 0e834a2c062c..477605fad76b 100644
--- a/arch/arm64/crypto/aes-modes.S
+++ b/arch/arm64/crypto/aes-modes.S
@@ -268,6 +268,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
add x4, x0, x4
st1 {v0.16b}, [x4] /* overlapping stores */
st1 {v1.16b}, [x0]
+ st1 {v1.16b}, [x5]
ret
AES_FUNC_END(aes_cbc_cts_encrypt)

But I don't know if that change is at all correct! (I've never even
looked at arm64 asm before). If someone who's knowledgeable about this
code could chime in, I'd appreciate it.

-Scott


2023-04-28 09:48:43

by Herbert Xu

[permalink] [raw]
Subject: Re: RPCSEC GSS krb5 KUnit test fails on arm64 with h/w accelerated ciphers enabled

Scott Mayhew <[email protected]> wrote:
>
> diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
> index 0e834a2c062c..477605fad76b 100644
> --- a/arch/arm64/crypto/aes-modes.S
> +++ b/arch/arm64/crypto/aes-modes.S
> @@ -268,6 +268,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
> add x4, x0, x4
> st1 {v0.16b}, [x4] /* overlapping stores */
> st1 {v1.16b}, [x0]
> + st1 {v1.16b}, [x5]
> ret
> AES_FUNC_END(aes_cbc_cts_encrypt)
>
> But I don't know if that change is at all correct! (I've never even
> looked at arm64 asm before). If someone who's knowledgeable about this
> code could chime in, I'd appreciate it.

Ard, could you please take a look at this?

Thanks,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2023-04-28 13:00:49

by Chuck Lever III

[permalink] [raw]
Subject: Re: RPCSEC GSS krb5 KUnit test fails on arm64 with h/w accelerated ciphers enabled



> On Apr 28, 2023, at 5:57 AM, Ard Biesheuvel <[email protected]> wrote:
>
> On Fri, 28 Apr 2023 at 10:44, Herbert Xu <[email protected]> wrote:
>>
>> Scott Mayhew <[email protected]> wrote:
>>>
>>> diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
>>> index 0e834a2c062c..477605fad76b 100644
>>> --- a/arch/arm64/crypto/aes-modes.S
>>> +++ b/arch/arm64/crypto/aes-modes.S
>>> @@ -268,6 +268,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
>>> add x4, x0, x4
>>> st1 {v0.16b}, [x4] /* overlapping stores */
>>> st1 {v1.16b}, [x0]
>>> + st1 {v1.16b}, [x5]
>>> ret
>>> AES_FUNC_END(aes_cbc_cts_encrypt)
>>>
>>> But I don't know if that change is at all correct! (I've never even
>>> looked at arm64 asm before). If someone who's knowledgeable about this
>>> code could chime in, I'd appreciate it.
>>
>> Ard, could you please take a look at this?
>>
>
> The issue seems to be that the caller expects iv_out to have been
> populated even when doing ciphertext stealing.
>
> This is meaningless, because the output IV can only be used to chain
> additional encrypted data if the split is at a block boundary. The
> ciphertext stealing fundamentally terminates the encryption, and
> produces a block of ciphertext that is shorter than the block size, so
> what the output IV should be is actually unspecified.
>
> IOW, test cases having plain/ciphertext vectors whose size is not a
> multiple of the cipher block size should not specify an expected value
> for the output IV.

The test cases are extracted from RFC 3962 Appendix B. The
standard clearly expects there to be a non-zero next IV for
plaintext sizes that are not block-aligned.

Also, these test cases pass on other hardware platforms.


--
Chuck Lever


2023-04-29 00:00:38

by Eric Biggers

[permalink] [raw]
Subject: Re: RPCSEC GSS krb5 KUnit test fails on arm64 with h/w accelerated ciphers enabled

On Fri, Apr 28, 2023 at 05:48:30PM +0100, Ard Biesheuvel wrote:
> On Fri, 28 Apr 2023 at 17:18, Chuck Lever III <[email protected]> wrote:
> >
> >
> >
> > > On Apr 28, 2023, at 12:09 PM, Ard Biesheuvel <[email protected]> wrote:
> > >
> > > On Fri, 28 Apr 2023 at 13:59, Chuck Lever III <[email protected]> wrote:
> > >>
> > >>
> > >>
> > >>> On Apr 28, 2023, at 5:57 AM, Ard Biesheuvel <[email protected]> wrote:
> > >>>
> > >>> On Fri, 28 Apr 2023 at 10:44, Herbert Xu <[email protected]> wrote:
> > >>>>
> > >>>> Scott Mayhew <[email protected]> wrote:
> > >>>>>
> > >>>>> diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
> > >>>>> index 0e834a2c062c..477605fad76b 100644
> > >>>>> --- a/arch/arm64/crypto/aes-modes.S
> > >>>>> +++ b/arch/arm64/crypto/aes-modes.S
> > >>>>> @@ -268,6 +268,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
> > >>>>> add x4, x0, x4
> > >>>>> st1 {v0.16b}, [x4] /* overlapping stores */
> > >>>>> st1 {v1.16b}, [x0]
> > >>>>> + st1 {v1.16b}, [x5]
> > >>>>> ret
> > >>>>> AES_FUNC_END(aes_cbc_cts_encrypt)
> > >>>>>
> > >>>>> But I don't know if that change is at all correct! (I've never even
> > >>>>> looked at arm64 asm before). If someone who's knowledgeable about this
> > >>>>> code could chime in, I'd appreciate it.
> > >>>>
> > >>>> Ard, could you please take a look at this?
> > >>>>
> > >>>
> > >>> The issue seems to be that the caller expects iv_out to have been
> > >>> populated even when doing ciphertext stealing.
> > >>>
> > >>> This is meaningless, because the output IV can only be used to chain
> > >>> additional encrypted data if the split is at a block boundary. The
> > >>> ciphertext stealing fundamentally terminates the encryption, and
> > >>> produces a block of ciphertext that is shorter than the block size, so
> > >>> what the output IV should be is actually unspecified.
> > >>>
> > >>> IOW, test cases having plain/ciphertext vectors whose size is not a
> > >>> multiple of the cipher block size should not specify an expected value
> > >>> for the output IV.
> > >>
> > >> The test cases are extracted from RFC 3962 Appendix B. The
> > >> standard clearly expects there to be a non-zero next IV for
> > >> plaintext sizes that are not block-aligned.
> > >>
> > >
> > > OK, so this is the Kerberos V specific spec on how to use AES in CBC
> > > mode, which appears to describe how to chain multiple CBC encryptions
> > > together.
> > >
> > > CBC-CTS itself does not define this: the IV is an input only, and the
> > > reason we happen to return the IV is because a single CBC operation
> > > may be broken up into several ones, which can only be done on block
> > > boundaries. This is why CBC-CTS itself passes all its tests: a single
> > > CBC-CTS encryption only performs ciphertext stealing at the very end,
> > > and the next IV is never used in that case. (This is why the CTS mode
> > > tests in crypto/testmgr.h don't have iv_out vectors)
> > >
> > > Note that RFC3962 defines that the penultimate block of CBC-CTS
> > > ciphertext is used as the next IV. CBC returns the last ciphertext
> > > block as the output IV. It is a happy coincidence that the generic CTS
> > > template encapsulates CBC in a way where its output IV ends up in the
> > > right place.
> > >
> > >> Also, these test cases pass on other hardware platforms.
> > >>
> > >
> > > Fair enough.
> > >
> > > I am not opposed to fixing this, but given that it is the RFC3962 spec
> > > that defines that the next IV is the penultimate full block of the
> > > previous CBC-CTS ciphertext, we might consider doing the memcpy() in
> > > the Kerberos code not in the CBC-CTS implementations.
> >
> > Interesting thought. I'm all about proper layering, so I think this
> > is worth considering. Can you send an RFC patch?
> >
>
> I'm not that familiar with kunit so this is just an off hand
> suggestion, but I imagine something like the below would suffice
>
> --- a/net/sunrpc/auth_gss/gss_krb5_crypto.c
> +++ b/net/sunrpc/auth_gss/gss_krb5_crypto.c
> @@ -639,6 +639,13 @@ gss_krb5_cts_crypt(struct crypto_sync_skcipher
> *cipher, struct xdr_buf *buf,
>
> ret = write_bytes_to_xdr_buf(buf, offset, data, len);
>
> + /*
> + * CBC-CTS does not define an output IV but RFC 3962 defines it as the
> + * penultimate block of ciphertext, so copy that into the IV buffer
> + * before returning.
> + */
> + if (encrypt)
> + memcpy(iv, data, crypto_sync_skcipher_ivsize(cipher));

The whole "update the IV" thing that the crypto API does for some algorithms is
very weird and nonstandard, and most algorithms don't implement it. So I'd
definitely support handling this in the caller here.

- Eric

2023-05-01 13:13:23

by Scott Mayhew

[permalink] [raw]
Subject: Re: RPCSEC GSS krb5 KUnit test fails on arm64 with h/w accelerated ciphers enabled

On Fri, 28 Apr 2023, Ard Biesheuvel wrote:

> On Fri, 28 Apr 2023 at 17:18, Chuck Lever III <[email protected]> wrote:
> >
> >
> >
> > > On Apr 28, 2023, at 12:09 PM, Ard Biesheuvel <[email protected]> wrote:
> > >
> > > On Fri, 28 Apr 2023 at 13:59, Chuck Lever III <[email protected]> wrote:
> > >>
> > >>
> > >>
> > >>> On Apr 28, 2023, at 5:57 AM, Ard Biesheuvel <[email protected]> wrote:
> > >>>
> > >>> On Fri, 28 Apr 2023 at 10:44, Herbert Xu <[email protected]> wrote:
> > >>>>
> > >>>> Scott Mayhew <[email protected]> wrote:
> > >>>>>
> > >>>>> diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
> > >>>>> index 0e834a2c062c..477605fad76b 100644
> > >>>>> --- a/arch/arm64/crypto/aes-modes.S
> > >>>>> +++ b/arch/arm64/crypto/aes-modes.S
> > >>>>> @@ -268,6 +268,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
> > >>>>> add x4, x0, x4
> > >>>>> st1 {v0.16b}, [x4] /* overlapping stores */
> > >>>>> st1 {v1.16b}, [x0]
> > >>>>> + st1 {v1.16b}, [x5]
> > >>>>> ret
> > >>>>> AES_FUNC_END(aes_cbc_cts_encrypt)
> > >>>>>
> > >>>>> But I don't know if that change is at all correct! (I've never even
> > >>>>> looked at arm64 asm before). If someone who's knowledgeable about this
> > >>>>> code could chime in, I'd appreciate it.
> > >>>>
> > >>>> Ard, could you please take a look at this?
> > >>>>
> > >>>
> > >>> The issue seems to be that the caller expects iv_out to have been
> > >>> populated even when doing ciphertext stealing.
> > >>>
> > >>> This is meaningless, because the output IV can only be used to chain
> > >>> additional encrypted data if the split is at a block boundary. The
> > >>> ciphertext stealing fundamentally terminates the encryption, and
> > >>> produces a block of ciphertext that is shorter than the block size, so
> > >>> what the output IV should be is actually unspecified.
> > >>>
> > >>> IOW, test cases having plain/ciphertext vectors whose size is not a
> > >>> multiple of the cipher block size should not specify an expected value
> > >>> for the output IV.
> > >>
> > >> The test cases are extracted from RFC 3962 Appendix B. The
> > >> standard clearly expects there to be a non-zero next IV for
> > >> plaintext sizes that are not block-aligned.
> > >>
> > >
> > > OK, so this is the Kerberos V specific spec on how to use AES in CBC
> > > mode, which appears to describe how to chain multiple CBC encryptions
> > > together.
> > >
> > > CBC-CTS itself does not define this: the IV is an input only, and the
> > > reason we happen to return the IV is because a single CBC operation
> > > may be broken up into several ones, which can only be done on block
> > > boundaries. This is why CBC-CTS itself passes all its tests: a single
> > > CBC-CTS encryption only performs ciphertext stealing at the very end,
> > > and the next IV is never used in that case. (This is why the CTS mode
> > > tests in crypto/testmgr.h don't have iv_out vectors)
> > >
> > > Note that RFC3962 defines that the penultimate block of CBC-CTS
> > > ciphertext is used as the next IV. CBC returns the last ciphertext
> > > block as the output IV. It is a happy coincidence that the generic CTS
> > > template encapsulates CBC in a way where its output IV ends up in the
> > > right place.
> > >
> > >> Also, these test cases pass on other hardware platforms.
> > >>
> > >
> > > Fair enough.
> > >
> > > I am not opposed to fixing this, but given that it is the RFC3962 spec
> > > that defines that the next IV is the penultimate full block of the
> > > previous CBC-CTS ciphertext, we might consider doing the memcpy() in
> > > the Kerberos code not in the CBC-CTS implementations.
> >
> > Interesting thought. I'm all about proper layering, so I think this
> > is worth considering. Can you send an RFC patch?
> >
>
> I'm not that familiar with kunit so this is just an off hand
> suggestion, but I imagine something like the below would suffice
>
> --- a/net/sunrpc/auth_gss/gss_krb5_crypto.c
> +++ b/net/sunrpc/auth_gss/gss_krb5_crypto.c
> @@ -639,6 +639,13 @@ gss_krb5_cts_crypt(struct crypto_sync_skcipher
> *cipher, struct xdr_buf *buf,
>
> ret = write_bytes_to_xdr_buf(buf, offset, data, len);
>
> + /*
> + * CBC-CTS does not define an output IV but RFC 3962 defines it as the
> + * penultimate block of ciphertext, so copy that into the IV buffer
> + * before returning.
> + */
> + if (encrypt)
> + memcpy(iv, data, crypto_sync_skcipher_ivsize(cipher));
> out:
> kfree(data);
> return ret;
>
Thanks, Ard. That fixes it on aarch64 (also ran it on x86_64, ppc64le,
s390x, and aarch64 w/ 64k pages). Could you send it as an official
patch and add

Tested-by: Scott Mayhew <[email protected]>

-Scott