2023-04-28 16:14:25

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: RPCSEC GSS krb5 KUnit test fails on arm64 with h/w accelerated ciphers enabled

On Fri, 28 Apr 2023 at 13:59, Chuck Lever III <[email protected]> wrote:
>
>
>
> > On Apr 28, 2023, at 5:57 AM, Ard Biesheuvel <[email protected]> wrote:
> >
> > On Fri, 28 Apr 2023 at 10:44, Herbert Xu <[email protected]> wrote:
> >>
> >> Scott Mayhew <[email protected]> wrote:
> >>>
> >>> diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
> >>> index 0e834a2c062c..477605fad76b 100644
> >>> --- a/arch/arm64/crypto/aes-modes.S
> >>> +++ b/arch/arm64/crypto/aes-modes.S
> >>> @@ -268,6 +268,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
> >>> add x4, x0, x4
> >>> st1 {v0.16b}, [x4] /* overlapping stores */
> >>> st1 {v1.16b}, [x0]
> >>> + st1 {v1.16b}, [x5]
> >>> ret
> >>> AES_FUNC_END(aes_cbc_cts_encrypt)
> >>>
> >>> But I don't know if that change is at all correct! (I've never even
> >>> looked at arm64 asm before). If someone who's knowledgeable about this
> >>> code could chime in, I'd appreciate it.
> >>
> >> Ard, could you please take a look at this?
> >>
> >
> > The issue seems to be that the caller expects iv_out to have been
> > populated even when doing ciphertext stealing.
> >
> > This is meaningless, because the output IV can only be used to chain
> > additional encrypted data if the split is at a block boundary. The
> > ciphertext stealing fundamentally terminates the encryption, and
> > produces a block of ciphertext that is shorter than the block size, so
> > what the output IV should be is actually unspecified.
> >
> > IOW, test cases having plain/ciphertext vectors whose size is not a
> > multiple of the cipher block size should not specify an expected value
> > for the output IV.
>
> The test cases are extracted from RFC 3962 Appendix B. The
> standard clearly expects there to be a non-zero next IV for
> plaintext sizes that are not block-aligned.
>

OK, so this is the Kerberos V specific spec on how to use AES in CBC
mode, which appears to describe how to chain multiple CBC encryptions
together.

CBC-CTS itself does not define this: the IV is an input only, and the
reason we happen to return the IV is because a single CBC operation
may be broken up into several ones, which can only be done on block
boundaries. This is why CBC-CTS itself passes all its tests: a single
CBC-CTS encryption only performs ciphertext stealing at the very end,
and the next IV is never used in that case. (This is why the CTS mode
tests in crypto/testmgr.h don't have iv_out vectors)

Note that RFC3962 defines that the penultimate block of CBC-CTS
ciphertext is used as the next IV. CBC returns the last ciphertext
block as the output IV. It is a happy coincidence that the generic CTS
template encapsulates CBC in a way where its output IV ends up in the
right place.

> Also, these test cases pass on other hardware platforms.
>

Fair enough.

I am not opposed to fixing this, but given that it is the RFC3962 spec
that defines that the next IV is the penultimate full block of the
previous CBC-CTS ciphertext, we might consider doing the memcpy() in
the Kerberos code not in the CBC-CTS implementations. (The 32-bit ARM
code will be broken in the same way, and potentially other
implementations as well)


2023-04-28 16:40:31

by Chuck Lever III

[permalink] [raw]
Subject: Re: RPCSEC GSS krb5 KUnit test fails on arm64 with h/w accelerated ciphers enabled



> On Apr 28, 2023, at 12:09 PM, Ard Biesheuvel <[email protected]> wrote:
>
> On Fri, 28 Apr 2023 at 13:59, Chuck Lever III <[email protected]> wrote:
>>
>>
>>
>>> On Apr 28, 2023, at 5:57 AM, Ard Biesheuvel <[email protected]> wrote:
>>>
>>> On Fri, 28 Apr 2023 at 10:44, Herbert Xu <[email protected]> wrote:
>>>>
>>>> Scott Mayhew <[email protected]> wrote:
>>>>>
>>>>> diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
>>>>> index 0e834a2c062c..477605fad76b 100644
>>>>> --- a/arch/arm64/crypto/aes-modes.S
>>>>> +++ b/arch/arm64/crypto/aes-modes.S
>>>>> @@ -268,6 +268,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
>>>>> add x4, x0, x4
>>>>> st1 {v0.16b}, [x4] /* overlapping stores */
>>>>> st1 {v1.16b}, [x0]
>>>>> + st1 {v1.16b}, [x5]
>>>>> ret
>>>>> AES_FUNC_END(aes_cbc_cts_encrypt)
>>>>>
>>>>> But I don't know if that change is at all correct! (I've never even
>>>>> looked at arm64 asm before). If someone who's knowledgeable about this
>>>>> code could chime in, I'd appreciate it.
>>>>
>>>> Ard, could you please take a look at this?
>>>>
>>>
>>> The issue seems to be that the caller expects iv_out to have been
>>> populated even when doing ciphertext stealing.
>>>
>>> This is meaningless, because the output IV can only be used to chain
>>> additional encrypted data if the split is at a block boundary. The
>>> ciphertext stealing fundamentally terminates the encryption, and
>>> produces a block of ciphertext that is shorter than the block size, so
>>> what the output IV should be is actually unspecified.
>>>
>>> IOW, test cases having plain/ciphertext vectors whose size is not a
>>> multiple of the cipher block size should not specify an expected value
>>> for the output IV.
>>
>> The test cases are extracted from RFC 3962 Appendix B. The
>> standard clearly expects there to be a non-zero next IV for
>> plaintext sizes that are not block-aligned.
>>
>
> OK, so this is the Kerberos V specific spec on how to use AES in CBC
> mode, which appears to describe how to chain multiple CBC encryptions
> together.
>
> CBC-CTS itself does not define this: the IV is an input only, and the
> reason we happen to return the IV is because a single CBC operation
> may be broken up into several ones, which can only be done on block
> boundaries. This is why CBC-CTS itself passes all its tests: a single
> CBC-CTS encryption only performs ciphertext stealing at the very end,
> and the next IV is never used in that case. (This is why the CTS mode
> tests in crypto/testmgr.h don't have iv_out vectors)
>
> Note that RFC3962 defines that the penultimate block of CBC-CTS
> ciphertext is used as the next IV. CBC returns the last ciphertext
> block as the output IV. It is a happy coincidence that the generic CTS
> template encapsulates CBC in a way where its output IV ends up in the
> right place.
>
>> Also, these test cases pass on other hardware platforms.
>>
>
> Fair enough.
>
> I am not opposed to fixing this, but given that it is the RFC3962 spec
> that defines that the next IV is the penultimate full block of the
> previous CBC-CTS ciphertext, we might consider doing the memcpy() in
> the Kerberos code not in the CBC-CTS implementations.

Interesting thought. I'm all about proper layering, so I think this
is worth considering. Can you send an RFC patch?

David, any opinion about this for crypto/krb5?


> (The 32-bit ARM
> code will be broken in the same way, and potentially other
> implementations as well)


--
Chuck Lever


2023-04-28 16:53:41

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: RPCSEC GSS krb5 KUnit test fails on arm64 with h/w accelerated ciphers enabled

On Fri, 28 Apr 2023 at 17:18, Chuck Lever III <[email protected]> wrote:
>
>
>
> > On Apr 28, 2023, at 12:09 PM, Ard Biesheuvel <[email protected]> wrote:
> >
> > On Fri, 28 Apr 2023 at 13:59, Chuck Lever III <[email protected]> wrote:
> >>
> >>
> >>
> >>> On Apr 28, 2023, at 5:57 AM, Ard Biesheuvel <[email protected]> wrote:
> >>>
> >>> On Fri, 28 Apr 2023 at 10:44, Herbert Xu <[email protected]> wrote:
> >>>>
> >>>> Scott Mayhew <[email protected]> wrote:
> >>>>>
> >>>>> diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
> >>>>> index 0e834a2c062c..477605fad76b 100644
> >>>>> --- a/arch/arm64/crypto/aes-modes.S
> >>>>> +++ b/arch/arm64/crypto/aes-modes.S
> >>>>> @@ -268,6 +268,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
> >>>>> add x4, x0, x4
> >>>>> st1 {v0.16b}, [x4] /* overlapping stores */
> >>>>> st1 {v1.16b}, [x0]
> >>>>> + st1 {v1.16b}, [x5]
> >>>>> ret
> >>>>> AES_FUNC_END(aes_cbc_cts_encrypt)
> >>>>>
> >>>>> But I don't know if that change is at all correct! (I've never even
> >>>>> looked at arm64 asm before). If someone who's knowledgeable about this
> >>>>> code could chime in, I'd appreciate it.
> >>>>
> >>>> Ard, could you please take a look at this?
> >>>>
> >>>
> >>> The issue seems to be that the caller expects iv_out to have been
> >>> populated even when doing ciphertext stealing.
> >>>
> >>> This is meaningless, because the output IV can only be used to chain
> >>> additional encrypted data if the split is at a block boundary. The
> >>> ciphertext stealing fundamentally terminates the encryption, and
> >>> produces a block of ciphertext that is shorter than the block size, so
> >>> what the output IV should be is actually unspecified.
> >>>
> >>> IOW, test cases having plain/ciphertext vectors whose size is not a
> >>> multiple of the cipher block size should not specify an expected value
> >>> for the output IV.
> >>
> >> The test cases are extracted from RFC 3962 Appendix B. The
> >> standard clearly expects there to be a non-zero next IV for
> >> plaintext sizes that are not block-aligned.
> >>
> >
> > OK, so this is the Kerberos V specific spec on how to use AES in CBC
> > mode, which appears to describe how to chain multiple CBC encryptions
> > together.
> >
> > CBC-CTS itself does not define this: the IV is an input only, and the
> > reason we happen to return the IV is because a single CBC operation
> > may be broken up into several ones, which can only be done on block
> > boundaries. This is why CBC-CTS itself passes all its tests: a single
> > CBC-CTS encryption only performs ciphertext stealing at the very end,
> > and the next IV is never used in that case. (This is why the CTS mode
> > tests in crypto/testmgr.h don't have iv_out vectors)
> >
> > Note that RFC3962 defines that the penultimate block of CBC-CTS
> > ciphertext is used as the next IV. CBC returns the last ciphertext
> > block as the output IV. It is a happy coincidence that the generic CTS
> > template encapsulates CBC in a way where its output IV ends up in the
> > right place.
> >
> >> Also, these test cases pass on other hardware platforms.
> >>
> >
> > Fair enough.
> >
> > I am not opposed to fixing this, but given that it is the RFC3962 spec
> > that defines that the next IV is the penultimate full block of the
> > previous CBC-CTS ciphertext, we might consider doing the memcpy() in
> > the Kerberos code not in the CBC-CTS implementations.
>
> Interesting thought. I'm all about proper layering, so I think this
> is worth considering. Can you send an RFC patch?
>

I'm not that familiar with kunit so this is just an off hand
suggestion, but I imagine something like the below would suffice

--- a/net/sunrpc/auth_gss/gss_krb5_crypto.c
+++ b/net/sunrpc/auth_gss/gss_krb5_crypto.c
@@ -639,6 +639,13 @@ gss_krb5_cts_crypt(struct crypto_sync_skcipher
*cipher, struct xdr_buf *buf,

ret = write_bytes_to_xdr_buf(buf, offset, data, len);

+ /*
+ * CBC-CTS does not define an output IV but RFC 3962 defines it as the
+ * penultimate block of ciphertext, so copy that into the IV buffer
+ * before returning.
+ */
+ if (encrypt)
+ memcpy(iv, data, crypto_sync_skcipher_ivsize(cipher));
out:
kfree(data);
return ret;