2020-05-19 15:34:31

by Stephan Müller

[permalink] [raw]
Subject: ARM CE: CTS IV handling

Hi Ard,

The following report applies to kernel 5.3 as I am currently unable to test
the latest upstream version.

The CTS IV handling for cts-cbc-aes-ce and cts-cbc-aes-neon is not consistent
with the C implementation for CTS such as cts(cbc-aes-ce) and cts(cbc-aes-
neon).

For example, assume encryption operation with the following data:

iv "6CDD928D19C56A2255D1EC16CAA2CCCB"
pt
"2D6BFE335F45EED1C3C404CAA5CA4D41FF2B8C6DE94C706B10F1D207972DE6599C92E117E3CBF61F"
key "930E9D4E65DB121E05F11A16E408AE82"

When you perform one encryption operation, all 4 ciphes return:

022edfa38975b09b295e1958efde2104be1e8e70c81340adfbdf431d5c80e77b89df5997aa96af72

Now, when you leave the TFM untouched (i.e. retain the IV state) and simply
set the following new pt:

6cdd928d19c56a2255d1ec16caa2cccb022edfa38975b09b295e1958efde2104be1e8e70c81340ad

the C CTS implementations return

35d54eb425afe7438c5e96b61b061f04df85a322942210568c20a5e78856c79c0af021f3e0650863

But the cts-cbc-aes-ce and cts-cbc-aes-neon return

a62f57efbe9d815aaf1b6c62f78a31da8ef46e5d401eaf48c261bcf889e6910abbc65c2bf26add9f


My hunch is that the internal IV handling is different. I am aware that CTS
does not exactly specify how the IV should look like after the encryption
operation, but using the NIST reference implementation of ACVP, the C CTS
implementation is considered to be OK whereas the ARM CE assembler
implementation is considered to be not OK.

Bottom line, feeding plaintext in chunks into the ARM CE assembler
implementation will yield a different output than the C implementation.

Ciao
Stephan




2020-05-19 16:23:52

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: ARM CE: CTS IV handling

(+ Eric)

Hi Stephan,

On Tue, 19 May 2020 at 17:31, Stephan Mueller <[email protected]> wrote:
>
> Hi Ard,
>
> The following report applies to kernel 5.3 as I am currently unable to test
> the latest upstream version.
>
> The CTS IV handling for cts-cbc-aes-ce and cts-cbc-aes-neon is not consistent
> with the C implementation for CTS such as cts(cbc-aes-ce) and cts(cbc-aes-
> neon).
>
> For example, assume encryption operation with the following data:
>
> iv "6CDD928D19C56A2255D1EC16CAA2CCCB"
> pt
> "2D6BFE335F45EED1C3C404CAA5CA4D41FF2B8C6DE94C706B10F1D207972DE6599C92E117E3CBF61F"
> key "930E9D4E65DB121E05F11A16E408AE82"
>
> When you perform one encryption operation, all 4 ciphes return:
>
> 022edfa38975b09b295e1958efde2104be1e8e70c81340adfbdf431d5c80e77b89df5997aa96af72
>
> Now, when you leave the TFM untouched (i.e. retain the IV state) and simply
> set the following new pt:
>
> 6cdd928d19c56a2255d1ec16caa2cccb022edfa38975b09b295e1958efde2104be1e8e70c81340ad
>
> the C CTS implementations return
>
> 35d54eb425afe7438c5e96b61b061f04df85a322942210568c20a5e78856c79c0af021f3e0650863
>
> But the cts-cbc-aes-ce and cts-cbc-aes-neon return
>
> a62f57efbe9d815aaf1b6c62f78a31da8ef46e5d401eaf48c261bcf889e6910abbc65c2bf26add9f
>
>
> My hunch is that the internal IV handling is different. I am aware that CTS
> does not exactly specify how the IV should look like after the encryption
> operation, but using the NIST reference implementation of ACVP, the C CTS
> implementation is considered to be OK whereas the ARM CE assembler
> implementation is considered to be not OK.
>
> Bottom line, feeding plaintext in chunks into the ARM CE assembler
> implementation will yield a different output than the C implementation.
>

To be honest, this looks like the API is being used incorrectly. Is
this a similar issue to the one Herbert spotted recently with the CTR
code?

When you say 'leaving the TFM untouched' do you mean the skcipher
request? The TFM should not retain any per-request state in the first
place.

The skcipher request struct is not meant to retain any state either -
the API simply does not support incremental encryption if the input is
not a multiple of the chunksize.

Could you give some sample code on how you are using the API in this case?

2020-05-19 17:36:00

by Stephan Müller

[permalink] [raw]
Subject: Re: ARM CE: CTS IV handling

Am Dienstag, 19. Mai 2020, 18:21:01 CEST schrieb Ard Biesheuvel:

Hi Ard,

>
> To be honest, this looks like the API is being used incorrectly. Is
> this a similar issue to the one Herbert spotted recently with the CTR
> code?
>
> When you say 'leaving the TFM untouched' do you mean the skcipher
> request? The TFM should not retain any per-request state in the first
> place.
>
> The skcipher request struct is not meant to retain any state either -
> the API simply does not support incremental encryption if the input is
> not a multiple of the chunksize.
>
> Could you give some sample code on how you are using the API in this case?

What I am doing technically is to allocate a new tfm and request at the
beginning and then reuse the TFM and request. In that sense, I think I violate
that constraint.

But in order to implement such repetition, I can surely clear / allocate a new
TFM. But in order to get that right, I need the resulting IV after the cipher
operation.

This IV that I get after the cipher operation completes is different between C
and CE.

Ciao
Stephan


2020-05-19 17:52:01

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: ARM CE: CTS IV handling

On Tue, 19 May 2020 at 19:35, Stephan Mueller <[email protected]> wrote:
>
> Am Dienstag, 19. Mai 2020, 18:21:01 CEST schrieb Ard Biesheuvel:
>
> Hi Ard,
>
> >
> > To be honest, this looks like the API is being used incorrectly. Is
> > this a similar issue to the one Herbert spotted recently with the CTR
> > code?
> >
> > When you say 'leaving the TFM untouched' do you mean the skcipher
> > request? The TFM should not retain any per-request state in the first
> > place.
> >
> > The skcipher request struct is not meant to retain any state either -
> > the API simply does not support incremental encryption if the input is
> > not a multiple of the chunksize.
> >
> > Could you give some sample code on how you are using the API in this case?
>
> What I am doing technically is to allocate a new tfm and request at the
> beginning and then reuse the TFM and request. In that sense, I think I violate
> that constraint.
>
> But in order to implement such repetition, I can surely clear / allocate a new
> TFM. But in order to get that right, I need the resulting IV after the cipher
> operation.
>
> This IV that I get after the cipher operation completes is different between C
> and CE.
>

So is the expected output IV simply the last block of ciphertext that
was generated (as usual), but located before the truncated block in
the output?

2020-05-19 17:54:58

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: ARM CE: CTS IV handling

On Tue, 19 May 2020 at 19:50, Ard Biesheuvel <[email protected]> wrote:
>
> On Tue, 19 May 2020 at 19:35, Stephan Mueller <[email protected]> wrote:
> >
> > Am Dienstag, 19. Mai 2020, 18:21:01 CEST schrieb Ard Biesheuvel:
> >
> > Hi Ard,
> >
> > >
> > > To be honest, this looks like the API is being used incorrectly. Is
> > > this a similar issue to the one Herbert spotted recently with the CTR
> > > code?
> > >
> > > When you say 'leaving the TFM untouched' do you mean the skcipher
> > > request? The TFM should not retain any per-request state in the first
> > > place.
> > >
> > > The skcipher request struct is not meant to retain any state either -
> > > the API simply does not support incremental encryption if the input is
> > > not a multiple of the chunksize.
> > >
> > > Could you give some sample code on how you are using the API in this case?
> >
> > What I am doing technically is to allocate a new tfm and request at the
> > beginning and then reuse the TFM and request. In that sense, I think I violate
> > that constraint.
> >
> > But in order to implement such repetition, I can surely clear / allocate a new
> > TFM. But in order to get that right, I need the resulting IV after the cipher
> > operation.
> >
> > This IV that I get after the cipher operation completes is different between C
> > and CE.
> >
>
> So is the expected output IV simply the last block of ciphertext that
> was generated (as usual), but located before the truncated block in
> the output?

If so, does the below fix the encrypt case?

index cf618d8f6cec..22f190a44689 100644
--- a/arch/arm64/crypto/aes-modes.S
+++ b/arch/arm64/crypto/aes-modes.S
@@ -275,6 +275,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
add x4, x0, x4
st1 {v0.16b}, [x4] /* overlapping stores */
st1 {v1.16b}, [x0]
+ st1 {v1.16b}, [x5]
ret
AES_FUNC_END(aes_cbc_cts_encrypt)

2020-05-19 18:12:45

by Stephan Müller

[permalink] [raw]
Subject: Re: ARM CE: CTS IV handling

Am Dienstag, 19. Mai 2020, 19:53:57 CEST schrieb Ard Biesheuvel:

Hi Ard,

> On Tue, 19 May 2020 at 19:50, Ard Biesheuvel <[email protected]> wrote:
> > On Tue, 19 May 2020 at 19:35, Stephan Mueller <[email protected]> wrote:
> > > Am Dienstag, 19. Mai 2020, 18:21:01 CEST schrieb Ard Biesheuvel:
> > >
> > > Hi Ard,
> > >
> > > > To be honest, this looks like the API is being used incorrectly. Is
> > > > this a similar issue to the one Herbert spotted recently with the CTR
> > > > code?
> > > >
> > > > When you say 'leaving the TFM untouched' do you mean the skcipher
> > > > request? The TFM should not retain any per-request state in the first
> > > > place.
> > > >
> > > > The skcipher request struct is not meant to retain any state either -
> > > > the API simply does not support incremental encryption if the input is
> > > > not a multiple of the chunksize.
> > > >
> > > > Could you give some sample code on how you are using the API in this
> > > > case?
> > >
> > > What I am doing technically is to allocate a new tfm and request at the
> > > beginning and then reuse the TFM and request. In that sense, I think I
> > > violate that constraint.
> > >
> > > But in order to implement such repetition, I can surely clear / allocate
> > > a new TFM. But in order to get that right, I need the resulting IV
> > > after the cipher operation.
> > >
> > > This IV that I get after the cipher operation completes is different
> > > between C and CE.
> >
> > So is the expected output IV simply the last block of ciphertext that
> > was generated (as usual), but located before the truncated block in
> > the output?
>
> If so, does the below fix the encrypt case?

I think it is.

But, allow me to take that patch to my test system for verification.
>
> index cf618d8f6cec..22f190a44689 100644
> --- a/arch/arm64/crypto/aes-modes.S
> +++ b/arch/arm64/crypto/aes-modes.S
> @@ -275,6 +275,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
> add x4, x0, x4
> st1 {v0.16b}, [x4] /* overlapping
> stores */ st1 {v1.16b}, [x0]
> + st1 {v1.16b}, [x5]
> ret
> AES_FUNC_END(aes_cbc_cts_encrypt)


Ciao
Stephan