2022-07-01 13:36:23

by Corentin Labbe

[permalink] [raw]
Subject: [RFC PATCH] crypto: flush poison data

On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to:
alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0

In fact the buffer is not overran by device but by the dma_map_single() operation.

To prevent any corruption of the poisoned data, simply flush them before
giving the buffer to the tested driver.

Signed-off-by: Corentin Labbe <[email protected]>
---

Hello

I put this patch as RFC, since this behavour happen only on non yet merged RISCV code.
(Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant)

Regards

crypto/testmgr.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index c59bd9e07978..187163e2e593 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -19,6 +19,7 @@
#include <crypto/aead.h>
#include <crypto/hash.h>
#include <crypto/skcipher.h>
+#include <linux/cacheflush.h>
#include <linux/err.h>
#include <linux/fips.h>
#include <linux/module.h>
@@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE])
static inline void testmgr_poison(void *addr, size_t len)
{
memset(addr, TESTMGR_POISON_BYTE, len);
+ /* Be sure data is written to prevent corruption from some DMA sync */
+ flush_icache_range((unsigned long)addr, (unsigned long)addr + len);
}

/* Is the memory region still fully poisoned? */
--
2.35.1


2022-07-01 14:18:21

by Ben Dooks

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

On 01/07/2022 14:27, Corentin Labbe wrote:
> On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to:
> alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0
>
> In fact the buffer is not overran by device but by the dma_map_single() operation.
>
> To prevent any corruption of the poisoned data, simply flush them before
> giving the buffer to the tested driver.
>
> Signed-off-by: Corentin Labbe <[email protected]>
> ---
>
> Hello
>
> I put this patch as RFC, since this behavour happen only on non yet merged RISCV code.
> (Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant)
>
> Regards
>
> crypto/testmgr.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> index c59bd9e07978..187163e2e593 100644
> --- a/crypto/testmgr.c
> +++ b/crypto/testmgr.c
> @@ -19,6 +19,7 @@
> #include <crypto/aead.h>
> #include <crypto/hash.h>
> #include <crypto/skcipher.h>
> +#include <linux/cacheflush.h>
> #include <linux/err.h>
> #include <linux/fips.h>
> #include <linux/module.h>
> @@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE])
> static inline void testmgr_poison(void *addr, size_t len)
> {
> memset(addr, TESTMGR_POISON_BYTE, len);
> + /* Be sure data is written to prevent corruption from some DMA sync */
> + flush_icache_range((unsigned long)addr, (unsigned long)addr + len);
> }
>
> /* Is the memory region still fully poisoned? */

why are you flushing the instruction cache and not the data-cache?

--
Ben Dooks http://www.codethink.co.uk/
Senior Engineer Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html

2022-07-01 14:46:57

by Andre Przywara

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

On Fri, 1 Jul 2022 13:27:35 +0000
Corentin Labbe <[email protected]> wrote:

Hi,

> On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to:
> alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0
>
> In fact the buffer is not overran by device but by the dma_map_single() operation.
>
> To prevent any corruption of the poisoned data, simply flush them before
> giving the buffer to the tested driver.
>
> Signed-off-by: Corentin Labbe <[email protected]>
> ---
>
> Hello
>
> I put this patch as RFC, since this behavour happen only on non yet merged RISCV code.
> (Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant)
>
> Regards
>
> crypto/testmgr.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> index c59bd9e07978..187163e2e593 100644
> --- a/crypto/testmgr.c
> +++ b/crypto/testmgr.c
> @@ -19,6 +19,7 @@
> #include <crypto/aead.h>
> #include <crypto/hash.h>
> #include <crypto/skcipher.h>
> +#include <linux/cacheflush.h>
> #include <linux/err.h>
> #include <linux/fips.h>
> #include <linux/module.h>
> @@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE])
> static inline void testmgr_poison(void *addr, size_t len)
> {
> memset(addr, TESTMGR_POISON_BYTE, len);
> + /* Be sure data is written to prevent corruption from some DMA sync */
> + flush_icache_range((unsigned long)addr, (unsigned long)addr + len);

As Ben already mentioned, this looks like having nothing to do with the I
cache. I guess you picked that because it does the required cache cleaning
and doesn't require a vma parameter?

But more importantly: I think drivers shouldn't do explicit cache
maintenance, this is what the DMA API is for.
So if you get DMA corruption, then this points to some flaw in the DMA API
usage: either the buffer belongs to the CPU, then the device must not write
to it. Or the buffer belongs to the device, then the CPU cannot expect to
write to that without that data potentially getting corrupted.

So can you check if that's the case?

Cheers,
Andre

> }
>
> /* Is the memory region still fully poisoned? */

2022-07-01 15:04:53

by Corentin Labbe

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

Le Fri, Jul 01, 2022 at 03:36:14PM +0100, Andre Przywara a ?crit :
> On Fri, 1 Jul 2022 13:27:35 +0000
> Corentin Labbe <[email protected]> wrote:
>
> Hi,
>
> > On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to:
> > alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0
> >
> > In fact the buffer is not overran by device but by the dma_map_single() operation.
> >
> > To prevent any corruption of the poisoned data, simply flush them before
> > giving the buffer to the tested driver.
> >
> > Signed-off-by: Corentin Labbe <[email protected]>
> > ---
> >
> > Hello
> >
> > I put this patch as RFC, since this behavour happen only on non yet merged RISCV code.
> > (Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant)
> >
> > Regards
> >
> > crypto/testmgr.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> > index c59bd9e07978..187163e2e593 100644
> > --- a/crypto/testmgr.c
> > +++ b/crypto/testmgr.c
> > @@ -19,6 +19,7 @@
> > #include <crypto/aead.h>
> > #include <crypto/hash.h>
> > #include <crypto/skcipher.h>
> > +#include <linux/cacheflush.h>
> > #include <linux/err.h>
> > #include <linux/fips.h>
> > #include <linux/module.h>
> > @@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE])
> > static inline void testmgr_poison(void *addr, size_t len)
> > {
> > memset(addr, TESTMGR_POISON_BYTE, len);
> > + /* Be sure data is written to prevent corruption from some DMA sync */
> > + flush_icache_range((unsigned long)addr, (unsigned long)addr + len);
>
> As Ben already mentioned, this looks like having nothing to do with the I
> cache. I guess you picked that because it does the required cache cleaning
> and doesn't require a vma parameter?

The reality is simpler, I just copied what did drivers/crypto/xilinx/zynqmp-sha.c

>
> But more importantly: I think drivers shouldn't do explicit cache
> maintenance, this is what the DMA API is for.
> So if you get DMA corruption, then this points to some flaw in the DMA API
> usage: either the buffer belongs to the CPU, then the device must not write
> to it. Or the buffer belongs to the device, then the CPU cannot expect to
> write to that without that data potentially getting corrupted.

The device does nothing wrong, I removed all sun8i-ce device action (and kept DMA API actions) and the the whole buffer is still corrupted.
Anyway if the driver was doing something wrong, it should have fail on arm or arm64.

See my previous report https://lore.kernel.org/lkml/YllWTN+15CoskNBt@Red/ which show the problem (The invalidated size is bigger than the dma_sync length parameter)

2022-07-05 08:28:23

by Corentin Labbe

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

Le Fri, Jul 01, 2022 at 02:35:41PM +0100, Ben Dooks a ?crit :
> On 01/07/2022 14:27, Corentin Labbe wrote:
> > On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to:
> > alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0
> >
> > In fact the buffer is not overran by device but by the dma_map_single() operation.
> >
> > To prevent any corruption of the poisoned data, simply flush them before
> > giving the buffer to the tested driver.
> >
> > Signed-off-by: Corentin Labbe <[email protected]>
> > ---
> >
> > Hello
> >
> > I put this patch as RFC, since this behavour happen only on non yet merged RISCV code.
> > (Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant)
> >
> > Regards
> >
> > crypto/testmgr.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> > index c59bd9e07978..187163e2e593 100644
> > --- a/crypto/testmgr.c
> > +++ b/crypto/testmgr.c
> > @@ -19,6 +19,7 @@
> > #include <crypto/aead.h>
> > #include <crypto/hash.h>
> > #include <crypto/skcipher.h>
> > +#include <linux/cacheflush.h>
> > #include <linux/err.h>
> > #include <linux/fips.h>
> > #include <linux/module.h>
> > @@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE])
> > static inline void testmgr_poison(void *addr, size_t len)
> > {
> > memset(addr, TESTMGR_POISON_BYTE, len);
> > + /* Be sure data is written to prevent corruption from some DMA sync */
> > + flush_icache_range((unsigned long)addr, (unsigned long)addr + len);
> > }
> >
> > /* Is the memory region still fully poisoned? */
>
> why are you flushing the instruction cache and not the data-cache?
>

I just copied what did drivers/crypto/xilinx/zynqmp-sha.c.
I tried to do flush_dcache_range() but it seems to not be implemented on riscV.
And flush_dcache_page(virt_to_page(addr), len) produce a kernel panic.

Any advice on how to go further ?

2022-07-05 16:43:12

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

On Tue, Jul 05, 2022 at 10:21:13AM +0200, LABBE Corentin wrote:
>
> I just copied what did drivers/crypto/xilinx/zynqmp-sha.c.
> I tried to do flush_dcache_range() but it seems to not be implemented on riscV.

That driver is broken and should no have been merged in that form.

> And flush_dcache_page(virt_to_page(addr), len) produce a kernel panic.

And that's good so. Drivers have no business doing their own cache
flushing. That is the job of the dma-mapping implementation, so I'd
suggest to look for problems there.

2022-07-05 18:09:30

by Corentin Labbe

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

Le Tue, Jul 05, 2022 at 06:42:13PM +0200, Christoph Hellwig a ?crit :
> On Tue, Jul 05, 2022 at 10:21:13AM +0200, LABBE Corentin wrote:
> >
> > I just copied what did drivers/crypto/xilinx/zynqmp-sha.c.
> > I tried to do flush_dcache_range() but it seems to not be implemented on riscV.
>
> That driver is broken and should no have been merged in that form.
>
> > And flush_dcache_page(virt_to_page(addr), len) produce a kernel panic.
>
> And that's good so. Drivers have no business doing their own cache
> flushing. That is the job of the dma-mapping implementation, so I'd
> suggest to look for problems there.

I am sorry but this code is not in driver but in crypto API code.

It seems that I didnt explain well the problem.

The crypto API run a number of crypto operations against every driver that register crypto algos.
For each buffer given to the tested driver, crypto API setup a poison buffer contigous to this buffer.
The goal is to detect if driver do bad thing outside of buffer it got.

So the tested driver dont know existence of this poison buffer and so cannot not handle it.

My problem is that a dma_sync on the data buffer corrupt the poison buffer as collateral dommage.
Probably because the sync operate on a larger region than the requested dma_sync length.
So I try to flush poison data in the cryptoAPI.

Any hint on how to do it properly is welcome.

2022-07-05 18:09:33

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

On Tue, Jul 05, 2022 at 07:56:11PM +0200, LABBE Corentin wrote:
> My problem is that a dma_sync on the data buffer corrupt the poison buffer as collateral dommage.
> Probably because the sync operate on a larger region than the requested dma_sync length.
> So I try to flush poison data in the cryptoAPI.

Data structures that are DMAed to must be aligned to
the value returned by dma_get_cache_alignment(), as non-coherent DMA
by definition can disturb the data inside that boundary. That is not
a bug but fundamentally part of how DMA works when the device attachment
is not cache coherent.

2022-07-06 07:34:46

by Corentin Labbe

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

Le Tue, Jul 05, 2022 at 07:58:34PM +0200, Christoph Hellwig a ?crit :
> On Tue, Jul 05, 2022 at 07:56:11PM +0200, LABBE Corentin wrote:
> > My problem is that a dma_sync on the data buffer corrupt the poison buffer as collateral dommage.
> > Probably because the sync operate on a larger region than the requested dma_sync length.
> > So I try to flush poison data in the cryptoAPI.
>
> Data structures that are DMAed to must be aligned to
> the value returned by dma_get_cache_alignment(), as non-coherent DMA
> by definition can disturb the data inside that boundary. That is not
> a bug but fundamentally part of how DMA works when the device attachment
> is not cache coherent.

I am sorry but I dont see how this can help my problem.

2022-07-06 09:53:04

by Ben Dooks

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

On 05/07/2022 17:42, Christoph Hellwig wrote:
> On Tue, Jul 05, 2022 at 10:21:13AM +0200, LABBE Corentin wrote:
>>
>> I just copied what did drivers/crypto/xilinx/zynqmp-sha.c.
>> I tried to do flush_dcache_range() but it seems to not be implemented on riscV.
>
> That driver is broken and should no have been merged in that form.
>
>> And flush_dcache_page(virt_to_page(addr), len) produce a kernel panic.
>
> And that's good so. Drivers have no business doing their own cache
> flushing. That is the job of the dma-mapping implementation, so I'd
> suggest to look for problems there.

I'm not sure that the dma-mapping code for non-coherent riscv systems
did get sorted. I couldn't find any when looking in 5.17.

I expect the flush of the icache is also implicitly doing a dcache
flush too as from what i've seen it is only being used when code or
tlbs are being modified.

--
Ben Dooks http://www.codethink.co.uk/
Senior Engineer Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html

2022-07-06 12:05:33

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

On Wed, Jul 06, 2022 at 10:47:24AM +0100, Ben Dooks wrote:
> I'm not sure that the dma-mapping code for non-coherent riscv systems
> did get sorted. I couldn't find any when looking in 5.17.

Yes, none of that is upstream. But as supporting it is essential for
the allwinner SOCs I'm pretty sure Corentin is not actually using an
upstream kernel anyway.

2022-07-06 12:40:21

by Corentin Labbe

[permalink] [raw]
Subject: Re: [RFC PATCH] crypto: flush poison data

Le Wed, Jul 06, 2022 at 01:58:07PM +0200, Christoph Hellwig a ?crit :
> On Wed, Jul 06, 2022 at 10:47:24AM +0100, Ben Dooks wrote:
> > I'm not sure that the dma-mapping code for non-coherent riscv systems
> > did get sorted. I couldn't find any when looking in 5.17.
>
> Yes, none of that is upstream. But as supporting it is essential for
> the allwinner SOCs I'm pretty sure Corentin is not actually using an
> upstream kernel anyway.

I use an upstream kernel + some "not yet merged but sent for review" patch serie like
"riscv: implement Zicbom-based CMO instructions + the t-head variant"

And good news, I just updated to use the v6 of this serie (just posted today) and my problem disappear.

Regards