2008-09-22 18:13:46

by Dimitri Puzin

[permalink] [raw]
Subject: hifn_795x in Linux-2.6.27-rc7

Hi all,

I'm trying to use a Soekris Hifn 7955 based pci card to accelerate
dm-crypt device on vanilla 2.6.27-rc7 LK.

# cryptsetup luksFormat -c aes-cbc-plain -s 256 /dev/generic/test
completes successfully but the volume is unreadable.
Dmesg is

hifn795x 0000:03:04.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
hifn795x: assuming 66MHz clock speed, override with
hifn_pll_ref=ext<frequency>
hifn0: AES 128 ECB test has been successfully passed.
Driver for HIFN 795x crypto accelerator chip has been successfully
registered.
nommu_map_single: overflow 1fe047e00+512 of device mask ffffffff
nommu_map_single: overflow 1fe047e00+512 of device mask ffffffff
nommu_map_single: overflow 1fe047e00+512 of device mask ffffffff

The same works when I use software AES module. The hardware is a dual
opteron 252 with 8 GB RAM on a tyan S2882D mainboard. OS is Debian Lenny
i386. Kernel built from vanilla sources from kernel.org. The .config is
attached.

Regards,

Dimitri Puzin


Attachments:
config.gz (23.68 kB)
signature.asc (260.00 B)
OpenPGP digital signature
Download all attachments

2008-09-23 16:56:24

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

Hi Dimitri.

On Mon, Sep 22, 2008 at 07:39:23PM +0200, Dimitri Puzin ([email protected]) wrote:
> nommu_map_single: overflow 1fe047e00+512 of device mask ffffffff

Does attached patch fix the problem?

diff --git a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
index 81f3f95..b288bbd 100644
--- a/drivers/crypto/hifn_795x.c
+++ b/drivers/crypto/hifn_795x.c
@@ -2593,7 +2593,7 @@ static int hifn_probe(struct pci_dev *pdev, const struct pci_device_id *id)
return err;
pci_set_master(pdev);

- err = pci_set_dma_mask(pdev, DMA_32BIT_MASK);
+ err = pci_set_dma_mask(pdev, DMA_64BIT_MASK);
if (err)
goto err_out_disable_pci_device;

--
Evgeniy Polyakov

2008-09-23 17:40:20

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

On Tue, Sep 23, 2008 at 08:55:21PM +0400, Evgeniy Polyakov ([email protected]) wrote:
> diff --git a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
> index 81f3f95..b288bbd 100644
> --- a/drivers/crypto/hifn_795x.c
> +++ b/drivers/crypto/hifn_795x.c
> @@ -2593,7 +2593,7 @@ static int hifn_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> return err;
> pci_set_master(pdev);
>
> - err = pci_set_dma_mask(pdev, DMA_32BIT_MASK);
> + err = pci_set_dma_mask(pdev, DMA_64BIT_MASK);
> if (err)
> goto err_out_disable_pci_device;

Actually not, I checked documentation, and although HIFN claims to have
some kind of 64 bit support, all addresses are 32-bit wide only, so I
really wonder if HIFN supports 64 bit.
There is no remapping or address mask parameter in the crypto stack,
so that it would not try to provide page to the devices, which do not
support it.

--
Evgeniy Polyakov

2008-09-23 18:07:06

by Dimitri Puzin

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

Evgeniy Polyakov schrieb:
> Hi Dimitri.
>
> On Mon, Sep 22, 2008 at 07:39:23PM +0200, Dimitri Puzin ([email protected]) wrote:
>> nommu_map_single: overflow 1fe047e00+512 of device mask ffffffff
>
> Does attached patch fix the problem?
With this patch applied it still doesn't work as expected. The overflow
messages are gone however syslog shows
[ 120.924266] hifn0: abort: c: 0, s: 1, d: 0, r: 0.
when doing cryptsetup luksFormat as in original e-mail. At this point
cryptsetup hangs and can't be killed with -SIGKILL. I've attached
SysRq-t dump of this condition.

Regards,

Dimitri Puzin
>
> diff --git a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
> index 81f3f95..b288bbd 100644
> --- a/drivers/crypto/hifn_795x.c
> +++ b/drivers/crypto/hifn_795x.c
> @@ -2593,7 +2593,7 @@ static int hifn_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> return err;
> pci_set_master(pdev);
>
> - err = pci_set_dma_mask(pdev, DMA_32BIT_MASK);
> + err = pci_set_dma_mask(pdev, DMA_64BIT_MASK);
> if (err)
> goto err_out_disable_pci_device;
>


Attachments:
sysrq.gz (9.95 kB)
signature.asc (260.00 B)
OpenPGP digital signature
Download all attachments

2008-09-23 18:19:36

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

On Tue, Sep 23, 2008 at 08:06:32PM +0200, Dimitri Puzin ([email protected]) wrote:
> With this patch applied it still doesn't work as expected. The overflow
> messages are gone however syslog shows
> [ 120.924266] hifn0: abort: c: 0, s: 1, d: 0, r: 0.
> when doing cryptsetup luksFormat as in original e-mail. At this point
> cryptsetup hangs and can't be killed with -SIGKILL. I've attached
> SysRq-t dump of this condition.

Yes, I was wrong with the patch: HIFN does not support 64-bit addresses
afaics.

Attached patch should not allow HIFN to be registered on 64-bit arch, so
crypto layer will fallback to the software algorithms.

Signed-off-by: Evgeniy Polyakov <[email protected]>

diff --git a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
index 81f3f95..31c75d4 100644
--- a/drivers/crypto/hifn_795x.c
+++ b/drivers/crypto/hifn_795x.c
@@ -2787,6 +2787,11 @@ static int __devinit hifn_init(void)
unsigned int freq;
int err;

+ if (sizeof(dma_addr_t) > 4) {
+ printk(KERN_INFO "HIFN supports only 32-bit addresses.\n");
+ return -EINVAL;
+ }
+
if (strncmp(hifn_pll_ref, "ext", 3) &&
strncmp(hifn_pll_ref, "pci", 3)) {
printk(KERN_ERR "hifn795x: invalid hifn_pll_ref clock, "


--
Evgeniy Polyakov

2008-09-23 21:55:27

by David Miller

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

From: Evgeniy Polyakov <[email protected]>
Date: Tue, 23 Sep 2008 22:18:42 +0400

> On Tue, Sep 23, 2008 at 08:06:32PM +0200, Dimitri Puzin ([email protected]) wrote:
> > With this patch applied it still doesn't work as expected. The overflow
> > messages are gone however syslog shows
> > [ 120.924266] hifn0: abort: c: 0, s: 1, d: 0, r: 0.
> > when doing cryptsetup luksFormat as in original e-mail. At this point
> > cryptsetup hangs and can't be killed with -SIGKILL. I've attached
> > SysRq-t dump of this condition.
>
> Yes, I was wrong with the patch: HIFN does not support 64-bit addresses
> afaics.
>
> Attached patch should not allow HIFN to be registered on 64-bit arch, so
> crypto layer will fallback to the software algorithms.
>
> Signed-off-by: Evgeniy Polyakov <[email protected]>

Well, it will disallow HIFN on 32-bit systems using highmem too. :-)

In the end, you will need to use bouncebuffers or similar, it seems.

2008-09-24 02:54:08

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

On Tue, Sep 23, 2008 at 02:55:15PM -0700, David Miller ([email protected]) wrote:
> > Yes, I was wrong with the patch: HIFN does not support 64-bit addresses
> > afaics.
> >
> > Attached patch should not allow HIFN to be registered on 64-bit arch, so
> > crypto layer will fallback to the software algorithms.
> >
> > Signed-off-by: Evgeniy Polyakov <[email protected]>
>
> Well, it will disallow HIFN on 32-bit systems using highmem too. :-)

Yes, but only 64g, where dma_addr_t is 8 bytes.

> In the end, you will need to use bouncebuffers or similar, it seems.

Maybe, we need to check, since allocation of multiple smaller pages from
4 gb area, copy data between them, hardware crypto, backward copy and
page freeing can be slower than software crypto. Although we can
preallocate buffers for several common sizes, although this will not
help dm-crypto which can submit up to 31 pages in single request on top
of ext3 in the common case.

--
Evgeniy Polyakov

2008-09-24 03:01:32

by Herbert Xu

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

On Wed, Sep 24, 2008 at 06:53:08AM +0400, Evgeniy Polyakov wrote:
>
> Maybe, we need to check, since allocation of multiple smaller pages from
> 4 gb area, copy data between them, hardware crypto, backward copy and
> page freeing can be slower than software crypto. Although we can
> preallocate buffers for several common sizes, although this will not
> help dm-crypto which can submit up to 31 pages in single request on top
> of ext3 in the common case.

IPsec is going to be lowmem most of the time. So you could
allocate a software backup similar to what padlock-sha does
and use that within hifn if you get a highmem request.

Of course not registering it works for me too. I don't think
we need to lose too much sleep over people who can afford this
much memory but still insists on 32-bit. They can afford to
hire a kernel engineer to fix this :)

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2008-09-24 03:06:00

by David Miller

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

From: Evgeniy Polyakov <[email protected]>
Date: Wed, 24 Sep 2008 06:53:08 +0400

> Maybe, we need to check, since allocation of multiple smaller pages from
> 4 gb area, copy data between them, hardware crypto, backward copy and
> page freeing can be slower than software crypto. Although we can
> preallocate buffers for several common sizes, although this will not
> help dm-crypto which can submit up to 31 pages in single request on top
> of ext3 in the common case.

I think you will need to preallocate a pool of pages.

2008-09-24 16:40:55

by Patrick McHardy

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

Dimitri Puzin wrote:
> With this patch applied it still doesn't work as expected. The overflow
> messages are gone however syslog shows
> [ 120.924266] hifn0: abort: c: 0, s: 1, d: 0, r: 0.
> when doing cryptsetup luksFormat as in original e-mail. At this point
> cryptsetup hangs and can't be killed with -SIGKILL. I've attached
> SysRq-t dump of this condition.

There are a few known problems left in the HIFN driver that I have
patches for but unfortunately didn't manage to clean them up and
submit them yet.

Looking at my queue, what's still broken is roughly:

[HIFN]: Have HW invalidate src and dest descriptors after processing

The descriptors need to be invalidated after processing for ring
cleanup to work properly and to avoid using an old destination
descriptor when the src and cmd descriptors are already set up
and the dst descriptor isn't.

Signed-off-by: Patrick McHardy <[email protected]>

That one is actually probably OK but I didn't send it upstream yet because
I didn't finish all follow-up patches that depend on how this works.

Further:

[HIFN]: Fix DMA setup

without a changelog :) IIRC the problem is that HIFN sets up each
DMA descriptor for the full request length, even though it should
only cover on SG entry. The result is memory corruption.

[HIFN]: Don't copy src sg list

also without a changelog. I think this one was just an optimization,
the source descriptors don't have alignment or size restrictions and
thus we don't need to linearize it first.

[HIFN]: Fix request context corruption

HIFN uses the transform context to store per-request data, which breaks
when more than one request is outstanding. Move per request members from
struct hifn_context to a new struct hifn_request_context and convert
the code to use this.

and:

[HIFN]: Fix queue processing

again without a changelog. The problem this patch fixed was missing crypto
backlog handling, causing crashes with dm_crypt under load.

If someone is interested in picking this up I could send my old patches,
I probably won't manage to do it myself in the next time.


2008-09-24 17:01:34

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

Hi Patrick.

On Wed, Sep 24, 2008 at 06:40:29PM +0200, Patrick McHardy ([email protected]) wrote:
> There are a few known problems left in the HIFN driver that I have
> patches for but unfortunately didn't manage to clean them up and
> submit them yet.
>
> Looking at my queue, what's still broken is roughly:
>
> [HIFN]: Have HW invalidate src and dest descriptors after processing
>
> The descriptors need to be invalidated after processing for ring
> cleanup to work properly and to avoid using an old destination
> descriptor when the src and cmd descriptors are already set up
> and the dst descriptor isn't.
>
> Signed-off-by: Patrick McHardy <[email protected]>
>
> That one is actually probably OK but I didn't send it upstream yet because
> I didn't finish all follow-up patches that depend on how this works.
>
> Further:
>
> [HIFN]: Fix DMA setup
>
> without a changelog :) IIRC the problem is that HIFN sets up each
> DMA descriptor for the full request length, even though it should
> only cover on SG entry. The result is memory corruption.

Can not tell without patch itself, but code operates on the number of
bytes provided from the higher levels, not full dma size (0x3ff bytes).

> [HIFN]: Don't copy src sg list
>
> also without a changelog. I think this one was just an optimization,
> the source descriptors don't have alignment or size restrictions and
> thus we don't need to linearize it first.

Also can not say for sure, but there should not be src linearization,
instead we copy data from src to aligned dst and then perform crypto
processin in place, and then copy data back.

> [HIFN]: Fix request context corruption
>
> HIFN uses the transform context to store per-request data, which breaks
> when more than one request is outstanding. Move per request members from
> struct hifn_context to a new struct hifn_request_context and convert
> the code to use this.

Its per-tfm, so things like key and iv are ok, and others should only be
accessed under the lock, but there may be problems.

> and:
>
> [HIFN]: Fix queue processing
>
> again without a changelog. The problem this patch fixed was missing crypto
> backlog handling, causing crashes with dm_crypt under load.

Yup, that's messy :)

> If someone is interested in picking this up I could send my old patches,
> I probably won't manage to do it myself in the next time.

Please do, if they work for you, we can just apply them, but I would
like first to check them out.

Thank you!

--
Evgeniy Polyakov

2008-09-24 17:14:02

by Patrick McHardy

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

Evgeniy Polyakov wrote:
> Hi Patrick.
>
> On Wed, Sep 24, 2008 at 06:40:29PM +0200, Patrick McHardy ([email protected]) wrote:
>
> [HIFN]: Fix DMA setup
>
> without a changelog :) IIRC the problem is that HIFN sets up each
> DMA descriptor for the full request length, even though it should
> only cover on SG entry. The result is memory corruption.
>
>
> Can not tell without patch itself, but code operates on the number of
> bytes provided from the higher levels, not full dma size (0x3ff bytes).
>

Yes, but it uses that size for every descriptor.

I'm going to send you the patches in private (they're quite large)
so you can have a look.

>
>> [HIFN]: Don't copy src sg list
>>
>> also without a changelog. I think this one was just an optimization,
>> the source descriptors don't have alignment or size restrictions and
>> thus we don't need to linearize it first.
>>
>
> Also can not say for sure, but there should not be src linearization,
> instead we copy data from src to aligned dst and then perform crypto
> processin in place, and then copy data back.
>

Right, I remember. The copying to the (temporary) destination area
is not necessary, the chip can just read it from the source are directly.


>
>> [HIFN]: Fix request context corruption
>>
>> HIFN uses the transform context to store per-request data, which breaks
>> when more than one request is outstanding. Move per request members from
>> struct hifn_context to a new struct hifn_request_context and convert
>> the code to use this.
>>
>
> Its per-tfm, so things like key and iv are ok, and others should only be
> accessed under the lock, but there may be problems.
>

Things like op, mode, ... are per request and are set without any
locking.

>> If someone is interested in picking this up I could send my old patches,
>> I probably won't manage to do it myself in the next time.
>>
>
> Please do, if they work for you, we can just apply them, but I would
> like first to check them out.

I'm not sure whether they're ready for applying. IIRC I still got stalls
occasionally,
but I don't remember the details. They made things better for me though :)

Anyways, patches coming up in a minute.

2008-09-24 17:17:13

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

On Wed, Sep 24, 2008 at 07:13:36PM +0200, Patrick McHardy ([email protected]) wrote:
> >Please do, if they work for you, we can just apply them, but I would
> >like first to check them out.
>
> I'm not sure whether they're ready for applying. IIRC I still got stalls
> occasionally,
> but I don't remember the details. They made things better for me though :)
>
> Anyways, patches coming up in a minute.

I will review them and push to Herbert if things are ok.
Thanks a lot, Patrick!

--
Evgeniy Polyakov

2008-10-12 12:15:35

by Herbert Xu

[permalink] [raw]
Subject: Re: hifn_795x in Linux-2.6.27-rc7

On Tue, Sep 23, 2008 at 10:18:42PM +0400, Evgeniy Polyakov wrote:
>
> Attached patch should not allow HIFN to be registered on 64-bit arch, so
> crypto layer will fallback to the software algorithms.
>
> Signed-off-by: Evgeniy Polyakov <jo[email protected]>

Patch applied.
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt