Subject: [PATCH 1/1] Crypto: [xp ]cbc: use 64bit regs on 64bit machines (rev. 2)

I removed the "#if BITS_PER_LONG == 64" because the compiler can handle
u64 on 32bit machines. The asm output is the same.

Currently on 64bit machines, the xor operation takes two 32bit registers
for a 8byte xor instead one single 64bit register. This patch fixes it for
the cbc, pcbc and xcbc template.

A quick speed test with with the tcrypt module showed for aes+cbc+dec:
old:
test 14 (256 bit key, 8192 byte blocks): 1 operation in 183138 cycles
(8192 bytes)
new:
test 14 (256 bit key, 8192 byte blocks): 1 operation in 181419 cycles
(8192 bytes)

Maybe my computer is just as tired as I am. In general I thing 64bit
registers should be prefered.

Signed-off-by: Sebastian Siewior <[email protected]>
Index: b/crypto/cbc.c
===================================================================
--- a/crypto/cbc.c
+++ b/crypto/cbc.c
@@ -17,6 +17,7 @@
#include <linux/module.h>
#include <linux/scatterlist.h>
#include <linux/slab.h>
+#include <linux/types.h>

struct crypto_cbc_ctx {
struct crypto_cipher *child;
@@ -226,16 +227,13 @@ static void xor_quad(u8 *dst, const u8 *

static void xor_64(u8 *a, const u8 *b, unsigned int bs)
{
- ((u32 *)a)[0] ^= ((u32 *)b)[0];
- ((u32 *)a)[1] ^= ((u32 *)b)[1];
+ ((u64 *)a)[0] ^= ((u64 *)b)[0];
}

static void xor_128(u8 *a, const u8 *b, unsigned int bs)
{
- ((u32 *)a)[0] ^= ((u32 *)b)[0];
- ((u32 *)a)[1] ^= ((u32 *)b)[1];
- ((u32 *)a)[2] ^= ((u32 *)b)[2];
- ((u32 *)a)[3] ^= ((u32 *)b)[3];
+ xor_64(&a[0], &b[0], bs);
+ xor_64(&a[8], &b[8], bs);
}

static int crypto_cbc_init_tfm(struct crypto_tfm *tfm)
Index: b/crypto/pcbc.c
===================================================================
--- a/crypto/pcbc.c
+++ b/crypto/pcbc.c
@@ -21,6 +21,7 @@
#include <linux/module.h>
#include <linux/scatterlist.h>
#include <linux/slab.h>
+#include <linux/types.h>

struct crypto_pcbc_ctx {
struct crypto_cipher *child;
@@ -230,16 +231,13 @@ static void xor_quad(u8 *dst, const u8 *

static void xor_64(u8 *a, const u8 *b, unsigned int bs)
{
- ((u32 *)a)[0] ^= ((u32 *)b)[0];
- ((u32 *)a)[1] ^= ((u32 *)b)[1];
+ ((u64 *)a)[0] ^= ((u64 *)b)[0];
}

static void xor_128(u8 *a, const u8 *b, unsigned int bs)
{
- ((u32 *)a)[0] ^= ((u32 *)b)[0];
- ((u32 *)a)[1] ^= ((u32 *)b)[1];
- ((u32 *)a)[2] ^= ((u32 *)b)[2];
- ((u32 *)a)[3] ^= ((u32 *)b)[3];
+ xor_64(&a[0], &b[0], bs);
+ xor_64(&a[8], &b[8], bs);
}

static int crypto_pcbc_init_tfm(struct crypto_tfm *tfm)
Index: b/crypto/xcbc.c
===================================================================
--- a/crypto/xcbc.c
+++ b/crypto/xcbc.c
@@ -27,6 +27,7 @@
#include <linux/rtnetlink.h>
#include <linux/slab.h>
#include <linux/scatterlist.h>
+#include <linux/types.h>
#include "internal.h"

static u_int32_t ks[12] = {0x01010101, 0x01010101, 0x01010101, 0x01010101,
@@ -60,10 +61,8 @@ struct crypto_xcbc_ctx {

static void xor_128(u8 *a, const u8 *b, unsigned int bs)
{
- ((u32 *)a)[0] ^= ((u32 *)b)[0];
- ((u32 *)a)[1] ^= ((u32 *)b)[1];
- ((u32 *)a)[2] ^= ((u32 *)b)[2];
- ((u32 *)a)[3] ^= ((u32 *)b)[3];
+ ((u64 *)a)[0] ^= ((u64 *)b)[0];
+ ((u64 *)a)[1] ^= ((u64 *)b)[1];
}

static int _crypto_xcbc_digest_setkey(struct crypto_hash *parent,


2007-06-22 12:41:35

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 1/1] Crypto: [xp ]cbc: use 64bit regs on 64bit machines (rev. 2)

On Thu, Jun 14, 2007 at 12:29:00PM +0000, Sebastian Siewior wrote:
>
> Maybe my computer is just as tired as I am. In general I thing 64bit
> registers should be prefered.
>
> Signed-off-by: Sebastian Siewior <[email protected]>

OK this makes sense. However you need to make sure that alignmask
is set appropriately (i.e., at least 8/16).

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Subject: Re: [PATCH 1/1] Crypto: [xp ]cbc: use 64bit regs on 64bit machines (rev. 2)

* Herbert Xu | 2007-06-22 20:41:30 [+0800]:

>On Thu, Jun 14, 2007 at 12:29:00PM +0000, Sebastian Siewior wrote:
>>
>> Maybe my computer is just as tired as I am. In general I thing 64bit
>> registers should be prefered.
>>
>> Signed-off-by: Sebastian Siewior <[email protected]>
>
>OK this makes sense. However you need to make sure that alignmask
>is set appropriately (i.e., at least 8/16).

I don't thing I understand. Why do I have to change the alignmask for
the xor operation? I guess you are talking about crypto_cbc_alloc()

|if (!(alg->cra_blocksize % 4))
| inst->alg.cra_alignmask |= 3;

don't you?
Since this (also) changes the alignment of in+out data I would prefer not
to. The speed up you gain from less xors is probably less than what you
spent on additional kmap()/memcpy()/kmalloc() in case the data is %4 but
not %5.

>Cheers,
Sebastian

2007-06-22 23:40:27

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 1/1] Crypto: [xp ]cbc: use 64bit regs on 64bit machines (rev. 2)

On Sat, Jun 23, 2007 at 12:44:06AM +0200, Sebastian Siewior wrote:
>
> >OK this makes sense. However you need to make sure that alignmask
> >is set appropriately (i.e., at least 8/16).
>
> I don't thing I understand. Why do I have to change the alignmask for
> the xor operation? I guess you are talking about crypto_cbc_alloc()
>
> |if (!(alg->cra_blocksize % 4))
> | inst->alg.cra_alignmask |= 3;
>
> don't you?

Yes.

> Since this (also) changes the alignment of in+out data I would prefer not
> to. The speed up you gain from less xors is probably less than what you
> spent on additional kmap()/memcpy()/kmalloc() in case the data is %4 but
> not %5.

Without it the data is not guaranteed to be aligned to 64 bits and
you'll get alignment traps on non-x86 architectures.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt