2009-01-30 00:22:54

by Pierre Habouzit

[permalink] [raw]
Subject: [RFC] MPI module

From: Pierre Habouzit <[email protected]>
Subject: [PATCH] crypto: add multiple precision APIs

Adds CONFIG_CRYPTO_BIGNUM to provide mpi_* APIs.

This introduces the multiple precision integers, through the `struct mpi`
type. MPI instances are just a bunch of unsigned longs (or limbs), in
little endian order, which is a rather usual representation.

Though, MPI instances have to respect a couple of invariants, the most
importants being:
* at any time the length (number of least significant limbs) of the MPI
is known;
* at any time, all allocated limbs after the most significant limb must
be set to 0.

Many mpi_* APIs use that fact to compute a valid result (see for example
mpi_sbits or mpi_is_zero).

Signed-off-by: Pierre Habouzit <[email protected]>
---
arch/x86/include/asm/bignum.h | 11 +
arch/x86/include/asm/bignum_32.h | 127 ++++++
arch/x86/include/asm/bignum_64.h | 51 +++
crypto/Kconfig | 6 +
crypto/Makefile | 1 +
crypto/bignum.c | 805 ++++++++++++++++++++++++++++++++++++++
include/asm-generic/bignum.h | 80 ++++
include/crypto/bignum.h | 209 ++++++++++
8 files changed, 1290 insertions(+), 0 deletions(-)
create mode 100644 arch/x86/include/asm/bignum.h
create mode 100644 arch/x86/include/asm/bignum_32.h
create mode 100644 arch/x86/include/asm/bignum_64.h
create mode 100644 crypto/bignum.c
create mode 100644 include/asm-generic/bignum.h
create mode 100644 include/crypto/bignum.h

diff --git a/arch/x86/include/asm/bignum.h b/arch/x86/include/asm/bignum.h
new file mode 100644
index 0000000..e33da47
--- /dev/null
+++ b/arch/x86/include/asm/bignum.h
@@ -0,0 +1,11 @@
+#ifndef _ASM_X86_BIGNUM_H
+#define _ASM_X86_BIGNUM_H
+
+#ifdef CONFIG_X86_32
+# include "bignum_32.h"
+#else
+# include "bignum_64.h"
+#endif
+#include <asm-generic/bignum.h>
+
+#endif
diff --git a/arch/x86/include/asm/bignum_32.h b/arch/x86/include/asm/bignum_32.h
new file mode 100644
index 0000000..0923585
--- /dev/null
+++ b/arch/x86/include/asm/bignum_32.h
@@ -0,0 +1,127 @@
+/**
+ * Multi-precision integer library
+ *
+ * Based on XySSL:
+ *
+ * Copyright (C) 2006-2008 Christophe Devine
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#ifndef _ASM_X86_BIGNUM_32_H
+#define _ASM_X86_BIGNUM_32_H
+
+#define MPI_MULADDC_INIT \
+ asm("movl %%ebx, %0 " : "=m" (t)); \
+ asm("movl %0, %%esi " :: "m" (s)); \
+ asm("movl %0, %%edi " :: "m" (d)); \
+ asm("movl %0, %%ecx " :: "m" (c)); \
+ asm("movl %0, %%ebx " :: "m" (b))
+
+#define MPI_MULADDC_CORE \
+ asm("lodsl "); \
+ asm("mull %ebx "); \
+ asm("addl %ecx, %eax "); \
+ asm("adcl $0, %edx "); \
+ asm("addl (%edi), %eax "); \
+ asm("adcl $0, %edx "); \
+ asm("movl %edx, %ecx "); \
+ asm("stosl ")
+
+#if defined(__HAVE_SSE2)
+
+#define MPI_MULADDC_8 \
+ asm("movd %ecx, %mm1 "); \
+ asm("movd %ebx, %mm0 "); \
+ asm("movd (%edi), %mm3 "); \
+ asm("paddq %mm3, %mm1 "); \
+ asm("movd (%esi), %mm2 "); \
+ asm("pmuludq %mm0, %mm2 "); \
+ asm("movd 4(%esi), %mm4 "); \
+ asm("pmuludq %mm0, %mm4 "); \
+ asm("movd 8(%esi), %mm6 "); \
+ asm("pmuludq %mm0, %mm6 "); \
+ asm("movd 12(%esi), %mm7 "); \
+ asm("pmuludq %mm0, %mm7 "); \
+ asm("paddq %mm2, %mm1 "); \
+ asm("movd 4(%edi), %mm3 "); \
+ asm("paddq %mm4, %mm3 "); \
+ asm("movd 8(%edi), %mm5 "); \
+ asm("paddq %mm6, %mm5 "); \
+ asm("movd 12(%edi), %mm4 "); \
+ asm("paddq %mm4, %mm7 "); \
+ asm("movd %mm1, (%edi) "); \
+ asm("movd 16(%esi), %mm2 "); \
+ asm("pmuludq %mm0, %mm2 "); \
+ asm("psrlq $32, %mm1 "); \
+ asm("movd 20(%esi), %mm4 "); \
+ asm("pmuludq %mm0, %mm4 "); \
+ asm("paddq %mm3, %mm1 "); \
+ asm("movd 24(%esi), %mm6 "); \
+ asm("pmuludq %mm0, %mm6 "); \
+ asm("movd %mm1, 4(%edi) "); \
+ asm("psrlq $32, %mm1 "); \
+ asm("movd 28(%esi), %mm3 "); \
+ asm("pmuludq %mm0, %mm3 "); \
+ asm("paddq %mm5, %mm1 "); \
+ asm("movd 16(%edi), %mm5 "); \
+ asm("paddq %mm5, %mm2 "); \
+ asm("movd %mm1, 8(%edi) "); \
+ asm("psrlq $32, %mm1 "); \
+ asm("paddq %mm7, %mm1 "); \
+ asm("movd 20(%edi), %mm5 "); \
+ asm("paddq %mm5, %mm4 "); \
+ asm("movd %mm1, 12(%edi) "); \
+ asm("psrlq $32, %mm1 "); \
+ asm("paddq %mm2, %mm1 "); \
+ asm("movd 24(%edi), %mm5 "); \
+ asm("paddq %mm5, %mm6 "); \
+ asm("movd %mm1, 16(%edi) "); \
+ asm("psrlq $32, %mm1 "); \
+ asm("paddq %mm4, %mm1 "); \
+ asm("movd 28(%edi), %mm5 "); \
+ asm("paddq %mm5, %mm3 "); \
+ asm("movd %mm1, 20(%edi) "); \
+ asm("psrlq $32, %mm1 "); \
+ asm("paddq %mm6, %mm1 "); \
+ asm("movd %mm1, 24(%edi) "); \
+ asm("psrlq $32, %mm1 "); \
+ asm("paddq %mm3, %mm1 "); \
+ asm("movd %mm1, 28(%edi) "); \
+ asm("addl $32, %edi "); \
+ asm("addl $32, %esi "); \
+ asm("psrlq $32, %mm1 "); \
+ asm("movd %mm1, %ecx ")
+
+#define MPI_MULADDC_STOP \
+ asm("emms "); \
+ asm("movl %0, %%ebx " :: "m" (t)); \
+ asm("movl %%ecx, %0 " : "=m" (c)); \
+ asm("movl %%edi, %0 " : "=m" (d)); \
+ asm("movl %%esi, %0 " : "=m" (s) :: \
+ "eax", "ecx", "edx", "esi", "edi")
+
+#else
+
+#define MPI_MULADDC_STOP \
+ asm("movl %0, %%ebx " :: "m" (t)); \
+ asm("movl %%ecx, %0 " : "=m" (c)); \
+ asm("movl %%edi, %0 " : "=m" (d)); \
+ asm("movl %%esi, %0 " : "=m" (s) :: \
+ "eax", "ecx", "edx", "esi", "edi")
+
+#endif /* SSE2 */
+
+#endif
diff --git a/arch/x86/include/asm/bignum_64.h b/arch/x86/include/asm/bignum_64.h
new file mode 100644
index 0000000..b22598b
--- /dev/null
+++ b/arch/x86/include/asm/bignum_64.h
@@ -0,0 +1,51 @@
+/**
+ * Multi-precision integer library
+ *
+ * Based on XySSL:
+ *
+ * Copyright (C) 2006-2008 Christophe Devine
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#ifndef _ASM_X86_BIGNUM_64_H
+#define _ASM_X86_BIGNUM_64_H
+
+#define MPI_MULADDC_INIT \
+ asm("movq %0, %%rsi " :: "m" (s)); \
+ asm("movq %0, %%rdi " :: "m" (d)); \
+ asm("movq %0, %%rcx " :: "m" (c)); \
+ asm("movq %0, %%rbx " :: "m" (b)); \
+ asm("xorq %r8, %r8 ")
+
+#define MPI_MULADDC_CORE \
+ asm("movq (%rsi),%rax "); \
+ asm("mulq %rbx "); \
+ asm("addq $8, %rsi "); \
+ asm("addq %rcx, %rax "); \
+ asm("movq %r8, %rcx "); \
+ asm("adcq $0, %rdx "); \
+ asm("nop "); \
+ asm("addq %rax, (%rdi) "); \
+ asm("adcq %rdx, %rcx "); \
+ asm("addq $8, %rdi ")
+
+#define MPI_MULADDC_STOP \
+ asm("movq %%rcx, %0 " : "=m" (c)); \
+ asm("movq %%rdi, %0 " : "=m" (d)); \
+ asm("movq %%rsi, %0 " : "=m" (s) :: \
+ "rax", "rcx", "rdx", "rbx", "rsi", "rdi", "r8")
+
+#endif
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 8dde4fc..12a7d99 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -29,6 +29,12 @@ config CRYPTO_FIPS
certification. You should say no unless you know what
this is.

+config CRYPTO_BIGNUM
+ tristate "Big Number library"
+ help
+ This option provides the APIs for multiple precision integer
+ operations.
+
config CRYPTO_ALGAPI
tristate
select CRYPTO_ALGAPI2
diff --git a/crypto/Makefile b/crypto/Makefile
index 46b08bf..270cdea 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_CRYPTO) += crypto.o
crypto-objs := api.o cipher.o digest.o compress.o

obj-$(CONFIG_CRYPTO_FIPS) += fips.o
+obj-$(CONFIG_CRYPTO_BIGNUM) += bignum.o

crypto_algapi-$(CONFIG_PROC_FS) += proc.o
crypto_algapi-objs := algapi.o scatterwalk.o $(crypto_algapi-y)
diff --git a/crypto/bignum.c b/crypto/bignum.c
new file mode 100644
index 0000000..7e4279c
--- /dev/null
+++ b/crypto/bignum.c
@@ -0,0 +1,805 @@
+/*
+ * Multi-precision integer library
+ *
+ * Based on XySSL:
+ *
+ * Copyright (C) 2006-2008 Christophe Devine
+ * Copyright (C) 2009 Pierre Habouzit <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+/*
+ * This MPI implementation is based on:
+ *
+ * http://www.cacr.math.uwaterloo.ca/hac/about/chap14.pdf
+ * http://www.stillhq.com/extracted/gnupg-api/struct mpi/
+ * http://math.libtomcrypt.com/files/tommath.pdf
+ */
+
+#include <linux/module.h>
+#include <linux/errno.h>
+#include <crypto/bignum.h>
+
+#define ciL ((int)sizeof(unsigned long)) /* chars in limb */
+#define biL (ciL << 3) /* bits in limb */
+#define MPI_CHK(f) if (unlikely(ret = f) != 0) goto cleanup
+#if (BITS_PER_LONG == 32)
+# define __be_long_to_cpu(x) __be32_to_cpu(x)
+# define __cpu_to_be_long(x) __cpu_to_be32(x)
+#else
+# define __be_long_to_cpu(x) __be64_to_cpu(x)
+# define __cpu_to_be_long(x) __cpu_to_be64(x)
+#endif
+
+#define MPI_FROM_INT(i) \
+ { .sign = (i) < 0, \
+ .len = (i) != 0, \
+ .p = (unsigned long []){ (i) < 0 ? -i : i }, \
+ }
+static struct mpi const ONE = MPI_FROM_INT(1);
+
+static inline void mpi_setlen(struct mpi *X, int len, int do_fix)
+{
+ if (len < X->len)
+ memset(X->p + len, 0, (X->len - len) * ciL);
+ if (do_fix) {
+ while (len > 0 && X->p[len - 1] == 0)
+ len--;
+ }
+ X->len = len;
+}
+
+static int mpi_set_2pow(struct mpi *X, int pow)
+{
+ if (mpi_grow(X, pow / biL))
+ return -ENOMEM;
+ mpi_zero(X);
+ X->p[pow / biL] = 1 << (pow % biL);
+ mpi_setlen(X, pow / biL + 1, 0);
+ return 0;
+}
+
+int mpi_grow(struct mpi *X, int nblimbs)
+{
+ unsigned long *p;
+
+ if (unlikely(nblimbs < 0))
+ return -ENOMEM;
+ if (X->alloc < nblimbs) {
+ p = krealloc(X->p, nblimbs * ciL, GFP_KERNEL);
+ if (!p)
+ return -ENOMEM;
+ memset(p + X->alloc, 0, (nblimbs - X->alloc) * ciL);
+ X->alloc = nblimbs;
+ X->p = p;
+ }
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mpi_grow);
+
+static int mpi_copy(struct mpi *X, const struct mpi *Y)
+{
+ if (X == Y)
+ return 0;
+
+ if (mpi_grow(X, Y->len))
+ return -ENOMEM;
+
+ X->sign = Y->sign;
+ memcpy(X->p, Y->p, Y->len * ciL);
+ mpi_setlen(X, Y->len, 0);
+ return 0;
+}
+
+int mpi_read_binary(struct mpi *X, const u8 *buf, int buflen)
+{
+ unsigned long u;
+ int i, v0, v1, xl;
+
+ while (buflen > 0 && *buf == 0) {
+ buf++;
+ buflen--;
+ }
+
+ xl = DIV_ROUND_UP(buflen, ciL);
+ if (mpi_grow(X, xl))
+ return -ENOMEM;
+
+ v0 = buflen / ciL;
+ v1 = buflen % ciL;
+
+ if (v1) {
+ for (u = i = 0; i < v1; i++) {
+ u = (u << 8) | *buf++;
+ }
+ X->p[v0] = u;
+ }
+ for (i = v0 - 1; i >= 0; i--, buf += ciL) {
+ memcpy(&u, buf, ciL);
+ X->p[i] = __be_long_to_cpu(u);
+ }
+ mpi_setlen(X, xl, 0);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mpi_read_binary);
+
+u8 *mpi_write_binary(const struct mpi *X, u8 *buf, int buflen)
+{
+ unsigned long u;
+ int i, v0, v1;
+
+ if (buflen > ciL * X->len) {
+ memset(buf, 0, buflen - ciL * X->len);
+ buf += buflen - ciL * X->len;
+ buflen = ciL * X->len;
+ v0 = X->len;
+ } else {
+ v0 = buflen / ciL;
+ v1 = buflen % ciL;
+
+ for (i = v1 - 1; i >= 0; i--) {
+ *buf++ = X->p[v0] >> (i << 3);
+ }
+ }
+ for (i = v0 - 1; i >= 0; i--, buf += ciL) {
+ u = __cpu_to_be_long(X->p[i]);
+ memcpy(buf, &u, ciL);
+ }
+ return buf;
+}
+EXPORT_SYMBOL_GPL(mpi_write_binary);
+
+
+int mpi_ctz(const struct mpi *X)
+{
+ int i;
+
+ for (i = 0; i < X->len; i++) {
+ if (X->p[i])
+ return i * biL + __builtin_ctzl(X->p[i]);
+ }
+ return -1;
+}
+EXPORT_SYMBOL_GPL(mpi_ctz);
+
+int mpi_sbits(const struct mpi *X)
+{
+ if (X->len == 0)
+ return 0;
+ return X->len * biL + (biL - __builtin_clzl(X->p[X->len - 1]));
+}
+EXPORT_SYMBOL_GPL(mpi_sbits);
+
+int mpi_shift_l(struct mpi *X, int count)
+{
+ int i, v0, v1, bits;
+
+ if (count == 0 || mpi_is_zero(X))
+ return 0;
+
+ v0 = count / biL;
+ v1 = count % biL;
+ bits = mpi_sbits(X);
+
+ if (mpi_grow(X, DIV_ROUND_UP(bits + count, BITS_PER_LONG)))
+ return -ENOMEM;
+
+ if (v1 == 0) {
+ memmove(X->p + v0, X->p, (X->len - v0) * ciL);
+ } else {
+ unsigned long limb, rest = 0;
+
+ for (i = X->len - 1; i >= 0; i--) {
+ limb = X->p[i];
+ X->p[i + v0 + 1] = rest | (limb >> (biL - v1));
+ rest = limb << v1;
+ }
+ X->p[v0] = rest;
+ }
+ memset(X->p, 0, v0 * ciL);
+ X->len = DIV_ROUND_UP(bits + count, BITS_PER_LONG);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mpi_shift_l);
+
+void mpi_shift_r(struct mpi *X, int count)
+{
+ int i, v0, v1, bits;
+
+ if (count == 0)
+ return;
+
+ bits = mpi_sbits(X);
+ if (count >= bits) {
+ mpi_zero(X);
+ return;
+ }
+
+ v0 = count / biL;
+ v1 = count % biL;
+
+ if (v1 == 0) {
+ memmove(X->p, X->p + v0, (X->len - v0) * ciL);
+ } else {
+ unsigned long limb, rest = X->p[v0] >> v1;
+
+ for (i = 0; i < X->len - v0 - 1; i++) {
+ limb = X->p[i + v0 + 1];
+ X->p[i] = rest | (limb << (biL - v1));
+ rest = limb >> v1;
+ }
+ }
+ mpi_setlen(X, DIV_ROUND_UP(bits - count, BITS_PER_LONG), 0);
+}
+EXPORT_SYMBOL_GPL(mpi_shift_r);
+
+static int mpi_cmp_abs(const struct mpi *X, const struct mpi *Y)
+{
+ int xl = X->len;
+
+ if (xl > Y->len)
+ return 1;
+ if (Y->len > xl)
+ return -1;
+
+ while (xl-- > 0) {
+ if (X->p[xl] > Y->p[xl])
+ return 1;
+ if (X->p[xl] < Y->p[xl])
+ return -1;
+ }
+
+ return 0;
+}
+
+int mpi_cmp(const struct mpi *X, const struct mpi *Y)
+{
+ int xl = X->len;
+ int sign = 1 - 2 * X->sign;
+
+ if (X->sign != Y->sign)
+ return sign;
+
+ if (xl > Y->len)
+ return sign;
+ if (Y->len > xl)
+ return -sign;
+
+ while (xl-- > 0) {
+ if (X->p[xl] > Y->p[xl])
+ return sign;
+ if (X->p[xl] < Y->p[xl])
+ return -sign;
+ }
+ return 0;
+}
+int mpi_cmp_int(const struct mpi *X, int y)
+{
+ struct mpi _Y = MPI_FROM_INT(y);
+ return mpi_cmp(X, &_Y);
+}
+EXPORT_SYMBOL_GPL(mpi_cmp);
+EXPORT_SYMBOL_GPL(mpi_cmp_int);
+
+/*
+ * Unsigned addition: X = |A| + |B| (HAC 14.7)
+ */
+static int mpi_add_abs(struct mpi *X, const struct mpi *A, const struct mpi *B)
+{
+ unsigned long *x, *a, *b, c;
+ int i;
+
+ if (B->len > A->len)
+ swap(A, B);
+
+ if (mpi_grow(X, A->len + 1))
+ return -ENOMEM;
+
+ a = A->p; b = B->p; x = X->p;
+
+ for (c = i = 0; i < B->len; i++, x++, a++, b++) {
+ *x = *a + c;
+ c = (*x < c) + (*x + *b < *b);
+ *x += *b;
+ }
+ for (; c && i < A->len; i++, x++, a++) {
+ *x = *a + c;
+ c = (*x < c);
+ }
+ if (X != A)
+ memcpy(x, a, (A->len - i) * ciL);
+ if (c)
+ *x = c;
+ mpi_setlen(X, A->len + c, 0);
+ return 0;
+}
+
+/*
+ * Unsigned substraction: X = |A| - |B| (HAC 14.9)
+ * assumes |A| > |B|
+ */
+static int mpi_sub_abs(struct mpi *X, const struct mpi *A, const struct mpi *B)
+{
+ unsigned long *x, *a, *b, c;
+ int i;
+
+ if (mpi_grow(X, A->len))
+ return -ENOMEM;
+
+ a = A->p; b = B->p; x = X->p;
+
+ for (c = i = 0; i < B->len; i++, x++, a++, b++) {
+ *x = *a - c;
+ c = (*a < c) + (*x < *b);
+ *x = *x - *b;
+ }
+ for (; c && i < A->len; i++, x++, a++) {
+ *x = *a - c;
+ c = (*a < c);
+ }
+ if (X != A)
+ memcpy(x, a, (A->len - i) * ciL);
+ mpi_setlen(X, A->len, 1);
+ return 0;
+}
+
+static int mpi_add_or_sub(struct mpi *X, const struct mpi *A,
+ const struct mpi *B, int is_add)
+{
+ int ret;
+
+ if ((A->sign == B->sign) == is_add) {
+ ret = mpi_add_abs(X, A, B);
+ X->sign = A->sign;
+ } else {
+ ret = mpi_cmp_abs(A, B);
+ if (ret > 0) {
+ ret = mpi_sub_abs(X, A, B);
+ X->sign = A->sign;
+ } else if (ret < 0) {
+ ret = mpi_sub_abs(X, B, A);
+ X->sign = A->sign ^ 1;
+ } else {
+ mpi_zero(X);
+ }
+ }
+
+ return ret;
+}
+
+int mpi_add(struct mpi *X, const struct mpi *A, const struct mpi *B)
+{
+ return mpi_add_or_sub(X, A, B, 1);
+}
+int mpi_add_int(struct mpi *X, const struct mpi *A, int b)
+{
+ struct mpi _B = MPI_FROM_INT(b);
+ return mpi_add(X, A, &_B);
+}
+int mpi_sub(struct mpi *X, const struct mpi *A, const struct mpi *B)
+{
+ return mpi_add_or_sub(X, A, B, 0);
+}
+int mpi_sub_int(struct mpi *X, const struct mpi *A, int b)
+{
+ struct mpi _B = MPI_FROM_INT(b);
+ return mpi_sub(X, A, &_B);
+}
+EXPORT_SYMBOL_GPL(mpi_add);
+EXPORT_SYMBOL_GPL(mpi_add_int);
+EXPORT_SYMBOL_GPL(mpi_sub);
+EXPORT_SYMBOL_GPL(mpi_sub_int);
+
+/*
+ * A /= 2; while (A is even) A /= 2
+ * Assumes A is non zero and positive
+ */
+static void mpi_gcd_hlp(struct mpi *A)
+{
+ if (A->len == 1 && A->p[0] == 1) {
+ mpi_zero(A);
+ } else {
+ A->p[0] &= ~1UL;
+ mpi_shift_r(A, mpi_ctz(A));
+ }
+}
+int mpi_gcd(struct mpi *G, const struct mpi *A, const struct mpi *B)
+{
+ struct mpi X = MPI_INIT, Y = MPI_INIT;
+ int ret, lzx, lzy;
+
+ if (mpi_is_zero(A) || mpi_is_zero(B))
+ return -EINVAL;
+
+ lzx = mpi_ctz(A);
+ lzy = mpi_ctz(B);
+ MPI_CHK(mpi_copy(&X, A));
+ MPI_CHK(mpi_copy(&Y, B));
+ X.sign = Y.sign = 0;
+ mpi_shift_r(&X, lzx);
+ mpi_shift_r(&Y, lzy);
+
+ while (!mpi_is_zero(&X)) {
+ ret = mpi_cmp_abs(&X, &Y);
+ if (ret > 0) {
+ mpi_sub_abs(&X, &X, &Y);
+ mpi_gcd_hlp(&X);
+ } else if (ret < 0) {
+ mpi_sub_abs(&Y, &Y, &X);
+ mpi_gcd_hlp(&Y);
+ } else
+ break;
+ }
+
+ mpi_shift_l(&Y, lzx < lzy ? lzx : lzy);
+ swap(*G, Y);
+
+cleanup:
+ mpi_destroy(&Y);
+ mpi_destroy(&X);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mpi_gcd);
+
+static void mpi_mul_hlp(int i, unsigned long *s, unsigned long *d, unsigned long b)
+{
+ unsigned long c = 0, t = 0;
+
+ if (i & 1) {
+ MPI_MULADDC_INIT; MPI_MULADDC_CORE; MPI_MULADDC_STOP;
+ }
+ if (i & 2) {
+ MPI_MULADDC_INIT; MPI_MULADDC_CORE2; MPI_MULADDC_STOP;
+ }
+ if (i & 4) {
+ MPI_MULADDC_INIT; MPI_MULADDC_CORE4; MPI_MULADDC_STOP;
+ }
+ for (i /= 8; i >= 0; i--) {
+ MPI_MULADDC_INIT; MPI_MULADDC_CORE8; MPI_MULADDC_STOP;
+ }
+
+ /* may be used by asm code and gcc isn't too clever about it */
+ t = t;
+ do {
+ *d += c; c = (*d < c); d++;
+ } while (c);
+}
+
+int mpi_mul(struct mpi *X, const struct mpi *A, const struct mpi *B)
+{
+ struct mpi T = MPI_INIT;
+ int al, bl, i;
+
+ if (A->len < B->len)
+ swap(A, B);
+
+ al = A->len;
+ bl = B->len;
+ if (mpi_grow(X, al + bl))
+ return -ENOMEM;
+
+ if (X == A) {
+ T = *A;
+ A = &T;
+ mpi_init(X);
+ } else if (X == B) {
+ T = *B;
+ B = &T;
+ mpi_init(X);
+ } else {
+ mpi_zero(X);
+ }
+
+ for (i = bl - 1; i >= 0; i--)
+ mpi_mul_hlp(al, A->p, X->p + i, B->p[i]);
+
+ X->sign = A->sign ^ B->sign;
+ mpi_setlen(X, al + bl, 1);
+ kfree(T.p);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mpi_mul);
+
+/*
+ * Baseline multiplication: X = A * b
+ */
+static int mpi_mul_ulong(struct mpi *X, struct mpi *A, unsigned long b)
+{
+ unsigned long p[1] = { b };
+ struct mpi B = { .sign = 0, .len = 1, .p = p };
+
+ return mpi_mul(X, A, &B);
+}
+
+/*
+ * Division by struct mpi: A = Q * B + R (HAC 14.20)
+ */
+int mpi_div(struct mpi *Q, struct mpi *R, const struct mpi *A, const struct mpi *B)
+{
+ struct mpi X = MPI_INIT, Y = MPI_INIT, Z = MPI_INIT;
+ struct mpi T1 = MPI_INIT, T2 = MPI_INIT;
+ int ret = -ENOMEM, i, xl, yl, lambda;
+
+ if (mpi_is_zero(B))
+ return -EINVAL;
+
+ if (mpi_cmp_abs(A, B) < 0) {
+ if (Q)
+ mpi_zero(Q);
+ return R ? mpi_copy(R, A) : 0;
+ }
+
+ MPI_CHK(mpi_copy(&X, A));
+ MPI_CHK(mpi_copy(&Y, B));
+ X.sign = Y.sign = 0;
+
+ MPI_CHK(mpi_grow(&Z, A->len - B->len + 2));
+ MPI_CHK(mpi_grow(&T1, 2));
+ MPI_CHK(mpi_grow(&T2, 3));
+
+ /* HAC 14.23: normalization */
+ lambda = __builtin_clzl(Y.p[Y.len - 1]);
+ if (lambda) {
+ MPI_CHK(mpi_shift_l(&X, lambda));
+ MPI_CHK(mpi_shift_l(&Y, lambda));
+ }
+
+ xl = X.len;
+ yl = Y.len;
+ mpi_shift_l(&Y, biL * (xl - yl));
+
+ while (mpi_cmp(&X, &Y) >= 0) {
+ Z.p[xl - yl]++;
+ mpi_sub(&X, &X, &Y);
+ }
+ mpi_shift_r(&Y, biL * (xl - yl));
+
+ for (i = xl - 1; i >= yl; i--) {
+ if (X.p[i] >= Y.p[yl - 1])
+ Z.p[i - yl] = ULONG_MAX;
+ else {
+#if (BITS_PER_LONG == 32)
+ unsigned long long r;
+
+ r = (unsigned long long)X.p[i] << biL;
+ r |= (unsigned long long)X.p[i - 1];
+ Z.p[i - yl] = r / Y.p[yl - 1];
+#else
+ /*
+ * __udiv_qrnnd_c, from gmp/longlong.h
+ */
+ unsigned long q0, q1, r0, r1;
+ unsigned long d0, d1, d, m;
+
+ d = Y.p[yl];
+ d0 = (d << 32) >> 32;
+ d1 = (d >> 32);
+
+ q1 = X.p[i] / d1;
+ r1 = X.p[i] - d1 * q1;
+ r1 <<= 32;
+ r1 |= (X.p[i - 1] >> 32);
+
+ m = q1 * d0;
+ if (r1 < m) {
+ q1--, r1 += d;
+ while (r1 >= d && r1 < m)
+ q1--, r1 += d;
+ }
+ r1 -= m;
+
+ q0 = r1 / d1;
+ r0 = r1 - d1 * q0;
+ r0 <<= 32;
+ r0 |= (X.p[i - 1] << 32) >> 32;
+
+ m = q0 * d0;
+ if (r0 < m) {
+ q0--, r0 += d;
+ while (r0 >= d && r0 < m)
+ q0--, r0 += d;
+ }
+ r0 -= m;
+
+ Z.p[i - yl] = (q1 << 32) | q0;
+#endif
+ }
+
+ Z.p[i - yl]++;
+ do {
+ Z.p[i - yl]--;
+
+ mpi_zero(&T1);
+ T1.p[0] = (yl <= 1) ? 0 : Y.p[yl - 2];
+ T1.p[1] = Y.p[yl - 1];
+ mpi_setlen(&T1, 2, 1);
+ MPI_CHK(mpi_mul_ulong(&T1, &T1, Z.p[i - yl]));
+
+ mpi_zero(&T2);
+ T2.p[0] = (i < 2) ? 0 : X.p[i - 2];
+ T2.p[1] = (i < 1) ? 0 : X.p[i - 1];
+ T2.p[2] = X.p[i];
+ mpi_setlen(&T2, 3, 1);
+ } while (mpi_cmp(&T1, &T2) > 0);
+
+ MPI_CHK(mpi_mul_ulong(&T1, &Y, Z.p[i - yl]));
+ MPI_CHK(mpi_shift_l(&T1, biL * (i - yl)));
+ MPI_CHK(mpi_sub(&X, &X, &T1));
+
+ if (X.sign) {
+ MPI_CHK(mpi_copy(&T1, &Y));
+ MPI_CHK(mpi_shift_l(&T1, biL * (i - yl)));
+ MPI_CHK(mpi_add(&X, &X, &T1));
+ Z.p[i - yl]--;
+ }
+ }
+
+ if (Q) {
+ mpi_setlen(&Z, Z.alloc, 1);
+ swap(*Q, Z);
+ Q->sign = A->sign ^ B->sign;
+ }
+
+ if (R) {
+ mpi_shift_r(&X, lambda);
+ swap(*R, X);
+ R->sign = mpi_is_zero(R) ? 0 : A->sign;
+ }
+ ret = 0;
+
+cleanup:
+ kfree(X.p);
+ kfree(Y.p);
+ kfree(Z.p);
+ kfree(T1.p);
+ kfree(T2.p);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mpi_div);
+
+/*
+ * Modulo: R = A mod B
+ */
+int mpi_mod(struct mpi *R, const struct mpi *A, const struct mpi *B)
+{
+ int ret = mpi_div(NULL, R, A, B);
+
+ while (ret == 0 && R->sign)
+ ret = mpi_add(R, R, B);
+
+ while (ret == 0 && mpi_cmp(R, B) >= 0)
+ ret = mpi_sub(R, R, B);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mpi_mod);
+
+/*
+ * Fast Montgomery initialization (thanks to Tom St Denis)
+ */
+static unsigned long mpi_montg_init(const struct mpi *N)
+{
+ unsigned long x, m0 = N->p[0];
+
+ x = m0;
+ x += ((m0 + 2) & 4) << 1;
+ x *= (2 - (m0 * x));
+ x *= (2 - (m0 * x));
+ x *= (2 - (m0 * x));
+#if (BITS_PER_LONG == 64)
+ x *= (2 - (m0 * x));
+#endif
+ return ~x + 1;
+}
+
+/*
+ * Montgomery multiplication: A = A * B * R^-1 mod N (HAC 14.36)
+ * T is a buffer that shall be N->len * 2 limbs big at least.
+ */
+static void mpi_montmul(struct mpi *A, const struct mpi *B, const struct mpi *N,
+ unsigned long mm, struct mpi *T)
+{
+ int i, n, m;
+ unsigned long u0, u1, *d;
+
+ mpi_zero(T);
+
+ d = T->p;
+ n = N->len;
+ m = B->len < n ? B->len : n;
+
+ for (i = 0; i < n; i++) {
+ /*
+ * T = (T + u0*B + u1*N) / 2^biL
+ */
+ u0 = A->p[i];
+ u1 = (d[0] + u0 * B->p[0]) * mm;
+
+ mpi_mul_hlp(m, B->p, d, u0);
+ mpi_mul_hlp(n, N->p, d, u1);
+
+ *d++ = u0; d[n + 1] = 0;
+ }
+
+ memcpy(A->p, d, (n + 1) * ciL);
+ mpi_setlen(A, n + 1, 1);
+ mpi_setlen(T, T->alloc, 1);
+ if (mpi_cmp_abs(A, N) >= 0)
+ mpi_sub_abs(A, A, N);
+ else {
+ /* prevent timing attacks */
+ mpi_sub_abs(T, N, A);
+ }
+}
+
+/*
+ * Montgomery exponentiation: A = X^E mod N (HAC 14.94)
+ */
+int mpi_exp_mod(struct mpi *A, const struct mpi *X, const struct mpi *E,
+ const struct mpi *N, struct mpi *_RR)
+{
+ struct mpi RR = MPI_INIT, _X = MPI_INIT, T = MPI_INIT;
+ unsigned long mm;
+ int ret, i;
+
+ MPI_CHK(mpi_grow(&T, N->len * 2));
+ mm = mpi_montg_init(N);
+
+ /* If 1st call, pre-compute R^2 mod N */
+ if (!_RR || !_RR->p) {
+ mpi_set_2pow(&RR, N->len * 2 * biL);
+ MPI_CHK(mpi_mod(&RR, &RR, N));
+ if (_RR)
+ *_RR = RR;
+ } else
+ RR = *_RR;
+
+ /* _X = X * R^2 * R^-1 mod N = X * R mod N */
+ MPI_CHK(mpi_mod(&_X, X, N));
+ mpi_montmul(&_X, &RR, N, mm, &T);
+
+ /* A = R^2 * R^-1 mod N = R mod N */
+ MPI_CHK(mpi_copy(A, &RR));
+ mpi_montmul(A, &ONE, N, mm, &T);
+
+ for (i = mpi_sbits(E) - 1; i >= 0; i--) {
+ mpi_montmul(A, A, N, mm, &T);
+ if (E->p[i / biL] & (1 << (i % biL)))
+ mpi_montmul(A, &_X, N, mm, &T);
+ }
+
+cleanup:
+ if (!_RR)
+ kfree(RR.p);
+ kfree(_X.p);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mpi_exp_mod);
+
+static int __init crypto_bignum_init(void)
+{
+ return 0;
+}
+
+static void __exit crypto_bignum_exit(void)
+{
+}
+
+module_init(crypto_bignum_init);
+module_exit(crypto_bignum_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Multiple Precision Integers API");
+MODULE_AUTHOR("Pierre Habouzit <[email protected]>");
diff --git a/include/asm-generic/bignum.h b/include/asm-generic/bignum.h
new file mode 100644
index 0000000..91f003e
--- /dev/null
+++ b/include/asm-generic/bignum.h
@@ -0,0 +1,80 @@
+/*
+ * Multi-precision integer library
+ *
+ * Based on XySSL:
+ *
+ * Copyright (C) 2006-2008 Christophe Devine
+ * Copyright (C) 2009 Pierre Habouzit <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#ifndef _ASM_GENERIC_BIGNUM
+#define _ASM_GENERIC_BIGNUM
+
+#include <asm/types.h>
+
+#ifndef MPI_MULADDC_CORE
+#if (BITS_PER_LONG == 32)
+
+#define MPI_MULADDC_INIT \
+ unsigned long long r; \
+ unsigned long r0, r1
+
+#define MPI_MULADDC_CORE \
+ r = *(s++) * (t_dbl) b; \
+ r0 = r; \
+ r1 = r >> biL; \
+ r0 += c; r1 += (r0 < c); \
+ r0 += *d; r1 += (r0 < *d); \
+ c = r1; *(d++) = r0;
+
+#else
+
+#define MPI_MULADDC_INIT \
+ unsigned long s0, s1, b0, b1; \
+ unsigned long r0, r1, rx, ry; \
+ b0 = (b << 32) >> 32; \
+ b1 = (b >> 32);
+
+#define MPI_MULADDC_CORE \
+ s0 = (*s << 32) >> 32; \
+ s1 = (*s >> 32); s++; \
+ rx = s0 * b1; r0 = s0 * b0; \
+ ry = s1 * b0; r1 = s1 * b1; \
+ r1 += (rx >> 32); \
+ r1 += (ry >> 32); \
+ rx <<= 32; ry <<= 32; \
+ r0 += rx; r1 += (r0 < rx); \
+ r0 += ry; r1 += (r0 < ry); \
+ r0 += c; r1 += (r0 < c); \
+ r0 += *d; r1 += (r0 < *d); \
+ c = r1; *(d++) = r0;
+
+#endif
+#define MPI_MULADDC_STOP do { } while (0)
+#endif
+
+#ifndef MPI_MULADDC_CORE2
+#define MPI_MULADDC_CORE2 do { MPI_MULADDC_CORE; MPI_MULADDC_CORE; } while (0)
+#endif
+#ifndef MPI_MULADDC_CORE4
+#define MPI_MULADDC_CORE4 do { MPI_MULADDC_CORE2; MPI_MULADDC_CORE2; } while (0)
+#endif
+#ifndef MPI_MULADDC_CORE8
+#define MPI_MULADDC_CORE8 do { MPI_MULADDC_CORE4; MPI_MULADDC_CORE4; } while (0)
+#endif
+
+#endif
diff --git a/include/crypto/bignum.h b/include/crypto/bignum.h
new file mode 100644
index 0000000..dc5f3bd
--- /dev/null
+++ b/include/crypto/bignum.h
@@ -0,0 +1,209 @@
+/*
+ * Multi-precision integer library
+ *
+ * Based on XySSL:
+ *
+ * Copyright (C) 2006-2008 Christophe Devine
+ * Copyright (C) 2009 Pierre Habouzit <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#ifndef _LINUX_BIGNUM_H
+#define _LINUX_BIGNUM_H
+
+#include <linux/string.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <asm/bignum.h>
+
+/**
+ * \brief MPI structure
+ *
+ * invariants:
+ * - alloc is the number of allocated limbs
+ * - len is the number of non zero limbs (p[len - 1] is never 0)
+ * - p[len .. alloc[ must always be set to 0
+ * - "0" is positive.
+ * - sign is set to 1 for negative values.
+ */
+struct mpi {
+ unsigned sign : 1; /*!< integer sign */
+ unsigned alloc : 31; /*!< allocated limbs */
+ unsigned len; /*!< number of used limbs */
+ unsigned long *p; /*!< pointer to limbs */
+};
+
+#define MPI_INIT { .p = NULL }
+
+static inline void mpi_init(struct mpi *X) {
+ memset(X, 0, sizeof(*X));
+}
+
+static inline void mpi_zero(struct mpi *X) {
+ memset(X->p, 0, X->len * sizeof(*X->p));
+ X->sign = 0;
+ X->len = 0;
+}
+
+static inline void mpi_destroy(struct mpi *X) {
+ kfree(X->p);
+}
+
+static inline int mpi_is_zero(const struct mpi *X) {
+ return X->len == 0;
+}
+
+/**
+ * \brief Enlarge to the specified number of limbs
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed
+ */
+int mpi_grow(struct mpi *X, int nblimbs);
+
+/**
+ * \brief Import X from unsigned binary data, big endian
+ *
+ * \param X destination struct mpi
+ * \param buf input buffer
+ * \param buflen input buffer size
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed
+ */
+int mpi_read_binary(struct mpi *X, const u8 *buf, int buflen);
+
+/**
+ * \brief Export X into unsigned binary data, big endian
+ *
+ * If %buflen is too short, the %X is truncated. If %buflen is larger than the
+ * number of octets required to write %X, then the buffer is left-padded with
+ * zeroes.
+ *
+ * \param X source struct mpi
+ * \param buf output buffer
+ * \param buflen number of octets of %X to write
+ *
+ * \return a pointer to the last written byte in buf.
+ */
+u8 *mpi_write_binary(const struct mpi *X, u8 *buf, int buflen);
+
+
+/**
+ * \brief Return the number of traling zeroes
+ *
+ * \return -1 if X is zero
+ */
+int mpi_ctz(const struct mpi *X);
+
+/**
+ * \brief Return the number of significant bits
+ */
+int mpi_sbits(const struct mpi *X);
+
+/**
+ * \brief Left-shift: X <<= count
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed
+ */
+int mpi_shift_l(struct mpi *X, int count);
+
+/**
+ * \brief Right-shift: X >>= count
+ */
+void mpi_shift_r(struct mpi *X, int count);
+
+/**
+ * \brief Compare signed values
+ *
+ * \return 1 if X is greater than Y,
+ * -1 if X is lesser than Y or
+ * 0 if X is equal to Y
+ */
+int mpi_cmp(const struct mpi *X, const struct mpi *Y);
+int mpi_cmp_int(const struct mpi *X, int i);
+
+/**
+ * \brief Signed addition: X = A + B
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed
+ */
+int mpi_add(struct mpi *X, const struct mpi *A, const struct mpi *B);
+int mpi_add_int(struct mpi *X, const struct mpi *A, int b);
+
+/**
+ * \brief Signed substraction: X = A - B
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed
+ */
+int mpi_sub(struct mpi *X, const struct mpi *A, const struct mpi *B);
+int mpi_sub_int(struct mpi *X, const struct mpi *A, int b);
+
+/**
+ * \brief Greatest common divisor: G = gcd(A, B)
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed
+ */
+int mpi_gcd(struct mpi *G, const struct mpi *A, const struct mpi *B);
+
+/**
+ * \brief Baseline multiplication: X = A * B
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed
+ */
+int mpi_mul(struct mpi *X, const struct mpi *A, const struct mpi *B);
+
+/**
+ * \brief Division by struct mpi: A = Q * B + R
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed,
+ * -EINVAL if B == 0
+ *
+ * \note Either Q or R can be NULL.
+ */
+int mpi_div(struct mpi *Q, struct mpi *R,
+ const struct mpi *A, const struct mpi *B);
+
+/**
+ * \brief Modulo: R = A mod B
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed,
+ * -EINVAL if B == 0
+ */
+int mpi_mod(struct mpi *R, const struct mpi *A, const struct mpi *B);
+
+/**
+ * \brief Sliding-window exponentiation: X = A^E mod N
+ *
+ * \return 0 if successful,
+ * -ENOMEM if memory allocation failed,
+ * -EINVAL if N is negative or even
+ *
+ * \note _RR is used to avoid re-computing R*R mod N across
+ * multiple calls, which speeds up things a bit. It can
+ * be set to NULL if the extra performance is unneeded.
+ */
+int mpi_exp_mod(struct mpi *X, const struct mpi *A,
+ const struct mpi *E, const struct mpi *N, struct mpi *_RR);
+
+#endif /* bignum.h */
--
1.6.1.399.g0d272


Attachments:
(No filename) (0.00 B)
(No filename) (197.00 B)
Download all attachments

2009-01-30 03:25:09

by Herbert Xu

[permalink] [raw]
Subject: Re: [RFC] MPI module

Pierre Habouzit <[email protected]> wrote:
>
> My endgame is simple, I'd like to see an in-kernel SSL/TLS
> implementation in Linux happen. There are many reasons to want that,
> ranging from performance reasons (waking the userland each time you
> perform a handshake isn't particularly nice, and it's easy to make
> system-wide session caches) to really cool features being enabled:
> - you can send "secure" file descriptors around through Unix Sockets,
> or prepare a "secure" socket, let it be your stdin/stdout pair and
> exec a service knowing nothing about SSL (think inetd-like stuff) ;
> - you can deploy secure services where the actual server knows nothing
> about the certificate that is used ;
> - you can have a system-wide service dealing with peer certificate
> validations and have a real system-wide policy in this regard ;
> - you could even think of some netfilter stuff to enforce security on
> a given socket, even if the service behind the socket knows nothing
> about it (bye bye stunnel)...
>
> Nowadays, the kernel has most of what we need cipher-wise for TLS/SSL.
> Only the key-exchange protocols and the very TLS protocol are lacking.
> I'm currently addressing the former issue, namely, bringing RSA and
> Diffie-Hellman to the kernel.

Stop right there. There is absolutely no reason why you need
to do the TLS stuff in kernel to achieve the goals you listed
above.

What you should do is have user-space create the session, and
then give the keys to the kernel to do the data-path. Please
take a look at IPsec for an example of how the work is divided
between user-space and the kernel.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2009-01-30 08:12:14

by Pierre Habouzit

[permalink] [raw]
Subject: Re: [RFC] MPI module

On Fri, Jan 30, 2009 at 03:25:06AM +0000, Herbert Xu wrote:
> Pierre Habouzit <[email protected]> wrote:
> >
> > My endgame is simple, I'd like to see an in-kernel SSL/TLS
> > implementation in Linux happen. There are many reasons to want that,
> > ranging from performance reasons (waking the userland each time you
> > perform a handshake isn't particularly nice, and it's easy to make
> > system-wide session caches) to really cool features being enabled:
> > - you can send "secure" file descriptors around through Unix Sockets,
> > or prepare a "secure" socket, let it be your stdin/stdout pair and
> > exec a service knowing nothing about SSL (think inetd-like stuff) ;
> > - you can deploy secure services where the actual server knows nothing
> > about the certificate that is used ;
> > - you can have a system-wide service dealing with peer certificate
> > validations and have a real system-wide policy in this regard ;
> > - you could even think of some netfilter stuff to enforce security on
> > a given socket, even if the service behind the socket knows nothing
> > about it (bye bye stunnel)...
> >
> > Nowadays, the kernel has most of what we need cipher-wise for TLS/SSL.
> > Only the key-exchange protocols and the very TLS protocol are lacking.
> > I'm currently addressing the former issue, namely, bringing RSA and
> > Diffie-Hellman to the kernel.
>
> Stop right there. There is absolutely no reason why you need
> to do the TLS stuff in kernel to achieve the goals you listed
> above.
>
> What you should do is have user-space create the session, and
> then give the keys to the kernel to do the data-path. Please
> take a look at IPsec for an example of how the work is divided
> between user-space and the kernel.

So let me rephrase that to be sure we've understood each other. What you
suggest is to have an IKE-like daemon dealing with the keys and all the
handshakes, and that the kernel would only deal with the symmetric
ciphers used on the data path. Is that right ?

--
·O· Pierre Habouzit
··O [email protected]
OOO http://www.madism.org


Attachments:
(No filename) (2.15 kB)
(No filename) (197.00 B)
Download all attachments

2009-01-30 12:41:12

by Herbert Xu

[permalink] [raw]
Subject: Re: [RFC] MPI module

Pierre Habouzit <[email protected]> wrote:
>
> So let me rephrase that to be sure we've understood each other. What you
> suggest is to have an IKE-like daemon dealing with the keys and all the
> handshakes, and that the kernel would only deal with the symmetric
> ciphers used on the data path. Is that right ?

Either a daemon or a library in user-space should handle the
hard work of negotiating the keys. You can leave the easy work
of encrypting/decrypting the data to the kernel :)

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2009-01-30 18:54:21

by Loc Ho

[permalink] [raw]
Subject: RE: [RFC] MPI module

Hi,

I would like to add that you can even handle the TLS/DTLS/SSL packet formation in the kernel as well if you provide an algorithms that does just that. Right now, most user just use the kernel for the hashing and cipher parts. There is no reason that the current framework cannot handle processing the full packet in hardware. All you need is to create another algorithm name that is aead type. Then, from user space (using Linux CryptoAPI user space interface) creates that algorithms. The underlying CryptoAPI will call the appropriate function that provided by your driver and the result of the operation will be an TLS/DTLS/SSL packet formation.

We currently does this for testing our hardware for non-IPSec protocol.

-Loc


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Herbert Xu
Sent: Friday, January 30, 2009 4:41 AM
To: Pierre Habouzit
Cc: [email protected]
Subject: Re: [RFC] MPI module

Pierre Habouzit <[email protected]> wrote:
>
> So let me rephrase that to be sure we've understood each other. What you
> suggest is to have an IKE-like daemon dealing with the keys and all the
> handshakes, and that the kernel would only deal with the symmetric
> ciphers used on the data path. Is that right ?

Either a daemon or a library in user-space should handle the
hard work of negotiating the keys. You can leave the easy work
of encrypting/decrypting the data to the kernel :)

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2009-01-31 22:35:01

by Pierre Habouzit

[permalink] [raw]
Subject: Re: [RFC] MPI module

On Fri, Jan 30, 2009 at 06:54:16PM +0000, Loc Ho wrote:
> Hi,
>
> I would like to add that you can even handle the TLS/DTLS/SSL packet
> formation in the kernel as well if you provide an algorithms that does
> just that. Right now, most user just use the kernel for the hashing
> and cipher parts. There is no reason that the current framework cannot
> handle processing the full packet in hardware. All you need is to
> create another algorithm name that is aead type. Then, from user space
> (using Linux CryptoAPI user space interface) creates that algorithms.
> The underlying CryptoAPI will call the appropriate function that
> provided by your driver and the result of the operation will be an
> TLS/DTLS/SSL packet formation.

Okay, this sounds nifty, though I'm not 100% sure I follow you. When
you're talking about "Linux CryptoAPI userspace interface" are you
talking about things like cryptodev[1] that aren't (AFAICT) merged into
the mainline kernel ?

Or do you mean that I should write a new aead algorithm, plus provide a
way (probably ioctl ?) to "activate" that algorithm on my socket once
user-land has negociated the ciphers and similar stuff ?

Also, this has the drawback, unless I'm mistaken, that the program using
the socket has to be aware it's using SSL/TLS/DTLS. Of course, when
writing something doing SMTP with STARTTLS it has to be somehow aware of
the SSL layer, since the handshake is delayed. But it would be quite sad
to be unable to secure a socket without the user noticing it at any
time. For example, it would be quite nifty to do through iptables
something that would redirect a given port to another one and adding SSL
at the same time (e.g. redirect 1.2.3.4:443 to 127.0.0.1:80 _and_ adding
SSL on the top).


It makes sense to want to put the handshakes and negociations in
user-space, and the system-wide ssl daemon for that task makes sense to
me. So I was basically trying to figure out a clean way to "redirect"
all non data SSL/TLS payloads (IOW handshakes and stuff like that) to
the daemon, and the rest to the actual socket/application[2]. So either
I'm wrong about what I'm trying to design, or I miss something in how
your hint can help me.

Anyways, I would be delighted if you can give me more details about what
you meant :)




[1] http://www.logix.cz/michal/devel/cryptodev/

[2] Which rises issues since unlike IPSec we have some programs
_aware_ that they want to use SSL: it's okay to _have_ to write
configuration for the system-wide daemon if you're securing
something behind the original program's back. But I want that
programs are still able to chose their certificate themselves and
stuff like that, and it's somehow "hard" (as in racy and/or
insecure) that a given program creates a socket, mark it as secure
(e.g. using a setsockopt) and uploading informations about that
new socket to the regulatory daemon (in the sense that anyone can
claim that it possess that given socket, only the kernel really
knows about it). But I assume those issues can be resolved later
once I've accustomed myself with the kernel crypto a bit more.

--
·O· Pierre Habouzit
··O [email protected]
OOO http://www.madism.org


Attachments:
(No filename) (3.27 kB)
(No filename) (197.00 B)
Download all attachments

2009-02-07 00:59:50

by Loc Ho

[permalink] [raw]
Subject: RE: [RFC] MPI module

Hi,

Sorry for the late response... See inline...

-Loc

-----Original Message-----
From: Pierre Habouzit [mailto:[email protected]]
Sent: Saturday, January 31, 2009 2:35 PM
To: Loc Ho
Cc: Herbert Xu; [email protected]
Subject: Re: [RFC] MPI module

On Fri, Jan 30, 2009 at 06:54:16PM +0000, Loc Ho wrote:
> Hi,
>
> I would like to add that you can even handle the TLS/DTLS/SSL packet
> formation in the kernel as well if you provide an algorithms that does
> just that. Right now, most user just use the kernel for the hashing
> and cipher parts. There is no reason that the current framework cannot
> handle processing the full packet in hardware. All you need is to
> create another algorithm name that is aead type. Then, from user space
> (using Linux CryptoAPI user space interface) creates that algorithms.
> The underlying CryptoAPI will call the appropriate function that
> provided by your driver and the result of the operation will be an
> TLS/DTLS/SSL packet formation.

Okay, this sounds nifty, though I'm not 100% sure I follow you. When you're talking about "Linux CryptoAPI userspace interface" are you talking about things like cryptodev[1] that aren't (AFAICT) merged into the mainline kernel ?
[Loc Ho]
Yes... We use this to test various algorithms supported by our hardware.

Or do you mean that I should write a new aead algorithm, plus provide a way (probably ioctl ?) to "activate" that algorithm on my socket once user-land has negociated the ciphers and similar stuff ?
[Loc Ho]
Yes... You will need an aead type algorithm that does what you want. Let say you want to process "DTLS" algorithm. Then provide a driver implementation for DTLS processing. From User Space, you would create this algorithm and operate on this algorithm over Linux CryptoAPI User Space interface. You might have to change slight in the User space interface code - try and see.

Also, this has the drawback, unless I'm mistaken, that the program using the socket has to be aware it's using SSL/TLS/DTLS. Of course, when writing something doing SMTP with STARTTLS it has to be somehow aware of the SSL layer, since the handshake is delayed. But it would be quite sad to be unable to secure a socket without the user noticing it at any time. For example, it would be quite nifty to do through iptables something that would redirect a given port to another one and adding SSL at the same time (e.g. redirect 1.2.3.4:443 to 127.0.0.1:80 _and_ adding SSL on the top).
[Loc Ho]
Yes... Unless you handle everything in the kernel (like IPSec), there is always a trade off. You could direct traffic to user space program and do all you processing and then inject back to the stack.


It makes sense to want to put the handshakes and negociations in user-space, and the system-wide ssl daemon for that task makes sense to me. So I was basically trying to figure out a clean way to "redirect"
all non data SSL/TLS payloads (IOW handshakes and stuff like that) to the daemon, and the rest to the actual socket/application[2]. So either I'm wrong about what I'm trying to design, or I miss something in how your hint can help me.
[Loc Ho]
Not sure what you referring here...

Anyways, I would be delighted if you can give me more details about what you meant :)




[1] http://www.logix.cz/michal/devel/cryptodev/

[2] Which rises issues since unlike IPSec we have some programs
_aware_ that they want to use SSL: it's okay to _have_ to write
configuration for the system-wide daemon if you're securing
something behind the original program's back. But I want that
programs are still able to chose their certificate themselves and
stuff like that, and it's somehow "hard" (as in racy and/or
insecure) that a given program creates a socket, mark it as secure
(e.g. using a setsockopt) and uploading informations about that
new socket to the regulatory daemon (in the sense that anyone can
claim that it possess that given socket, only the kernel really
knows about it). But I assume those issues can be resolved later
once I've accustomed myself with the kernel crypto a bit more.

--
·O· Pierre Habouzit
··O [email protected]
OOO http://www.madism.org