Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp6132741pxu; Wed, 23 Dec 2020 14:41:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJw1qjGVzcl1G3gm6cNrXd2d2gcGW1YMTrt+5Ifz+7bYlvw7k8vGhxkczwZmK8aMy54OWNI7 X-Received: by 2002:a17:906:90d6:: with SMTP id v22mr25941560ejw.88.1608763307865; Wed, 23 Dec 2020 14:41:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1608763307; cv=none; d=google.com; s=arc-20160816; b=ab7aj8xTTBf1GuQgwcmm2WgHEdS9wGW/dfjobmTtdmv98R4VwttmcVqryTTHhtgADB c0p0DsDZYFu4GY5YAY79s6ILrMNZQrR10jEFWBRQNqhWgcUkUxlT12YPsUS3wkJvN/Tc I/ZO+opi9JYViJG4LSpEa9IZQr8eg1OMjFKfASiuCeOVsO+H6M+qiq/aoXI7coskwOiO ZHsdHPxNMfoT/8bxt9hA5XxXHgLZBPxCp9+IpWolMmjXsL5CWEDzG0ULzC69BqPQyPFz TlqNNbegt1II5K9c6utCHPsqq3S1gxFopwSuyMbq+tonFhCZgQmCaSa+pIIigPqKzj26 5TVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=R24QUL4lJYi+SkEWssaWj3VdIjJJaulOptGbMRNMtbo=; b=nMIMceqF2qtN0aJx8433++TeKabX6diD2iDVFTaiDiuEKM60wibr5Sjf3oSS7cZEWi N2A4k9aLlT3XUbk7PcassLPzUosYwUx3Nfdi2vO9gZGHi0NtLi6BY95vdja8ZcoBXj8+ +hI7rl5SB0M+8t7EKMoA23cYtseSvrTmDLbuFL6MYiRXNiohnBDnYjtsicgWU5++R7Ja 9bcQAhE+diIpS4UXTi+Myu61/ytumUnLwnJbPpZOULiXvMDrNMCdS6Dpo5j0p6DzPHCg 3mL6LhlZI5UEIGBB7OpqEx/EonluoQjHcN2XlnaQ4h+suCaLquGRv8NDxCefGuucx62x JoXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="C/GEKf/B"; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v24si14300886edy.573.2020.12.23.14.41.28; Wed, 23 Dec 2020 14:41:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="C/GEKf/B"; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726282AbgLWWjr (ORCPT + 99 others); Wed, 23 Dec 2020 17:39:47 -0500 Received: from mail.kernel.org ([198.145.29.99]:59244 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725811AbgLWWjq (ORCPT ); Wed, 23 Dec 2020 17:39:46 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id C3B88224B2; Wed, 23 Dec 2020 22:39:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1608763145; bh=u+uFryUmH+JxWzJkbheVciNMfp/q1G+YXDT1u1i8vCg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=C/GEKf/BvlPfinxqSX1Nx8v+D7Vr5+UsHmYHJEPab+GXtuPJEidKOzkB+/Fw7dpEq LTnVa7dF46bLnEi/jp9Z9bFsm3SmjDSP9T+e6JflQLZz1KwJyoSQGjuYYiJ+AqVXxc MwHLUORXPmXTNOPqhhKCUSNCg6kyN4uMCy9cXfBWBUdPE320y4CHCNm4i//1yC/faz /NIr3ZLWS8ecm+I7DFvfEzM6ouA8t3He678yCMJajzzkDEhIOz+YeCQSvULokFedaR JjwaB6LCx5CHoKKtaALXWMSxJtsdAnu6UvveCUq8gGKW2+RxDF/9KpcqMQNuQjjGB1 S86AfUe2cbY8g== From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: dm-devel@redhat.com, Ard Biesheuvel , Megha Dey , Eric Biggers , Herbert Xu , Milan Broz , Mike Snitzer Subject: [RFC PATCH 02/10] crypto: x86/cast6 - switch to XTS template Date: Wed, 23 Dec 2020 23:38:33 +0100 Message-Id: <20201223223841.11311-3-ardb@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20201223223841.11311-1-ardb@kernel.org> References: <20201223223841.11311-1-ardb@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Now that the XTS template can wrap accelerated ECB modes, it can be used to implement CAST6 in XTS mode as well, which turns out to be at least as fast, and sometimes even faster Signed-off-by: Ard Biesheuvel --- arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 56 ----------- arch/x86/crypto/cast6_avx_glue.c | 98 -------------------- 2 files changed, 154 deletions(-) diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S index 932a3ce32a88..0c1ea836215a 100644 --- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S +++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S @@ -212,8 +212,6 @@ .section .rodata.cst16, "aM", @progbits, 16 .align 16 -.Lxts_gf128mul_and_shl1_mask: - .byte 0x87, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 .Lbswap_mask: .byte 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12 .Lbswap128_mask: @@ -440,57 +438,3 @@ SYM_FUNC_START(cast6_ctr_8way) FRAME_END ret; SYM_FUNC_END(cast6_ctr_8way) - -SYM_FUNC_START(cast6_xts_enc_8way) - /* input: - * %rdi: ctx, CTX - * %rsi: dst - * %rdx: src - * %rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸)) - */ - FRAME_BEGIN - pushq %r15; - - movq %rdi, CTX - movq %rsi, %r11; - - /* regs <= src, dst <= IVs, regs <= regs xor IVs */ - load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2, - RX, RKR, RKM, .Lxts_gf128mul_and_shl1_mask); - - call __cast6_enc_blk8; - - /* dst <= regs xor IVs(in dst) */ - store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2); - - popq %r15; - FRAME_END - ret; -SYM_FUNC_END(cast6_xts_enc_8way) - -SYM_FUNC_START(cast6_xts_dec_8way) - /* input: - * %rdi: ctx, CTX - * %rsi: dst - * %rdx: src - * %rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸)) - */ - FRAME_BEGIN - pushq %r15; - - movq %rdi, CTX - movq %rsi, %r11; - - /* regs <= src, dst <= IVs, regs <= regs xor IVs */ - load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2, - RX, RKR, RKM, .Lxts_gf128mul_and_shl1_mask); - - call __cast6_dec_blk8; - - /* dst <= regs xor IVs(in dst) */ - store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2); - - popq %r15; - FRAME_END - ret; -SYM_FUNC_END(cast6_xts_dec_8way) diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index 48e0f37796fa..5a21d3e9041c 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -15,7 +15,6 @@ #include #include #include -#include #include #define CAST6_PARALLEL_BLOCKS 8 @@ -27,27 +26,12 @@ asmlinkage void cast6_cbc_dec_8way(const void *ctx, u8 *dst, const u8 *src); asmlinkage void cast6_ctr_8way(const void *ctx, u8 *dst, const u8 *src, le128 *iv); -asmlinkage void cast6_xts_enc_8way(const void *ctx, u8 *dst, const u8 *src, - le128 *iv); -asmlinkage void cast6_xts_dec_8way(const void *ctx, u8 *dst, const u8 *src, - le128 *iv); - static int cast6_setkey_skcipher(struct crypto_skcipher *tfm, const u8 *key, unsigned int keylen) { return cast6_setkey(&tfm->base, key, keylen); } -static void cast6_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv) -{ - glue_xts_crypt_128bit_one(ctx, dst, src, iv, __cast6_encrypt); -} - -static void cast6_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv) -{ - glue_xts_crypt_128bit_one(ctx, dst, src, iv, __cast6_decrypt); -} - static void cast6_crypt_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv) { be128 ctrblk; @@ -87,19 +71,6 @@ static const struct common_glue_ctx cast6_ctr = { } } }; -static const struct common_glue_ctx cast6_enc_xts = { - .num_funcs = 2, - .fpu_blocks_limit = CAST6_PARALLEL_BLOCKS, - - .funcs = { { - .num_blocks = CAST6_PARALLEL_BLOCKS, - .fn_u = { .xts = cast6_xts_enc_8way } - }, { - .num_blocks = 1, - .fn_u = { .xts = cast6_xts_enc } - } } -}; - static const struct common_glue_ctx cast6_dec = { .num_funcs = 2, .fpu_blocks_limit = CAST6_PARALLEL_BLOCKS, @@ -126,19 +97,6 @@ static const struct common_glue_ctx cast6_dec_cbc = { } } }; -static const struct common_glue_ctx cast6_dec_xts = { - .num_funcs = 2, - .fpu_blocks_limit = CAST6_PARALLEL_BLOCKS, - - .funcs = { { - .num_blocks = CAST6_PARALLEL_BLOCKS, - .fn_u = { .xts = cast6_xts_dec_8way } - }, { - .num_blocks = 1, - .fn_u = { .xts = cast6_xts_dec } - } } -}; - static int ecb_encrypt(struct skcipher_request *req) { return glue_ecb_req_128bit(&cast6_enc, req); @@ -164,48 +122,6 @@ static int ctr_crypt(struct skcipher_request *req) return glue_ctr_req_128bit(&cast6_ctr, req); } -struct cast6_xts_ctx { - struct cast6_ctx tweak_ctx; - struct cast6_ctx crypt_ctx; -}; - -static int xts_cast6_setkey(struct crypto_skcipher *tfm, const u8 *key, - unsigned int keylen) -{ - struct cast6_xts_ctx *ctx = crypto_skcipher_ctx(tfm); - int err; - - err = xts_verify_key(tfm, key, keylen); - if (err) - return err; - - /* first half of xts-key is for crypt */ - err = __cast6_setkey(&ctx->crypt_ctx, key, keylen / 2); - if (err) - return err; - - /* second half of xts-key is for tweak */ - return __cast6_setkey(&ctx->tweak_ctx, key + keylen / 2, keylen / 2); -} - -static int xts_encrypt(struct skcipher_request *req) -{ - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct cast6_xts_ctx *ctx = crypto_skcipher_ctx(tfm); - - return glue_xts_req_128bit(&cast6_enc_xts, req, __cast6_encrypt, - &ctx->tweak_ctx, &ctx->crypt_ctx, false); -} - -static int xts_decrypt(struct skcipher_request *req) -{ - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct cast6_xts_ctx *ctx = crypto_skcipher_ctx(tfm); - - return glue_xts_req_128bit(&cast6_dec_xts, req, __cast6_encrypt, - &ctx->tweak_ctx, &ctx->crypt_ctx, true); -} - static struct skcipher_alg cast6_algs[] = { { .base.cra_name = "__ecb(cast6)", @@ -249,20 +165,6 @@ static struct skcipher_alg cast6_algs[] = { .setkey = cast6_setkey_skcipher, .encrypt = ctr_crypt, .decrypt = ctr_crypt, - }, { - .base.cra_name = "__xts(cast6)", - .base.cra_driver_name = "__xts-cast6-avx", - .base.cra_priority = 200, - .base.cra_flags = CRYPTO_ALG_INTERNAL, - .base.cra_blocksize = CAST6_BLOCK_SIZE, - .base.cra_ctxsize = sizeof(struct cast6_xts_ctx), - .base.cra_module = THIS_MODULE, - .min_keysize = 2 * CAST6_MIN_KEY_SIZE, - .max_keysize = 2 * CAST6_MAX_KEY_SIZE, - .ivsize = CAST6_BLOCK_SIZE, - .setkey = xts_cast6_setkey, - .encrypt = xts_encrypt, - .decrypt = xts_decrypt, }, }; -- 2.17.1