Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp4039606imm; Mon, 6 Aug 2018 15:37:01 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcXqRvZJ0JB6qqaNYGGWv4nJEEk3Rf2k5bzRX1woCLQzMrrXGZxr8S9EfEcOpX3OlsQdo/a X-Received: by 2002:a62:42d7:: with SMTP id h84-v6mr18833084pfd.146.1533595021166; Mon, 06 Aug 2018 15:37:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533595021; cv=none; d=google.com; s=arc-20160816; b=Eqwpk/IuQJ2Msir5HR3ORvqmwCXZMJPbTv8DXtWkGdaJB1XIKmgn+WtJFfUDYnVZpN PwphBySuvKh4q+/1UNlsaTKPXAFAbMCrBH+IkmDZNWOFgXEgW3ruaEjMKLbyeUF2l10i 8ZYen0PnoVP7FavL0aIAfkpstevMB468Ric3MWIt2qIhuNO2138DFJWI/wjfcAVgbAJk H6pKLlvygjci8jwjpLcBEvIMVLNA3HQ1QHNpUBbB2AsEnjiMTBOFvfTwqgF+FmZuF6G+ o86h6xQev/V8cMFInuxRhkSPVjNDncg4QfH/FNvdeJma4tPF6p00cs7/LTEwn2Wus8F0 Xjaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=M8IREBcCqpd859yTZD3eC4B6rL6BY7FiRVBkTgQJuN8=; b=u0Jzjn2KfjsqDgDUfC3cii4TqPXJ+aH+32en+/X/IA54O70kRTPfooZfqziV/Rmx0C ssU5MBuT+U0AXu8gKBPZ/MPJ8Bfxivbw1CF25aWx1bsz6pOPQrMjlcoOoL7haKFcvgE1 EBLQ88g28ZbGdYzSPFIGph96tTvML9gqHw1Si5IwsfmQREQCB03uKoGbOsIn+I4P1tcZ oEC88EydVPoQq47eGRT4p3ME6klCNd4NnVKHaCnvVIbqX0ckAC5CfEt600rRGf0jhOCb dbuU278D3cgQjy/Q3RD+LhCnFnH0zGy3Y9DB3DS60llg1qGgTO5KeBYa3Xcu4Aptg/kJ g+nw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=LtFW08u4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 34-v6si12714381pgs.243.2018.08.06.15.36.46; Mon, 06 Aug 2018 15:37:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=LtFW08u4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732690AbeHGAqO (ORCPT + 99 others); Mon, 6 Aug 2018 20:46:14 -0400 Received: from mail.kernel.org ([198.145.29.99]:45134 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732295AbeHGAqN (ORCPT ); Mon, 6 Aug 2018 20:46:13 -0400 Received: from ebiggers-linuxstation.kir.corp.google.com (unknown [104.132.51.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A798E21A6E; Mon, 6 Aug 2018 22:34:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1533594899; bh=Fl+z/fw7th53OB8OCy3uOHbhqBvoGVHD5NhmQGOkfJY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LtFW08u47cAJ8ITgsrkQ8vXB0He8SJXGkv38O2BXqNr4D1tLPFeEGybNPvY9J2Oct Qx+PQoH0KRkvWgnjWjmUvzk7A/kX05vOVRA36AyM7cW4zDCAshghi8PcnRnpJy4efS 83fr0ad8BAiyHE35htmtqs5y7YNam6p17Z3pvEBI= From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-fscrypt@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Herbert Xu , Paul Crowley , Greg Kaiser , Michael Halcrow , "Jason A . Donenfeld" , Samuel Neves , Tomer Ashur , Eric Biggers Subject: [RFC PATCH 5/9] crypto: arm/chacha20 - add XChaCha20 support Date: Mon, 6 Aug 2018 15:32:56 -0700 Message-Id: <20180806223300.113891-6-ebiggers@kernel.org> X-Mailer: git-send-email 2.18.0.597.ga71716f1ad-goog In-Reply-To: <20180806223300.113891-1-ebiggers@kernel.org> References: <20180806223300.113891-1-ebiggers@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Eric Biggers Add an XChaCha20 implementation that is hooked up to the ARM NEON implementation of ChaCha20. This is needed for use in the HPolyC construction for disk/file encryption; see the generic code patch, "crypto: chacha20-generic - add XChaCha20 support", for more details. We also update the NEON code to support HChaCha20 on one block, so we can use that in XChaCha20 rather than calling the generic HChaCha20. This required factoring the permutation out into its own macro. Signed-off-by: Eric Biggers --- arch/arm/crypto/Kconfig | 2 +- arch/arm/crypto/chacha20-neon-core.S | 68 ++++++++++------ arch/arm/crypto/chacha20-neon-glue.c | 111 ++++++++++++++++++++------- 3 files changed, 130 insertions(+), 51 deletions(-) diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig index 925d1364727a..896dcf142719 100644 --- a/arch/arm/crypto/Kconfig +++ b/arch/arm/crypto/Kconfig @@ -116,7 +116,7 @@ config CRYPTO_CRC32_ARM_CE select CRYPTO_HASH config CRYPTO_CHACHA20_NEON - tristate "NEON accelerated ChaCha20 symmetric cipher" + tristate "NEON accelerated ChaCha20 stream cipher algorithms" depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER select CRYPTO_CHACHA20 diff --git a/arch/arm/crypto/chacha20-neon-core.S b/arch/arm/crypto/chacha20-neon-core.S index 451a849ad518..8e63208cc025 100644 --- a/arch/arm/crypto/chacha20-neon-core.S +++ b/arch/arm/crypto/chacha20-neon-core.S @@ -24,31 +24,20 @@ .fpu neon .align 5 -ENTRY(chacha20_block_xor_neon) - // r0: Input state matrix, s - // r1: 1 data block output, o - // r2: 1 data block input, i - - // - // This function encrypts one ChaCha20 block by loading the state matrix - // in four NEON registers. It performs matrix operation on four words in - // parallel, but requireds shuffling to rearrange the words after each - // round. - // - - // x0..3 = s0..3 - add ip, r0, #0x20 - vld1.32 {q0-q1}, [r0] - vld1.32 {q2-q3}, [ip] - - vmov q8, q0 - vmov q9, q1 - vmov q10, q2 - vmov q11, q3 +/* + * _chacha20_permute - permute one block + * + * Permute one 64-byte block where the state matrix is stored in the four NEON + * registers q0-q3. It performs matrix operation on four words in parallel, but + * requires shuffling to rearrange the words after each round. + * + * Clobbers: r3, q4 + */ +.macro _chacha_permute mov r3, #10 -.Ldoubleround: +.Ldoubleround_\@: // x0 += x1, x3 = rotl32(x3 ^ x0, 16) vadd.i32 q0, q0, q1 veor q3, q3, q0 @@ -110,7 +99,25 @@ ENTRY(chacha20_block_xor_neon) vext.8 q3, q3, q3, #4 subs r3, r3, #1 - bne .Ldoubleround + bne .Ldoubleround_\@ +.endm + +ENTRY(chacha20_block_xor_neon) + // r0: Input state matrix, s + // r1: 1 data block output, o + // r2: 1 data block input, i + + // x0..3 = s0..3 + add ip, r0, #0x20 + vld1.32 {q0-q1}, [r0] + vld1.32 {q2-q3}, [ip] + + vmov q8, q0 + vmov q9, q1 + vmov q10, q2 + vmov q11, q3 + + _chacha20_permute add ip, r2, #0x20 vld1.8 {q4-q5}, [r2] @@ -139,6 +146,21 @@ ENTRY(chacha20_block_xor_neon) bx lr ENDPROC(chacha20_block_xor_neon) +ENTRY(hchacha20_block_neon) + // r0: Input state matrix, s + // r1: output (8 32-bit words) + + vld1.32 {q0-q1}, [r0]! + vld1.32 {q2-q3}, [r0] + + _chacha20_permute + + vst1.8 {q0}, [r1]! + vst1.8 {q3}, [r1] + + bx lr +ENDPROC(hchacha20_block_neon) + .align 5 ENTRY(chacha20_4block_xor_neon) push {r4-r6, lr} diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c index ed8dec0f1768..becc7990b1d3 100644 --- a/arch/arm/crypto/chacha20-neon-glue.c +++ b/arch/arm/crypto/chacha20-neon-glue.c @@ -1,5 +1,5 @@ /* - * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions + * ChaCha20 (RFC7539) and XChaCha20 stream ciphers, NEON accelerated * * Copyright (C) 2016 Linaro, Ltd. * @@ -30,6 +30,7 @@ asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src); asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src); +asmlinkage void hchacha20_block_neon(const u32 *state, u32 *out); static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, unsigned int bytes) @@ -57,22 +58,17 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, } } -static int chacha20_neon(struct skcipher_request *req) +static int chacha20_neon_stream_xor(struct skcipher_request *req, + struct chacha_ctx *ctx, u8 *iv) { - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); struct skcipher_walk walk; u32 state[16]; int err; - if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd()) - return crypto_chacha_crypt(req); - err = skcipher_walk_virt(&walk, req, true); - crypto_chacha_init(state, ctx, walk.iv); + crypto_chacha_init(state, ctx, iv); - kernel_neon_begin(); while (walk.nbytes > 0) { unsigned int nbytes = walk.nbytes; @@ -83,27 +79,85 @@ static int chacha20_neon(struct skcipher_request *req) nbytes); err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } + + return err; +} + +static int chacha20_neon(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + int err; + + if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd()) + return crypto_chacha_crypt(req); + + kernel_neon_begin(); + err = chacha20_neon_stream_xor(req, ctx, req->iv); + kernel_neon_end(); + return err; +} + +static int xchacha20_neon(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + struct chacha_ctx subctx; + u32 state[16]; + u8 real_iv[16]; + int err; + + if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd()) + return crypto_xchacha_crypt(req); + + crypto_chacha_init(state, ctx, req->iv); + + kernel_neon_begin(); + + hchacha20_block_neon(state, subctx.key); + memcpy(&real_iv[0], req->iv + 24, 8); + memcpy(&real_iv[8], req->iv + 16, 8); + err = chacha20_neon_stream_xor(req, &subctx, real_iv); + kernel_neon_end(); return err; } -static struct skcipher_alg alg = { - .base.cra_name = "chacha20", - .base.cra_driver_name = "chacha20-neon", - .base.cra_priority = 300, - .base.cra_blocksize = 1, - .base.cra_ctxsize = sizeof(struct chacha_ctx), - .base.cra_module = THIS_MODULE, - - .min_keysize = CHACHA_KEY_SIZE, - .max_keysize = CHACHA_KEY_SIZE, - .ivsize = CHACHA_IV_SIZE, - .chunksize = CHACHA_BLOCK_SIZE, - .walksize = 4 * CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha20_setkey, - .encrypt = chacha_neon, - .decrypt = chacha_neon, +static struct skcipher_alg algs[] = { + { + .base.cra_name = "chacha20", + .base.cra_driver_name = "chacha20-neon", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = CHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = 4 * CHACHA_BLOCK_SIZE, + .setkey = crypto_chacha20_setkey, + .encrypt = chacha20_neon, + .decrypt = chacha20_neon, + }, { + .base.cra_name = "xchacha20", + .base.cra_driver_name = "xchacha20-neon", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = 4 * CHACHA_BLOCK_SIZE, + .setkey = crypto_chacha20_setkey, + .encrypt = xchacha20_neon, + .decrypt = xchacha20_neon, + } }; static int __init chacha20_simd_mod_init(void) @@ -111,12 +165,12 @@ static int __init chacha20_simd_mod_init(void) if (!(elf_hwcap & HWCAP_NEON)) return -ENODEV; - return crypto_register_skcipher(&alg); + return crypto_register_skciphers(algs, ARRAY_SIZE(algs)); } static void __exit chacha20_simd_mod_fini(void) { - crypto_unregister_skcipher(&alg); + crypto_unregister_skciphers(algs, ARRAY_SIZE(algs)); } module_init(chacha20_simd_mod_init); @@ -125,3 +179,6 @@ module_exit(chacha20_simd_mod_fini); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); MODULE_ALIAS_CRYPTO("chacha20"); +MODULE_ALIAS_CRYPTO("chacha20-neon"); +MODULE_ALIAS_CRYPTO("xchacha20"); +MODULE_ALIAS_CRYPTO("xchacha20-neon"); -- 2.18.0.597.ga71716f1ad-goog