Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp1225056pxb; Thu, 14 Apr 2022 00:53:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwvHYM8G/ZDSOCwWbNN0z0nPf957X/6tf/8V+1D6L1UjbuW/92q3Fgwk/IOikQwkaCzl4bA X-Received: by 2002:a17:906:7d83:b0:6ce:fee:9256 with SMTP id v3-20020a1709067d8300b006ce0fee9256mr1305563ejo.647.1649922806123; Thu, 14 Apr 2022 00:53:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649922806; cv=none; d=google.com; s=arc-20160816; b=nS/O2eiWDRaeFtzl/1f7gdtQLd85j+kodCKGKQF5yVE+qqSPtq0lRotYXUQcSmsfCK 34tn3HzVhSC8Y36D3PrcASxxNkvnTSQcL7pGdrg3ikHocIJn62o0jCkR0MfR5ELtDD2N JP6bY2kxiKgJoduWFPVkJEdPw4ObgN02B+XowRe6vA4mp57BQZ/dNGS/XRlwk1EazEdp sAcZdkPdD6UL7Ih+NJcdzh75f6S5Ct8iiKrppRZ5EbXIIdMxyJaKSqgVnt/QjYXnPBnY 1SabRbQ2f8Xk8LkxhReICTEC4iOvJ1hJkG3JDd/BLtCZPzvj3uARWzSsMyJFKCjxAEfZ 4NVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=+y1NMoaf3oB+yffLkxHD84ZM7FzBxO5THHjDA1XNMbI=; b=zNi46o5dMDH9IdhKNnfU1/plbCZnKYyykL7cLoEag+0vZ2V/G1BSkF3cUitcaSe40q ZBtdYG0Ll9wB2knH1+Sx8JCk4MIktXSugm9F5eUYlvmtL4lGTLCOz/ki73bDwl5ZYfoG CwVt4fEQlaTk/mVYUp6HxGLf6iTw66O1bjIL81RZhyPYgu/Hmtwkd4jSw1tlX0NxuhlF ARqAeHXNwoF+N6LEjMPnbtTszBl26bfR30BhrdplCYX+xuEmoZHqG+MywcJHnFCUE/EU sW+wSXgavno76m0I5qw3d/80stRtCrvoOIh/S3xP2VMUo/QcPgJjiQDZpsRn4v4QnW2g RyRg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=AOcwbyv8; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ky11-20020a170907778b00b006e7fca9e87asi1530153ejc.656.2022.04.14.00.52.54; Thu, 14 Apr 2022 00:53:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=AOcwbyv8; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240205AbiDNHDc (ORCPT + 99 others); Thu, 14 Apr 2022 03:03:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240190AbiDNHDV (ORCPT ); Thu, 14 Apr 2022 03:03:21 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FD5C21807 for ; Thu, 14 Apr 2022 00:00:56 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 28DDFB827CB for ; Thu, 14 Apr 2022 07:00:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8B075C385A5; Thu, 14 Apr 2022 07:00:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649919653; bh=uHr1+Cwk86eR3a3pxOBg7H43IWRGQGcenpXAPlFZX4w=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=AOcwbyv8bz3OKJR/W9KdvGEitDnENEPLZ0FbcyK6tb8Q9nBqE+05Um11dUIT4+Dvc u0hjXDCoTv1Kq+uwp8ZzhFkjeD2KPPKPDUabZ3LuB0X+h+xwpwz5gNYQjiJwSVvEZd 8YwC8y83PeGlO/JFx2aY1OYnV8T96ICtdB6eSrVbnH57SaRGjVJIxdO/RIcnGwzWo6 KLI8mEPP7Uaf5GdHMSax42P6kR+qMoqg28hAj+gxuto5AKHQw642s5AdNixMdTZiwJ sxwmzKDv7NUuK9VD7ABVlYMB00te7PcscP36DoVGnizsk7HVTblYjzUBJySeuQJP0K AqDjxDLqjwUYA== Date: Thu, 14 Apr 2022 00:00:51 -0700 From: Eric Biggers To: Nathan Huckleberry Cc: linux-crypto@vger.kernel.org, Herbert Xu , "David S. Miller" , linux-arm-kernel@lists.infradead.org, Paul Crowley , Sami Tolvanen , Ard Biesheuvel Subject: Re: [PATCH v4 4/8] crypto: x86/aesni-xctr: Add accelerated implementation of XCTR Message-ID: References: <20220412172816.917723-1-nhuck@google.com> <20220412172816.917723-5-nhuck@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220412172816.917723-5-nhuck@google.com> X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org A few initial comments, I'll take a closer look at the .S file soon... On Tue, Apr 12, 2022 at 05:28:12PM +0000, Nathan Huckleberry wrote: > Add hardware accelerated versions of XCTR for x86-64 CPUs with AESNI > support. These implementations are modified versions of the CTR > implementations found in aesni-intel_asm.S and aes_ctrby8_avx-x86_64.S. > > More information on XCTR can be found in the HCTR2 paper: > Length-preserving encryption with HCTR2: > https://enterprint.iacr.org/2021/1441.pdf The above link doesn't work. > +#ifdef __x86_64__ > +/* > + * void aesni_xctr_enc(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src, > + * size_t len, u8 *iv, int byte_ctr) > + */ This prototype doesn't match the one declared in the .c file. > + > +asmlinkage void aes_xctr_enc_128_avx_by8(const u8 *in, u8 *iv, void *keys, u8 > + *out, unsigned int num_bytes, unsigned int byte_ctr); > + > +asmlinkage void aes_xctr_enc_192_avx_by8(const u8 *in, u8 *iv, void *keys, u8 > + *out, unsigned int num_bytes, unsigned int byte_ctr); > + > +asmlinkage void aes_xctr_enc_256_avx_by8(const u8 *in, u8 *iv, void *keys, u8 > + *out, unsigned int num_bytes, unsigned int byte_ctr); Please don't have line breaks between parameter types and their names. These should look like: asmlinkage void aes_xctr_enc_128_avx_by8(const u8 *in, u8 *iv, void *keys, u8 *out, unsigned int num_bytes, unsigned int byte_ctr); Also, why aren't the keys const? > +static void aesni_xctr_enc_avx_tfm(struct crypto_aes_ctx *ctx, u8 *out, const u8 > + *in, unsigned int len, u8 *iv, unsigned int > + byte_ctr) > +{ > + if (ctx->key_length == AES_KEYSIZE_128) > + aes_xctr_enc_128_avx_by8(in, iv, (void *)ctx, out, len, > + byte_ctr); > + else if (ctx->key_length == AES_KEYSIZE_192) > + aes_xctr_enc_192_avx_by8(in, iv, (void *)ctx, out, len, > + byte_ctr); > + else > + aes_xctr_enc_256_avx_by8(in, iv, (void *)ctx, out, len, > + byte_ctr); > +} Same comments above. > +static int xctr_crypt(struct skcipher_request *req) > +{ > + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); > + struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); > + u8 keystream[AES_BLOCK_SIZE]; > + u8 ctr[AES_BLOCK_SIZE]; > + struct skcipher_walk walk; > + unsigned int nbytes; > + unsigned int byte_ctr = 0; > + int err; > + __le32 ctr32; > + > + err = skcipher_walk_virt(&walk, req, false); > + > + while ((nbytes = walk.nbytes) > 0) { > + kernel_fpu_begin(); > + if (nbytes & AES_BLOCK_MASK) > + static_call(aesni_xctr_enc_tfm)(ctx, walk.dst.virt.addr, > + walk.src.virt.addr, nbytes & AES_BLOCK_MASK, > + walk.iv, byte_ctr); > + nbytes &= ~AES_BLOCK_MASK; > + byte_ctr += walk.nbytes - nbytes; > + > + if (walk.nbytes == walk.total && nbytes > 0) { > + ctr32 = cpu_to_le32(byte_ctr / AES_BLOCK_SIZE + 1); > + memcpy(ctr, walk.iv, AES_BLOCK_SIZE); > + crypto_xor(ctr, (u8 *)&ctr32, sizeof(ctr32)); > + aesni_enc(ctx, keystream, ctr); > + crypto_xor_cpy(walk.dst.virt.addr + walk.nbytes - > + nbytes, walk.src.virt.addr + walk.nbytes > + - nbytes, keystream, nbytes); > + byte_ctr += nbytes; > + nbytes = 0; > + } For the final block case, it would be a bit simpler to do something like this: __le32 block[AES_BLOCK_SIZE / sizeof(__le32)] ... memcpy(block, walk.iv, AES_BLOCK_SIZE); block[0] ^= cpu_to_le32(1 + byte_ctr / AES_BLOCK_SIZE); aesni_enc(ctx, (u8 *)block, (u8 *)block); I.e., have one buffer, use a regular XOR instead of crypto_xor(), and encrypt it in-place. > @@ -1162,6 +1249,8 @@ static int __init aesni_init(void) > /* optimize performance of ctr mode encryption transform */ > static_call_update(aesni_ctr_enc_tfm, aesni_ctr_enc_avx_tfm); > pr_info("AES CTR mode by8 optimization enabled\n"); > + static_call_update(aesni_xctr_enc_tfm, aesni_xctr_enc_avx_tfm); > + pr_info("AES XCTR mode by8 optimization enabled\n"); > } Please don't add the log message above, as it would get printed at every boot-up on most x86 systems, and it's not important enough for that. The existing message "AES CTR mode ..." shouldn't really exist in the first place. - Eric