Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp453503pxb; Fri, 22 Apr 2022 04:49:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx7iSvRLAniFzptCRS2NtzTt0XraSbyTtoiBJw8mt1IzAgnf2JA/q+gjrSPWT6qLO6rXvJl X-Received: by 2002:a17:906:66c8:b0:6e8:8b06:1b32 with SMTP id k8-20020a17090666c800b006e88b061b32mr3714981ejp.236.1650628159669; Fri, 22 Apr 2022 04:49:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650628159; cv=none; d=google.com; s=arc-20160816; b=EgG+rhE2/Ckv2iBDhpEYJeyLNNuv4AgkitY+ZWu7nb7TMp3h/mLvnyRgAlEQohlRuA wRioFN50izNfQabUmiLJAn5nPnvsGV21q/OiCHSVcpQrVbCLtspgcbIXV+62qn+kZ4Wm kFuGZ/+8Dm5w6YBm7UfjxaAhMQbDNo1c21RYa0kEo8KJIVIR7MNQqzhRr0z4B8569lw3 fiT+gkrhZvvJxqy6+NjxXStCMMjN7l0zoi9GLlQY+nY1M78eYQVt3+yBmswblAo6TDCI kJd5YWMJ71OMCFG8vDF09NeWDfdCL0G51jisqeixckvo+oE2c8NbEg8vHbmOD7njvBrG UQxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=SrbvEthCL9oJTkn4hOw+r/9HpPETv2lTQQK1lo1U5Eo=; b=y50mg817PdXlr7xGdYGmyXwXGXfig6wnE3cblF7N+pXt0eApGKLWiEVlEtJ92QfUKi kzdHQ7tqNzZqKguyBJ/K2Xrelt5r46jO/osTqatJaJzRX05f4IvEFsJ2S1BIu4VD/JhG foJvbO7c5LHuCvAE8+F+fUtxO8hTX/L0YjqpbEphYrB2+qAQtoVIAPZaljG+1EiJlAEX de/Ir9iVFjrEIUBZ7qNlLuxnj6q6ZxNIZ2Rs5r/bTbPYVmoUIf3KnNXYFVXnyCV4xbKS h+LlBraCYzC47QbFVTk8XiuuHxCXWsuRWct6mH9s6PXqOSjxyis+dp9f5ZpxlKa7HpHt i/EQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=lEujdmH8; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z21-20020a1709060ad500b006e883d7759dsi6022782ejf.182.2022.04.22.04.48.49; Fri, 22 Apr 2022 04:49:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=lEujdmH8; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344390AbiDUWcJ (ORCPT + 99 others); Thu, 21 Apr 2022 18:32:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235154AbiDUWcG (ORCPT ); Thu, 21 Apr 2022 18:32:06 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9CDA4B850 for ; Thu, 21 Apr 2022 15:29:15 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 065F4B8298E for ; Thu, 21 Apr 2022 22:29:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 61653C385A5; Thu, 21 Apr 2022 22:29:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650580152; bh=hhx6ZHpH8Kxhx1mXm+n7JX0qgOHEiTHJNbV8HmTcis0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=lEujdmH8wTGzV6Osp9POxVQ8aLECEmnHAFDLR0Wsi501pOvrniLyCTATlKr69AkfX rEZ0Lh/3jCH9s1OVd4WZAIDmpxEmeuLHsdcj3412WbA6HlK5hJxf2rGvSx8pqna+h8 Kct/yUmCmuUk7r+I062zCGUeluU1QqxmnDVwbGJmTppS2hjcXb/+kCsnhPJT394OSn cST14YDYyk2qVvDzSmJ9++TnAOvwmkturki1QxYtBUODinTGf9yVMCc+J9sGzLKV2m k4Nyic2bQHIHp6jC+IUQf6YNJeZ42R1g8JjoukTA7yrWwvd231ziaKre/eY3YL/BO6 Dy9MloD4yALvQ== Date: Thu, 21 Apr 2022 15:29:10 -0700 From: Eric Biggers To: Nathan Huckleberry Cc: linux-crypto@vger.kernel.org, Herbert Xu , "David S. Miller" , linux-arm-kernel@lists.infradead.org, Paul Crowley , Sami Tolvanen , Ard Biesheuvel Subject: Re: [PATCH v4 4/8] crypto: x86/aesni-xctr: Add accelerated implementation of XCTR Message-ID: References: <20220412172816.917723-1-nhuck@google.com> <20220412172816.917723-5-nhuck@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Thu, Apr 21, 2022 at 04:59:31PM -0500, Nathan Huckleberry wrote: > On Mon, Apr 18, 2022 at 7:13 PM Eric Biggers wrote: > > > > On Tue, Apr 12, 2022 at 05:28:12PM +0000, Nathan Huckleberry wrote: > > > diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S > > > index 363699dd7220..ce17fe630150 100644 > > > --- a/arch/x86/crypto/aesni-intel_asm.S > > > +++ b/arch/x86/crypto/aesni-intel_asm.S > > > @@ -2821,6 +2821,76 @@ SYM_FUNC_END(aesni_ctr_enc) > > > > > > #endif > > > > > > +#ifdef __x86_64__ > > > +/* > > > + * void aesni_xctr_enc(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src, > > > + * size_t len, u8 *iv, int byte_ctr) > > > + */ > > > +SYM_FUNC_START(aesni_xctr_enc) > > > + FRAME_BEGIN > > > + cmp $16, LEN > > > + jb .Lxctr_ret > > > + shr $4, %arg6 > > > + movq %arg6, CTR > > > + mov 480(KEYP), KLEN > > > + movups (IVP), IV > > > + cmp $64, LEN > > > + jb .Lxctr_enc_loop1 > > > +.align 4 > > > +.Lxctr_enc_loop4: > > > + movaps IV, STATE1 > > > + vpaddq ONE(%rip), CTR, CTR > > > + vpxor CTR, STATE1, STATE1 > > > + movups (INP), IN1 > > > + movaps IV, STATE2 > > > + vpaddq ONE(%rip), CTR, CTR > > > + vpxor CTR, STATE2, STATE2 > > > + movups 0x10(INP), IN2 > > > + movaps IV, STATE3 > > > + vpaddq ONE(%rip), CTR, CTR > > > + vpxor CTR, STATE3, STATE3 > > > + movups 0x20(INP), IN3 > > > + movaps IV, STATE4 > > > + vpaddq ONE(%rip), CTR, CTR > > > + vpxor CTR, STATE4, STATE4 > > > + movups 0x30(INP), IN4 > > > + call _aesni_enc4 > > > + pxor IN1, STATE1 > > > + movups STATE1, (OUTP) > > > + pxor IN2, STATE2 > > > + movups STATE2, 0x10(OUTP) > > > + pxor IN3, STATE3 > > > + movups STATE3, 0x20(OUTP) > > > + pxor IN4, STATE4 > > > + movups STATE4, 0x30(OUTP) > > > + sub $64, LEN > > > + add $64, INP > > > + add $64, OUTP > > > + cmp $64, LEN > > > + jge .Lxctr_enc_loop4 > > > + cmp $16, LEN > > > + jb .Lxctr_ret > > > +.align 4 > > > +.Lxctr_enc_loop1: > > > + movaps IV, STATE > > > + vpaddq ONE(%rip), CTR, CTR > > > + vpxor CTR, STATE1, STATE1 > > > + movups (INP), IN > > > + call _aesni_enc1 > > > + pxor IN, STATE > > > + movups STATE, (OUTP) > > > + sub $16, LEN > > > + add $16, INP > > > + add $16, OUTP > > > + cmp $16, LEN > > > + jge .Lxctr_enc_loop1 > > > +.Lxctr_ret: > > > + FRAME_END > > > + RET > > > +SYM_FUNC_END(aesni_xctr_enc) > > > + > > > +#endif > > > > Sorry, I missed this file. This is the non-AVX version, right? That means that > > AVX instructions, i.e. basically anything instruction starting with "v", can't > > be used here. So the above isn't going to work. (There might be a way to test > > this with QEMU; maybe --cpu-type=Nehalem without --enable-kvm?) > > > > You could rewrite this without using AVX instructions. However, polyval-clmulni > > is broken in the same way; it uses AVX instructions without checking whether > > they are available. But your patchset doesn't aim to provide a non-AVX polyval > > implementation at all. So even if you got the non-AVX XCTR working, it wouldn't > > be paired with an accelerated polyval. > > > > So I think you should just not provide non-AVX versions for now. That would > > mean: > > > > 1.) Drop the change to aesni-intel_asm.S > > 2.) Don't register the AES XCTR algorithm unless AVX is available > > (in addition to AES-NI) > > Is there a preferred way to conditionally register xctr? It looks like > aesni-intel_glue.c registers a default implementation for all the > algorithms in the array, then better versions are enabled depending on > cpu features. Should I remove xctr from the list of other algorithms > and register it separately? > Yes, it will need to be removed from the aesni_skciphers array. I don't see any other algorithms in that file that are conditional on AES-NI && AVX, so it will have to go by itself. - Eric