Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp1471191iob; Thu, 5 May 2022 01:54:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwQl/iIcLr89k2Rjanc6tj1BLoBy53tvwDejwPyL48UUfsTrPCGmaq1qxJ3E8rO4ZT2Zh4X X-Received: by 2002:a17:902:7109:b0:15c:e11e:efd with SMTP id a9-20020a170902710900b0015ce11e0efdmr26099476pll.110.1651740898349; Thu, 05 May 2022 01:54:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651740898; cv=none; d=google.com; s=arc-20160816; b=pFbNqwWkWeaxvBwsJdzaxJBKuZhwjpTyh3yHEYGa5EQnjFXwDScN+94pab7pNTH2Bk mAymRGnGz4XOudALlK8whS2N/xLU+WiV4qbcm//hQgOHPusPfGSZuY5x67mvRTkZzDzs uZzT2cknHiFugV/YbYkckjsFMRux6kJxnMQbwjCr30wUzl4OG4TghaFK+SbZ+v2RlUdR AR7DQzeoD7UDZhYIw0jfSyQMVQwnzvyfFYVos+ccgb6EKlIoQpA8do7qz3/Bb1T6uUKt GhBISSWKs5lVTKR8nZAkWlPt3qQFMs8RfNF/KkdPPtu4geP1nCFfjv5/+bfpyeRjTeBL m3PQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=k8aovTROPIojQYhqkVMgUUYm3T4DJLp+YRxngFt19ME=; b=YQVdfuLr+hDdTAOabTF90dXtIxeKobEtHCozKgwXeV/Q+bSQHjL/oXuxaQVuhbuO3g RhY+UJ6agNQl3UZvjgVf8sALI2Qjz7e3Yw1AE8DWgM5cLpeOshoDHc2iSrWjVoLJqNcz rVw1KY1Q5sYfILcuf+4SOJG42nC3WplNosjv3nkQzx5BteDkj0qySwwZPMef1njcoK1s AeAGAc+E5LlWGzP6+lPTLBx6RPvHKFuVSNltrSr5ruLe0Q4ZSMhuATpercPFa+SpTNM+ FXzDaGkRzsUvTW7V/TgpWSeSQ70v9oGe3vtt1eaXO0OG7ngYg3nFEx8llXOTkMfvuD2E c0LQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="N/Ng0JJT"; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j13-20020aa78dcd000000b004fa9a8cc0ccsi963058pfr.100.2022.05.05.01.54.35; Thu, 05 May 2022 01:54:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="N/Ng0JJT"; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243889AbiEEGAc (ORCPT + 99 others); Thu, 5 May 2022 02:00:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230024AbiEEGAa (ORCPT ); Thu, 5 May 2022 02:00:30 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5A7629C86; Wed, 4 May 2022 22:56:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5EB1961C16; Thu, 5 May 2022 05:56:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6E946C385A4; Thu, 5 May 2022 05:56:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651730211; bh=wspa6oBp61T6ZNoZSTb/Ap2SRQOc8RAlZyxMCn+JRkw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=N/Ng0JJTyAJ9akPBvDKJ8OHa8b8e58ftOzIhBbGrPbZ1hJcbjA+cbYmwd/dKkiDZl Xc6LT9SgvZhI1aV4/TDPOd9jyP3AefN1tlSniKlohkUFEmIjf4oWHwlcnUWxVQl8rq 69X3nUwu9euQ57XAdD4zfV7pQDi/XsRc1zahHpHjWrOw5gpZd9qUO9qV4sdWndRTfY GFUMNHHBdgaBIJ0grPYuZyC5zI9e8l944Op+vy4aoc5/Txa7aSDCPxR8i923lxt27D wefNazij/QAHic6ZEcavWTaPn7DozgTwZmoL8CRFEzqpE6E2ti7qI9wFXq7coIplw+ 4y11gFeSWpIRA== Date: Wed, 4 May 2022 22:56:49 -0700 From: Eric Biggers To: Nathan Huckleberry Cc: linux-crypto@vger.kernel.org, linux-fscrypt@vger.kernel.org, Herbert Xu , "David S. Miller" , linux-arm-kernel@lists.infradead.org, Paul Crowley , Sami Tolvanen , Ard Biesheuvel Subject: Re: [PATCH v6 8/9] crypto: arm64/polyval: Add PMULL accelerated implementation of POLYVAL Message-ID: References: <20220504001823.2483834-1-nhuck@google.com> <20220504001823.2483834-9-nhuck@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220504001823.2483834-9-nhuck@google.com> X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Wed, May 04, 2022 at 12:18:22AM +0000, Nathan Huckleberry wrote: > + * X = [X_1 : X_0] > + * Y = [Y_1 : Y_0] > + * > + * The multiplication produces four parts: > + * LOW: The polynomial given by performing carryless multiplication of X_0 and > + * Y_0 > + * MID: The polynomial given by performing carryless multiplication of (X_0 + > + * X_1) and (Y_0 + Y_1) > + * HIGH: The polynomial given by performing carryless multiplication of X_1 > + * and Y_1 > + * > + * We compute: > + * LO += LOW > + * MI += MID > + * HI += HIGH Three parts, not four. But why not write this as the much more concise: * Given: * X = [X_1 : X_0] * Y = [Y_1 : Y_0] * * We compute: * LO += X_0 * Y_0 * MI += (X_0 + X_1) * (Y_0 + Y_1) * HI += X_1 * Y_1 > + * So our final computation is: T = T_1 : T_0 = g*(x) * P_0 V = V_1 : V_0 = > + * g*(x) * (P_1 + T_0) p(x) / x^{128} mod g(x) = P_3 + P_1 + T_0 + V_1 : P_2 + > + * P_0 + T_1 + V_0 As on the x86 version, this part is now unreadable. It was fine in v5. > + * [HI_1 : HI_0 + HI_1 + MI_1 + LO_1 : LO_1 + HI_0 + MI_0 + LO_0 : LO_0] [...] > + * [HI_1 : HI_1 + HI_0 + MI_1 + LO_1 : HI_0 + MI_0 + LO_1 + LO_0 : LO_0] [...] > + // TMP_V = T_1 : T_0 = P_0 * g*(x) > + pmull TMP_V.1q, PL.1d, GSTAR.1d [...] > + // TMP_V = V_1 : V_0 = (P_1 + T_0) * g*(x) > + pmull2 TMP_V.1q, GSTAR.2d, TMP_V.2d > + eor DEST.16b, PH.16b, TMP_V.16b [...] > + pmull TMP_V.1q, GSTAR.1d, PL.1d [...] > + pmull2 TMP_V.1q, GSTAR.2d, TMP_V.2d [...] > + eor SUM.16b, TMP_V.16b, PH.16b It looks like you didn't fully address my comments on v5 about putting operands in a consistent order. Not a big deal, but assembly code is always hard to read, and anything to make it easier would be greatly appreciated. > +/* > + * Handle any extra blocks afer full_stride loop. > + */ Typo above. > diff --git a/arch/arm64/crypto/polyval-ce-glue.c b/arch/arm64/crypto/polyval-ce-glue.c [...] > +struct polyval_tfm_ctx { > + u8 key_powers[NUM_KEY_POWERS][POLYVAL_BLOCK_SIZE]; > +}; This is missing the comment about the order of the key powers that I had suggested for readability. It made it into the x86 version but not here. This file is very similar to arch/x86/crypto/polyval-clmulni_glue.c, so if you could diff them and eliminate any unintended differences, that would be helpful. Other than the above readability suggestions this patch looks good, nice job. - Eric