Received: by 2002:a05:7412:e794:b0:fa:551:50a7 with SMTP id o20csp1770546rdd; Thu, 11 Jan 2024 08:46:11 -0800 (PST) X-Google-Smtp-Source: AGHT+IERABKLlZ7xbUPR8Im7c+xnLAqcI40z0EyhhTIK9/kXW5R7qJrb3bmX4v6MTCGteBSqT0PC X-Received: by 2002:a17:90a:8544:b0:28b:e09f:58c4 with SMTP id a4-20020a17090a854400b0028be09f58c4mr36564pjw.67.1704991571373; Thu, 11 Jan 2024 08:46:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704991571; cv=none; d=google.com; s=arc-20160816; b=VL6ynvmvUX5o6oowpe2CFm3B7EKdBrG57Y3APIFzLR6+qtHdCiPW7VILL/3CST0iSS 7LDH16S8E6u8laYCPhG7Op8Adj7gjsyM19UPCFriIw9/r2ghnppAEIMmG6gQEkKGmFIP 8j0autz3PlAKp2hIFrcnbMobGQpbakWhH3n/gzDG0uWQKPgbOLGQPVHf78qwXCVqcDef WUXrKerlDEt3q+UB3uIaAdHCwuvXwsYY/b31Vh09LcUFxZ7T/IVlTib2SxSv86+u0sJS 3qDDxjqRelmtSnD0KlzkoEhkPV2s667JS8sCedguYbtGPRF+wjZl8OKnGCbwGy9NInd9 WvVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :dkim-signature; bh=EWToa/xz9F+jecpV953nv2wtWT2/r9USYOEpNgiJspw=; fh=9/Ax1/LC6kc6GZoWqvcqCccZ2eB6caAtZ+bk7miJknQ=; b=aj3+lhtbSi8W7IDqPz87BaCvgpt2tmQ9IKpwHY8Dmin0vxPU7gRzabirPLId2p1vex zuKgAuyFXtyXmyHr3G6B+EMsr+WxndvNvF1IFlZwwWD6MKM2SubWyaF3tfVX4XpAiyY4 moVhNMMpETQfaEzNJDxI5aHyNQSPXFa6USWXp9EtXit99mPqj4vJ7yR/ZkQp/HrYN0Wl 9Cq730u5NXZDHQWs/U4Fuedc35npx7oj1OsVPGw9/8BWDHQ2mxZtPqqIoFsRN3O5YwBn N/AOoJxtqDxz2a6gloGG/p6YTG89xIS/UuAAx0pmlKsb48xXx66pBfLUvbuCAYuqHzBh 7ltw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=dwQH2zgD; spf=pass (google.com: domain of linux-crypto+bounces-1391-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-crypto+bounces-1391-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id x4-20020a17090abc8400b0028d292a843dsi1386133pjr.102.2024.01.11.08.46.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jan 2024 08:46:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-crypto+bounces-1391-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=dwQH2zgD; spf=pass (google.com: domain of linux-crypto+bounces-1391-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-crypto+bounces-1391-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 2E9AC28C5FE for ; Thu, 11 Jan 2024 16:35:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 54B045024E; Thu, 11 Jan 2024 16:35:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dwQH2zgD" X-Original-To: linux-crypto@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1FCF650250 for ; Thu, 11 Jan 2024 16:35:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9DE52C43390 for ; Thu, 11 Jan 2024 16:35:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1704990915; bh=pYZFzDQ/TWAhv59QB61LzALmMor3mbcIpv5DzqPls6s=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=dwQH2zgDTp2iU/C/sQfxuFE8Wl9PsY+/86TzHlbSjtgNHG+x2pe+A/s92EEBfxhaR XuYoVLdCb6PY5YCNnIk5eR7v+A736n2Slmxdy7uynWoWTzSSXkYjlckmjerlXNPXh9 ethkCXU311LOB7b+F3iaICYMXQ9bQOFrAWudIsCIHLtYVnCeWdwPoO0mmMeaW31W70 arRGx2ab/eZVjnDElb8+cQkpbHoWWRGw15OsbHCQLrSI3R7MxPXs1SLqOljqKKNdif K/wonqm0FJwmEyPynmBF0ZURamnHjF/ISJ7bz0Q/eoXyLY52GBCvPD4yQbEggCfWaT jaDr8ZLeEikbA== Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-50eaa8b447bso6084167e87.1 for ; Thu, 11 Jan 2024 08:35:15 -0800 (PST) X-Gm-Message-State: AOJu0YwJ0A8X7vr7AFxLqScpkSYwECeeb6p3/c27JuGD1odIToBHKnnY fZGowG/aRcSuscLFh2tLuBP/XCPrTyWZZsJFyzE= X-Received: by 2002:a05:6512:3e27:b0:50e:7bba:8567 with SMTP id i39-20020a0565123e2700b0050e7bba8567mr528644lfv.233.1704990913847; Thu, 11 Jan 2024 08:35:13 -0800 (PST) Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240111123302.589910-10-ardb+git@google.com> <20240111123302.589910-14-ardb+git@google.com> In-Reply-To: <20240111123302.589910-14-ardb+git@google.com> From: Ard Biesheuvel Date: Thu, 11 Jan 2024 17:35:02 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 4/8] crypto: arm64/aes-ccm - Replace bytewise tail handling with NEON permute To: Ard Biesheuvel Cc: linux-crypto@vger.kernel.org, ebiggers@kernel.org, herbert@gondor.apana.org.au Content-Type: text/plain; charset="UTF-8" On Thu, 11 Jan 2024 at 13:33, Ard Biesheuvel wrote: > > From: Ard Biesheuvel > > Implement the CCM tail handling using a single sequence that uses > permute vectors and overlapping loads and stores, rather than going over > the tail byte by byte in a loop, and using scalar operations. This is > more efficient, even though the measured speedup is only around 1-2% on > the CPUs I have tried. > > Signed-off-by: Ard Biesheuvel > --- > arch/arm64/crypto/aes-ce-ccm-core.S | 59 +++++++++++++------- > arch/arm64/crypto/aes-ce-ccm-glue.c | 20 +++---- > 2 files changed, 48 insertions(+), 31 deletions(-) > ... The hunks below don't belong here: they were supposed to be squashed into the previous patch. I will fix that up for the next revision. > diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c > index 2f4e6a318fcd..4710e59075f5 100644 > --- a/arch/arm64/crypto/aes-ce-ccm-glue.c > +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c > @@ -181,16 +181,16 @@ static int ccm_encrypt(struct aead_request *req) > if (walk.nbytes == walk.total) > tail = 0; > > - if (unlikely(walk.total < AES_BLOCK_SIZE)) > - src = dst = memcpy(buf + sizeof(buf) - walk.total, > - src, walk.total); > + if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) > + src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes], > + src, walk.nbytes); > > ce_aes_ccm_encrypt(dst, src, walk.nbytes - tail, > ctx->key_enc, num_rounds(ctx), > mac, walk.iv); > > - if (unlikely(walk.total < AES_BLOCK_SIZE)) > - memcpy(walk.dst.virt.addr, dst, walk.total); > + if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) > + memcpy(walk.dst.virt.addr, dst, walk.nbytes); > > if (walk.nbytes == walk.total) > ce_aes_ccm_final(mac, orig_iv, ctx->key_enc, num_rounds(ctx)); > @@ -248,16 +248,16 @@ static int ccm_decrypt(struct aead_request *req) > if (walk.nbytes == walk.total) > tail = 0; > > - if (unlikely(walk.total < AES_BLOCK_SIZE)) > - src = dst = memcpy(buf + sizeof(buf) - walk.total, > - src, walk.total); > + if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) > + src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes], > + src, walk.nbytes); > > ce_aes_ccm_decrypt(dst, src, walk.nbytes - tail, > ctx->key_enc, num_rounds(ctx), > mac, walk.iv); > > - if (unlikely(walk.total < AES_BLOCK_SIZE)) > - memcpy(walk.dst.virt.addr, dst, walk.total); > + if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) > + memcpy(walk.dst.virt.addr, dst, walk.nbytes); > > if (walk.nbytes == walk.total) > ce_aes_ccm_final(mac, orig_iv, ctx->key_enc, num_rounds(ctx)); > -- > 2.43.0.275.g3460e3d667-goog >