Received: by 2002:a05:7208:9594:b0:7e:5202:c8b4 with SMTP id gs20csp1054852rbb; Sun, 25 Feb 2024 17:40:27 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCV8JvZ0mrdmJyLMDIJhw1BdsUCWz73NztMKhKvFMnR+lkIFH09XyBnyveELRDi+Rz9t6ygNDMWI+9giy5MaRX0RLwBo5zZddrt4a6OwgA== X-Google-Smtp-Source: AGHT+IG8YLAFCOciisvk4UEQhZf2jjwPGZ95N1aB5FuS/ReFMyle0BOJ7JHacmjKRWSEFgnGwnk5 X-Received: by 2002:a17:907:940b:b0:a42:e720:58af with SMTP id dk11-20020a170907940b00b00a42e72058afmr4791630ejc.4.1708911627358; Sun, 25 Feb 2024 17:40:27 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708911627; cv=pass; d=google.com; s=arc-20160816; b=kfTLGzkta0ugPS3+HW307eaSkx7ILOYVj8PKNazD/JtS7YkIDwmTtGBoX44lnTVwQx 3y/Ni/DeN6iEf0Rvy1yUkCn5UF32tDx5Kd/fTTW/kt+JClENa/0lle/rZClvdlVcZzkj rmisTo8r92cK9kYPWySF0wvqre+sJTsfWyNKZFkSo5rCyWuteVE/LqtkJHdGxMbAOxiJ nL0X31JGfeCWZyry/0AP6oBc3nFJCIj5anyNaQgntVqw9UefhlfB8dWQPS2Nh4SCqYlc hjpT/xXqsafvYeip4iHiuuiBWPdmzjXstVhmWm+PQcCCbmJgfP6aA9hMo1CKkK5XSvwL I5gA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:dkim-signature; bh=gao7hRUfZpXivOmhIfnNUfW8TTK3M/4khJ9JgCKSDQQ=; fh=qh8s5z/3jX7lZLVdqVYaiXBaOIBv0pRKK0qEpIB+sSg=; b=xY0Y/iUDrZ+oaj1ZsC5y+ypd/Kz34ITLthbh8HT8rhnmAxiq8ErybH8YlqbxgkvION 6qmidm7QIAVxh7+MPr+5nVyh2aZVVfaWDAjtbO8wdoEL+Mc3qKL/kaU/XN0YdKrLJkRT j7Nv+cjD4KkT7u40u3AXFzZvXPEb0xuOOFpzxMIwFfsYAm4EggtWNl2dDuRh+eagzt0Y AXiLwzc4N73BHIj4dp8j06A7tpf3Z2PUUjNUuHqOXigFj6yWZVwdT/ykgGu5RDFWVCLD shfy8efQSfwKh0x/94OYxQUlupNYbBQPtrrVQsNGcDTpB0L6Z4RCPL1LnULfDZY2431n 3aVQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@sifive.com header.s=google header.b=hh1wLVTo; arc=pass (i=1 spf=pass spfdomain=sifive.com dkim=pass dkdomain=sifive.com dmarc=pass fromdomain=sifive.com); spf=pass (google.com: domain of linux-crypto+bounces-2311-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-crypto+bounces-2311-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=sifive.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id z7-20020a1709063ac700b00a3eba62753bsi1685106ejd.962.2024.02.25.17.40.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Feb 2024 17:40:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-crypto+bounces-2311-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@sifive.com header.s=google header.b=hh1wLVTo; arc=pass (i=1 spf=pass spfdomain=sifive.com dkim=pass dkdomain=sifive.com dmarc=pass fromdomain=sifive.com); spf=pass (google.com: domain of linux-crypto+bounces-2311-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-crypto+bounces-2311-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=sifive.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id BB5A01F21868 for ; Mon, 26 Feb 2024 01:40:26 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 074D314AAD; Mon, 26 Feb 2024 01:40:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=sifive.com header.i=@sifive.com header.b="hh1wLVTo" X-Original-To: linux-crypto@vger.kernel.org Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F4D314A98 for ; Mon, 26 Feb 2024 01:40:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708911620; cv=none; b=VlBDvVVc3ee6k5zU2OZ5Q0Wa6pORXDsC2vtGeC5Nc71njKCMLBg4Y5n7L7KwUxlOxp2l4vKRw8HpMENxiDKBELAnM3GIQVWbPV4CcBBDUyxRUhcHguixZPJ414jKg/xLHev6onooG8xWNVS1kUc0Ga1H0RYbZadsQtnpi1Pdwfs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708911620; c=relaxed/simple; bh=CEF5PY0v162e8nADzCCdKuh/9GsYfRE40wXrl7GU0Gg=; h=Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc: Message-Id:References:To; b=Tq+6I/T0pth/FpZQMbHsl4idfcNTFc32VPOosti5cFTQGayPupiGJsLRA6twOkG+conRCL5cjkd0IjM/R7YYc0Fnc28Gw15lwDiXlK8DPmkgdAKcjogBHsWe9gFF/EUvKDw/Lo7OZCHvksEEZmRTO9usG/k2uFXsb4x6xaKdDXs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=sifive.com; spf=pass smtp.mailfrom=sifive.com; dkim=pass (2048-bit key) header.d=sifive.com header.i=@sifive.com header.b=hh1wLVTo; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=sifive.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sifive.com Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-6da202aa138so1605188b3a.2 for ; Sun, 25 Feb 2024 17:40:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1708911618; x=1709516418; darn=vger.kernel.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gao7hRUfZpXivOmhIfnNUfW8TTK3M/4khJ9JgCKSDQQ=; b=hh1wLVTodmB84UDVO/dzjHZ9nJiQ8bE9iHVsmejcIoH9U3N45ObaMGrhaXOfB72Qfn JJ2sM9wdnWCkttN1oQAlcWPZlYs76LtoEacrlxV3dqFPnuDs3aQFErVXWVd5tqBm9/hw L6n1hUksh1huqjxi/1G/5A9q8HUpjEErXDwDchcxPxYg7rakasnV2L5NvCveyDwgDr2o svNk0qtlZiSXYvbNXRqD5z6ehGW4qF6dhWsvU5cVbRdH38h8cJutBQY+c6EZlHdvUcv1 Fx3mWjoJiwm2s23qKOsOIYkJV1IDDZtoNE2Vg24wRZrUOFPkSbKXEI9A/wjroz1PXlIW Xixg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708911618; x=1709516418; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gao7hRUfZpXivOmhIfnNUfW8TTK3M/4khJ9JgCKSDQQ=; b=CoF/3A5Y0YCSdYUIXsjjQzLs9LNNmdXj+r5pXPTw4FGNxPrkv8LubjnfPbAxs0d6AE s0XJCxoNa7N73Zl+0rLbSp6oMKY6+gmxM/TAGRR/2N7uqPuXp+ynQkV7eV0xUW6SzA/5 KQbJg0AlJdCmwHov/IetsaACFyL1rJ1H0khGtQrqFoS3DsZj7wy/3i3+Yyl9Rpncxrvt 1X+3PNaSkP6RPIMafes1TwToitPEMpP4fj1yydMR0NfV4fzdA8kfvYqAXi8FMWVHwFlA cWA2gP70kgwamrs3U7G9bvaUJ8HBWk7H43aLzDfV7BSOlRjzBovtIxHrcU33FTqDrcbL 6cig== X-Forwarded-Encrypted: i=1; AJvYcCUiwk1VZKBSRbvb69XUw71g9Z6aORIHv1CyQyKlpgSjXy5mUPsXCH7hHhk3jzgRSNUdJ3HgP5AX7CIFd4t/GMmzo/I35jUsR2iMTiUK X-Gm-Message-State: AOJu0YzpknYdysX3QTVPfKcm9hVk+2CaHqT2uspNaic10cwT2mZjBHh0 fCdiSYi9zs2Spm5SdGYWIXVE4n8BCuzeikBbeyk1zxOQIrsea4VW4cDBZT1Pz9s= X-Received: by 2002:a05:6a00:14c5:b0:6e5:2f27:5235 with SMTP id w5-20020a056a0014c500b006e52f275235mr1867135pfu.11.1708911618546; Sun, 25 Feb 2024 17:40:18 -0800 (PST) Received: from ?IPv6:2402:7500:5dc:7e53:808:399a:34d8:b170? ([2402:7500:5dc:7e53:808:399a:34d8:b170]) by smtp.gmail.com with ESMTPSA id e17-20020aa79811000000b006e45c5d7720sm2903150pfl.93.2024.02.25.17.40.16 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 25 Feb 2024 17:40:17 -0800 (PST) Content-Type: text/plain; charset=us-ascii Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.7\)) Subject: Re: [PATCH riscv/for-next] crypto: riscv - parallelize AES-CBC decryption From: Jerry Shih In-Reply-To: <20240210181240.GA1098@sol.localdomain> Date: Mon, 26 Feb 2024 09:40:14 +0800 Cc: linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-crypto@vger.kernel.org, =?utf-8?Q?Christoph_M=C3=BCllner?= , Heiko Stuebner , Phoebe Chen , Andy Chiu Content-Transfer-Encoding: quoted-printable Message-Id: <71CC95E2-D5A2-4158-ADD2-C28216B00F3A@sifive.com> References: <20240208060851.154129-1-ebiggers@kernel.org> <04703246-6EF6-4B54-B8F1-96EDEC2FBA6B@sifive.com> <20240210181240.GA1098@sol.localdomain> To: Eric Biggers X-Mailer: Apple Mail (2.3445.9.7) On Feb 11, 2024, at 02:12, Eric Biggers wrote: > On Sat, Feb 10, 2024 at 11:25:27PM +0800, Jerry Shih wrote: >>> .macro aes_cbc_decrypt keylen >>> + srli LEN, LEN, 2 // Convert LEN from bytes to = words >>> vle32.v v16, (IVP) // Load IV >>> 1: >>> - vle32.v v17, (INP) // Load ciphertext block >>> - vmv.v.v v18, v17 // Save ciphertext block >>> - aes_decrypt v17, \keylen // Decrypt >>> - vxor.vv v17, v17, v16 // XOR with IV or prev = ciphertext block >>> - vse32.v v17, (OUTP) // Store plaintext block >>> - vmv.v.v v16, v18 // Next "IV" is prev ciphertext = block >>> - addi INP, INP, 16 >>> - addi OUTP, OUTP, 16 >>> - addi LEN, LEN, -16 >>> + vsetvli t0, LEN, e32, m4, ta, ma >>> + vle32.v v20, (INP) // Load ciphertext blocks >>> + vslideup.vi v16, v20, 4 // Setup prev ciphertext blocks >>> + addi t1, t0, -4 >>> + vslidedown.vx v24, v20, t1 // Save last ciphertext block >>=20 >> Do we need to setup the `e32, len=3Dt0` for next IV? >> I think we only need 128bit IV (with VL=3D4). >>=20 >>> + aes_decrypt v20, \keylen // Decrypt the blocks >>> + vxor.vv v20, v20, v16 // XOR with prev ciphertext = blocks >>> + vse32.v v20, (OUTP) // Store plaintext blocks >>> + vmv.v.v v16, v24 // Next "IV" is last ciphertext = block >>=20 >> Same VL issue here. >=20 > It's true that the vslidedown.vx and vmv.v.v only need vl=3D4. But it = also works > fine with vl unchanged. It just results in some extra data being = moved in the > registers. My hypothesis is that this is going to be faster than = having the > three extra instructions per loop iteration to change the vl to 4 = twice. >=20 > I still have no real hardware to test on, so I have no quantitative = data. All I > can do is go with my instinct which is that the shorter version will = be better. >=20 > If you have access to a real CPU that supports the RISC-V vector = crypto > extensions, I'd be interested in the performance you get from each = variant. > (Of course, different RISC-V CPU implementations may have quite = different > performance characteristics, so that still won't be definitive.) Hi Eric, Thank you. I think the extra vl doesn't affect performance = significantly. The main tasks are still the aes body. The original implementation is enough right now. > In general, this level of micro-optimization probably needs to be wait = until > there are a variety of CPUs to test on. We know that parallelizing = the > algorithms is helpful, so we should do that, as this patch does. But = the > effects of small variations in the instruction sequences are currently = unclear. >=20 > - Eric