Received: by 2002:ab2:7903:0:b0:1fb:b500:807b with SMTP id a3csp640308lqj; Sun, 2 Jun 2024 15:24:45 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUZsqvurNlAJHm/Vx2AZs9QcIDf6u/Heqe64exAhENkbxtIGXd3cJxZjJ4CvQQZ7ydv6aSlycMFJOpaSNzWRZEFI6gg02l3/HXrShlBXw== X-Google-Smtp-Source: AGHT+IGZwpGjk9iTvm1k+D0wGrLkmySMap48arxdBD0LoGKVHCEs4VHfzjMi1qlqAl8hSBfk14MQ X-Received: by 2002:a05:6358:5e0c:b0:19b:9d5a:ebb7 with SMTP id e5c5f4694b2df-19b9d5aec07mr306007355d.1.1717367085030; Sun, 02 Jun 2024 15:24:45 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717367084; cv=pass; d=google.com; s=arc-20160816; b=oO/IM+ToOUXRW6u+FWxLdBY2yLBnvvcSWpjLw3Nq+cE4kwBtbgWCMCidRn2wGIQhKa NGEG0DdZCy17uYezXuwRGQCAze2uEQpgXviQsShGXFrI0y6vsFawU6sDzmM1Fj03xiKt z6ozQOSztO+tRHf+wF7Nmhso++ZDUsPeVGUByiElLstV/E3bz9T1eQJjVUc8R40T+KzP cdc34LjKqUOdnoJe0528Z1n7ZfuzIXHNkatS4Wh77hg0iThYsDpRPdcYlpKEy8YNpf63 5UtqzXbOSB53p0w+sF0O9YanY0oMUYVEXN8ynNtDN7uwyJB1irFdj3IApETJCl3pyqbj xWLQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=qGahmNjN8/lMOn2K9UsQfXw3waJNnuPJ5jWIw/VamLM=; fh=dZVugqTbv7zMa5pDuGxY94D0lsZ6th+/NxpVIvRdjZ0=; b=ipZAOQIAlBjV1mhdTR4NeyCuoxs/Y8DvNuuSjl8zvFnWU9M9/rX33OW+EWfGal3WjI y/e/6wnbYtXG4N63MFcnG/m6jCcowccbLnUHvcVKG8b96CbeO54kcLAjQLydZGhJZWuj SH/b21GDjsi0WNMhqGuXLGXouMDH3DSQO77T/jyiAqmSXnRfXP/A6+tLIFe4DLxtxpZp aD+CyCliaR88cWHV/msBwc+UD+JpxT2U6rJWNG+RwO7G7F901ACylxiJwsqcO3WXFDHQ bBknBwGW4W7kDxDaf6iFqa1pACereNlmauYTFsbVVWK4ZktKJMWg7U0rv28x3pLwPXCb MPVA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=EkUdSW7u; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-crypto+bounces-4640-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-crypto+bounces-4640-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id 41be03b00d2f7-6c3540fae6esi5576476a12.13.2024.06.02.15.24.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Jun 2024 15:24:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto+bounces-4640-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=EkUdSW7u; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-crypto+bounces-4640-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-crypto+bounces-4640-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 9F5682810B3 for ; Sun, 2 Jun 2024 22:24:44 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6001771747; Sun, 2 Jun 2024 22:24:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EkUdSW7u" X-Original-To: linux-crypto@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 208572943F; Sun, 2 Jun 2024 22:24:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717367081; cv=none; b=jTsSB5nsTcc6xFAWrQVWzlObi8WQU33HHz6z2UpZAcYIzIXGxbj7KdX32KH34ECDf2zVND5eR2Um0cuXuNGY8b9dlJLl8WqlasC8H8ly3VfMscrPYTnYwVYF7PvAzrcMANpU9UYppEi0M8Y1igyvKP3EgNiNLxX/Sf8WTGVZvj4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717367081; c=relaxed/simple; bh=Omz1QrXqYdbYb6RHBMsxkY4PSO7RQRLr2zfjvtW12qs=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=fOKOxvUUkVuMPYdbx0xrAME0BOTR/675IWI2v5Dw3ccU8xgT2ExyCqggWfE9nZccQxJiknYhWu+rzwOkklImE27s7UBOrEybnpVs+UrAbT4+zuePJWwDXiIKTBSTuviOk3/CVOrA4rwKw5mxRJHfu89gH/2Xa2k/eiwvlrT0//0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EkUdSW7u; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 80E81C2BBFC; Sun, 2 Jun 2024 22:24:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717367080; bh=Omz1QrXqYdbYb6RHBMsxkY4PSO7RQRLr2zfjvtW12qs=; h=From:To:Cc:Subject:Date:From; b=EkUdSW7u5KST7fB+OKD+KLuiWuB8DUIFGGOV2PCi3tg4byWFdduzA0Zbu1bgb7i7n W1PYSVXXi+N8SQLe3YmF15TM3PCCAdG+b5dEvxMps0sunxq7aRouHo5WETL0zw2hyx 8sqHgRYtswsd96yrEQrCVmntuFXPEcvrFwSBy49XyTmIEugg8ja+xehldvrcQd2xll rllEDu20Iny/BWmY1POfaZlNoWByjg7TabC6nVJeFf9YLMAt4n5+B0FP15gQpy8BLX n5hk9LbnWNTWov8iuLo9mcUA4P8alpIKKnxck9tQp6In9XPe9BsUoaNmd8WDnrvKPL 6QsOI7A6DkLsA== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 0/2] x86_64 AES-GCM improvements Date: Sun, 2 Jun 2024 15:22:18 -0700 Message-ID: <20240602222221.176625-1-ebiggers@kernel.org> X-Mailer: git-send-email 2.45.1 Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit This patchset adds a VAES and AVX512 / AVX10 implementation of AES-GCM (Galois/Counter Mode), which improves AES-GCM performance by up to 162%. In addition, it replaces the old AES-NI GCM code from Intel with new code that is slightly faster and fixes a number of issues including the massive binary size of over 250 KB. See the patches for details. The end state of the x86_64 AES-GCM assembly code is that we end up with two assembly files, one that generates AES-NI code with or without AVX, and one that generates VAES code with AVX512 / AVX10 with 256-bit or 512-bit vectors. There's no support for VAES alone (without AVX512 / AVX10). This differs slightly from what I did with AES-XTS where one file generates both AVX and AVX512 / AVX10 code including code using VAES alone (without AVX512 / AVX10), and another file generates non-AVX code only. For now this seems like the right choice for each particular algorithm, though, based on how much being limited to 16 SIMD registers and 128-bit vectors resulted in some significantly different design choices for AES-GCM, but not quite as much for AES-XTS. CPUs shipping with VAES alone also seems to be a temporary thing, so we perhaps shouldn't go too much out of our way to support that combination. Changed in v5: - Fixed sparse warnings in gcm_setkey() - Fixed some comments in aes-gcm-aesni-x86_64.S Changed in v4: - Added AES-NI rewrite patch. - Adjusted the VAES-AVX10 patch slightly to make it possible to cleanly add the AES-NI support on top of it. Changed in v3: - Optimized the finalization code slightly. - Fixed a minor issue in my userspace benchmark program (guard page after key struct made "AVX512_Cloudflare" extra slow on some input lengths) and regenerated tables 3-4. Also upgraded to Emerald Rapids. - Eliminated an instruction from _aes_gcm_precompute. Changed in v2: - Additional assembly optimizations - Improved some comments - Aligned key struct to 64 bytes - Added comparison with Cloudflare's implementation of AES-GCM - Other cleanups Eric Biggers (2): crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM crypto: x86/aes-gcm - rewrite the AES-NI optimized AES-GCM arch/x86/crypto/Kconfig | 1 + arch/x86/crypto/Makefile | 8 +- arch/x86/crypto/aes-gcm-aesni-x86_64.S | 1128 +++++++++ arch/x86/crypto/aes-gcm-avx10-x86_64.S | 1222 ++++++++++ arch/x86/crypto/aesni-intel_asm.S | 1503 +----------- arch/x86/crypto/aesni-intel_avx-x86_64.S | 2804 ---------------------- arch/x86/crypto/aesni-intel_glue.c | 1269 ++++++---- 7 files changed, 3125 insertions(+), 4810 deletions(-) create mode 100644 arch/x86/crypto/aes-gcm-aesni-x86_64.S create mode 100644 arch/x86/crypto/aes-gcm-avx10-x86_64.S delete mode 100644 arch/x86/crypto/aesni-intel_avx-x86_64.S base-commit: aabbf2135f9a9526991f17cb0c78cf1ec878f1c2 -- 2.45.1