Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp3166808lqp; Tue, 26 Mar 2024 01:06:12 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUT+l2EehmV1MtI4OthzAA6QQ5jbYXFlGruGY8OAtPvB+P400j78T4GEiCuyZ/JRnydd6sF4H1+nfW+kqyyZWVD4JiSZdd9cP1SNG8T3A== X-Google-Smtp-Source: AGHT+IGkQvJ32yX9OXKw1lKFiPZl5DAxG+kv5l+i9HVPlPzMdPArWxIwI6RBTyD50t15qLevDP2z X-Received: by 2002:a92:d9c1:0:b0:366:9edb:c0a6 with SMTP id n1-20020a92d9c1000000b003669edbc0a6mr2039069ilq.12.1711440372371; Tue, 26 Mar 2024 01:06:12 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711440372; cv=pass; d=google.com; s=arc-20160816; b=acL/GmBL1qJaGEYkM+ngM8JvrE5Ngy2yOITMpXb8kZ1JnMjQ5v3WoenHNoEkMtZKcb MD2CXnJSIHTI3VBh0Q3sajNyVN36IZKvaE9PKShqEiIZnUX6wEGxPSHt2LrYAGGRtwWL /znv788OPbP+5R8+E14MUyV1hPsvqhj+g7ARhHOepPbsT9VGdnGnaB/cb0c6S8fBsTpx d5gMfPU2h+xtO6wXrVVaM5S79ntTmyXygUK7XSUeXsoSuhz31PcW0Wr+xWXnsmlUVEkL GpoIjXbQfHj7aXU9j9mjIClpLJ7isviiX6jqagsS8U86FGOjXoQbbZsEjMKyaDflWqkN N6TA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=xBAg4hNt5xL8Tj+SEROvvoXicEJ3zcwOBZyLqgmovrQ=; fh=HYLtzSYcxOa1KHDXbG6ny3oWyG0LxDnh9szm5mOcs3E=; b=LlHZk0RL/cTyPKJhbSXjdYaaWK/gB16b4YzRJZLwO6i8JO4gXNC48VS38neiLtkNSv d2RV0c1UD50QlCUmrZBtpobY/eS5/t4M3iNOklfSH2k3oNqh5Ribulr7PTq1OnRb0zIL PZ/OoqXfFNcl+DpWv7BBUU8NdfxzjzDP+lIG/f97TQKeGSoz8Liyv+IAwg5C99JQzM7s JtrdWMIwvmFkLFqIq2Ls4WeLWV2+bvnwTGt7SrgTnygMaUg9xSrOnY4x5J+0J7pfDg76 ZWUC0wWDx+MElY4n6FmUrG+oQsrf8V7pH0/L7PxGorrrconH5ahmG6GkTcbq36jR8wN+ UuHQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZeWbToD1; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-crypto+bounces-2859-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-crypto+bounces-2859-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id w2-20020a63f502000000b005e4f076f51asi8830315pgh.762.2024.03.26.01.06.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Mar 2024 01:06:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto+bounces-2859-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZeWbToD1; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-crypto+bounces-2859-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-crypto+bounces-2859-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id A632AB22D93 for ; Tue, 26 Mar 2024 08:06:07 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 42088132C04; Tue, 26 Mar 2024 08:06:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZeWbToD1" X-Original-To: linux-crypto@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F302118C38; Tue, 26 Mar 2024 08:06:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711440363; cv=none; b=TK5cAhDkreJG3Dg0vnYfALbxlS3FnQvGNvV+Jvd28XIA2flrmLe/Ez3brsfG+LABlA73JGVypSYMkezXn7t5A7xMfLRMWOxYufI64gV0PDzG6pDdZ/BiZxcbl6fpjARt3bkNyn1O+HLZ9jjCrB5CJxWRcaWhpWSPth612hXUcNg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711440363; c=relaxed/simple; bh=E06+Y98eVhxip/iGCDTYLOwvEzjwrLzY+VAqA9LlH8I=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=jNq1Ke41rbzEWtjRGuSuUQnWKsdldnIrCEnKSJMD8yY2W3TN3Ki6ss16A/h3Ima8dQJ96/dbn8kVji7i0Q4eZkILMHdzMhD6k4r4zL3JGX/ML7+dXSaC39QKtRQFXILjIrNyn5eryan2vrTqj80tfadZWaE8tuUcUHZu4Ro8Umo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZeWbToD1; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 42DBBC433C7; Tue, 26 Mar 2024 08:06:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1711440362; bh=E06+Y98eVhxip/iGCDTYLOwvEzjwrLzY+VAqA9LlH8I=; h=From:To:Cc:Subject:Date:From; b=ZeWbToD1r6qK21HcGhAHETqjWPj3WIDVsb9pW+KvU97Lf2XDR9xkZzE5h9lvlc2vK 0bQnVuQEXwbHEiJOy8ZgCwEdD0iRPWZMBljqPwBS4CcT03FoVE+W4L5D/ia1IXUeIL ZEKe2x15HVLiK3uy7K7laTN3Y/tQB4yR7wKGRt0R9+rDmYn/S40qROIBulSJD2CiMj LFz+JnFpJ5HHlWBR5MA5Zy1aBEZOCFAngKpj1tO2zW1FcNDCHpBjjgeCzngoTZCz83 U7Jq9BQD4pRyjYpphoQjxFKbc2ZYN9bH+ryPvmueQsT6laW43e19d89CE4UFQtL8s9 2kqzYqFVCQT2w== From: Eric Biggers To: linux-crypto@vger.kernel.org, x86@kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , Andy Lutomirski , "Chang S . Bae" Subject: [PATCH 0/6] Faster AES-XTS on modern x86_64 CPUs Date: Tue, 26 Mar 2024 01:02:58 -0700 Message-ID: <20240326080305.402382-1-ebiggers@kernel.org> X-Mailer: git-send-email 2.44.0 Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit This patchset adds new AES-XTS implementations that accelerate disk and file encryption on modern x86_64 CPUs. The largest improvements are seen on CPUs that support the VAES extension: Intel Ice Lake (2019) and later, and AMD Zen 3 (2020) and later. However, an implementation using plain AESNI + AVX is also added and provides a small boost on older CPUs too. To try to handle the mess that is x86 SIMD, the code for all the new AES-XTS implementations is generated from an assembly macro. This makes it so that we e.g. don't have to have entirely different source code just for different vector lengths (xmm, ymm, zmm). To avoid downclocking effects, zmm registers aren't used on certain Intel CPU models such as Ice Lake. These CPU models default to an implementation using ymm registers instead. This patchset increases the throughput of AES-256-XTS decryption by the following amounts on the following CPUs: | 4096-byte messages | 512-byte messages | ----------------------+--------------------+-------------------+ Intel Skylake | 1% | 11% | Intel Ice Lake | 92% | 59% | Intel Sapphire Rapids | 115% | 78% | AMD Zen 1 | 25% | 20% | AMD Zen 2 | 26% | 20% | AMD Zen 3 | 82% | 40% | AMD Zen 4 | 118% | 48% | (The results for encryption are very similar to decryption. I just tend to measure decryption because decryption performance is more important.) There's no separate kconfig option for the new AES-XTS implementations, as they are included in the existing option CONFIG_CRYPTO_AES_NI_INTEL. To make testing easier, all four new AES-XTS implementations are registered separately with the crypto API. They are prioritized appropriately so that the best one for the CPU is used by default. Open questions: - Is the policy that I implemented for preferring ymm registers to zmm registers the right one? arch/x86/crypto/poly1305_glue.c thinks that only Skylake has the bad downclocking. My current proposal is a bit more conservative; it also excludes Ice Lake and Tiger Lake. Those CPUs supposedly still have some downclocking, though not as much. - Should the policy on the use of zmm registers be in a centralized place? It probably doesn't make sense to have random different policies for different crypto algorithms (AES, Poly1305, ARIA, etc.). - Are there any other known issues with using AVX512 in kernel mode? It seems to work, and technically it's not new because Poly1305 and ARIA already use AVX512, including the mask registers and zmm registers up to 31. So if there was a major issue, like the new registers not being properly saved and restored, it probably would have already been found. But AES-XTS support would introduce a wider use of it. Eric Biggers (6): x86: add kconfig symbols for assembler VAES and VPCLMULQDQ support crypto: x86/aes-xts - add AES-XTS assembly macro for modern CPUs crypto: x86/aes-xts - wire up AESNI + AVX implementation crypto: x86/aes-xts - wire up VAES + AVX2 implementation crypto: x86/aes-xts - wire up VAES + AVX10/256 implementation crypto: x86/aes-xts - wire up VAES + AVX10/512 implementation arch/x86/Kconfig.assembler | 10 + arch/x86/crypto/Makefile | 3 +- arch/x86/crypto/aes-xts-avx-x86_64.S | 796 +++++++++++++++++++++++++++ arch/x86/crypto/aesni-intel_glue.c | 263 ++++++++- 4 files changed, 1070 insertions(+), 2 deletions(-) create mode 100644 arch/x86/crypto/aes-xts-avx-x86_64.S base-commit: 4cece764965020c22cff7665b18a012006359095 -- 2.44.0