Received: by 2002:ab2:60d1:0:b0:1f7:5705:b850 with SMTP id i17csp1860030lqm; Fri, 3 May 2024 08:28:19 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCX9IbxLKCTU5hPdWAF73Edce4qDNWUvnOjXdq9CLYs5vDhgPKCRAAvX+eXhSC/eHCSh2w+XRljvrHGekNd6S8wbO7vI4RkNDF1nLvCRrA== X-Google-Smtp-Source: AGHT+IHBYw3VicINqzRz3I41tg+DYpyCMJELDtxhA4pjplizN8vPoahDzUnDQWxUzl+LPPxOOTly X-Received: by 2002:aa7:cd41:0:b0:572:d082:79fa with SMTP id v1-20020aa7cd41000000b00572d08279famr1716610edw.14.1714750099702; Fri, 03 May 2024 08:28:19 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714750099; cv=pass; d=google.com; s=arc-20160816; b=OYsaZLDLj6DkrCzSPtr3Ki2WcO/BKHkUg79fXDG1L3OvImifbwa+nsgCRCvpE2wGUn 6oTC2jHXZx6HIKOYiNRjRFF+lrT8khr69ZMH4XIlAQr4PSwvZ66BZJfGt5AlMcxrCl2R XQO2e4A6oE82Sn+LwXgBLnE5IrlHGPhOeDLst5PwbCQQXtkE1bidtOm9sDbFYYYj15+l YC3Fxk0tKzOMPBVRIb1gbx+DPQyfs9B4Aa02a5plBcidcK5RYu6+gqCneV4vQejU9uoI ESRLqQL0eeVIPO+n/lvsdB+HBiDhtV6t2XjADi8A9qNKokYtVyhFQ3AZ4GlLVkVaSU/0 9nBg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=3VcCLM2xevu3kPdlG/JxaWJINjpmMm0lvdf1cplCDWE=; fh=BVP83eNr+oWatb4qMBZYyHe7HobjerBKYantRajwKl4=; b=h4TltABNAGUuiZoQump3THcZbhK0oUWAmFX0iAjUqH4MBBtFkb7XaipUx2UmyEc5Q6 Zh9DJ11wtQPaO6tAOOBQDgDjr64KbLN73iBfAJ85WXV3S0ZVHQVgDEfB/ut365niHluM A8q+kfL2b/IkLj+LT2vX3hsMoM9PBmQh0L2f+/tTbbVzb2lzOTiQ88Jy7Yz4IsSXdMm+ M8ECS4XDCuqVUGrA9us64kTDv91hD8hcGJZ8VSxrX4OD5z7Vkv+E/ADBwHy6CdtJKoa8 N3+qGORgyTbbPeIxAAQD+fRs1IffXXxhxKwRCdulhHnXCDwS1gLM3EUb9/n4u1cqbG2o UgfA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=SUXfXXta; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-crypto+bounces-4022-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-crypto+bounces-4022-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id cq24-20020a056402221800b00572afdee1b5si1682458edb.323.2024.05.03.08.28.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 May 2024 08:28:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto+bounces-4022-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=SUXfXXta; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-crypto+bounces-4022-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-crypto+bounces-4022-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 716401F2287F for ; Fri, 3 May 2024 15:28:19 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A541C152518; Fri, 3 May 2024 15:28:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SUXfXXta" X-Original-To: linux-crypto@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5DE1D15098B; Fri, 3 May 2024 15:28:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714750093; cv=none; b=YG2B/gIXCF0Bzim5WD7mGq6k02NKebG11eo3wQCONHEYUjT8Ik6LOI75FFmBQBIiqBJltJYWa/ibByfbwdDbM1EdYpijqwkeBykOJvN2KAU/nxMAcXzpnLf4XzAV1zdO85Lpc4zPEeb+TiD6vSQfNt5VXfe9+WjjT7ev3bJJX+U= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714750093; c=relaxed/simple; bh=q0yiVpvWbaHYUHI4s8f9mDHN0TBYmnMwy6gyVKrWQHM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=QHXliftIj3QLZka1+Bv1KJhdgaUwCTzsZu7GTwOtQYpYl1+W9q0pzGwhF54mibj3xDBiSFnt6W+7bj6ssFxXv3Y1I60pdJbn9OtZYOpX2l74tyhh3iU/pLkWMKAW1FqqMrIY5RI1hCwuxzPSfIyotS/bhu9DMyhejAJzj6ybWCA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SUXfXXta; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 88E94C116B1; Fri, 3 May 2024 15:28:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1714750092; bh=q0yiVpvWbaHYUHI4s8f9mDHN0TBYmnMwy6gyVKrWQHM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=SUXfXXta0N0s3L31U6+7fOVQ0HTg1BGsHqM1QX8Lp7zLhOoZpfwWQilSnF52YEO5q Ff3TNu7tf/KuuLuZFkUDMjgnL7Fp6oagvqxksH/H9UM9BnBKkV2israkeoKH93ICua cd8G1PsdzVMKONo73NAXIhuNPNOSdb0mZN8IJYXzoYlkhSbMS6vKmG7p1rr5syJC4r kLdzv/lgPSCoxYwKI3ruya87MP4pCRhQ6zudYYpxqN6Aum7+yKokDc4OL3HxoLwxjv 24bADwTRfL+SYm/dCYYTxMbYoXwv2BZKYtujlcYiOvpY8sYSCbNyNAvyI++WghXNIt pjpU1nYEPSvJA== Date: Fri, 3 May 2024 08:28:10 -0700 From: Eric Biggers To: Herbert Xu Cc: linux-crypto@vger.kernel.org, fsverity@lists.linux.dev, dm-devel@lists.linux.dev, x86@kernel.org, linux-arm-kernel@lists.infradead.org, ardb@kernel.org, samitolvanen@google.com, bvanassche@acm.org Subject: Re: [PATCH v2 1/8] crypto: shash - add support for finup2x Message-ID: <20240503152810.GA1132@sol.localdomain> References: <20240422203544.195390-2-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Fri, May 03, 2024 at 06:18:32PM +0800, Herbert Xu wrote: > Eric Biggers wrote: > > > > For now the API only supports 2-way interleaving, as the usefulness and > > practicality seems to drop off dramatically after 2. The arm64 CPUs I > > tested don't support more than 2 concurrent SHA-256 hashes. On x86_64, > > AMD's Zen 4 can do 4 concurrent SHA-256 hashes (at least based on a > > microbenchmark of the sha256rnds2 instruction), and it's been reported > > that the highest SHA-256 throughput on Intel processors comes from using > > AVX512 to compute 16 hashes in parallel. However, higher interleaving > > factors would involve tradeoffs such as no longer being able to cache > > the round constants in registers, further increasing the code size (both > > source and binary), further increasing the amount of state that users > > need to keep track of, and causing there to be more "leftover" hashes. > > I think the lack of extensibility is the biggest problem with this > API. Now I confess I too have used the magic number 2 in the > lskcipher patch-set, but there I think at least it was more > justifiable based on the set of algorithms we currently support. > > Here I think the evidence for limiting this to 2 is weak. And the > amount of work to extend this beyond 2 would mean ripping this API > out again. > > So let's get this right from the start. Rather than shoehorning > this into shash, how about we add this to ahash instead where an > async return is a natural part of the API? > > In fact, if we do it there we don't need to make any major changes > to the API. You could simply add an optional flag that to the > request flags to indicate that more requests will be forthcoming > immediately. > > The algorithm could then either delay the current request if it > is supported, or process it immediately as is the case now. > The kernel already had ahash-based multibuffer hashing years ago. It failed spectacularly, as it was extremely complex, buggy, slow, and potentially insecure as it mixed requests from different contexts. Sure, it could have been improved slightly by adding flush support, but most issues would have remained. Synchronous hashing really is the right model here. One of the main performance issues we are having with dm-verity and fs-verity is the scheduling hops associated with the workqueues on which the dm-verity and fs-verity work runs. If there was another scheduling hop from the worker task to another task to do the actual hashing, that would be even worse and would defeat the point of doing multibuffer hashing. And with the ahash based API this would be difficult to avoid, as when an individual request gets submitted and put on a queue somewhere it would lose the information about the original submitter, so when it finally gets hashed it might be by another task (which the original task would then have to wait for). I guess the submitter could provide some sort of tag that makes the request be put on a dedicated queue that would eventually get processed only by the same task (which might also be needed for security reasons anyway, due to all the CPU side channels), but that would add lots of complexity to add tag support to the API and support an arbitrary number of queues. And then there's the issue of request lengths. With one at a time submission via 'ahash_request', each request would have its own length. Having to support multibuffer hashing of different length requests would add a massive amount of complexity and edge cases that are difficult to get correct, as was shown by the old ahash based code. This suggests that either the API needs to enforce that all the lengths are the same, or it needs to provide a clean API (my patch) where the caller just provides a single length that applies to all messages. So the synchronous API really seems like the right approach, whereas shoehorning it into the asynchronous hash API would result in something much more complex and not actually useful for the intended use cases. If you're concerned about the hardcoding to 2x specifically, how about the following API instead: int crypto_shash_finup_mb(struct shash_desc *desc, const u8 *datas[], unsigned int len, u8 *outs[], int num_msgs) This would allow extension to higher interleaving factors. I do suspect that anything higher than 2x isn't going to be very practical for in-kernel use cases, where code size, latency, and per-request memory usage tend to be very important. Regardless, this would make the API able to support higher interleaving factors. - Eric