From: Ard Biesheuvel Subject: Re: [PATCH net-next v3 02/17] zinc: introduce minimal cryptography library Date: Tue, 18 Sep 2018 09:06:13 -0700 Message-ID: References: <20180911214737.GA81235@gmail.com> <20180911233015.GD11474@lunn.ch> <20180911.165739.2032677219588723041.davem@davemloft.net> <8937D6B1-D21C-4C47-8A89-A466CDB6FB04@amacapital.net> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Cc: "Jason A. Donenfeld" , Andrew Lutomirski , David Miller , Andrew Lunn , Eric Biggers , Greg Kroah-Hartman , LKML , Netdev , Samuel Neves , Jean-Philippe Aumasson , Linux Crypto Mailing List To: Andy Lutomirski Return-path: In-Reply-To: <8937D6B1-D21C-4C47-8A89-A466CDB6FB04@amacapital.net> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-crypto.vger.kernel.org On 17 September 2018 at 07:53, Andy Lutomirski wrote: > > > >> On Sep 16, 2018, at 10:07 PM, Jason A. Donenfeld wrote= : >> >> Hey Andy, >> >> Thanks a lot for your feedback. >> >>> On Mon, Sep 17, 2018 at 6:09 AM Andy Lutomirski wrote= : >>> 1. Zinc conflates the addition of a new API with the replacement of >>> some algorithm implementations. This is problematic. Look at the >>> recent benchmarks of ipsec before and after this series. Apparently >>> big packets get faster and small packets get slower. It would be >>> really nice to bisect the series to narrow down *where* the regression >>> came from, but, as currently structured, you can't. >>> >>> The right way to do this is to rearrange the series. First, the new >>> Zinc APIs should be added, and they should be backed with the >>> *existing* crypto code. (If the code needs to be moved or copied to a >>> new location, so be it. The patch will be messy because somehow the >>> Zinc API is going to have to dispatch to the arch-specific code, and >>> the way that the crypto API handles it is not exactly friendly to this >>> type of use. So be it.) Then another patch should switch the crypto >>> API to use the Zinc interface. That patch, *by itself*, can be >>> benchmarked. If it causes a regression for small ipsec packets, then >>> it can be tracked down relatively easily. Once this is all done, the >>> actual crypto implementation can be changed, and that changed can be >>> reviewed on its own merits. >> >> That ipsec regression was less related to the implementation and more >> related to calling kernel_fpu_begin() unnecessarily, something I've >> now fixed. So I'm not sure that's such a good example. However, I can >> try to implement Zinc over the existing assembly (Martin's and Ard's), >> first, as you've described. This will be a pretty large amount of >> work, but if you think it's worth it for the commit history, then I'll >> do it. > > Ard, what do you think? I think it would > be nice, but if the authors of that assembly are convinced it should be r= eplaced, then this step is optional IMO. > I don't think there is any problem with switching to faster code immediately, as long as we have data that supports the claim that it is actually faster on hardware people care about. The arm64 ChaCha20 code in the kernel is slower than the OpenSSL code as far as I know, so I have no problems whatsoever with dropping it. The ARM version, however, is slower on Cortex-A7 (according to Eric's benchmarks), which is the only 32-bit ARM core anybody cares about these days. >> >>> 2. The new Zinc crypto implementations look like they're brand new. I >>> realize that they have some history, some of them are derived from >>> OpenSSL, etc, but none of this is really apparent in the patches >>> themselves. >> >> The whole point of going with these is that they _aren't_ brand new, >> yet they are very fast. Eyeballs and fuzzer hours are important, and >> AndyP's seems to get the most eyeballs and fuzzer hours, generally. >> >>> it would be nice if >>> the patches made it more clear how the code differs from its origin. >>> At the very least, though, if the replacement of the crypto code were, >>> as above, a patch that just replaced the crypto code, it would be much >>> easier to review and benchmark intelligently. >> >> You seem to have replied to the v3 thread, not the v4 thread. I've >> already started to include lots of detail about the origins of the >> code and [any] important differences in v4, and I'll continue to add >> more detail for v5. > > This is indeed better. Ard=E2=80=99s reply covers this better.