From: Andy Lutomirski Subject: Re: [PATCH net-next v3 02/17] zinc: introduce minimal cryptography library Date: Mon, 17 Sep 2018 09:07:43 -0700 Message-ID: References: <20180911214737.GA81235@gmail.com> <20180911233015.GD11474@lunn.ch> <20180911.165739.2032677219588723041.davem@davemloft.net> <35BC21D7-01F4-4F91-A7E9-8D15DE5B95D6@amacapital.net> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Ard Biesheuvel , Andrew Lutomirski , "David S. Miller" , andrew@lunn.ch, Eric Biggers , Greg KH , LKML , Network Development , Samuel Neves , Jean-Philippe Aumasson , Linux Crypto Mailing List To: "Jason A. Donenfeld" Return-path: In-Reply-To: Sender: netdev-owner@vger.kernel.org List-Id: linux-crypto.vger.kernel.org On Mon, Sep 17, 2018 at 8:32 AM Jason A. Donenfeld wrote: > > On Mon, Sep 17, 2018 at 4:52 PM Andy Lutomirski wrote: > > I think the module organization needs to change. It needs to be possible to have chacha20 built in but AES or whatever as a module. > > Okay, I'll do that for v5. > > > I might have agreed before Spectre :(. Unfortunately, unless we do some magic, I think the code would look something like: > > > > if (static_branch_likely(have_simd)) arch_chacha20(); > > > > ...where arch_chacha20 is a *pointer*. And that will generate a retpoline and run very, very slowly. (I just rewrote some of the x86 entry code to eliminate one retpoline. I got a 5% speedup on some tests according to the kbuild bot.) > > Actually, the way it works now benefits from the compilers inliner and > the branch predictor. I benchmarked this without any retpoline > slowdowns, and the branch predictor becomes correct pretty much all > the time. We can tinker with this after the initial merge, if you > really want, but avoiding function pointers and instead using ordinary > branches really winds up being quite fast. Indeed. What I'm saying is that you shouldn't refactor it this way because it will be slow. I agree it would be conceptually nice to be able to blacklist a chacha20_x86_64 module to disable the asm, but I think it would be very hard to get good performance. --Andy