From: "Jason A. Donenfeld" Subject: Re: [PATCH net-next v6 07/23] zinc: ChaCha20 ARM and ARM64 implementations Date: Thu, 27 Sep 2018 19:06:50 +0200 Message-ID: References: <20180925145622.29959-1-Jason@zx2c4.com> <20180925145622.29959-8-Jason@zx2c4.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Cc: Thomas Gleixner , Peter Zijlstra , Ard Biesheuvel , LKML , Netdev , Linux Crypto Mailing List , David Miller , Greg Kroah-Hartman , Samuel Neves , Andrew Lutomirski , Jean-Philippe Aumasson , Russell King - ARM Linux , linux-arm-kernel@lists.infradead.org To: Andy Lutomirski Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-crypto.vger.kernel.org On Thu, Sep 27, 2018 at 6:27 PM Andy Lutomirski wrote= : > I would add another consideration: if you can get better latency with neg= ligible overhead (0.1%? 0.05%), then that might make sense too. For example= , it seems plausible that checking need_resched() every few blocks adds bas= ically no overhead, and the SIMD helpers could do this themselves or perhap= s only ever do a block at a time. > > need_resched() costs a cacheline access, but it=E2=80=99s usually a hot c= acheline, and the actual check is just whether a certain bit in memory is s= et. Yes you're right, I do plan to check quite often, rather than seldom, for this reason. I've been toying with the idea of instead processing 65k (maximum size of a UDP packet) at a time before checking need_resched(), but armed with the 20=C2=B5s figure, this isn't remotely possible on most hardware. So I'll stick with the original conservative plan of checking very often, and not making things different from the aspects worked out by the present crypto API in this regard.