Return-Path: Received: from sitav-80046.hsr.ch ([152.96.80.46]:35714 "EHLO mail.strongswan.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726256AbeKSSPU (ORCPT ); Mon, 19 Nov 2018 13:15:20 -0500 Message-ID: Subject: Re: [PATCH 0/6] crypto: x86/chacha20 - SIMD performance improvements From: Martin Willi To: "Jason A. Donenfeld" Cc: Eric Biggers , Ard Biesheuvel , Linux Crypto Mailing List , X86 ML , Samuel Neves , Herbert Xu Date: Mon, 19 Nov 2018 08:52:29 +0100 In-Reply-To: References: <20181111093630.28107-1-martin@strongswan.org> <20181116022059.aob3ruc7umg32go6@gondor.apana.org.au> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-crypto-owner@vger.kernel.org List-ID: Hi Jason, > I'd be inclined to roll with your implementation if it can eventually > become competitive with Andy Polyakov's, [...] I think for the SSSE3/AVX2 code paths it is competitive; especially for small sizes it is faster, which is not that unimportant when implementing layer 3 VPNs. > there are still no AVX-512 paths, which means it's considerably > slower on all newer generation Intel chips. Andy's has the AVX-512VL > implementation for Skylake (using ymm, so as not to hit throttling) > and AVX-512F for Cannon Lake and beyond (using zmm). I don't think that having AVX-512F is that important until it is really usable on CPUs in the market. Adding AVX-512VL support is relatively simple. I have a patchset mostly ready that is more than competitive with the code from Zinc. I'll clean that up and do more testing before posting it later this week. Best regards Martin