Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755439AbcKCHZK (ORCPT ); Thu, 3 Nov 2016 03:25:10 -0400 Received: from frisell.zx2c4.com ([192.95.5.64]:38889 "EHLO frisell.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755071AbcKCHZI (ORCPT ); Thu, 3 Nov 2016 03:25:08 -0400 MIME-Version: 1.0 In-Reply-To: <20161103004934.GA30775@gondor.apana.org.au> References: <20161102175810.18647-1-Jason@zx2c4.com> <20161102200959.GA23297@gondor.apana.org.au> <20161102210802.GA26741@gondor.apana.org.au> <20161102212657.GA26887@gondor.apana.org.au> <20161103004934.GA30775@gondor.apana.org.au> From: "Jason A. Donenfeld" Date: Thu, 3 Nov 2016 08:24:57 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] poly1305: generic C can be faster on chips with slow unaligned access To: Herbert Xu Cc: "David S. Miller" , linux-crypto@vger.kernel.org, LKML , Martin Willi Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1221 Lines: 24 Hi Herbert, On Thu, Nov 3, 2016 at 1:49 AM, Herbert Xu wrote: > FWIW I'd rather live with a 6% slowdown than having two different > code paths in the generic code. Anyone who cares about 6% would > be much better off writing an assembly version of the code. Please think twice before deciding that the generic C "is allowed to be slow". It turns out to be used far more often than might be obvious. For example, crypto is commonly done on the netdev layer -- like the case with mac80211-based drivers. At this layer, the FPU on x86 isn't always available, depending on the path used. Some combinations of drivers, packet family, and workload can result in the generic C being used instead of the vectorized assembly for a massive percentage of time. So, I think we do have a good motivation for wanting the generic C to be as fast as possible. In the particular case of poly1305, these are the only spots where unaligned accesses take place, and they're rather small, and I think it's pretty obvious what's happening in the two different cases of code from a quick glance. This isn't the "two different paths case" in which there's a significant future-facing maintenance burden. Jason