Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1972971imm; Tue, 2 Oct 2018 18:03:46 -0700 (PDT) X-Google-Smtp-Source: ACcGV63Q89HxO4yBcR+cFoGgcRQzhoTaMCi8tRpeZWV60hTpidGbvyA9gf1QV8mYxtz/EAMTbFW2 X-Received: by 2002:a17:902:1026:: with SMTP id b35-v6mr19355931pla.283.1538528626131; Tue, 02 Oct 2018 18:03:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538528626; cv=none; d=google.com; s=arc-20160816; b=jesgOj43qBg6jxSpq6L97n41xHhwsNv5x5ExqsvFrsS+xFVXxqyR7IDfGZcjF+K7fq SJXOu/Tk0LTleDe0qRpkUvMDb4Um/MxJAWRnG1BZDvHl+VqMNsTqNC3zzRQRr1bPaGp7 ViLhXPFWg1As/8ub5/df3/dWgjUGgs7bd1QTkkk/NCtIMJlrGOF1NlY4WbmE4l5GSmPa Hdu9z4tQO6Zl77LRCYRkOTKh0T6LKry+ZSVBlz6ELAe5r0zex2N4k25P/JRumc71dNOW 7Vwd5TqZULlkfgHlRHheU0nPAr/iJzpvnUafDFy9ONuUrfeAK6N/oGHnON5LCxfnZMS4 CEvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=tRrJos9lN7StERvX9phlGS+7l+S0Yd+w7UAbe9e8L34=; b=z96/mGVXOrTx8f5Mz6fvBhylPAGfTHL/lUJYyC8SXoGsadQuZnr+UMo9j26Pbd6Yyo nkoxfGw63C+378xlh5Ao6d7hV1oRfNiexbZPgSNbWKqiPMId+xubgt1SQS9ea4/jrdMB U5+1cLfRLnu4DchJf3CR4OHpIzH9EEqLRxr96EZnmAFhUkgsgcgdrwiP/f8AI2Vt7Ftu RlOJxNAWPQJbetsywntT1SarzdIjPX+qZ2npLeeyBDgQ/xm0tl7W4nHshr0Si7CJ0iT2 a3pdOSVp2VJq3Bb9Y6yN0ahertVuiIelOCqpjf07CTMbXfwrSrmjkbgTXrt7YF9uN3q7 3pbg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zx2c4.com header.s=mail header.b=LIe52dxo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zx2c4.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 5-v6si18269620pfz.160.2018.10.02.18.03.30; Tue, 02 Oct 2018 18:03:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@zx2c4.com header.s=mail header.b=LIe52dxo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zx2c4.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726619AbeJCHt2 (ORCPT + 99 others); Wed, 3 Oct 2018 03:49:28 -0400 Received: from frisell.zx2c4.com ([192.95.5.64]:51063 "EHLO frisell.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725767AbeJCHt2 (ORCPT ); Wed, 3 Oct 2018 03:49:28 -0400 Received: by frisell.zx2c4.com (ZX2C4 Mail Server) with ESMTP id 472b9b18; Wed, 3 Oct 2018 01:03:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=zx2c4.com; h=mime-version :references:in-reply-to:from:date:message-id:subject:to:cc :content-type; s=mail; bh=dXOih4WeXhYPQcclndDyyEar5p0=; b=LIe52d xo38m4gQrNP2E77wfyrsLIQCZ1fX+fssx8qJv3ZFUQ8hLsRRQ1TbbmfU9E4gMKww TyNToIUsAw7RYrhuING7/ojLpo1m4R3Z0da/AzEJ1OX8TajclbsE83U0fL3osyzJ N+WYVPshvJHDTanVcBZSnWzL237JXxQ3CbLmu4PsZVROhg1Q/QTtqQTdlCMvpD9u hZQcxB1k5qw2WuIZopl7qUo25VBBvp7etwDPlAniaNCSLoAAba+Csvr0hcDrfDzY gE9YY9KGn3h+ba+POhzQUiW/kCJdCGVxVgU1OrjYwKTsEsdOUFbgsV38pLPB0L+P ffoluV0Pe/ef3kGA== Received: by frisell.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id c3d2dbfe (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128:NO); Wed, 3 Oct 2018 01:03:14 +0000 (UTC) Received: by mail-oi1-f174.google.com with SMTP id e17-v6so3109264oig.12; Tue, 02 Oct 2018 18:03:21 -0700 (PDT) X-Gm-Message-State: ABuFfoiPqVmQNRiTl2CvPgpbdR/JLI5hcAHi25MTtD+kJZnTLEcCn0h6 vPeHr+KHRoN8bWtDHKdA/EXkjX7L01MEfA96Iqk= X-Received: by 2002:aca:df42:: with SMTP id w63-v6mr8347618oig.295.1538528600437; Tue, 02 Oct 2018 18:03:20 -0700 (PDT) MIME-Version: 1.0 References: <20180925145622.29959-1-Jason@zx2c4.com> <20180925145622.29959-20-Jason@zx2c4.com> In-Reply-To: From: "Jason A. Donenfeld" Date: Wed, 3 Oct 2018 03:03:09 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH net-next v6 19/23] zinc: Curve25519 ARM implementation To: Ard Biesheuvel Cc: LKML , Netdev , Linux Crypto Mailing List , David Miller , Greg Kroah-Hartman , Samuel Neves , Andrew Lutomirski , Jean-Philippe Aumasson , Russell King - ARM Linux , linux-arm-kernel@lists.infradead.org, Peter Schwabe , "Daniel J . Bernstein" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (+Dan,Peter in CC. Replying to: for context.) Hi Ard, On Tue, Oct 2, 2018 at 6:59 PM Ard Biesheuvel wrote: > Shouldn't this use the new simd abstraction as well? Yes, it probably should, thanks. > I guess qhasm means generated code, right? > Because many of these adds are completely redundant ... > This looks odd as well. > Could you elaborate on what qhasm is exactly? And, as with the other > patches, I would prefer it if we could have your changes as a separate > patch (although having the qhasm base would be preferred) Indeed qhasm converts this -- -- into this. It's a thing from Dan (CC'd now) -- . As you've requested, I can layer the patches to show our changes on top. > ... you can drop this add > same here > and here > and here > and here > and here > and here > and here > redundant add > I'll stop here - let me just note that this code does not strike me as > particularly well optimized for in-order cores (such as A7). > For instance, the sequence > can be reordered as > and not have every other instruction depend on the output of the previous one. > Obviously, the ultimate truth is in the benchmark numbers, but I'd > thought I'd mention it anyway. Yes indeed the output is suboptimal in a lot of places. We can gradually clean this up -- slowly and carefully over time -- if you want. I can also look into producing a new implementation within HACL* so that it's verified. Assurance-wise, though, I feel pretty good about this implementation considering its origins, its breadth of use (in BoringSSL), the fuzzing hours it's incurred, and the actual implementation itself. Either way, performance-wise, it's really worth having. For example, on a Cortex-A7, we get these results (according to get_cycles()): neon: 23142 cycles per call fiat32: 49136 cycles per call donna32: 71988 cycles per call And on a Cortex-A9, we get these results (according to get_cycles()): neon: 5020 cycles per call fiat32: 17326 cycles per call donna32: 28076 cycles per call Jason