Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp562881imm; Fri, 5 Oct 2018 08:12:02 -0700 (PDT) X-Google-Smtp-Source: ACcGV623pYnVcU1kdixGospIDUhJXwqi854rFMEkn51PuW0N/6ajZZTX20iLZPfxMnINNGz3xgkY X-Received: by 2002:a17:902:b949:: with SMTP id h9-v6mr12224823pls.34.1538752322887; Fri, 05 Oct 2018 08:12:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538752322; cv=none; d=google.com; s=arc-20160816; b=rvG7JHg+bSNIV8G7ZlkYc7gp5Upgazcw8bnA7EtCNJLzR8F9g5C30ZHhRYX3DxgFpH ygZuEwDEJ85hY7VqgtPg0ujWISLiBjxzHdM2IEtbJfXXyZ97ERT5Sv74LbNEdM892KJv fYEfZFrqWkvFYvsbK9TSgzGQXX+r8jaJy9A5av879rknuTOPlLmdtQjeAs4XCzcjGdzB HqZx5T7flFFkxtBlvZT3pf3Q4aSH28xEBNUmdDLrRYDNhWDcxhZ2wvvRtKMR2eC9BGWw Ii9RcKl7CcxiuzRaS2q/ZkymrLqLskCo8YYhk/ZD4xxGUvWE/YRawAdAUrvbDAgUgyv/ Tg2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition:mime-version :references:subject:to:from:automatic-legal-notices:mail-followup-to :message-id:date; bh=Rj17lRUwws2f/FbjiSwXZ0aZD9YDwmCoJ5eynCKWzG8=; b=vRwqiNzkD2EEM+0ONDxt9ZzVv6dvdw/75luEnGbJgNOfJs5i8H7CFjTvtGmTCT4tFL 9gkudApoABfu37Ya1nojbHoVYNYpLPie8hMLmBKGim0us7SgJ/2jDq9hIQDPBdtU9m91 n7XREWduBhLxQuhK0ur0mLnd4aFFf+TOkmjU4xPXL8vmrqqV7mDuJa6RpmBf7qvSzHj+ e9DgMWNPl7PG40mLov29YjhdY37v7u5Gi6BEHjZWeO8NquWlStKrYXDl6OfpHOIjPg1L 3iIFTRBb3R6B60J4hTx/C4fF7Ey8gvL/ZchiM6VAZn8vzyNKMuCjCsPcY5xtM/girUw2 m3gg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e3-v6si9070974plk.114.2018.10.05.08.11.47; Fri, 05 Oct 2018 08:12:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728314AbeJEWKn (ORCPT + 99 others); Fri, 5 Oct 2018 18:10:43 -0400 Received: from salsa.cs.uic.edu ([131.193.32.108]:52722 "HELO salsa.cs.uic.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1728118AbeJEWKm (ORCPT ); Fri, 5 Oct 2018 18:10:42 -0400 Received: (qmail 24324 invoked by uid 1010); 5 Oct 2018 15:04:55 -0000 Received: from unknown (unknown) by unknown with QMTP; 5 Oct 2018 15:04:55 -0000 Received: (qmail 17007 invoked by uid 1000); 5 Oct 2018 15:05:38 -0000 Date: 5 Oct 2018 15:05:38 -0000 Message-ID: <20181005150538.17006.qmail@cr.yp.to> Mail-Followup-To: Jason@zx2c4.com, ard.biesheuvel@linaro.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, linux-crypto@vger.kernel.org, davem@davemloft.net, gregkh@linuxfoundation.org, sneves@dei.uc.pt, luto@kernel.org, jeanphilippe.aumasson@gmail.com, linux@armlinux.org.uk, linux-arm-kernel@lists.infradead.org, peter@cryptojedi.org Automatic-Legal-Notices: See https://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: "Jason A. Donenfeld" , Ard Biesheuvel , LKML , Netdev , Linux Crypto Mailing List , David Miller , Greg Kroah-Hartman , Samuel Neves , Andrew Lutomirski , Jean-Philippe Aumasson , Russell King - ARM Linux , linux-arm-kernel@lists.infradead.org, Peter Schwabe Subject: Re: [PATCH net-next v6 19/23] zinc: Curve25519 ARM implementation References: <20180925145622.29959-1-Jason@zx2c4.com> <20180925145622.29959-20-Jason@zx2c4.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="6c2NcOVqGQ03X4Wi" Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --6c2NcOVqGQ03X4Wi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline For the in-order ARM Cortex-A8 (the target for this code), adjacent multiply-add instructions forward summands quickly. A simple in-order dot-product computation has no latency problems, while interleaving computations, as suggested in this thread, creates problems. Also, on this microarchitecture, occasional ARM instructions run in parallel with NEON, so trying to manually eliminate ARM instructions through global pointer tracking wouldn't gain speed; it would simply create unnecessary code-maintenance problems. See https://cr.yp.to/papers.html#neoncrypto for analysis of the performance of---and remaining bottlenecks in---this code. Further speedups should be possible on this microarchitecture, but, for anyone interested in this, I recommend focusing on building a cycle-accurate simulator (e.g., fixing inaccuracies in the Sobole simulator) first. Of course, there are other ARM microarchitectures, and there are many cases where different microarchitectures prefer different optimizations. The kernel already has boot-time benchmarks for different optimizations for raid6, and should do the same for crypto code, so that implementors can focus on each microarchitecture separately rather than living in the barbaric world of having to choose which CPUs to favor. ---Dan --6c2NcOVqGQ03X4Wi Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIcBAEBAgAGBQJbt33CAAoJELDADU47DlRZPQwP/RX2uAhuaumOVvp9njd8n0zY 92cmL/mUx10TUsw/u76ryVjvdDkRd/8BAOtNQ5dYqLReiNcMqCX7keWe7F9myXWO /LsobUdU28F4Q2iP0a7TwZIzKyxwQzHUnHsAl0tydsWoX0Nb2bs8gbpkf4AJ4BWr 41WDxiynGZezl7FUcXA0RgdxDMdFZxYXSw6wRDUscFs7MfOaDO9nvhNYRWVn+lNZ DDnTCO3YPLc2qe5uPcH/CVOj+CIBVOd9uzO7ggmBqNfBWyYzKebu4xyY1bg8GXJa Xj4y0ob+plAf427Svz5X2br4t2Lwg4VSrQ1kU3qXfCY46D1DJKraeKbLeiKijB6N 5vpxl1nKBvMyIQHr6dE/Q7qOvasYZuvKd+A/xv2wUwQxkdTI2u1tz+Oa2BkS2zyR Oukto1kIYcfq8BlWxCIItjholBY5opfCA3tT1QdghBWILZwMN9IVGyb4kkCdxdcX Py/DkEY2Cen90LNO5UT+3g4g/D/ALTwAvATg3U1JKgJfti6zQKLxUMQcmEiV4qky aFcR1mpPoxwttoNd4s9zA1WFWBl6dCQjMhqhOvM3rBwByJyum0SQFqyx+F9P5Cpv LM5V4MMVng8rGcF/E99FeVSPcdLAa8TjYfFqTfbe/hhFLJcvQ1PtdHeP8oSMvhqM 1IA4fIdgtto2qD3k37Kr =AM3C -----END PGP SIGNATURE----- --6c2NcOVqGQ03X4Wi--