Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp3879936pxu; Tue, 20 Oct 2020 03:00:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwy6B6Wwh7J8xE6Xnq9KIrs57BNDkT7YSH1+jhmazPNAz5ZjJbrbqXi7wZFqTtsHh3U4CVt X-Received: by 2002:a17:906:e2d1:: with SMTP id gr17mr2252031ejb.433.1603188019534; Tue, 20 Oct 2020 03:00:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603188019; cv=none; d=google.com; s=arc-20160816; b=PKqWPuHWJVm8sIPX6PDbzE7K/8ciua1CMhgggNwEQajkqs6Q9QfDTxOet/TzXd5Q+y NovgVmToD7TXFTRa2ZB+Mg+1L5ht0izpxsBjRs+QKKFeWNSfjccDHQg0WcTTENlakXT+ TVfXIrBGzXssIvXfwLA8fJEog6XUf3oydYm7a3bYNPMWlqbb+eVLN/19CjaDpcjHXf4j OZdF6np+VT4TB0hfml4GMCMX3BdT8YzcZmD30LrnWt5G+VAJwMFL/0IFilcn4jOgeBuJ EEa+OSSWx7zxJX7ChqbbqVVdjSDu6fvw6Th44pLqVSHt2vnFR8J5iDuzMIKu0W1Io8AA E2UQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=oCsXBNYQ0llfLHehLLpZKTkrNTns1ii5/SxXzs6AJ7o=; b=Oc8G3RHenYkaDidRb9/u5Uty0JPZpVB3SW4IIZbvzWXaev1ags07IzaXygPrjmZMjh giksgqCAbUKqnrbfjgRAJHJmdJCh88TmA2L4EbC6ldQ/jeV3Oot2VBm9tYw6E9cCKytO 4L0a/rS+RNmpCNrNC9kB08ZQvkV5WDfwumUmNlYyomu64ebw0Q+jucFFP7CmEpCP6bMm RI5JbzIiHzLgInRav5j6rmSV9+aC1Q4DKgc1BuqexDC5CmW2Qy4lt7mGQ1laKsVcuorY rJ6UHDYRfdeh7thCR8XG/uB2vrXAQ6M6ICfGz2em/ZNXVuf3spRsxfp7/b9DkrNb6DMw HoGQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c11si845238eds.202.2020.10.20.02.59.50; Tue, 20 Oct 2020 03:00:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731702AbgJTHli convert rfc822-to-8bit (ORCPT + 99 others); Tue, 20 Oct 2020 03:41:38 -0400 Received: from eu-smtp-delivery-151.mimecast.com ([185.58.86.151]:24317 "EHLO eu-smtp-delivery-151.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731571AbgJTHli (ORCPT ); Tue, 20 Oct 2020 03:41:38 -0400 Received: from AcuMS.aculab.com (156.67.243.126 [156.67.243.126]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mta-19-Ws7YY1epMoqWjkynPZLMpQ-1; Tue, 20 Oct 2020 08:41:34 +0100 X-MC-Unique: Ws7YY1epMoqWjkynPZLMpQ-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) by AcuMS.aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Tue, 20 Oct 2020 08:41:33 +0100 Received: from AcuMS.Aculab.com ([fe80::43c:695e:880f:8750]) by AcuMS.aculab.com ([fe80::43c:695e:880f:8750%12]) with mapi id 15.00.1347.000; Tue, 20 Oct 2020 08:41:33 +0100 From: David Laight To: 'Arvind Sankar' , Herbert Xu , "David S. Miller" , "linux-crypto@vger.kernel.org" CC: "linux-kernel@vger.kernel.org" Subject: RE: [PATCH 4/5] crypto: lib/sha256 - Unroll SHA256 loop 8 times intead of 64 Thread-Topic: [PATCH 4/5] crypto: lib/sha256 - Unroll SHA256 loop 8 times intead of 64 Thread-Index: AQHWpizLtR2ktKloi0KILMfFeYMjdqmgGc8g Date: Tue, 20 Oct 2020 07:41:33 +0000 Message-ID: <1324eb3519d54ddd9469d30a94c11823@AcuMS.aculab.com> References: <20201019153016.2698303-1-nivedita@alum.mit.edu> <20201019153016.2698303-5-nivedita@alum.mit.edu> In-Reply-To: <20201019153016.2698303-5-nivedita@alum.mit.edu> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Arvind Sankar> Sent: 19 October 2020 16:30 > To: Herbert Xu ; David S. Miller ; linux- > crypto@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Subject: [PATCH 4/5] crypto: lib/sha256 - Unroll SHA256 loop 8 times intead of 64 > > This reduces code size substantially (on x86_64 with gcc-10 the size of > sha256_update() goes from 7593 bytes to 1952 bytes including the new > SHA256_K array), and on x86 is slightly faster than the full unroll. The speed will depend on exactly which cpu type is used. It is even possible that the 'not unrolled at all' loop (with the all the extra register moves) is faster on some x86-64 cpu. > > Signed-off-by: Arvind Sankar > --- > lib/crypto/sha256.c | 164 ++++++++------------------------------------ > 1 file changed, 28 insertions(+), 136 deletions(-) > > diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c > index c6bfeacc5b81..9f0b71d41ea0 100644 > --- a/lib/crypto/sha256.c > +++ b/lib/crypto/sha256.c > @@ -18,6 +18,17 @@ > #include > #include ... > > +#define SHA256_ROUND(i, a, b, c, d, e, f, g, h) do { \ > + u32 t1, t2; \ > + t1 = h + e1(e) + Ch(e, f, g) + SHA256_K[i] + W[i]; \ > + t2 = e0(a) + Maj(a, b, c); d += t1; h = t1 + t2; \ Split to 3 lines. If you can put SHA256_K[] and W[] into a struct then the compiler can use the same register to address into both arrays (using an offset of 64*4 for the second one). (ie keep the two arrays, not an array of struct). This should reduce the register pressure slightly. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)