Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp7601pxx; Mon, 26 Oct 2020 01:43:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyeUe6NrQSm3XkupQIZogE2w51RIHcPBq0DZZ153HsNfayH8jX6ySDM2JARSaWBXGSnzd4I X-Received: by 2002:a17:906:53d7:: with SMTP id p23mr14498761ejo.232.1603701788621; Mon, 26 Oct 2020 01:43:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603701788; cv=none; d=google.com; s=arc-20160816; b=NqRFSLoJGDHEi5KDnxLlHcD5ZiCbpCWaPkwVB4ZlkMASqiu0J14cFy0lYG4n+G8BDt CXZdLSpRKCFcshYxnA/0Eu/4FlZD75Xmp23xPvK9YOTNcwt9qcz95dQmlA26enTRJDSK cplo5GQB5ggnkdUjIwvCDsLry7jOItDo4XFYzfRjVx0wVuJMjRTRGudhO5mLLIlLikMp dW0ar4N0+k6hEYgw2zJ3AluEmhse9OGNbaoUmSL8y3sZnF2prJsKEHZpZq3l/A/X2fEH 1dWbmkHUZYEjZTDgE2q0k9Nc7/2JCuUABArb3QB4tnJcU/KeOCZDDyOKisl1aIakwB3G 87cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=bDZrbzmaeU7dPYb1dmYEIXk9AiRFXK+nRwahll5UIHs=; b=o9vcF8d1YqNnk/o7dDRjDQATLiE18bicgr6hJc7awijamtAsm1DzOE7JDMOZj7xgZ1 WylspXA8aOMUC8P1TCUnmUB4ogk07zRZV/tBBQ+3evoiGbZ3r1+MMCOlMy+GVEjd3HZd 0Yow9sKrb1xlFHusWJO0quvBSYML7ZYAwXircziMutvnDcz9h6QnCD+4ysBBWZPc87ny zlBik3W1gukYnmhUN6b0xi6Sq+4DBneY4ESInfYcP2SrZdW0bKFVg3cde2xPHF/KZrw4 ZiFhBhFYEj0tipmEPA9J1BhBPiloBEz2H/rjm9I+49NRxnMbtJC7a1wEqVljLdwIwh0q y/1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=OLXQxMTw; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id sa7si7905745ejb.618.2020.10.26.01.42.45; Mon, 26 Oct 2020 01:43:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=OLXQxMTw; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1420953AbgJZICW (ORCPT + 99 others); Mon, 26 Oct 2020 04:02:22 -0400 Received: from mail.kernel.org ([198.145.29.99]:33888 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1419349AbgJZICW (ORCPT ); Mon, 26 Oct 2020 04:02:22 -0400 Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B7243223AE; Mon, 26 Oct 2020 08:02:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1603699341; bh=u7gIFsaZApJjD2/VYS+ThxztiaP8WkJFHr4IYJ9i5Ug=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=OLXQxMTwiCDm5iwYwg5KvNHxgkEH0mA/+B4b1qRYm8AHhPPmhxvaqcaJNqr6WCAz/ 1I4wjNj+npXF+3YMMkcro6LoDJeoRXwWAR//G1NaIV2JBX+vlYMp7ZX5C37ceiNWlB bmNpgpYZgddD6AW+Bjr/Osp/Pvoq5nGR+89+stlo= Received: by mail-oi1-f173.google.com with SMTP id m128so9669599oig.7; Mon, 26 Oct 2020 01:02:21 -0700 (PDT) X-Gm-Message-State: AOAM532/u3iqOG8ze9pQirw+Q5DoK0+BfSs9AWYiVkn6k/fLmlT6+Z2u w+hpxu5Ua37UKeG64OPCsoBpUjOOt/6e0506bzU= X-Received: by 2002:aca:4085:: with SMTP id n127mr11770959oia.33.1603699341004; Mon, 26 Oct 2020 01:02:21 -0700 (PDT) MIME-Version: 1.0 References: <20201025143119.1054168-1-nivedita@alum.mit.edu> <20201025143119.1054168-7-nivedita@alum.mit.edu> In-Reply-To: <20201025143119.1054168-7-nivedita@alum.mit.edu> From: Ard Biesheuvel Date: Mon, 26 Oct 2020 09:02:10 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v4 6/6] crypto: lib/sha256 - Unroll LOAD and BLEND loops To: Arvind Sankar Cc: Herbert Xu , "David S. Miller" , "linux-crypto@vger.kernel.org" , Eric Biggers , David Laight , Linux Kernel Mailing List , Eric Biggers Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Sun, 25 Oct 2020 at 15:31, Arvind Sankar wrote: > > Unrolling the LOAD and BLEND loops improves performance by ~8% on x86_64 > (tested on Broadwell Xeon) while not increasing code size too much. > > Signed-off-by: Arvind Sankar > Reviewed-by: Eric Biggers Acked-by: Ard Biesheuvel > --- > lib/crypto/sha256.c | 24 ++++++++++++++++++++---- > 1 file changed, 20 insertions(+), 4 deletions(-) > > diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c > index e2e29d9b0ccd..cdef37c05972 100644 > --- a/lib/crypto/sha256.c > +++ b/lib/crypto/sha256.c > @@ -76,12 +76,28 @@ static void sha256_transform(u32 *state, const u8 *input, u32 *W) > int i; > > /* load the input */ > - for (i = 0; i < 16; i++) > - LOAD_OP(i, W, input); > + for (i = 0; i < 16; i += 8) { > + LOAD_OP(i + 0, W, input); > + LOAD_OP(i + 1, W, input); > + LOAD_OP(i + 2, W, input); > + LOAD_OP(i + 3, W, input); > + LOAD_OP(i + 4, W, input); > + LOAD_OP(i + 5, W, input); > + LOAD_OP(i + 6, W, input); > + LOAD_OP(i + 7, W, input); > + } > > /* now blend */ > - for (i = 16; i < 64; i++) > - BLEND_OP(i, W); > + for (i = 16; i < 64; i += 8) { > + BLEND_OP(i + 0, W); > + BLEND_OP(i + 1, W); > + BLEND_OP(i + 2, W); > + BLEND_OP(i + 3, W); > + BLEND_OP(i + 4, W); > + BLEND_OP(i + 5, W); > + BLEND_OP(i + 6, W); > + BLEND_OP(i + 7, W); > + } > > /* load the state into our registers */ > a = state[0]; b = state[1]; c = state[2]; d = state[3]; > -- > 2.26.2 >