Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp9250335rwr; Thu, 11 May 2023 12:05:20 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5CO6xkaj0ZDla6/Hk9Q44+b3QIygrUy62iXxl9jB9L2rbDDDhqgk0RgdMqo+z/CgqNM4Qu X-Received: by 2002:a17:90a:34c2:b0:23d:376a:c2bc with SMTP id m2-20020a17090a34c200b0023d376ac2bcmr21987085pjf.5.1683831919722; Thu, 11 May 2023 12:05:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683831919; cv=none; d=google.com; s=arc-20160816; b=MC1OC5dL4C/f6Y2F1LeEdlYFUwX9JpnD6fbsERxAF5ww7pdqsNLp33p/jYAlzUTQQp JT+qrLH540EF/0Dia+1PWJfuB7+QG+tFP+oSLgGVRHvWUGj3Tr+JzkhKifrM0KD6xu+C GXXmR2muFo9liPzoOx5oYOf64OykDhubTteDbvHor2hrB4Qbg17SEg9U+C7Qkq8uOowf 3ULRoWoVt+zuMujf6/85Htdx8QoOsD3ti05lhBJrZxfYFWCYWaALVIXZTqOcDOezLrw1 VwM99Vkjo0UKPKMsmkieSnDrijwkWizOiwXIMmubDfzbotlI8PbCmVXIvmJi7/uUeIxD 9PdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=aRtr65hf9ccqynjXPIiNojbvPdO5ozceM8RcS1ysWrc=; b=t3xGWixfbtIBudscN1KSHTIhm4Y9m9+P4TXJlUo470j5q7sdZ9h0s+i9xBvMbXFuee 6UplOraxlEKpTm/TrqYRBs4+LOLQCVXFOoVrg7/4z7Z5ITAcaYQAAzvRlZjvO5SBVArf URMsxjfu5R8/14NjAeFMSrdpzuws+rV/1L/woINyvul7S72fQTf0zaEsTcWdo8kWZ8J5 sVYI5T4P7q86wlQuZw8CCOfIlc3vNfLCiN3/qa3f5XsSL5QDLDzuHwBSqqMq8PQeXa1h Io+ZzPKXPC/Sodcellxv4vfZeLdtYDtirgdX/1sNafUxbfuz7dn3haRvLVjhnfE5wICQ Zq+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=ovDltNBS; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s8-20020a170902b18800b001a979e702b2si767900plr.416.2023.05.11.12.04.57; Thu, 11 May 2023 12:05:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=ovDltNBS; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238954AbjEKTCp (ORCPT + 99 others); Thu, 11 May 2023 15:02:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229834AbjEKTCo (ORCPT ); Thu, 11 May 2023 15:02:44 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DAA01268F for ; Thu, 11 May 2023 12:02:42 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-3f42397f41fso215575e9.1 for ; Thu, 11 May 2023 12:02:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1683831761; x=1686423761; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=aRtr65hf9ccqynjXPIiNojbvPdO5ozceM8RcS1ysWrc=; b=ovDltNBSrtnPvOFyWQzOYXTGlTtE+TOG9cKwFIL5AdP69UNcoYy5wLhZ/5uq8JhWjt wqaayeZLt0bd3vl25p1Bre8bkvv+cx225RlJe4dvL/1fcYJmD2NVvLblW2G3v5klZ3Qo Oz+WPQ6wI6V2cqEici32aQJtT44Zq7NUsOHa0xAcOamhxIHQm6XyoU/abiCCLT54cnXr 0B4G0xPk/KZdFYZoyESm7BKJDiBxPy1rsqe0gnWwHGX/p9nwRS8oMW0XCNArXDYGLE+2 U/GGopoV4e8XOWHxOWmikbak5JG0Vt6k5knpIr+REfP46AlSKy0WqO9+kgZ8Yj6dDkrB uMfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683831761; x=1686423761; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aRtr65hf9ccqynjXPIiNojbvPdO5ozceM8RcS1ysWrc=; b=SfproQEhGv16LUALU40rpB3D16TM7jxiZsx/TZ5Vau7sB95uBFTZeP8UkYlRLv2qW/ t52Gv2WChtmQ0Yabf9FO7D14UkaI3NumOItKztWanusa8Jw9pn1a/T8yFBdeh4HtUapd shM6+n/vBfSIQXw0hhehreDHlZDlihat3AndoUOnEfbNjpJfzE7rUE2H2HF8CEQDa2x0 sKx3hqJWs4m09hhJes6OUQiXWBHItQ5sGt8iqvMV9+g86aoAfAceIGLSq4i9nhnzPUef UBgqJDBbANXG3OrFJmxA+4R1KLYQN0BVxFqhawL9gt+kZiQ1jrL+bXEOM2kzoNatlu47 tGPw== X-Gm-Message-State: AC+VfDzo+hE+eVWjIChUYh11aHADgz/hK5h0g0opuxmy6ZupOIjIUizo k8B2Vp4JmB2zhar2HtNwls96/fd9HL0g/OvGKlVRlA== X-Received: by 2002:a05:600c:3b18:b0:3f1:9a3d:4f7f with SMTP id m24-20020a05600c3b1800b003f19a3d4f7fmr25701wms.1.1683831760922; Thu, 11 May 2023 12:02:40 -0700 (PDT) MIME-Version: 1.0 References: <20230329140642.2186644-1-heiko.stuebner@vrull.eu> <20230329140642.2186644-5-heiko.stuebner@vrull.eu> <3540048.LM0AJKV5NW@diego> In-Reply-To: <3540048.LM0AJKV5NW@diego> From: Nathan Huckleberry Date: Thu, 11 May 2023 12:02:00 -0700 Message-ID: Subject: Re: [PATCH v4 4/4] RISC-V: crypto: add accelerated GCM GHASH implementation To: =?UTF-8?Q?Heiko_St=C3=BCbner?= Cc: palmer@dabbelt.com, paul.walmsley@sifive.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, christoph.muellner@vrull.eu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Thu, May 11, 2023 at 3:30=E2=80=AFAM Heiko St=C3=BCbner wrote: > > Hi Nathan, > > Am Dienstag, 11. April 2023, 17:00:00 CEST schrieb Nathan Huckleberry: > > On Wed, Mar 29, 2023 at 7:08=E2=80=AFAM Heiko Stuebner wrote: > > > +struct riscv64_ghash_ctx { > > > + void (*ghash_func)(u64 Xi[2], const u128 Htable[16], > > > + const u8 *inp, size_t len); > > > + > > > + /* key used by vector asm */ > > > + u128 htable[16]; > > > > This field looks too big. The assembly only loads the first 128-byte > > value from this table. > > OpenSSL defines the Htable field handed to the init- and the other > functions as this "u128 Htable[16]" [0] . As I really like the concept > of keeping in sync with openSSL, I guess I'd rather not change that. > > [0] https://github.com/openssl/openssl/blob/master/crypto/modes/gcm128.c#= L88 > > > > Is this copied from another implementation? There's an optimization > > where you precompute the first N powers of H so that you can perform 1 > > finite field reduction for every N multiplications, but it doesn't > > look like that's being used here. > > The whole crypto-specific code comes from openSSL itself, so for now I > guess I'd like to try keeping things the same. > > > > > +#define RISCV64_ZBC_SETKEY(VARIANT, GHASH) = \ > > > +void gcm_init_rv64i_ ## VARIANT(u128 Htable[16], const u64 Xi[2]); = \ > > > +static int riscv64_zbc_ghash_setkey_ ## VARIANT(struct crypto_shash = *tfm, \ > > > + const u8 *key, = \ > > > + unsigned int keylen) = \ > > > +{ = \ > > > + struct riscv64_ghash_ctx *ctx =3D crypto_tfm_ctx(crypto_shash= _tfm(tfm)); \ > > > + const u64 k[2] =3D { cpu_to_be64(((const u64 *)key)[0]), = \ > > > + cpu_to_be64(((const u64 *)key)[1]) }; = \ > > > + = \ > > > + if (keylen !=3D GHASH_BLOCK_SIZE) = \ > > > + return -EINVAL; = \ > > > + = \ > > > + memcpy(&ctx->key, key, GHASH_BLOCK_SIZE); = \ > > > + gcm_init_rv64i_ ## VARIANT(ctx->htable, k); = \ > > > + = \ > > > + ctx->ghash_func =3D gcm_ghash_rv64i_ ## GHASH; = \ > > > + = \ > > > + return 0; = \ > > > +} > > > > I'd prefer three identical functions over a macro here. Code searching > > tools and compiler warnings are significantly worse with macros. > > done :-) > > > > > + > > > +static int riscv64_zbc_ghash_update(struct shash_desc *desc, > > > + const u8 *src, unsigned int srclen) > > > +{ > > > + unsigned int len; > > > + struct riscv64_ghash_ctx *ctx =3D crypto_tfm_ctx(crypto_shash= _tfm(desc->tfm)); > > > + struct riscv64_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); > > > + > > > + if (dctx->bytes) { > > > + if (dctx->bytes + srclen < GHASH_DIGEST_SIZE) { > > > + memcpy(dctx->buffer + dctx->bytes, src, > > > + srclen); > > > + dctx->bytes +=3D srclen; > > > + return 0; > > > + } > > > + memcpy(dctx->buffer + dctx->bytes, src, > > > + GHASH_DIGEST_SIZE - dctx->bytes); > > > + > > > + ctx->ghash_func(dctx->shash, ctx->htable, > > > + dctx->buffer, GHASH_DIGEST_SIZE); > > > + > > > + src +=3D GHASH_DIGEST_SIZE - dctx->bytes; > > > + srclen -=3D GHASH_DIGEST_SIZE - dctx->bytes; > > > + dctx->bytes =3D 0; > > > + } > > > + len =3D srclen & ~(GHASH_DIGEST_SIZE - 1); > > > + > > > + if (len) { > > > + gcm_ghash_rv64i_zbc(dctx->shash, ctx->htable, > > > + src, len); > > > + src +=3D len; > > > + srclen -=3D len; > > > + } > > > + > > > + if (srclen) { > > > + memcpy(dctx->buffer, src, srclen); > > > + dctx->bytes =3D srclen; > > > + } > > > + return 0; > > > +} > > > + > > > +static int riscv64_zbc_ghash_final(struct shash_desc *desc, u8 *out) > > > +{ > > > + int i; > > > + struct riscv64_ghash_ctx *ctx =3D crypto_tfm_ctx(crypto_shash= _tfm(desc->tfm)); > > > + struct riscv64_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); > > > + > > > + if (dctx->bytes) { > > > + for (i =3D dctx->bytes; i < GHASH_DIGEST_SIZE; i++) > > > + dctx->buffer[i] =3D 0; > > > + ctx->ghash_func(dctx->shash, ctx->htable, > > > + dctx->buffer, GHASH_DIGEST_SIZE); > > > > Can we do this without an indirect call? > > hmm, the indirect call is in both riscv64_zbc_ghash_update() and > riscv64_zbc_ghash_final() . And I found a missing one at the bottom > of riscv64_zbc_ghash_update(), where gcm_ghash_rv64i_zbc() is > called right now. > > Getting rid of the indirect call would mean duplicating both of these > functions for all instances. Especially with the slightly higher > complexity of the update this somehow seems not the best way to go. Indirect calls are quite expensive. They are an issue for things like disk/filesystem encryption because it introduces a ton of latency per block. I think this is a candidate for static calls. It looks like static call support hasn't been accepted for riscv yet. Maybe just add a TODO for now? See: https://lwn.net/Articles/771209/ https://lore.kernel.org/all/tencent_A8A256967B654625AEE1DB222514B0613B07@qq= .com/ > > > Thanks for your pointers > Heiko > > Thanks, Huck