Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp3389473rwb; Mon, 16 Jan 2023 07:25:07 -0800 (PST) X-Google-Smtp-Source: AMrXdXvfK6HUrVvmdzJzyehloxsWL4fKiemIGN0mEYSuUFy2+nqGjAH4id0Ipo2pXbPN5dMzcMYy X-Received: by 2002:a05:6a20:94c4:b0:aa:806:7b91 with SMTP id ht4-20020a056a2094c400b000aa08067b91mr111412229pzb.39.1673882707299; Mon, 16 Jan 2023 07:25:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673882707; cv=none; d=google.com; s=arc-20160816; b=skeqHw6IzuCAEAgk0V6Xh+6hGjv2vr7w/ioZydcBhRFDaqsrZl4stuP/9jcQBtuw9E UPijf6mhLXIcjbL9FKfqHIJZt9ITNzThh9rx1Hf6NNpnjYeMfSnIeZJ9Nej2XNv63ytm cbf+POhzodAswKyVfanOUNbQu4QRl3N7NLdzYcACZ97+14low0DSHqyDprCeGsNhi7aw 6/ExZ7rQT0knZDukihhHZb7vhEbtsyXfm5lqLGYUFzyfVKOOtnJH26ris91nKGDYnRFh C1MAFRgH0b3nnY7j9TepiLvcAg/KyB+HwvSPebjoNGhDE5G6DnOydixUP04bXAJPinFs RQ+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :feedback-id:message-id:subject:cc:from:to:dkim-signature:date; bh=zzF9mAQN1tOBRS7sao22BNfZIlVCplVmqFFZ1O+i9rQ=; b=EkQ72Lp0ra3OgZqPb62yKJn+swaFzLuCYuozC9wuovuztlylmNBGjFLAz4q0UWoaTC E9Zq2JRLX9QeGNDSqF0VupduLI9DoznXOJsSNODrhwQ3oZL8AhGvWI5Qs93eZcgOUjUk hh0poVN2w1nTqvAWM8BOs183jVlPDmCZrc7cXg1vp42gwH4TfIn7Sv2FrI7xAx7FTvPd QZtMbhOQkSzlm+YUh5eiOjWe9VhG7n1G4Nx57Ty4HVveL0Rt7DPjFHEMwXwcZy/+ByuP F2aIH0KGvsphAdZnke7L1oZE7rxNv/3A50ca359EpD8ya8kR9WD5QchigcUFyCXnzuHW eH2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@n8pjl.ca header.s=protonmail header.b=PeWRTFyZ; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=n8pjl.ca Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v2-20020a63f202000000b004ac71b079b6si27921546pgh.349.2023.01.16.07.24.46; Mon, 16 Jan 2023 07:25:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@n8pjl.ca header.s=protonmail header.b=PeWRTFyZ; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=n8pjl.ca Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232147AbjAPPMI (ORCPT + 99 others); Mon, 16 Jan 2023 10:12:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229918AbjAPPLp (ORCPT ); Mon, 16 Jan 2023 10:11:45 -0500 Received: from mail-4317.proton.ch (mail-4317.proton.ch [185.70.43.17]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B23E30294 for ; Mon, 16 Jan 2023 06:59:57 -0800 (PST) Date: Mon, 16 Jan 2023 14:59:46 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=n8pjl.ca; s=protonmail; t=1673881196; x=1674140396; bh=zzF9mAQN1tOBRS7sao22BNfZIlVCplVmqFFZ1O+i9rQ=; h=Date:To:From:Cc:Subject:Message-ID:Feedback-ID:From:To:Cc:Date: Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector; b=PeWRTFyZ5oaod8KIEvldYz8XEm18qXWxt/IZefcD5a+Y/YTjdCq1zkY+4KeEEz8PC vX4ti3EzR1zcuC6fkj3Pge1gIAHhHzbW9+DH7jl+TzK13q7hwiS+MQEyWvYWiz5poq urQ3kcCoT1rzfhzh5f2dKN1J1qTerMlJZkqf2wOWPp0O7o2sXQbY+0VqzZn2seNdrd 2G/UfDWw9OY991VbA+GX35XY4icIf5elnTaVpDHqgeQs2/csr9UVq6KYWRZuczlU5/ +Ta0vNF0xeDFPKMwR7/cYBzEZne8HkH4g6f15DQrJq3FTilfyGvcFqPMMwQc8IJyVo 2cikKCqAFtEuw== To: "linux-crypto@vger.kernel.org" From: Peter Lafreniere Cc: "ardb@kernel.org" , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , Peter Lafreniere Subject: [PATCH] crypto: x86 - exit fpu context earlier in ECB/CBC macros Message-ID: Feedback-ID: 53133685:user:proton MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Currently the ecb/cbc macros hold fpu context unnecessarily when using=20 scalar cipher routines (e.g. when handling odd sizes of blocks per walk). Change the macros to drop fpu context as soon as the fpu is out of use. No performance impact found (on Intel Haswell). Signed-off-by: Peter Lafreniere --- arch/x86/crypto/ecb_cbc_helpers.h | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/arch/x86/crypto/ecb_cbc_helpers.h b/arch/x86/crypto/ecb_cbc_he= lpers.h index eaa15c7b29d6..b83085e18ab0 100644 --- a/arch/x86/crypto/ecb_cbc_helpers.h +++ b/arch/x86/crypto/ecb_cbc_helpers.h @@ -14,12 +14,13 @@ #define ECB_WALK_START(req, bsize, fpu_blocks) do {=09=09=09\ =09void *ctx =3D crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));=09\ +=09const int __fpu_blocks =3D (fpu_blocks);=09=09=09=09\ =09const int __bsize =3D (bsize);=09=09=09=09=09\ =09struct skcipher_walk walk;=09=09=09=09=09\ =09int err =3D skcipher_walk_virt(&walk, (req), false);=09=09\ =09while (walk.nbytes > 0) {=09=09=09=09=09\ =09=09unsigned int nbytes =3D walk.nbytes;=09=09=09\ -=09=09bool do_fpu =3D (fpu_blocks) !=3D -1 &&=09=09=09\ -=09=09=09 nbytes >=3D (fpu_blocks) * __bsize;=09=09\ +=09=09bool do_fpu =3D __fpu_blocks !=3D -1 &&=09=09=09\ +=09=09=09 nbytes >=3D __fpu_blocks * __bsize;=09=09\ =09=09const u8 *src =3D walk.src.virt.addr;=09=09=09\ =09=09u8 *dst =3D walk.dst.virt.addr;=09=09=09=09\ =09=09u8 __maybe_unused buf[(bsize)];=09=09=09=09\ @@ -35,7 +36,12 @@ } while (0) =20 #define ECB_BLOCK(blocks, func) do {=09=09=09=09=09\ -=09while (nbytes >=3D (blocks) * __bsize) {=09=09=09=09\ +=09const int __blocks =3D (blocks);=09=09=09=09=09\ +=09if (do_fpu && __blocks < __fpu_blocks) {=09=09=09\ +=09=09kernel_fpu_end();=09=09=09=09=09\ +=09=09do_fpu =3D false;=09=09=09=09=09=09\ +=09}=09=09=09=09=09=09=09=09\ +=09while (nbytes >=3D __blocks * __bsize) {=09=09=09=09\ =09=09(func)(ctx, dst, src);=09=09=09=09=09\ =09=09ECB_WALK_ADVANCE(blocks);=09=09=09=09\ =09}=09=09=09=09=09=09=09=09\ @@ -53,7 +59,12 @@ } while (0) =20 #define CBC_DEC_BLOCK(blocks, func) do {=09=09=09=09\ -=09while (nbytes >=3D (blocks) * __bsize) {=09=09=09=09\ +=09const int __blocks =3D (blocks);=09=09=09=09=09\ +=09if (do_fpu && __blocks < __fpu_blocks) {=09=09=09\ +=09=09kernel_fpu_end();=09=09=09=09=09\ +=09=09do_fpu =3D false;=09=09=09=09=09=09\ +=09}=09=09=09=09=09=09=09=09\ +=09while (nbytes >=3D __blocks * __bsize) {=09=09=09=09\ =09=09const u8 *__iv =3D src + ((blocks) - 1) * __bsize;=09\ =09=09if (dst =3D=3D src)=09=09=09=09=09=09\ =09=09=09__iv =3D memcpy(buf, __iv, __bsize);=09=09\ --=20 2.39.0