Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp602367rwb; Wed, 7 Dec 2022 02:41:20 -0800 (PST) X-Google-Smtp-Source: AA0mqf705n96gOw+hVYDBKlUh93FMx8Z6MkK5FpSDe+kcCfSd9mlDY9IzClr2LSm8W35qnNE+7Jx X-Received: by 2002:aa7:9acb:0:b0:577:1f5f:cc28 with SMTP id x11-20020aa79acb000000b005771f5fcc28mr10354284pfp.16.1670409679883; Wed, 07 Dec 2022 02:41:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670409679; cv=none; d=google.com; s=arc-20160816; b=0Cxe0FNtnZIr0WagKE5AEAwkxoww62OiUv9mrFqMrLgiZ3FZfT/1NEp8RscoTIi46l TEPa7l5yu3/p3nvDroJ/CM8IMc+EClf2WKH4GdUC1cWh4McurAB4M45sIgV6k1aY7V2y yzf5A71i8T347bt17Yz2N7fFYWsx4IGtJ8XLBFzM/3GmylzJQLbl/ua874LT7IYp1kON QteBzIiq1tk6feakt1uOeRDlj9oWffR0IZRJ2MBuWrQUL2JjlkCD29pbwzvnSUVimUHA imZNzf1hJSmSKpmLDQRZPUMD4Usx7bsbilXAeMBDyVCese+IBo3IOT4x7aWuAUbFpUvu GUlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=6lyAzqys3dPH/PzFfqPXhG8eFyFg8rB2ymMJihNl21A=; b=pe7jOKRdtiHhjdFKyx2Lc+gTL4xu2pJkYGuUqguLjAUuzwm/n1W4jJquE7H1isYOPJ 4KDqkU4GMubstaRR29aySRy4fiHc5vjoAju9ZcxC3uf9FKBag0AG+pjw/hn/QN30jyvU IO5wGy+m6ZBE6zzItPLv7EoLD5ladfbRfk3pPkyBidukXV9KvHaATCEl7809IZoa9EGq d2C48shYd1rQfb/ppgpA7wQoYGWxdxo+Xj0jOvLHRfTWWSsti2bK0xbltuMfjC3vLP8l Db2rmHMoD1LVNEJCxgh/ynRhqPYginkhXARIs9PZ4uJa3PftQL3G3R0MNAgHTI2Za1MN SIbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=q9qZC5cC; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o19-20020a056a0015d300b0056301324a24si6241640pfu.133.2022.12.07.02.40.59; Wed, 07 Dec 2022 02:41:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=q9qZC5cC; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229462AbiLGKjv (ORCPT + 99 others); Wed, 7 Dec 2022 05:39:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229486AbiLGKju (ORCPT ); Wed, 7 Dec 2022 05:39:50 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 255B61C7 for ; Wed, 7 Dec 2022 02:39:49 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AA91E612ED for ; Wed, 7 Dec 2022 10:39:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A5384C433D6; Wed, 7 Dec 2022 10:39:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1670409588; bh=UaU51mjjvEqzffoeoi31IzUxoHXD53qz+ANzmxo0Rkk=; h=From:To:Cc:Subject:Date:From; b=q9qZC5cCEfgWfyU2bXpus+XThf+wva3nXr9aW5sxQ/LpjLwCA+tO0fObw4niyDzEO 0Wh/zwSUCiLyzeZ2NeWuG5Zg++i6NHQOnMx9IoXjIGCe9Li4opUBZ4aNWYZcM228+f VBK3BMQ2oJ0/QiCnEZzcq3amn3w8WLJt6EnEgQFVItemdeQ3AUkNYbZ9hyN3/rl5h/ xfRh8qxoAGlu0xsOXqKrUo3N9x9L0Z96+7Uv1lNYNuDfLXtK4BuCkPt9n86iINTj+u aCzLzObBYzqHyjQnG7VucaAXA2r8htr8fBLpCR5WwVkuYBWk+yrt3JZfSxLtrF8pA6 5PMlWhqvdEEyQ== From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org, linux@armlinux.org.uk Cc: linux-crypto@vger.kernel.org, Ard Biesheuvel , Linus Walleij , Arnd Bergmann Subject: [PATCH v2 0/2] ARM: allow kernel mode NEON in softirq context Date: Wed, 7 Dec 2022 11:39:34 +0100 Message-Id: <20221207103936.2198407-1-ardb@kernel.org> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1946; i=ardb@kernel.org; h=from:subject; bh=UaU51mjjvEqzffoeoi31IzUxoHXD53qz+ANzmxo0Rkk=; b=owEB7QES/pANAwAKAcNPIjmS2Y8kAcsmYgBjkG1k4gIk5BPXrKBYWIwEOYpVmyzuoJKtLUfjurOF GJASu7OJAbMEAAEKAB0WIQT72WJ8QGnJQhU3VynDTyI5ktmPJAUCY5BtZAAKCRDDTyI5ktmPJCVKDA CKv21UzwKXgkO/ZU9h4lfHB83WFeFw5gnytRKzYsoPzYddKyNiW9CstQUD3j7zpfCIQ/EK/zIQoOg4 XOZC/Y/SXyhzdrGrFL4n8SPxP2BCDyz4dm2UWAOEJvKx3Xr89FNB2qKnOjVALFAZG4sSG/AtsfTRfs yYkammk2Bc2NCzkqmPShUvevqzoHPPRRSGMVdOibE1smM25C/gvCZA6VXHyFGDTVewLjoYk380C0h5 H1iGw++5PhbuFH7FV/bThtarczx2USpHb4Aj/M98I5iZ0eIV47JSOW7tfALby2d0bhYogCXK78GFRK FsAkD/Si1NVF9h7wAaytWCpMHNmRYVncIZm/tTtrKRhJVZv2wsrB/3/orPD5VLEodjAVdNhWeCGGaW jGZo/uobU5mzVrwmNyz1HnAhNSovJBa05Y2be6xfMgBtk0vFEu6k+qic7qGGr0e7jhpPAFXtaXjpBy xOAQ8IxLVP9EUo+k4SGZ3p9wsHMIMqKnCdc2EgHCHWx30= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Currently on ARM, we only permit kernel mode NEON in task context, and NEON based processing triggered from softirq context is queued for asynchronous completion via the crypto API's cryptd layer. For IPsec packet encryption involving highly performant crypto implementations, this results in a substantial performance hit, and so it would be desirable to permit those crypto operations to complete synchronously even when invoked from softirq context. For example, on a 1 GHz Cortex-A53 machine (SynQuacer), AES-256-GCM executes in 7.2 cycles per byte, putting an upper bound of ~140 MB/s on the achievable throughput of a single CPU. Without these changes, an IPsec tunnel from a 32-bit VM to the 64-bit host can achieve a throughput of 9.5 MB/s TX and 11.9 MB/s RX. When the crypto algorithm is permitted to execute in softirq context, the throughput increases to 16.5 MB/s TX and 41 MB/s RX. (This is measured using debian's iperf3 3.11 with the default options) So let's reorganize the VFP state handling so that it its critical handling of the FPU registers runs with softirqs disabled. Then, update the kernel_neon_begin()/end() logic to keep softirq processing disabled as long as the NEON is being used in kernel mode. Cc: Linus Walleij Cc: Arnd Bergmann Cc: Russell King Ard Biesheuvel (2): ARM: vfp: Manipulate VFP state with softirqs disabled ARM: permit non-nested kernel mode NEON in softirq context arch/arm/include/asm/assembler.h | 19 ++++++++++++------- arch/arm/include/asm/simd.h | 8 ++++++++ arch/arm/kernel/asm-offsets.c | 1 + arch/arm/vfp/entry.S | 4 ++-- arch/arm/vfp/vfphw.S | 4 ++-- arch/arm/vfp/vfpmodule.c | 19 ++++++++++++------- 6 files changed, 37 insertions(+), 18 deletions(-) create mode 100644 arch/arm/include/asm/simd.h -- 2.35.1