Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp4833171pxb; Sun, 6 Feb 2022 05:04:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJwWSdUx7d0tDrYMIpT4E6LzlINtYhEySROy1aBhh95sUCJMPkN9aHZWw1G76PEaOwr1jm9r X-Received: by 2002:a62:784b:: with SMTP id t72mr11614274pfc.86.1644152680079; Sun, 06 Feb 2022 05:04:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644152680; cv=none; d=google.com; s=arc-20160816; b=EEr6wO32dSbXriXStesecdJdX2QW51fGSP/74c3bgsb7KX9FHaJlTr6Z8NSLCjObsA dXAIGvAbvq63BH6t13RdVBRBxFjH74FQgONZB59aEkBXIuRzKkW8F2xd2dNmfjeJ5QeX ZMWUrmf8ZI9U0vLdBEz5kW/+ylBnIQ9qfWAycWqio2hlKnl5I4Uh2iilkmj1R5eFTQ08 /KFBbTnhWywbiQ5Of729yVhVk6/nO2fZRQmAatkFGYGpD0ZUNFx+BX+A1Knsj13s2Z2h tZCCT3bYhCUU5d1DIpZP2k/MjjgtUig2Jvrae3Oxxpr4NY0jtUJgURGMDPS2RNAgFpcW nSew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XpwRsLoPLUrbJQbicMUIZ2C+alR57WAc28DvC5CezH4=; b=MsvHxMFvkNLaij88golGbXy/WG2LvLt/+F2JJjSW6wW0O+ugZo/frVuLD/NSUVxqdL QtlbwIH85kt82DI7ATmbXdGswl4Aaj1jCQdLxJism8omWl6eqLMdXwJxlBpnRjRSq1y9 IDvDnD/LGBFAjrjZUob4mVG5H04HitS9Uz4S1cCfeMmQACwGClhL9d56LCtnsS+QdbSM bIB575b4g+xeHaHr7sMPL2R0LFfG7O2OyIfOjs+11Rd5VbM9wgUFxlR76wgILfdu/iiJ ohzoXTTh6qILpYsHa2Byxznld/5nl5icrmZeqoPiyf5etgv20PDedobYmb7BLTHpXGyk eFfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Ac1wns1v; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p42si7732338pfh.29.2022.02.06.05.04.14; Sun, 06 Feb 2022 05:04:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Ac1wns1v; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229379AbiBEH67 (ORCPT + 99 others); Sat, 5 Feb 2022 02:58:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379668AbiBEH67 (ORCPT ); Sat, 5 Feb 2022 02:58:59 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 134B7C061347 for ; Fri, 4 Feb 2022 23:58:58 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CA826B83992 for ; Sat, 5 Feb 2022 07:58:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5782EC340F1; Sat, 5 Feb 2022 07:58:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1644047935; bh=qcipBAZoHSGtkTN9yHJ0ciJ/e0rE2CJLdEsIWqhrb4M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ac1wns1vGmOGVybj4WRqy72+w8XMMfrzo7YguCar/9yrbmhGvbz57KOsvrU4itSIK jC1QFf8uUx7IVzqMrAqmkY5of/OBXU487Zo2EORXTN8tCC3k4nD7XNntaRmX9mEQGp fT9UoYBNEUFCVqkSCimrATpwnbt98DjPFO7dDE/gNwPWi1CgGL2QBjgGs6JYfiZwp9 eMe5JCquDSUQNXMRIijlSv/G0PzBkMKLfr3kLjCzzIcOHvktyhr2uatuRQXkyAWdjs 65QRr0nqq0irxy+GsAzgBRcOC8KodnKzFUoeDWNjCNFnRfLLD7J63JdKF5ChjPYA6M Zm4W/6n7WBSEw== From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, Ard Biesheuvel , Nick Desaulniers , Nathan Chancellor Subject: [PATCH v3 2/2] crypto: arm/xor - make vectorized C code Clang-friendly Date: Sat, 5 Feb 2022 08:58:37 +0100 Message-Id: <20220205075837.153418-3-ardb@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220205075837.153418-1-ardb@kernel.org> References: <20220205075837.153418-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2129; h=from:subject; bh=qcipBAZoHSGtkTN9yHJ0ciJ/e0rE2CJLdEsIWqhrb4M=; b=owEB7QES/pANAwAKAcNPIjmS2Y8kAcsmYgBh/i4s5ntRJgpuwWrmJYcYI53WG3WN6XF6sRBUsVrj R0+0GD2JAbMEAAEKAB0WIQT72WJ8QGnJQhU3VynDTyI5ktmPJAUCYf4uLAAKCRDDTyI5ktmPJNl+C/ 4xnRRyLZFJFxCexy2mNZEsK0JRjfPOwCUQHEAciBx5Y2aer8tXGHNuXuTPK6xfScdYPxrKTIOBpehY FrN/gmfPolFcXhnB3jIGAgRaReGST4QpztID0eKqkyVQC6CJiATPlrUCY0B6R7jAOjiwkfaZPPVs6J U6EoGv8HCB15VVbAb3bu5VVjxrWcwdRdd6qLS6PLRN6dyxkb7wLvezTL5BKUQVzUU6qbvsyAGnrd7x 0ssH9guu2v8rlNtfLeaTULukGJx0q2pEL/1pDY1jKC/jhiN3HfhH3nrVx3E9D4XbzmyRAQLJ7P7fsk R5RmWBgGemKXTmc0/w31NRIAwPkHxrTK/FLbu6hiStZlCDtnSlf7Rsw6moc30D2w+YGExSbligtfCW lYQa7df47c/YdjMfUSr90hLn54qqnBYlhi4myMnu6aYq9PBergXmXl2J9FDTb7Emq8qxQ2UhzwDY/I PrQ5rf+J0T/HkGpJZ1tBNJbGq3klex06DnYsA+w0HXdEI= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The ARM version of the accelerated XOR routines are simply the 8-way C routines passed through the auto-vectorizer with SIMD codegen enabled. This used to require GCC version 4.6 at least, but given that 5.1 is now the baseline, this check is no longer necessary, and actually misidentifies Clang as GCC < 4.6 as Clang defines the GCC major/minor as well, but makes no attempt at doing this in a way that conveys feature parity with a certain version of GCC (which would not be a great idea in the first place). So let's drop the version check, and make the auto-vectorize pragma (which is based on a GCC-specific command line option) GCC-only. Since Clang performs SIMD auto-vectorization by default at -O2, no pragma is necessary here. Tested-by: Nathan Chancellor Signed-off-by: Ard Biesheuvel Reviewed-by: Nick Desaulniers Link: https://github.com/ClangBuiltLinux/linux/issues/496 Link: https://github.com/ClangBuiltLinux/linux/issues/503 --- arch/arm/lib/xor-neon.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c index b99dd8e1c93f..522510baed49 100644 --- a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ -17,17 +17,11 @@ MODULE_LICENSE("GPL"); /* * Pull in the reference implementations while instructing GCC (through * -ftree-vectorize) to attempt to exploit implicit parallelism and emit - * NEON instructions. + * NEON instructions. Clang does this by default at O2 so no pragma is + * needed. */ -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) +#ifdef CONFIG_CC_IS_GCC #pragma GCC optimize "tree-vectorize" -#else -/* - * While older versions of GCC do not generate incorrect code, they fail to - * recognize the parallel nature of these functions, and emit plain ARM code, - * which is known to be slower than the optimized ARM code in asm-arm/xor.h. - */ -#warning This code requires at least version 4.6 of GCC #endif #pragma GCC diagnostic ignored "-Wunused-variable" -- 2.30.2