Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp1017269pxb; Thu, 5 Nov 2020 21:17:32 -0800 (PST) X-Google-Smtp-Source: ABdhPJxSey8c991rarYsR9qbyOLmjSzUUZbWg+TZoAaUWsowGd31k/Acz51vNv7S14j3ycy9lLXK X-Received: by 2002:a17:906:198b:: with SMTP id g11mr302839ejd.445.1604639852157; Thu, 05 Nov 2020 21:17:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604639852; cv=none; d=google.com; s=arc-20160816; b=e7VDQImmeysOGSCsehV4gXPdLmFSEXtSuKe9nP8ikNc7meHnNIn1YhA8fEe5M1xxgA 3sZL/DeANSn/QEeN2JQHXU2cGivoB48hB4VRiZ0/V15Bpla/cwmFZzq7i2SsHnmeZlyq CtMaJYo56PEqbCinRAApTl+F7bd4htw1KVAltoyqjgdjHKQ1wjLKUfFuJ2hQ41RAnHS/ ZnZeI4tmvF6J2IZD0IcTduhyhLhz33KEiOjJ7RECTlAvO91xrkHhxoW/EkKe3KqY7CtM E1vcaYar+9Kaqjbuf89bnXOv+CwvoPj8S5ljQjKY6o7icCreCSTWJcDt7TWOOkUDJUxS c83g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Oy2qqijFEaFVi0q8ix2zELu90Ld2HYlLC6QSZ9Qfqkw=; b=tXy5Kr0eG7qlJY339zbI29i9f3U1SAMz5ze+IxZ9OOXJnTZJ/1IeBZvFA9pdwlLOkv hvw5CnqZr2RE7jnN1592QAnNyMBW9T0JuM5DV7eeKDTNSG5myscZvA0Fi2I09Z4W1EUM L8lkCPAAhg//jPeX3dpWdoapE6fGqwh4RWvuzFK9Cty4diTUKAJJZwrGsaejf4hULiqg IO4WYCK6lMe2eJv8xpojigkvrm+YMLtuBL7QRM3yXjL2QPixbZIKwK8tmk/NCm92JFaP LHSYic1xE9eAt8iiQAVxFYgAr9srf0XoPGGu1a6OnL8MgznsmkPjhfGy8knSn4Bgou0w gwhg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u6si165059edq.475.2020.11.05.21.17.09; Thu, 05 Nov 2020 21:17:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726202AbgKFFOv (ORCPT + 99 others); Fri, 6 Nov 2020 00:14:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726194AbgKFFOt (ORCPT ); Fri, 6 Nov 2020 00:14:49 -0500 Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0262C0613CF for ; Thu, 5 Nov 2020 21:14:49 -0800 (PST) Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: aratiu) with ESMTPSA id D41631F466CE From: Adrian Ratiu To: linux-arm-kernel@lists.infradead.org Cc: Nathan Chancellor , Nick Desaulniers , Arnd Bergmann , clang-built-linux@googlegroups.com, Russell King , linux-kernel@vger.kernel.org, kernel@collabora.com Subject: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization Date: Fri, 6 Nov 2020 07:14:36 +0200 Message-Id: <20201106051436.2384842-3-adrian.ratiu@collabora.com> X-Mailer: git-send-email 2.29.0 In-Reply-To: <20201106051436.2384842-1-adrian.ratiu@collabora.com> References: <20201106051436.2384842-1-adrian.ratiu@collabora.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Due to a Clang bug [1] neon autoloop vectorization does not happen or happens badly with no gains and considering previous GCC experiences which generated unoptimized code which was worse than the default asm implementation, it is safer to default clang builds to the known good generic implementation. The kernel currently supports a minimum Clang version of v10.0.1, see commit 1f7a44f63e6c ("compiler-clang: add build check for clang 10.0.1"). When the bug gets eventually fixed, this commit could be reverted or, if the minimum clang version bump takes a long time, a warning could be added for users to upgrade their compilers like was done for GCC. [1] https://bugs.llvm.org/show_bug.cgi?id=40976 Signed-off-by: Adrian Ratiu --- arch/arm/include/asm/xor.h | 3 ++- arch/arm/lib/Makefile | 3 +++ arch/arm/lib/xor-neon.c | 4 ++++ 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/arm/include/asm/xor.h b/arch/arm/include/asm/xor.h index aefddec79286..49937dafaa71 100644 --- a/arch/arm/include/asm/xor.h +++ b/arch/arm/include/asm/xor.h @@ -141,7 +141,8 @@ static struct xor_block_template xor_block_arm4regs = { NEON_TEMPLATES; \ } while (0) -#ifdef CONFIG_KERNEL_MODE_NEON +/* disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976 */ +#if defined(CONFIG_KERNEL_MODE_NEON) && !defined(CONFIG_CC_IS_CLANG) extern struct xor_block_template const xor_block_neon_inner; diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index 6d2ba454f25b..53f9e7dd9714 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -43,8 +43,11 @@ endif $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S +# disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976 +ifndef CONFIG_CC_IS_CLANG ifeq ($(CONFIG_KERNEL_MODE_NEON),y) NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon CFLAGS_xor-neon.o += $(NEON_FLAGS) obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o endif +endif diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c index e1e76186ec23..84c91c48dfa2 100644 --- a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ -18,6 +18,10 @@ MODULE_LICENSE("GPL"); * Pull in the reference implementations while instructing GCC (through * -ftree-vectorize) to attempt to exploit implicit parallelism and emit * NEON instructions. + + * On Clang the loop vectorizer is enabled by default, but due to a bug + * (https://bugs.llvm.org/show_bug.cgi?id=40976) vectorization is broke + * so xor-neon is disabled in favor of the default reg implementations. */ #ifdef CONFIG_CC_IS_GCC #pragma GCC optimize "tree-vectorize" -- 2.29.0