While building arm32 allyesconfig, I ran into the following errors:
arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
'-mfloat-abi=softfp -mfpu=neon'
In file included from lib/raid6/neon1.c:27:
/home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
error: "NEON support not enabled"
Building V=1 showed NEON_FLAGS getting passed along to Clang but
__ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
which is the '-march' value for allyesconfig.
From lib/Basic/Targets/ARM.cpp in the Clang source:
// This only gets set when Neon instructions are actually available, unlike
// the VFP define, hence the soft float and arch check. This is subtly
// different from gcc, we follow the intent which was that it should be set
// when Neon instructions are actually available.
if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
Builder.defineMacro("__ARM_NEON", "1");
Builder.defineMacro("__ARM_NEON__");
// current AArch32 NEON implementations do not support double-precision
// floating-point even when it is present in VFP.
Builder.defineMacro("__ARM_NEON_FP",
"0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
}
Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
definined by Clang. This doesn't functionally change anything because
that code will only run where NEON is supported, which is implicitly
armv7.
Link: https://github.com/ClangBuiltLinux/linux/issues/287
Suggested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
---
Documentation/arm/kernel_mode_neon.txt | 4 ++--
arch/arm/lib/Makefile | 2 +-
arch/arm/lib/xor-neon.c | 2 +-
lib/raid6/Makefile | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
index 525452726d31..b9e060c5b61e 100644
--- a/Documentation/arm/kernel_mode_neon.txt
+++ b/Documentation/arm/kernel_mode_neon.txt
@@ -6,7 +6,7 @@ TL;DR summary
* Use only NEON instructions, or VFP instructions that don't rely on support
code
* Isolate your NEON code in a separate compilation unit, and compile it with
- '-mfpu=neon -mfloat-abi=softfp'
+ '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
* Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
NEON code
* Don't sleep in your NEON code, and be aware that it will be executed with
@@ -87,7 +87,7 @@ instructions appearing in unexpected places if no special care is taken.
Therefore, the recommended and only supported way of using NEON/VFP in the
kernel is by adhering to the following rules:
* isolate the NEON code in a separate compilation unit and compile it with
- '-mfpu=neon -mfloat-abi=softfp';
+ '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
* issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
into the unit containing the NEON code from a compilation unit which is *not*
built with the GCC flag '-mfpu=neon' set.
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index ad25fd1872c7..0bff0176db2c 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
$(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
- NEON_FLAGS := -mfloat-abi=softfp -mfpu=neon
+ NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
CFLAGS_xor-neon.o += $(NEON_FLAGS)
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index a6741a895189..4600b62d845f 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,7 +14,7 @@
MODULE_LICENSE("GPL");
#ifndef __ARM_NEON__
-#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
+#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
#endif
/*
diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
index 2f8b61dfd9b0..bfec7c87c61e 100644
--- a/lib/raid6/Makefile
+++ b/lib/raid6/Makefile
@@ -25,7 +25,7 @@ endif
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
NEON_FLAGS := -ffreestanding
ifeq ($(ARCH),arm)
-NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
+NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
endif
CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
ifeq ($(ARCH),arm64)
--
2.20.1
On Sat, 15 Dec 2018, Nathan Chancellor wrote:
> While building arm32 allyesconfig, I ran into the following errors:
>
> arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
> '-mfloat-abi=softfp -mfpu=neon'
>
> In file included from lib/raid6/neon1.c:27:
> /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
> error: "NEON support not enabled"
>
> Building V=1 showed NEON_FLAGS getting passed along to Clang but
> __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
> only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
> which is the '-march' value for allyesconfig.
>
> From lib/Basic/Targets/ARM.cpp in the Clang source:
>
> // This only gets set when Neon instructions are actually available, unlike
> // the VFP define, hence the soft float and arch check. This is subtly
> // different from gcc, we follow the intent which was that it should be set
> // when Neon instructions are actually available.
> if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
> Builder.defineMacro("__ARM_NEON", "1");
> Builder.defineMacro("__ARM_NEON__");
> // current AArch32 NEON implementations do not support double-precision
> // floating-point even when it is present in VFP.
> Builder.defineMacro("__ARM_NEON_FP",
> "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
> }
>
> Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
> beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
> definined by Clang. This doesn't functionally change anything because
> that code will only run where NEON is supported, which is implicitly
> armv7.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/287
> Suggested-by: Ard Biesheuvel <[email protected]>
> Signed-off-by: Nathan Chancellor <[email protected]>
Did you test that this doesn't create issues with gcc e.g. complaints
from the linker that objects have incompatible architecture
specifications or similar annoyance? This already happened in the past
but I forget the exact scenario. If you already did, or after you do
validate with gcc as well, then you may add:
Acked-by: Nicolas Pitre <[email protected]>
> ---
> Documentation/arm/kernel_mode_neon.txt | 4 ++--
> arch/arm/lib/Makefile | 2 +-
> arch/arm/lib/xor-neon.c | 2 +-
> lib/raid6/Makefile | 2 +-
> 4 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
> index 525452726d31..b9e060c5b61e 100644
> --- a/Documentation/arm/kernel_mode_neon.txt
> +++ b/Documentation/arm/kernel_mode_neon.txt
> @@ -6,7 +6,7 @@ TL;DR summary
> * Use only NEON instructions, or VFP instructions that don't rely on support
> code
> * Isolate your NEON code in a separate compilation unit, and compile it with
> - '-mfpu=neon -mfloat-abi=softfp'
> + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
> * Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
> NEON code
> * Don't sleep in your NEON code, and be aware that it will be executed with
> @@ -87,7 +87,7 @@ instructions appearing in unexpected places if no special care is taken.
> Therefore, the recommended and only supported way of using NEON/VFP in the
> kernel is by adhering to the following rules:
> * isolate the NEON code in a separate compilation unit and compile it with
> - '-mfpu=neon -mfloat-abi=softfp';
> + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
> * issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
> into the unit containing the NEON code from a compilation unit which is *not*
> built with the GCC flag '-mfpu=neon' set.
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index ad25fd1872c7..0bff0176db2c 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
> $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
>
> ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> - NEON_FLAGS := -mfloat-abi=softfp -mfpu=neon
> + NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> CFLAGS_xor-neon.o += $(NEON_FLAGS)
> obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
> endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index a6741a895189..4600b62d845f 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,7 +14,7 @@
> MODULE_LICENSE("GPL");
>
> #ifndef __ARM_NEON__
> -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
> #endif
>
> /*
> diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
> index 2f8b61dfd9b0..bfec7c87c61e 100644
> --- a/lib/raid6/Makefile
> +++ b/lib/raid6/Makefile
> @@ -25,7 +25,7 @@ endif
> ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> NEON_FLAGS := -ffreestanding
> ifeq ($(ARCH),arm)
> -NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
> +NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> endif
> CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
> ifeq ($(ARCH),arm64)
> --
> 2.20.1
>
>
On Mon, Dec 17, 2018 at 01:23:52PM -0500, Nicolas Pitre wrote:
> On Sat, 15 Dec 2018, Nathan Chancellor wrote:
>
> > While building arm32 allyesconfig, I ran into the following errors:
> >
> > arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
> > '-mfloat-abi=softfp -mfpu=neon'
> >
> > In file included from lib/raid6/neon1.c:27:
> > /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
> > error: "NEON support not enabled"
> >
> > Building V=1 showed NEON_FLAGS getting passed along to Clang but
> > __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
> > only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
> > which is the '-march' value for allyesconfig.
> >
> > From lib/Basic/Targets/ARM.cpp in the Clang source:
> >
> > // This only gets set when Neon instructions are actually available, unlike
> > // the VFP define, hence the soft float and arch check. This is subtly
> > // different from gcc, we follow the intent which was that it should be set
> > // when Neon instructions are actually available.
> > if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
> > Builder.defineMacro("__ARM_NEON", "1");
> > Builder.defineMacro("__ARM_NEON__");
> > // current AArch32 NEON implementations do not support double-precision
> > // floating-point even when it is present in VFP.
> > Builder.defineMacro("__ARM_NEON_FP",
> > "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
> > }
> >
> > Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
> > beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
> > definined by Clang. This doesn't functionally change anything because
> > that code will only run where NEON is supported, which is implicitly
> > armv7.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/287
> > Suggested-by: Ard Biesheuvel <[email protected]>
> > Signed-off-by: Nathan Chancellor <[email protected]>
>
> Did you test that this doesn't create issues with gcc e.g. complaints
> from the linker that objects have incompatible architecture
> specifications or similar annoyance? This already happened in the past
> but I forget the exact scenario. If you already did, or after you do
> validate with gcc as well, then you may add:
>
> Acked-by: Nicolas Pitre <[email protected]>
>
>
Hi Nicolas,
I was 99% sure that I checked GCC before sending this but I just did
another run to confirm and everything links successfully. We still use
binutils for assembling/linking the kernel so I assume that I would have
seen a warning from ld.bfd even with Clang.
Thank you for the review!
Nathan
>
> > ---
> > Documentation/arm/kernel_mode_neon.txt | 4 ++--
> > arch/arm/lib/Makefile | 2 +-
> > arch/arm/lib/xor-neon.c | 2 +-
> > lib/raid6/Makefile | 2 +-
> > 4 files changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
> > index 525452726d31..b9e060c5b61e 100644
> > --- a/Documentation/arm/kernel_mode_neon.txt
> > +++ b/Documentation/arm/kernel_mode_neon.txt
> > @@ -6,7 +6,7 @@ TL;DR summary
> > * Use only NEON instructions, or VFP instructions that don't rely on support
> > code
> > * Isolate your NEON code in a separate compilation unit, and compile it with
> > - '-mfpu=neon -mfloat-abi=softfp'
> > + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
> > * Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
> > NEON code
> > * Don't sleep in your NEON code, and be aware that it will be executed with
> > @@ -87,7 +87,7 @@ instructions appearing in unexpected places if no special care is taken.
> > Therefore, the recommended and only supported way of using NEON/VFP in the
> > kernel is by adhering to the following rules:
> > * isolate the NEON code in a separate compilation unit and compile it with
> > - '-mfpu=neon -mfloat-abi=softfp';
> > + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
> > * issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
> > into the unit containing the NEON code from a compilation unit which is *not*
> > built with the GCC flag '-mfpu=neon' set.
> > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> > index ad25fd1872c7..0bff0176db2c 100644
> > --- a/arch/arm/lib/Makefile
> > +++ b/arch/arm/lib/Makefile
> > @@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
> > $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
> >
> > ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> > - NEON_FLAGS := -mfloat-abi=softfp -mfpu=neon
> > + NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> > CFLAGS_xor-neon.o += $(NEON_FLAGS)
> > obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
> > endif
> > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> > index a6741a895189..4600b62d845f 100644
> > --- a/arch/arm/lib/xor-neon.c
> > +++ b/arch/arm/lib/xor-neon.c
> > @@ -14,7 +14,7 @@
> > MODULE_LICENSE("GPL");
> >
> > #ifndef __ARM_NEON__
> > -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> > +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
> > #endif
> >
> > /*
> > diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
> > index 2f8b61dfd9b0..bfec7c87c61e 100644
> > --- a/lib/raid6/Makefile
> > +++ b/lib/raid6/Makefile
> > @@ -25,7 +25,7 @@ endif
> > ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> > NEON_FLAGS := -ffreestanding
> > ifeq ($(ARCH),arm)
> > -NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
> > +NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> > endif
> > CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
> > ifeq ($(ARCH),arm64)
> > --
> > 2.20.1
> >
> >
On Sat, Dec 15, 2018 at 1:23 PM Nathan Chancellor
<[email protected]> wrote:
>
> While building arm32 allyesconfig, I ran into the following errors:
>
> arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
> '-mfloat-abi=softfp -mfpu=neon'
>
> In file included from lib/raid6/neon1.c:27:
> /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
> error: "NEON support not enabled"
>
> Building V=1 showed NEON_FLAGS getting passed along to Clang but
> __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
> only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
> which is the '-march' value for allyesconfig.
>
> From lib/Basic/Targets/ARM.cpp in the Clang source:
>
> // This only gets set when Neon instructions are actually available, unlike
> // the VFP define, hence the soft float and arch check. This is subtly
> // different from gcc, we follow the intent which was that it should be set
> // when Neon instructions are actually available.
> if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
> Builder.defineMacro("__ARM_NEON", "1");
> Builder.defineMacro("__ARM_NEON__");
> // current AArch32 NEON implementations do not support double-precision
> // floating-point even when it is present in VFP.
> Builder.defineMacro("__ARM_NEON_FP",
> "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
> }
>
> Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
> beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
> definined by Clang. This doesn't functionally change anything because
> that code will only run where NEON is supported, which is implicitly
> armv7.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/287
> Suggested-by: Ard Biesheuvel <[email protected]>
> Signed-off-by: Nathan Chancellor <[email protected]>
Nathan,
Thanks for sending the patch, and thanks to Ard for the suggestion.
Reviewed-by: Nick Desaulniers <[email protected]>
> ---
> Documentation/arm/kernel_mode_neon.txt | 4 ++--
> arch/arm/lib/Makefile | 2 +-
> arch/arm/lib/xor-neon.c | 2 +-
> lib/raid6/Makefile | 2 +-
> 4 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
> index 525452726d31..b9e060c5b61e 100644
> --- a/Documentation/arm/kernel_mode_neon.txt
> +++ b/Documentation/arm/kernel_mode_neon.txt
> @@ -6,7 +6,7 @@ TL;DR summary
> * Use only NEON instructions, or VFP instructions that don't rely on support
> code
> * Isolate your NEON code in a separate compilation unit, and compile it with
> - '-mfpu=neon -mfloat-abi=softfp'
> + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
> * Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
> NEON code
> * Don't sleep in your NEON code, and be aware that it will be executed with
> @@ -87,7 +87,7 @@ instructions appearing in unexpected places if no special care is taken.
> Therefore, the recommended and only supported way of using NEON/VFP in the
> kernel is by adhering to the following rules:
> * isolate the NEON code in a separate compilation unit and compile it with
> - '-mfpu=neon -mfloat-abi=softfp';
> + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
> * issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
> into the unit containing the NEON code from a compilation unit which is *not*
> built with the GCC flag '-mfpu=neon' set.
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index ad25fd1872c7..0bff0176db2c 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
> $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
>
> ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> - NEON_FLAGS := -mfloat-abi=softfp -mfpu=neon
> + NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> CFLAGS_xor-neon.o += $(NEON_FLAGS)
> obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
> endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index a6741a895189..4600b62d845f 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,7 +14,7 @@
> MODULE_LICENSE("GPL");
>
> #ifndef __ARM_NEON__
> -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
> #endif
>
> /*
> diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
> index 2f8b61dfd9b0..bfec7c87c61e 100644
> --- a/lib/raid6/Makefile
> +++ b/lib/raid6/Makefile
> @@ -25,7 +25,7 @@ endif
> ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> NEON_FLAGS := -ffreestanding
> ifeq ($(ARCH),arm)
> -NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
> +NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> endif
> CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
> ifeq ($(ARCH),arm64)
> --
> 2.20.1
>
--
Thanks,
~Nick Desaulniers
While building arm32 allyesconfig, I ran into the following errors:
arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
'-mfloat-abi=softfp -mfpu=neon'
In file included from lib/raid6/neon1.c:27:
/home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
error: "NEON support not enabled"
Building V=1 showed NEON_FLAGS getting passed along to Clang but
__ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
which is the '-march' value for allyesconfig.
>From lib/Basic/Targets/ARM.cpp in the Clang source:
// This only gets set when Neon instructions are actually available, unlike
// the VFP define, hence the soft float and arch check. This is subtly
// different from gcc, we follow the intent which was that it should be set
// when Neon instructions are actually available.
if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
Builder.defineMacro("__ARM_NEON", "1");
Builder.defineMacro("__ARM_NEON__");
// current AArch32 NEON implementations do not support double-precision
// floating-point even when it is present in VFP.
Builder.defineMacro("__ARM_NEON_FP",
"0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
}
Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
definined by Clang. This doesn't functionally change anything because
that code will only run where NEON is supported, which is implicitly
armv7.
Link: https://github.com/ClangBuiltLinux/linux/issues/287
Suggested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Acked-by: Nicolas Pitre <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
---
Resending with Nicolas's ack and Nick's review, to give others a chance
to pitch in with review/testing before submitting it to the patch
system (specifically Stefan, sorry I should have included you in the
previous posting).
With this patch and:
* A tip of tree LLVM build (I used apt.llvm.org)
* Clang GCOV support: https://lore.kernel.org/lkml/[email protected]/
* Two minor hacks for unrelated issues that are still being worked through:
* https://raw.githubusercontent.com/nathanchance/patches/c313b2fa0efb/linux/build-hax/0003-DO-NOT-UPSTREAM-ARM-Don-t-select-HAVE_FUNCTION_TRACE.patch (see https://github.com/ClangBuiltLinux/linux/issues/35)
* https://gist.githubusercontent.com/nathanchance/b2f5a4015abade1a41e78d5fc3235c5b/raw/744321882ab05511331f26896bad7c9f0056a6a5/gistfile1.txt (see https://github.com/ClangBuiltLinux/linux/issues/325)
I can build and link a little endian allyesconfig ARM kernel. This
patch alone works with GCC 8.2.0 and binutils 2.31.1.
Documentation/arm/kernel_mode_neon.txt | 4 ++--
arch/arm/lib/Makefile | 2 +-
arch/arm/lib/xor-neon.c | 2 +-
lib/raid6/Makefile | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
index 525452726d31..b9e060c5b61e 100644
--- a/Documentation/arm/kernel_mode_neon.txt
+++ b/Documentation/arm/kernel_mode_neon.txt
@@ -6,7 +6,7 @@ TL;DR summary
* Use only NEON instructions, or VFP instructions that don't rely on support
code
* Isolate your NEON code in a separate compilation unit, and compile it with
- '-mfpu=neon -mfloat-abi=softfp'
+ '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
* Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
NEON code
* Don't sleep in your NEON code, and be aware that it will be executed with
@@ -87,7 +87,7 @@ instructions appearing in unexpected places if no special care is taken.
Therefore, the recommended and only supported way of using NEON/VFP in the
kernel is by adhering to the following rules:
* isolate the NEON code in a separate compilation unit and compile it with
- '-mfpu=neon -mfloat-abi=softfp';
+ '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
* issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
into the unit containing the NEON code from a compilation unit which is *not*
built with the GCC flag '-mfpu=neon' set.
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index ad25fd1872c7..0bff0176db2c 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
$(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
- NEON_FLAGS := -mfloat-abi=softfp -mfpu=neon
+ NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
CFLAGS_xor-neon.o += $(NEON_FLAGS)
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index 2c40aeab3eaa..c691b901092f 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,7 +14,7 @@
MODULE_LICENSE("GPL");
#ifndef __ARM_NEON__
-#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
+#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
#endif
/*
diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
index 4e90d443d1b0..e723eacf7868 100644
--- a/lib/raid6/Makefile
+++ b/lib/raid6/Makefile
@@ -39,7 +39,7 @@ endif
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
NEON_FLAGS := -ffreestanding
ifeq ($(ARCH),arm)
-NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
+NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
endif
CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
ifeq ($(ARCH),arm64)
--
2.20.1
On 26.01.2019 05:01, Nathan Chancellor wrote:
> While building arm32 allyesconfig, I ran into the following errors:
>
> arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
> '-mfloat-abi=softfp -mfpu=neon'
>
> In file included from lib/raid6/neon1.c:27:
> /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
> error: "NEON support not enabled"
>
> Building V=1 showed NEON_FLAGS getting passed along to Clang but
> __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
> only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
> which is the '-march' value for allyesconfig.
>
>>From lib/Basic/Targets/ARM.cpp in the Clang source:
>
> // This only gets set when Neon instructions are actually available, unlike
> // the VFP define, hence the soft float and arch check. This is subtly
> // different from gcc, we follow the intent which was that it should be set
> // when Neon instructions are actually available.
> if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
> Builder.defineMacro("__ARM_NEON", "1");
> Builder.defineMacro("__ARM_NEON__");
> // current AArch32 NEON implementations do not support double-precision
> // floating-point even when it is present in VFP.
> Builder.defineMacro("__ARM_NEON_FP",
> "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
> }
>
> Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
> beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
> definined by Clang. This doesn't functionally change anything because
> that code will only run where NEON is supported, which is implicitly
> armv7.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/287
> Suggested-by: Ard Biesheuvel <[email protected]>
> Signed-off-by: Nathan Chancellor <[email protected]>
> Acked-by: Nicolas Pitre <[email protected]>
> Reviewed-by: Nick Desaulniers <[email protected]>
> ---
>
> Resending with Nicolas's ack and Nick's review, to give others a chance
> to pitch in with review/testing before submitting it to the patch
> system (specifically Stefan, sorry I should have included you in the
> previous posting).
No worries.
Looks good to me.
Reviewed-by: Stefan Agner <[email protected]>
--
Stefan
>
> With this patch and:
> * A tip of tree LLVM build (I used apt.llvm.org)
> * Clang GCOV support:
> https://lore.kernel.org/lkml/[email protected]/
> * Two minor hacks for unrelated issues that are still being worked through:
> *
> https://raw.githubusercontent.com/nathanchance/patches/c313b2fa0efb/linux/build-hax/0003-DO-NOT-UPSTREAM-ARM-Don-t-select-HAVE_FUNCTION_TRACE.patch
> (see https://github.com/ClangBuiltLinux/linux/issues/35)
> *
> https://gist.githubusercontent.com/nathanchance/b2f5a4015abade1a41e78d5fc3235c5b/raw/744321882ab05511331f26896bad7c9f0056a6a5/gistfile1.txt
> (see https://github.com/ClangBuiltLinux/linux/issues/325)
>
> I can build and link a little endian allyesconfig ARM kernel. This
> patch alone works with GCC 8.2.0 and binutils 2.31.1.
>
> Documentation/arm/kernel_mode_neon.txt | 4 ++--
> arch/arm/lib/Makefile | 2 +-
> arch/arm/lib/xor-neon.c | 2 +-
> lib/raid6/Makefile | 2 +-
> 4 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/arm/kernel_mode_neon.txt
> b/Documentation/arm/kernel_mode_neon.txt
> index 525452726d31..b9e060c5b61e 100644
> --- a/Documentation/arm/kernel_mode_neon.txt
> +++ b/Documentation/arm/kernel_mode_neon.txt
> @@ -6,7 +6,7 @@ TL;DR summary
> * Use only NEON instructions, or VFP instructions that don't rely on support
> code
> * Isolate your NEON code in a separate compilation unit, and compile it with
> - '-mfpu=neon -mfloat-abi=softfp'
> + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
> * Put kernel_neon_begin() and kernel_neon_end() calls around the
> calls into your
> NEON code
> * Don't sleep in your NEON code, and be aware that it will be executed with
> @@ -87,7 +87,7 @@ instructions appearing in unexpected places if no
> special care is taken.
> Therefore, the recommended and only supported way of using NEON/VFP in the
> kernel is by adhering to the following rules:
> * isolate the NEON code in a separate compilation unit and compile it with
> - '-mfpu=neon -mfloat-abi=softfp';
> + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
> * issue the calls to kernel_neon_begin(), kernel_neon_end() as well
> as the calls
> into the unit containing the NEON code from a compilation unit which is *not*
> built with the GCC flag '-mfpu=neon' set.
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index ad25fd1872c7..0bff0176db2c 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
> $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
>
> ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> - NEON_FLAGS := -mfloat-abi=softfp -mfpu=neon
> + NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> CFLAGS_xor-neon.o += $(NEON_FLAGS)
> obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
> endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index 2c40aeab3eaa..c691b901092f 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,7 +14,7 @@
> MODULE_LICENSE("GPL");
>
> #ifndef __ARM_NEON__
> -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> +#error You should compile this file with '-march=armv7-a
> -mfloat-abi=softfp -mfpu=neon'
> #endif
>
> /*
> diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
> index 4e90d443d1b0..e723eacf7868 100644
> --- a/lib/raid6/Makefile
> +++ b/lib/raid6/Makefile
> @@ -39,7 +39,7 @@ endif
> ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> NEON_FLAGS := -ffreestanding
> ifeq ($(ARCH),arm)
> -NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
> +NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> endif
> CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
> ifeq ($(ARCH),arm64)
On Sat, Dec 15, 2018 at 10:24 PM Nathan Chancellor
<[email protected]> wrote:
> endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index a6741a895189..4600b62d845f 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,7 +14,7 @@
> MODULE_LICENSE("GPL");
>
> #ifndef __ARM_NEON__
> -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
> #endif
>
I see this patch has made it in now, but I also see two other problems with the
same file that prevent it from working right with clang:
- it triggers #warning This code requires at least version 4.6 of GCC
- As I reported in https://bugs.llvm.org/show_bug.cgi?id=40976, even
when it builds cleanly, it does not get vectorized.
Has anyone actually managed to get this to do the right thing?
Arnd
On Mon, 11 Mar 2019 at 17:22, Arnd Bergmann <[email protected]> wrote:
>
> On Sat, Dec 15, 2018 at 10:24 PM Nathan Chancellor
> <[email protected]> wrote:
> > endif
> > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> > index a6741a895189..4600b62d845f 100644
> > --- a/arch/arm/lib/xor-neon.c
> > +++ b/arch/arm/lib/xor-neon.c
> > @@ -14,7 +14,7 @@
> > MODULE_LICENSE("GPL");
> >
> > #ifndef __ARM_NEON__
> > -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> > +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
> > #endif
> >
>
> I see this patch has made it in now, but I also see two other problems with the
> same file that prevent it from working right with clang:
>
> - it triggers #warning This code requires at least version 4.6 of GCC
What is currently the oldest GCC we support for ARM?
> - As I reported in https://bugs.llvm.org/show_bug.cgi?id=40976, even
> when it builds cleanly, it does not get vectorized.
>
> Has anyone actually managed to get this to do the right thing?
>
On my Cortex-A57 under KVM, I get this at boot
[ 0.002287] xor: measuring software checksum speed
[ 0.100054] arm4regs : 5212.800 MB/sec
[ 0.200131] 8regs : 3472.800 MB/sec
[ 0.300205] 32regs : 3282.000 MB/sec
[ 0.400281] neon : 7011.600 MB/sec
So that means that
a) the cost model is inaccurate, at least for some cores,
b) given that the code is only selected if it is faster, it would be
nice if we could override the cost model based decisions made by the
vectorizer.
On Mon, Mar 11, 2019 at 5:49 PM Ard Biesheuvel
<[email protected]> wrote:
>
> On Mon, 11 Mar 2019 at 17:22, Arnd Bergmann <[email protected]> wrote:
> >
> > On Sat, Dec 15, 2018 at 10:24 PM Nathan Chancellor
> > <[email protected]> wrote:
> > > endif
> > > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> > > index a6741a895189..4600b62d845f 100644
> > > --- a/arch/arm/lib/xor-neon.c
> > > +++ b/arch/arm/lib/xor-neon.c
> > > @@ -14,7 +14,7 @@
> > > MODULE_LICENSE("GPL");
> > >
> > > #ifndef __ARM_NEON__
> > > -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> > > +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
> > > #endif
> > >
> >
> > I see this patch has made it in now, but I also see two other problems with the
> > same file that prevent it from working right with clang:
> >
> > - it triggers #warning This code requires at least version 4.6 of GCC
>
> What is currently the oldest GCC we support for ARM?
Linux overall requires gcc-4.6, so we could just as well drop this
check, good point.
Arnd