2020-08-05 18:21:14

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH] arm64: tlb: fix ARM64_TLB_RANGE with LLVM's integrated assembler

Commit 7c78f67e9bd9 ("arm64: enable tlbi range instructions") breaks
LLVM's integrated assembler, because -Wa,-march is only passed to
external assemblers and therefore, the new instructions are not enabled
when IAS is used.

As binutils doesn't support .arch_extension tlb-rmi, this change adds
.arch armv8.4-a to __TLBI_0 and __TLBI_1 to fix the issue with both LLVM
IAS and binutils.

Fixes: 7c78f67e9bd9 ("arm64: enable tlbi range instructions")
Link: https://github.com/ClangBuiltLinux/linux/issues/1106
Signed-off-by: Sami Tolvanen <[email protected]>
---
arch/arm64/include/asm/tlbflush.h | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index d493174415db..66c2aab5e9cb 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -16,6 +16,16 @@
#include <asm/cputype.h>
#include <asm/mmu.h>

+/*
+ * Enable ARMv8.4-TLBI instructions with ARM64_TLB_RANGE. Note that binutils
+ * doesn't support .arch_extension tlb-rmi, so use .arch armv8.4-a instead.
+ */
+#ifdef CONFIG_ARM64_TLB_RANGE
+#define __TLBI_PREAMBLE ".arch armv8.4-a\n"
+#else
+#define __TLBI_PREAMBLE
+#endif
+
/*
* Raw TLBI operations.
*
@@ -28,14 +38,16 @@
* not. The macros handles invoking the asm with or without the
* register argument as appropriate.
*/
-#define __TLBI_0(op, arg) asm ("tlbi " #op "\n" \
+#define __TLBI_0(op, arg) asm (__TLBI_PREAMBLE \
+ "tlbi " #op "\n" \
ALTERNATIVE("nop\n nop", \
"dsb ish\n tlbi " #op, \
ARM64_WORKAROUND_REPEAT_TLBI, \
CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
: : )

-#define __TLBI_1(op, arg) asm ("tlbi " #op ", %0\n" \
+#define __TLBI_1(op, arg) asm (__TLBI_PREAMBLE \
+ "tlbi " #op ", %0\n" \
ALTERNATIVE("nop\n nop", \
"dsb ish\n tlbi " #op ", %0", \
ARM64_WORKAROUND_REPEAT_TLBI, \

base-commit: 4834ce9d8e074bb7ae197632e0708219b9f389b5
--
2.28.0.163.g6104cc2f0b6-goog


2020-08-05 19:19:02

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] arm64: tlb: fix ARM64_TLB_RANGE with LLVM's integrated assembler

On Wed, Aug 5, 2020 at 11:19 AM Sami Tolvanen <[email protected]> wrote:
>
> Commit 7c78f67e9bd9 ("arm64: enable tlbi range instructions") breaks
> LLVM's integrated assembler, because -Wa,-march is only passed to
> external assemblers and therefore, the new instructions are not enabled
> when IAS is used.
>
> As binutils doesn't support .arch_extension tlb-rmi, this change adds
> .arch armv8.4-a to __TLBI_0 and __TLBI_1 to fix the issue with both LLVM
> IAS and binutils.
>
> Fixes: 7c78f67e9bd9 ("arm64: enable tlbi range instructions")
> Link: https://github.com/ClangBuiltLinux/linux/issues/1106
> Signed-off-by: Sami Tolvanen <[email protected]>

I've filed https://sourceware.org/bugzilla/show_bug.cgi?id=26339 to
discuss more with ARM binutils devs about some of the compat issues
around these assembler directives.

> ---
> arch/arm64/include/asm/tlbflush.h | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index d493174415db..66c2aab5e9cb 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -16,6 +16,16 @@
> #include <asm/cputype.h>
> #include <asm/mmu.h>
>
> +/*
> + * Enable ARMv8.4-TLBI instructions with ARM64_TLB_RANGE. Note that binutils
> + * doesn't support .arch_extension tlb-rmi, so use .arch armv8.4-a instead.
> + */
> +#ifdef CONFIG_ARM64_TLB_RANGE
> +#define __TLBI_PREAMBLE ".arch armv8.4-a\n"
> +#else
> +#define __TLBI_PREAMBLE
> +#endif
> +
> /*
> * Raw TLBI operations.
> *
> @@ -28,14 +38,16 @@
> * not. The macros handles invoking the asm with or without the
> * register argument as appropriate.
> */
> -#define __TLBI_0(op, arg) asm ("tlbi " #op "\n" \
> +#define __TLBI_0(op, arg) asm (__TLBI_PREAMBLE \
> + "tlbi " #op "\n" \
> ALTERNATIVE("nop\n nop", \
> "dsb ish\n tlbi " #op, \
> ARM64_WORKAROUND_REPEAT_TLBI, \
> CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
> : : )
>
> -#define __TLBI_1(op, arg) asm ("tlbi " #op ", %0\n" \
> +#define __TLBI_1(op, arg) asm (__TLBI_PREAMBLE \
> + "tlbi " #op ", %0\n" \
> ALTERNATIVE("nop\n nop", \
> "dsb ish\n tlbi " #op ", %0", \
> ARM64_WORKAROUND_REPEAT_TLBI, \
>
> base-commit: 4834ce9d8e074bb7ae197632e0708219b9f389b5
> --
> 2.28.0.163.g6104cc2f0b6-goog
>


--
Thanks,
~Nick Desaulniers

2020-08-06 07:19:07

by Zhenyu Ye

[permalink] [raw]
Subject: Re: [PATCH] arm64: tlb: fix ARM64_TLB_RANGE with LLVM's integrated assembler

Hi,

On 2020/8/6 2:19, Sami Tolvanen wrote:
> Commit 7c78f67e9bd9 ("arm64: enable tlbi range instructions") breaks
> LLVM's integrated assembler, because -Wa,-march is only passed to
> external assemblers and therefore, the new instructions are not enabled
> when IAS is used.
>

I have looked through the discussion on Github issues. The best way to
solve this problem is try to pass the "-Wa,-march" parameter to clang
even when IAS is enabled, which may need the cooperation of compilation
tool chains :(

Currently, I think we can solve the problem by passing
the '-march=armv8.4-a' when using the integrated assembler, just like:

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 55bc8546d9c7..e5ce184e98c2 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -91,8 +91,12 @@ KBUILD_CFLAGS += $(branch-prot-flags-y)

ifeq ($(CONFIG_AS_HAS_ARMV8_4), y)
# make sure to pass the newest target architecture to -march.
+ifneq ($(LLVM),)
+KBUILD_CFLAGS += -march=armv8.4-a
+else
KBUILD_CFLAGS += -Wa,-march=armv8.4-a
endif
+endif

ifeq ($(CONFIG_SHADOW_CALL_STACK), y)


No need to worry about that this might generate instructions that are not
supported on older hardware, because the 'TLB range' feature is only
enabled when the hardware support ARMv8.4.

> As binutils doesn't support .arch_extension tlb-rmi, this change adds
> .arch armv8.4-a to __TLBI_0 and __TLBI_1 to fix the issue with both LLVM
> IAS and binutils.
>
> Fixes: 7c78f67e9bd9 ("arm64: enable tlbi range instructions")
> Link: https://github.com/ClangBuiltLinux/linux/issues/1106
> Signed-off-by: Sami Tolvanen <[email protected]>

Thanks,
Zhenyu

2020-08-06 16:50:36

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] arm64: tlb: fix ARM64_TLB_RANGE with LLVM's integrated assembler

On Wed, Aug 05, 2020 at 12:15:54PM -0700, Nick Desaulniers wrote:
> On Wed, Aug 5, 2020 at 11:19 AM Sami Tolvanen <[email protected]> wrote:
> >
> > Commit 7c78f67e9bd9 ("arm64: enable tlbi range instructions") breaks
> > LLVM's integrated assembler, because -Wa,-march is only passed to
> > external assemblers and therefore, the new instructions are not enabled
> > when IAS is used.
> >
> > As binutils doesn't support .arch_extension tlb-rmi, this change adds
> > .arch armv8.4-a to __TLBI_0 and __TLBI_1 to fix the issue with both LLVM
> > IAS and binutils.
> >
> > Fixes: 7c78f67e9bd9 ("arm64: enable tlbi range instructions")
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1106
> > Signed-off-by: Sami Tolvanen <[email protected]>
>
> I've filed https://sourceware.org/bugzilla/show_bug.cgi?id=26339 to
> discuss more with ARM binutils devs about some of the compat issues
> around these assembler directives.

Until we get some alignment between binutils and the LLVM's integrated
assembler, the latter will be officially unsupported by the kernel. It's
just insane to maintain different options for architecture extensions,
e.g. memtag vs mte, armv8.4-a vs tlb-rmi. Even worse, I think you can't
add some .arch_extension in binutils without bumping the .arch version.
So maybe ".arch_extension tlb-rmi" works for the integrated assembler
but, if such option is added to binutils, it would require ".arch
armv8.4-a" as well.

So, please sort it out guys, collaborate between yourselves when
inventing architecture mnemonics so that you are aligned.

We make take the occasional patch to fix the integrated assembler if
it's not intrusive but at some point we may say it's just not supported
and reject the fix. We have a hard line on the compiler not generating
newer than ARMv8.0 instructions (unless they are in the NOP/HINT space),
so we limit newer instructions to (inline) asm. That's why -march
doesn't work, it needs to be -Wa,-march.

--
Catalin

2020-08-06 17:13:38

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] arm64: tlb: fix ARM64_TLB_RANGE with LLVM's integrated assembler

On Wed, Aug 05, 2020 at 11:19:20AM -0700, Sami Tolvanen wrote:
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index d493174415db..66c2aab5e9cb 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -16,6 +16,16 @@
> #include <asm/cputype.h>
> #include <asm/mmu.h>
>
> +/*
> + * Enable ARMv8.4-TLBI instructions with ARM64_TLB_RANGE. Note that binutils
> + * doesn't support .arch_extension tlb-rmi, so use .arch armv8.4-a instead.
> + */
> +#ifdef CONFIG_ARM64_TLB_RANGE
> +#define __TLBI_PREAMBLE ".arch armv8.4-a\n"
> +#else
> +#define __TLBI_PREAMBLE
> +#endif
> +
> /*
> * Raw TLBI operations.
> *
> @@ -28,14 +38,16 @@
> * not. The macros handles invoking the asm with or without the
> * register argument as appropriate.
> */
> -#define __TLBI_0(op, arg) asm ("tlbi " #op "\n" \
> +#define __TLBI_0(op, arg) asm (__TLBI_PREAMBLE \
> + "tlbi " #op "\n" \
> ALTERNATIVE("nop\n nop", \
> "dsb ish\n tlbi " #op, \
> ARM64_WORKAROUND_REPEAT_TLBI, \
> CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
> : : )
>
> -#define __TLBI_1(op, arg) asm ("tlbi " #op ", %0\n" \
> +#define __TLBI_1(op, arg) asm (__TLBI_PREAMBLE \
> + "tlbi " #op ", %0\n" \
> ALTERNATIVE("nop\n nop", \
> "dsb ish\n tlbi " #op ", %0", \
> ARM64_WORKAROUND_REPEAT_TLBI, \

A potential problem here is that for gas (not sure about the integrated
assembler), .arch overrides any other .arch. So if we end up with two
preambles included in the same generated .S files in the future, it will
lead to some random behaviour.

Does the LLVM integrated assembler have the same behaviour on .arch
overriding a prior .arch?

Maybe a better solution is for all inline asm on arm64 to have a
standard preamble which is the maximum supported architecture version.
We can add individual .arch_extension as those are not overriding.

--
Catalin

2020-08-06 17:33:45

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] arm64: tlb: fix ARM64_TLB_RANGE with LLVM's integrated assembler

On Thu, Aug 06, 2020 at 03:17:40PM +0800, Zhenyu Ye wrote:
> On 2020/8/6 2:19, Sami Tolvanen wrote:
> > Commit 7c78f67e9bd9 ("arm64: enable tlbi range instructions") breaks
> > LLVM's integrated assembler, because -Wa,-march is only passed to
> > external assemblers and therefore, the new instructions are not enabled
> > when IAS is used.
>
> I have looked through the discussion on Github issues. The best way to
> solve this problem is try to pass the "-Wa,-march" parameter to clang
> even when IAS is enabled, which may need the cooperation of compilation
> tool chains :(
>
> Currently, I think we can solve the problem by passing
> the '-march=armv8.4-a' when using the integrated assembler, just like:
>
> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> index 55bc8546d9c7..e5ce184e98c2 100644
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -91,8 +91,12 @@ KBUILD_CFLAGS += $(branch-prot-flags-y)
>
> ifeq ($(CONFIG_AS_HAS_ARMV8_4), y)
> # make sure to pass the newest target architecture to -march.
> +ifneq ($(LLVM),)
> +KBUILD_CFLAGS += -march=armv8.4-a
> +else
> KBUILD_CFLAGS += -Wa,-march=armv8.4-a
> endif
> +endif

No, see my other reply. This only works for .S files. For inline
assembly, passing -march to .c files will make the compiler generate
ARMv8.4 instructions and break the kernel single image that's supposed
to run on ARMv8.0 hardware (even if you don't have any range TLBI
instructions).

--
Catalin

2020-08-06 19:25:05

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH] arm64: tlb: fix ARM64_TLB_RANGE with LLVM's integrated assembler

On Thu, Aug 06, 2020 at 01:01:09PM +0100, Catalin Marinas wrote:
> On Wed, Aug 05, 2020 at 11:19:20AM -0700, Sami Tolvanen wrote:
> > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> > index d493174415db..66c2aab5e9cb 100644
> > --- a/arch/arm64/include/asm/tlbflush.h
> > +++ b/arch/arm64/include/asm/tlbflush.h
> > @@ -16,6 +16,16 @@
> > #include <asm/cputype.h>
> > #include <asm/mmu.h>
> >
> > +/*
> > + * Enable ARMv8.4-TLBI instructions with ARM64_TLB_RANGE. Note that binutils
> > + * doesn't support .arch_extension tlb-rmi, so use .arch armv8.4-a instead.
> > + */
> > +#ifdef CONFIG_ARM64_TLB_RANGE
> > +#define __TLBI_PREAMBLE ".arch armv8.4-a\n"
> > +#else
> > +#define __TLBI_PREAMBLE
> > +#endif
> > +
> > /*
> > * Raw TLBI operations.
> > *
> > @@ -28,14 +38,16 @@
> > * not. The macros handles invoking the asm with or without the
> > * register argument as appropriate.
> > */
> > -#define __TLBI_0(op, arg) asm ("tlbi " #op "\n" \
> > +#define __TLBI_0(op, arg) asm (__TLBI_PREAMBLE \
> > + "tlbi " #op "\n" \
> > ALTERNATIVE("nop\n nop", \
> > "dsb ish\n tlbi " #op, \
> > ARM64_WORKAROUND_REPEAT_TLBI, \
> > CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
> > : : )
> >
> > -#define __TLBI_1(op, arg) asm ("tlbi " #op ", %0\n" \
> > +#define __TLBI_1(op, arg) asm (__TLBI_PREAMBLE \
> > + "tlbi " #op ", %0\n" \
> > ALTERNATIVE("nop\n nop", \
> > "dsb ish\n tlbi " #op ", %0", \
> > ARM64_WORKAROUND_REPEAT_TLBI, \
>
> A potential problem here is that for gas (not sure about the integrated
> assembler), .arch overrides any other .arch. So if we end up with two
> preambles included in the same generated .S files in the future, it will
> lead to some random behaviour.
>
> Does the LLVM integrated assembler have the same behaviour on .arch
> overriding a prior .arch?

I would assume so, but each inline assembly block is independent in
LLVM, so unless there are .arch changes within the block, that shouldn't
be an issue for the integrated assembler.

> Maybe a better solution is for all inline asm on arm64 to have a
> standard preamble which is the maximum supported architecture version.
> We can add individual .arch_extension as those are not overriding.

Sure, that works. How would you feel about something like this, so we can
keep the preamble in sync with future -Wa,-march changes? I'm not sure if
asm/compiler.h is the correct place for the definition though.

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 55bc8546d9c7..0dd07059beaa 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -82,8 +82,8 @@ endif
# compiler to generate them and consequently to break the single image contract
# we pass it only to the assembler. This option is utilized only in case of non
# integrated assemblers.
-ifneq ($(CONFIG_AS_HAS_ARMV8_4), y)
-branch-prot-flags-$(CONFIG_AS_HAS_PAC) += -Wa,-march=armv8.3-a
+ifeq ($(CONFIG_AS_HAS_PAC), y)
+asm-arch := armv8.3-a
endif
endif

@@ -91,7 +91,12 @@ KBUILD_CFLAGS += $(branch-prot-flags-y)

ifeq ($(CONFIG_AS_HAS_ARMV8_4), y)
# make sure to pass the newest target architecture to -march.
-KBUILD_CFLAGS += -Wa,-march=armv8.4-a
+asm-arch := armv8.4-a
+endif
+
+ifdef asm-arch
+KBUILD_CFLAGS += -Wa,-march=$(asm-arch) \
+ -DARM64_ASM_ARCH='"$(asm-arch)"'
endif

ifeq ($(CONFIG_SHADOW_CALL_STACK), y)
diff --git a/arch/arm64/include/asm/compiler.h b/arch/arm64/include/asm/compiler.h
index 51a7ce87cdfe..6fb2e6bcc392 100644
--- a/arch/arm64/include/asm/compiler.h
+++ b/arch/arm64/include/asm/compiler.h
@@ -2,6 +2,12 @@
#ifndef __ASM_COMPILER_H
#define __ASM_COMPILER_H

+#ifdef ARM64_ASM_ARCH
+#define ARM64_ASM_PREAMBLE ".arch " ARM64_ASM_ARCH "\n"
+#else
+#define ARM64_ASM_PREAMBLE
+#endif
+
/*
* The EL0/EL1 pointer bits used by a pointer authentication code.
* This is dependent on TBI0/TBI1 being enabled, or bits 63:56 would also apply.
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index d493174415db..cc3f5a33ff9c 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -28,14 +28,16 @@
* not. The macros handles invoking the asm with or without the
* register argument as appropriate.
*/
-#define __TLBI_0(op, arg) asm ("tlbi " #op "\n" \
+#define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE \
+ "tlbi " #op "\n" \
ALTERNATIVE("nop\n nop", \
"dsb ish\n tlbi " #op, \
ARM64_WORKAROUND_REPEAT_TLBI, \
CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
: : )

-#define __TLBI_1(op, arg) asm ("tlbi " #op ", %0\n" \
+#define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \
+ "tlbi " #op ", %0\n" \
ALTERNATIVE("nop\n nop", \
"dsb ish\n tlbi " #op ", %0", \
ARM64_WORKAROUND_REPEAT_TLBI, \

Sami