2019-10-07 20:15:36

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH] arm64: lse: fix LSE atomics with LLVM's integrated assembler

Unlike gcc, clang considers each inline assembly block to be independent
and therefore, when using the integrated assembler for inline assembly,
any preambles that enable features must be repeated in each block.

Instead of changing all inline assembly blocks that use LSE, this change
adds -march=armv8-a+lse to KBUILD_CFLAGS, which works with both clang
and gcc.

Signed-off-by: Sami Tolvanen <[email protected]>
---
arch/arm64/Makefile | 2 ++
arch/arm64/include/asm/lse.h | 2 --
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 84a3d502c5a5..7a7c0cb8ed60 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -36,6 +36,8 @@ lseinstr := $(call as-instr,.arch_extension lse,-DCONFIG_AS_LSE=1)
ifeq ($(CONFIG_ARM64_LSE_ATOMICS), y)
ifeq ($(lseinstr),)
$(warning LSE atomics not supported by binutils)
+ else
+KBUILD_CFLAGS += -march=armv8-a+lse
endif
endif

diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
index 80b388278149..8603a9881529 100644
--- a/arch/arm64/include/asm/lse.h
+++ b/arch/arm64/include/asm/lse.h
@@ -14,8 +14,6 @@
#include <asm/atomic_lse.h>
#include <asm/cpucaps.h>

-__asm__(".arch_extension lse");
-
extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
extern struct static_key_false arm64_const_caps_ready;

--
2.23.0.581.g78d2f28ef7-goog


2019-10-07 20:31:28

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Mon, Oct 7, 2019 at 1:14 PM 'Sami Tolvanen' via Clang Built Linux
<[email protected]> wrote:
>
> Unlike gcc, clang considers each inline assembly block to be independent
> and therefore, when using the integrated assembler for inline assembly,
> any preambles that enable features must be repeated in each block.
>
> Instead of changing all inline assembly blocks that use LSE, this change
> adds -march=armv8-a+lse to KBUILD_CFLAGS, which works with both clang
> and gcc.
>
> Signed-off-by: Sami Tolvanen <[email protected]>

Thanks Sami, looks like this addresses:
Link: https://github.com/ClangBuiltLinux/linux/issues/671
I tried adding `.arch armv8-a+lse` directives to all of the inline asm:
https://github.com/ClangBuiltLinux/linux/issues/573#issuecomment-535098996
though that quickly ran aground in some failed assertion using the
alternatives system that I don't quite yet understand.

One thing to be careful about is that blankets the entire kernel in
`+lse`, allowing LSE atomics to be selected at any point. The
assembler directive in the header (or per inline asm block) is more
fine grained. I'm not sure that they would be guarded by the
alternative system. Maybe that's not an issue, but it is the reason I
tried to localize the assembler directive first.

Note that Clang really wants the target arch specified by either:
1. command line argument (as in this patch)
2. per function function attribute
3. per asm statement assembler directive

1 is the smallest incision.

> ---
> arch/arm64/Makefile | 2 ++
> arch/arm64/include/asm/lse.h | 2 --
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> index 84a3d502c5a5..7a7c0cb8ed60 100644
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -36,6 +36,8 @@ lseinstr := $(call as-instr,.arch_extension lse,-DCONFIG_AS_LSE=1)
> ifeq ($(CONFIG_ARM64_LSE_ATOMICS), y)
> ifeq ($(lseinstr),)
> $(warning LSE atomics not supported by binutils)
> + else
> +KBUILD_CFLAGS += -march=armv8-a+lse
> endif
> endif
>
> diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
> index 80b388278149..8603a9881529 100644
> --- a/arch/arm64/include/asm/lse.h
> +++ b/arch/arm64/include/asm/lse.h
> @@ -14,8 +14,6 @@
> #include <asm/atomic_lse.h>
> #include <asm/cpucaps.h>
>
> -__asm__(".arch_extension lse");
> -
> extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
> extern struct static_key_false arm64_const_caps_ready;
>
> --
> 2.23.0.581.g78d2f28ef7-goog
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20191007201452.208067-1-samitolvanen%40google.com.



--
Thanks,
~Nick Desaulniers

2019-10-07 21:25:09

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Mon, Oct 7, 2019 at 1:28 PM 'Nick Desaulniers' via Clang Built
Linux <[email protected]> wrote:
> I tried adding `.arch armv8-a+lse` directives to all of the inline asm:
> https://github.com/ClangBuiltLinux/linux/issues/573#issuecomment-535098996

Yes, I had a similar patch earlier. I feel like using a command line
parameter here is cleaner, but I'm fine with either solution.

> One thing to be careful about is that blankets the entire kernel in
> `+lse`, allowing LSE atomics to be selected at any point.

Is that a problem? The current code allows LSE instructions with gcc
in any file that includes <asm/lse.h>, which turns out to be quite a
few places.

Sami

2019-10-07 21:50:12

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Mon, Oct 7, 2019 at 2:24 PM Sami Tolvanen <[email protected]> wrote:
>
> On Mon, Oct 7, 2019 at 1:28 PM 'Nick Desaulniers' via Clang Built
> Linux <[email protected]> wrote:
> > I tried adding `.arch armv8-a+lse` directives to all of the inline asm:
> > https://github.com/ClangBuiltLinux/linux/issues/573#issuecomment-535098996
>
> Yes, I had a similar patch earlier. I feel like using a command line
> parameter here is cleaner, but I'm fine with either solution.
>
> > One thing to be careful about is that blankets the entire kernel in
> > `+lse`, allowing LSE atomics to be selected at any point.
>
> Is that a problem? The current code allows LSE instructions with gcc
> in any file that includes <asm/lse.h>, which turns out to be quite a
> few places.

I may be mistaken, but I don't think inline asm directives allow the C
compiler to change what instructions it selects for C code, but
command line arguments to the C compiler do.

Grepping the kernel for some of the functions and memory orderings
turns up a few hits:
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

I'm worried that one of these might lower to LSE atomics without
ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.
But I did just boot test this patch but using GAS in QEMU (on a -cpu
cortex-a72 which I suspect should not have lse instructions by default
IIUC), FWIW.
Tested-by: Nick Desaulniers <[email protected]>
Maybe the maintainers have more thoughts?
--
Thanks,
~Nick Desaulniers

2019-10-08 08:38:31

by Andrew Murray

[permalink] [raw]
Subject: Re: [PATCH] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Mon, Oct 07, 2019 at 01:28:19PM -0700, Nick Desaulniers wrote:
> On Mon, Oct 7, 2019 at 1:14 PM 'Sami Tolvanen' via Clang Built Linux
> <[email protected]> wrote:
> >
> > Unlike gcc, clang considers each inline assembly block to be independent
> > and therefore, when using the integrated assembler for inline assembly,
> > any preambles that enable features must be repeated in each block.
> >
> > Instead of changing all inline assembly blocks that use LSE, this change
> > adds -march=armv8-a+lse to KBUILD_CFLAGS, which works with both clang
> > and gcc.
> >
> > Signed-off-by: Sami Tolvanen <[email protected]>
>
> Thanks Sami, looks like this addresses:
> Link: https://github.com/ClangBuiltLinux/linux/issues/671
> I tried adding `.arch armv8-a+lse` directives to all of the inline asm:
> https://github.com/ClangBuiltLinux/linux/issues/573#issuecomment-535098996
> though that quickly ran aground in some failed assertion using the
> alternatives system that I don't quite yet understand.

I think these issues somehow are related to the strange way we used to
jump to the out-of-line fallbacks. Since around addfc38672c7 ("arm64:
atomics: avoid out-of-line ll/sc atomics") this approach was removed.

I reproduced your patch on 5.4-rc2 and no longer get the clang build
errors. Therefore this approach is viable option.

>
> One thing to be careful about is that blankets the entire kernel in
> `+lse`, allowing LSE atomics to be selected at any point. The
> assembler directive in the header (or per inline asm block) is more
> fine grained. I'm not sure that they would be guarded by the
> alternative system. Maybe that's not an issue, but it is the reason I
> tried to localize the assembler directive first.
>
> Note that Clang really wants the target arch specified by either:
> 1. command line argument (as in this patch)

This makes me nervous, as we're telling the compiler that the machine
we're building for has LSE - obviously it would be perfectly acceptable
for the compiler to then throw in some LSE instructions at some point.
Thus something may break further down the line.

> 2. per function function attribute
> 3. per asm statement assembler directive

Keen to hear Will's thoughts - but I'd suggest this is probably the
safest way forward.

Thanks,

Andrew Murray

>
> 1 is the smallest incision.
>
> > ---
> > arch/arm64/Makefile | 2 ++
> > arch/arm64/include/asm/lse.h | 2 --
> > 2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> > index 84a3d502c5a5..7a7c0cb8ed60 100644
> > --- a/arch/arm64/Makefile
> > +++ b/arch/arm64/Makefile
> > @@ -36,6 +36,8 @@ lseinstr := $(call as-instr,.arch_extension lse,-DCONFIG_AS_LSE=1)
> > ifeq ($(CONFIG_ARM64_LSE_ATOMICS), y)
> > ifeq ($(lseinstr),)
> > $(warning LSE atomics not supported by binutils)
> > + else
> > +KBUILD_CFLAGS += -march=armv8-a+lse
> > endif
> > endif
> >
> > diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
> > index 80b388278149..8603a9881529 100644
> > --- a/arch/arm64/include/asm/lse.h
> > +++ b/arch/arm64/include/asm/lse.h
> > @@ -14,8 +14,6 @@
> > #include <asm/atomic_lse.h>
> > #include <asm/cpucaps.h>
> >
> > -__asm__(".arch_extension lse");
> > -
> > extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
> > extern struct static_key_false arm64_const_caps_ready;
> >
> > --
> > 2.23.0.581.g78d2f28ef7-goog
> >
> > --
> > You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> > To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20191007201452.208067-1-samitolvanen%40google.com.
>
>
>
> --
> Thanks,
> ~Nick Desaulniers

2019-10-08 15:23:01

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Mon, Oct 7, 2019 at 2:46 PM 'Nick Desaulniers' via Clang Built
Linux <[email protected]> wrote:
> I'm worried that one of these might lower to LSE atomics without
> ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.

True, that's a valid concern. I think adding the directive to each
assembly block is the way forward then, assuming the maintainers are
fine with that.

Sami

2019-10-08 16:20:00

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On 08/10/2019 16:22, Sami Tolvanen wrote:
> On Mon, Oct 7, 2019 at 2:46 PM 'Nick Desaulniers' via Clang Built
> Linux <[email protected]> wrote:
>> I'm worried that one of these might lower to LSE atomics without
>> ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.
>
> True, that's a valid concern. I think adding the directive to each
> assembly block is the way forward then, assuming the maintainers are
> fine with that.

It's definitely a valid concern in principle, but in practice note that
lse.h ends up included in ~99% of C files, so the extension is enabled
more or less everywhere already.

(based on a quick hack involving '#pragma message' and grepping the
build logs)

Robin.

2019-10-08 21:07:46

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Tue, 8 Oct 2019 at 18:19, Robin Murphy <[email protected]> wrote:
>
> On 08/10/2019 16:22, Sami Tolvanen wrote:
> > On Mon, Oct 7, 2019 at 2:46 PM 'Nick Desaulniers' via Clang Built
> > Linux <[email protected]> wrote:
> >> I'm worried that one of these might lower to LSE atomics without
> >> ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.
> >
> > True, that's a valid concern. I think adding the directive to each
> > assembly block is the way forward then, assuming the maintainers are
> > fine with that.
>
> It's definitely a valid concern in principle, but in practice note that
> lse.h ends up included in ~99% of C files, so the extension is enabled
> more or less everywhere already.
>

lse.h currently does

__asm__(".arch_extension lse");

which instructs the assembler to permit the use of LSE opcodes, but it
does not instruct the compiler to emit them, so this is not quite the
same thing.

2019-10-08 21:28:24

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v2] arm64: lse: fix LSE atomics with LLVM's integrated assembler

Unlike gcc, clang considers each inline assembly block to be independent
and therefore, when using the integrated assembler for inline assembly,
any preambles that enable features must be repeated in each block.

This change defines __LSE_PREAMBLE and adds it to each inline assembly
block that has LSE instructions, which allows them to be compiled also
with clang's assembler.

Link: https://github.com/ClangBuiltLinux/linux/issues/671
Signed-off-by: Sami Tolvanen <[email protected]>
---
v2:
- Add a preamble to inline assembly blocks that use LSE instead
of allowing the compiler to emit LSE instructions everywhere.

---
arch/arm64/include/asm/atomic_lse.h | 19 +++++++++++++++++++
arch/arm64/include/asm/lse.h | 6 +++---
2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index c6bd87d2915b..3ee600043042 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -14,6 +14,7 @@
static inline void __lse_atomic_##op(int i, atomic_t *v) \
{ \
asm volatile( \
+ __LSE_PREAMBLE \
" " #asm_op " %w[i], %[v]\n" \
: [i] "+r" (i), [v] "+Q" (v->counter) \
: "r" (v)); \
@@ -30,6 +31,7 @@ ATOMIC_OP(add, stadd)
static inline int __lse_atomic_fetch_##op##name(int i, atomic_t *v) \
{ \
asm volatile( \
+ __LSE_PREAMBLE \
" " #asm_op #mb " %w[i], %w[i], %[v]" \
: [i] "+r" (i), [v] "+Q" (v->counter) \
: "r" (v) \
@@ -58,6 +60,7 @@ static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \
u32 tmp; \
\
asm volatile( \
+ __LSE_PREAMBLE \
" ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
" add %w[i], %w[i], %w[tmp]" \
: [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
@@ -77,6 +80,7 @@ ATOMIC_OP_ADD_RETURN( , al, "memory")
static inline void __lse_atomic_and(int i, atomic_t *v)
{
asm volatile(
+ __LSE_PREAMBLE
" mvn %w[i], %w[i]\n"
" stclr %w[i], %[v]"
: [i] "+&r" (i), [v] "+Q" (v->counter)
@@ -87,6 +91,7 @@ static inline void __lse_atomic_and(int i, atomic_t *v)
static inline int __lse_atomic_fetch_and##name(int i, atomic_t *v) \
{ \
asm volatile( \
+ __LSE_PREAMBLE \
" mvn %w[i], %w[i]\n" \
" ldclr" #mb " %w[i], %w[i], %[v]" \
: [i] "+&r" (i), [v] "+Q" (v->counter) \
@@ -106,6 +111,7 @@ ATOMIC_FETCH_OP_AND( , al, "memory")
static inline void __lse_atomic_sub(int i, atomic_t *v)
{
asm volatile(
+ __LSE_PREAMBLE
" neg %w[i], %w[i]\n"
" stadd %w[i], %[v]"
: [i] "+&r" (i), [v] "+Q" (v->counter)
@@ -118,6 +124,7 @@ static inline int __lse_atomic_sub_return##name(int i, atomic_t *v) \
u32 tmp; \
\
asm volatile( \
+ __LSE_PREAMBLE \
" neg %w[i], %w[i]\n" \
" ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
" add %w[i], %w[i], %w[tmp]" \
@@ -139,6 +146,7 @@ ATOMIC_OP_SUB_RETURN( , al, "memory")
static inline int __lse_atomic_fetch_sub##name(int i, atomic_t *v) \
{ \
asm volatile( \
+ __LSE_PREAMBLE \
" neg %w[i], %w[i]\n" \
" ldadd" #mb " %w[i], %w[i], %[v]" \
: [i] "+&r" (i), [v] "+Q" (v->counter) \
@@ -159,6 +167,7 @@ ATOMIC_FETCH_OP_SUB( , al, "memory")
static inline void __lse_atomic64_##op(s64 i, atomic64_t *v) \
{ \
asm volatile( \
+ __LSE_PREAMBLE \
" " #asm_op " %[i], %[v]\n" \
: [i] "+r" (i), [v] "+Q" (v->counter) \
: "r" (v)); \
@@ -175,6 +184,7 @@ ATOMIC64_OP(add, stadd)
static inline long __lse_atomic64_fetch_##op##name(s64 i, atomic64_t *v)\
{ \
asm volatile( \
+ __LSE_PREAMBLE \
" " #asm_op #mb " %[i], %[i], %[v]" \
: [i] "+r" (i), [v] "+Q" (v->counter) \
: "r" (v) \
@@ -203,6 +213,7 @@ static inline long __lse_atomic64_add_return##name(s64 i, atomic64_t *v)\
unsigned long tmp; \
\
asm volatile( \
+ __LSE_PREAMBLE \
" ldadd" #mb " %[i], %x[tmp], %[v]\n" \
" add %[i], %[i], %x[tmp]" \
: [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
@@ -222,6 +233,7 @@ ATOMIC64_OP_ADD_RETURN( , al, "memory")
static inline void __lse_atomic64_and(s64 i, atomic64_t *v)
{
asm volatile(
+ __LSE_PREAMBLE
" mvn %[i], %[i]\n"
" stclr %[i], %[v]"
: [i] "+&r" (i), [v] "+Q" (v->counter)
@@ -232,6 +244,7 @@ static inline void __lse_atomic64_and(s64 i, atomic64_t *v)
static inline long __lse_atomic64_fetch_and##name(s64 i, atomic64_t *v) \
{ \
asm volatile( \
+ __LSE_PREAMBLE \
" mvn %[i], %[i]\n" \
" ldclr" #mb " %[i], %[i], %[v]" \
: [i] "+&r" (i), [v] "+Q" (v->counter) \
@@ -251,6 +264,7 @@ ATOMIC64_FETCH_OP_AND( , al, "memory")
static inline void __lse_atomic64_sub(s64 i, atomic64_t *v)
{
asm volatile(
+ __LSE_PREAMBLE
" neg %[i], %[i]\n"
" stadd %[i], %[v]"
: [i] "+&r" (i), [v] "+Q" (v->counter)
@@ -263,6 +277,7 @@ static inline long __lse_atomic64_sub_return##name(s64 i, atomic64_t *v) \
unsigned long tmp; \
\
asm volatile( \
+ __LSE_PREAMBLE \
" neg %[i], %[i]\n" \
" ldadd" #mb " %[i], %x[tmp], %[v]\n" \
" add %[i], %[i], %x[tmp]" \
@@ -284,6 +299,7 @@ ATOMIC64_OP_SUB_RETURN( , al, "memory")
static inline long __lse_atomic64_fetch_sub##name(s64 i, atomic64_t *v) \
{ \
asm volatile( \
+ __LSE_PREAMBLE \
" neg %[i], %[i]\n" \
" ldadd" #mb " %[i], %[i], %[v]" \
: [i] "+&r" (i), [v] "+Q" (v->counter) \
@@ -305,6 +321,7 @@ static inline s64 __lse_atomic64_dec_if_positive(atomic64_t *v)
unsigned long tmp;

asm volatile(
+ __LSE_PREAMBLE
"1: ldr %x[tmp], %[v]\n"
" subs %[ret], %x[tmp], #1\n"
" b.lt 2f\n"
@@ -331,6 +348,7 @@ static inline u##sz __lse__cmpxchg_case_##name##sz(volatile void *ptr, \
unsigned long tmp; \
\
asm volatile( \
+ __LSE_PREAMBLE \
" mov %" #w "[tmp], %" #w "[old]\n" \
" cas" #mb #sfx "\t%" #w "[tmp], %" #w "[new], %[v]\n" \
" mov %" #w "[ret], %" #w "[tmp]" \
@@ -377,6 +395,7 @@ static inline long __lse__cmpxchg_double##name(unsigned long old1, \
register unsigned long x4 asm ("x4") = (unsigned long)ptr; \
\
asm volatile( \
+ __LSE_PREAMBLE \
" casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\
" eor %[old1], %[old1], %[oldval1]\n" \
" eor %[old2], %[old2], %[oldval2]\n" \
diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
index 80b388278149..73834996c4b6 100644
--- a/arch/arm64/include/asm/lse.h
+++ b/arch/arm64/include/asm/lse.h
@@ -6,6 +6,8 @@

#if defined(CONFIG_AS_LSE) && defined(CONFIG_ARM64_LSE_ATOMICS)

+#define __LSE_PREAMBLE ".arch armv8-a+lse\n"
+
#include <linux/compiler_types.h>
#include <linux/export.h>
#include <linux/jump_label.h>
@@ -14,8 +16,6 @@
#include <asm/atomic_lse.h>
#include <asm/cpucaps.h>

-__asm__(".arch_extension lse");
-
extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
extern struct static_key_false arm64_const_caps_ready;

@@ -34,7 +34,7 @@ static inline bool system_uses_lse_atomics(void)

/* In-line patching at runtime */
#define ARM64_LSE_ATOMIC_INSN(llsc, lse) \
- ALTERNATIVE(llsc, lse, ARM64_HAS_LSE_ATOMICS)
+ ALTERNATIVE(llsc, __LSE_PREAMBLE lse, ARM64_HAS_LSE_ATOMICS)

#else /* CONFIG_AS_LSE && CONFIG_ARM64_LSE_ATOMICS */

--
2.23.0.581.g78d2f28ef7-goog

2019-10-08 22:20:18

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Tue, Oct 8, 2019 at 2:27 PM 'Sami Tolvanen' via Clang Built Linux
<[email protected]> wrote:
>
> Unlike gcc, clang considers each inline assembly block to be independent
> and therefore, when using the integrated assembler for inline assembly,
> any preambles that enable features must be repeated in each block.
>
> This change defines __LSE_PREAMBLE and adds it to each inline assembly
> block that has LSE instructions, which allows them to be compiled also
> with clang's assembler.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/671
> Signed-off-by: Sami Tolvanen <[email protected]>


Thanks, I think this will better limit the assembler to use of these
instructions, while preventing the C compiler from emitting them.
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang AS=clang -j71 clean
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang AS=clang
-j71 defconfig
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang AS=clang
-j71 fs/ext4/balloc.o
<error explosion>
$ git am <patch.eml>
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang AS=clang
-j71 fs/ext4/balloc.o
...
$ echo $?
0
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang AS=clang -j71 clean
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang -j71 defconfig
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang -j71
<builds successfully>
$ qemu-system-aarch64 -kernel arch/arm64/boot/Image.gz -machine virt
-cpu cortex-a72 -nographic --append "console=ttyAMA0" -m 2048 -initrd
/android1/buildroot/output/images/rootfs.cpio
<boots successfully; doesn't appear to regress the case of GAS, though
I doubt such a compiler directive would>

Reviewed-by: Nick Desaulniers <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>

> ---
> v2:
> - Add a preamble to inline assembly blocks that use LSE instead
> of allowing the compiler to emit LSE instructions everywhere.
>
> ---
> arch/arm64/include/asm/atomic_lse.h | 19 +++++++++++++++++++
> arch/arm64/include/asm/lse.h | 6 +++---
> 2 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
> index c6bd87d2915b..3ee600043042 100644
> --- a/arch/arm64/include/asm/atomic_lse.h
> +++ b/arch/arm64/include/asm/atomic_lse.h
> @@ -14,6 +14,7 @@
> static inline void __lse_atomic_##op(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op " %w[i], %[v]\n" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v)); \
> @@ -30,6 +31,7 @@ ATOMIC_OP(add, stadd)
> static inline int __lse_atomic_fetch_##op##name(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op #mb " %w[i], %w[i], %[v]" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v) \
> @@ -58,6 +60,7 @@ static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \
> u32 tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
> " add %w[i], %w[i], %w[tmp]" \
> : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
> @@ -77,6 +80,7 @@ ATOMIC_OP_ADD_RETURN( , al, "memory")
> static inline void __lse_atomic_and(int i, atomic_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " mvn %w[i], %w[i]\n"
> " stclr %w[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -87,6 +91,7 @@ static inline void __lse_atomic_and(int i, atomic_t *v)
> static inline int __lse_atomic_fetch_and##name(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " mvn %w[i], %w[i]\n" \
> " ldclr" #mb " %w[i], %w[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -106,6 +111,7 @@ ATOMIC_FETCH_OP_AND( , al, "memory")
> static inline void __lse_atomic_sub(int i, atomic_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " neg %w[i], %w[i]\n"
> " stadd %w[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -118,6 +124,7 @@ static inline int __lse_atomic_sub_return##name(int i, atomic_t *v) \
> u32 tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %w[i], %w[i]\n" \
> " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
> " add %w[i], %w[i], %w[tmp]" \
> @@ -139,6 +146,7 @@ ATOMIC_OP_SUB_RETURN( , al, "memory")
> static inline int __lse_atomic_fetch_sub##name(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %w[i], %w[i]\n" \
> " ldadd" #mb " %w[i], %w[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -159,6 +167,7 @@ ATOMIC_FETCH_OP_SUB( , al, "memory")
> static inline void __lse_atomic64_##op(s64 i, atomic64_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op " %[i], %[v]\n" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v)); \
> @@ -175,6 +184,7 @@ ATOMIC64_OP(add, stadd)
> static inline long __lse_atomic64_fetch_##op##name(s64 i, atomic64_t *v)\
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op #mb " %[i], %[i], %[v]" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v) \
> @@ -203,6 +213,7 @@ static inline long __lse_atomic64_add_return##name(s64 i, atomic64_t *v)\
> unsigned long tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " ldadd" #mb " %[i], %x[tmp], %[v]\n" \
> " add %[i], %[i], %x[tmp]" \
> : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
> @@ -222,6 +233,7 @@ ATOMIC64_OP_ADD_RETURN( , al, "memory")
> static inline void __lse_atomic64_and(s64 i, atomic64_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " mvn %[i], %[i]\n"
> " stclr %[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -232,6 +244,7 @@ static inline void __lse_atomic64_and(s64 i, atomic64_t *v)
> static inline long __lse_atomic64_fetch_and##name(s64 i, atomic64_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " mvn %[i], %[i]\n" \
> " ldclr" #mb " %[i], %[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -251,6 +264,7 @@ ATOMIC64_FETCH_OP_AND( , al, "memory")
> static inline void __lse_atomic64_sub(s64 i, atomic64_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " neg %[i], %[i]\n"
> " stadd %[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -263,6 +277,7 @@ static inline long __lse_atomic64_sub_return##name(s64 i, atomic64_t *v) \
> unsigned long tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %[i], %[i]\n" \
> " ldadd" #mb " %[i], %x[tmp], %[v]\n" \
> " add %[i], %[i], %x[tmp]" \
> @@ -284,6 +299,7 @@ ATOMIC64_OP_SUB_RETURN( , al, "memory")
> static inline long __lse_atomic64_fetch_sub##name(s64 i, atomic64_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %[i], %[i]\n" \
> " ldadd" #mb " %[i], %[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -305,6 +321,7 @@ static inline s64 __lse_atomic64_dec_if_positive(atomic64_t *v)
> unsigned long tmp;
>
> asm volatile(
> + __LSE_PREAMBLE
> "1: ldr %x[tmp], %[v]\n"
> " subs %[ret], %x[tmp], #1\n"
> " b.lt 2f\n"
> @@ -331,6 +348,7 @@ static inline u##sz __lse__cmpxchg_case_##name##sz(volatile void *ptr, \
> unsigned long tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " mov %" #w "[tmp], %" #w "[old]\n" \
> " cas" #mb #sfx "\t%" #w "[tmp], %" #w "[new], %[v]\n" \
> " mov %" #w "[ret], %" #w "[tmp]" \
> @@ -377,6 +395,7 @@ static inline long __lse__cmpxchg_double##name(unsigned long old1, \
> register unsigned long x4 asm ("x4") = (unsigned long)ptr; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\
> " eor %[old1], %[old1], %[oldval1]\n" \
> " eor %[old2], %[old2], %[oldval2]\n" \
> diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
> index 80b388278149..73834996c4b6 100644
> --- a/arch/arm64/include/asm/lse.h
> +++ b/arch/arm64/include/asm/lse.h
> @@ -6,6 +6,8 @@
>
> #if defined(CONFIG_AS_LSE) && defined(CONFIG_ARM64_LSE_ATOMICS)
>
> +#define __LSE_PREAMBLE ".arch armv8-a+lse\n"
> +
> #include <linux/compiler_types.h>
> #include <linux/export.h>
> #include <linux/jump_label.h>
> @@ -14,8 +16,6 @@
> #include <asm/atomic_lse.h>
> #include <asm/cpucaps.h>
>
> -__asm__(".arch_extension lse");
> -
> extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
> extern struct static_key_false arm64_const_caps_ready;
>
> @@ -34,7 +34,7 @@ static inline bool system_uses_lse_atomics(void)
>
> /* In-line patching at runtime */
> #define ARM64_LSE_ATOMIC_INSN(llsc, lse) \
> - ALTERNATIVE(llsc, lse, ARM64_HAS_LSE_ATOMICS)
> + ALTERNATIVE(llsc, __LSE_PREAMBLE lse, ARM64_HAS_LSE_ATOMICS)
>
> #else /* CONFIG_AS_LSE && CONFIG_ARM64_LSE_ATOMICS */
>
> --
> 2.23.0.581.g78d2f28ef7-goog
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20191008212730.185532-1-samitolvanen%40google.com.



--
Thanks,
~Nick Desaulniers

2019-10-08 23:35:35

by Andrew Murray

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Tue, Oct 08, 2019 at 02:27:30PM -0700, Sami Tolvanen wrote:
> Unlike gcc, clang considers each inline assembly block to be independent
> and therefore, when using the integrated assembler for inline assembly,
> any preambles that enable features must be repeated in each block.
>
> This change defines __LSE_PREAMBLE and adds it to each inline assembly
> block that has LSE instructions, which allows them to be compiled also
> with clang's assembler.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/671
> Signed-off-by: Sami Tolvanen <[email protected]>

This looks good to me. I can build and boot in a model with both Clang
(9.0.6) and GCC (7.3.1) and boot a guest without anything going bang.

Though when I build with AS=clang, e.g.

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- CC=clang AS=clang Image

I get errors like this:

CC init/main.o
In file included from init/main.c:17:
In file included from ./include/linux/module.h:9:
In file included from ./include/linux/list.h:9:
In file included from ./include/linux/kernel.h:12:
In file included from ./include/linux/bitops.h:26:
In file included from ./arch/arm64/include/asm/bitops.h:26:
In file included from ./include/asm-generic/bitops/atomic.h:5:
In file included from ./include/linux/atomic.h:7:
In file included from ./arch/arm64/include/asm/atomic.h:16:
In file included from ./arch/arm64/include/asm/cmpxchg.h:14:
In file included from ./arch/arm64/include/asm/lse.h:13:
In file included from ./include/linux/jump_label.h:117:
./arch/arm64/include/asm/jump_label.h:24:20: error: expected a symbol reference in '.long' directive
" .align 3 \n\t"
^
<inline asm>:4:21: note: instantiated into assembly here
.long 1b - ., "" - .
^

I'm assuming that I'm doing something wrong?

Thanks,

Andrew Murray

> ---
> v2:
> - Add a preamble to inline assembly blocks that use LSE instead
> of allowing the compiler to emit LSE instructions everywhere.
>
> ---
> arch/arm64/include/asm/atomic_lse.h | 19 +++++++++++++++++++
> arch/arm64/include/asm/lse.h | 6 +++---
> 2 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
> index c6bd87d2915b..3ee600043042 100644
> --- a/arch/arm64/include/asm/atomic_lse.h
> +++ b/arch/arm64/include/asm/atomic_lse.h
> @@ -14,6 +14,7 @@
> static inline void __lse_atomic_##op(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op " %w[i], %[v]\n" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v)); \
> @@ -30,6 +31,7 @@ ATOMIC_OP(add, stadd)
> static inline int __lse_atomic_fetch_##op##name(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op #mb " %w[i], %w[i], %[v]" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v) \
> @@ -58,6 +60,7 @@ static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \
> u32 tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
> " add %w[i], %w[i], %w[tmp]" \
> : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
> @@ -77,6 +80,7 @@ ATOMIC_OP_ADD_RETURN( , al, "memory")
> static inline void __lse_atomic_and(int i, atomic_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " mvn %w[i], %w[i]\n"
> " stclr %w[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -87,6 +91,7 @@ static inline void __lse_atomic_and(int i, atomic_t *v)
> static inline int __lse_atomic_fetch_and##name(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " mvn %w[i], %w[i]\n" \
> " ldclr" #mb " %w[i], %w[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -106,6 +111,7 @@ ATOMIC_FETCH_OP_AND( , al, "memory")
> static inline void __lse_atomic_sub(int i, atomic_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " neg %w[i], %w[i]\n"
> " stadd %w[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -118,6 +124,7 @@ static inline int __lse_atomic_sub_return##name(int i, atomic_t *v) \
> u32 tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %w[i], %w[i]\n" \
> " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
> " add %w[i], %w[i], %w[tmp]" \
> @@ -139,6 +146,7 @@ ATOMIC_OP_SUB_RETURN( , al, "memory")
> static inline int __lse_atomic_fetch_sub##name(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %w[i], %w[i]\n" \
> " ldadd" #mb " %w[i], %w[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -159,6 +167,7 @@ ATOMIC_FETCH_OP_SUB( , al, "memory")
> static inline void __lse_atomic64_##op(s64 i, atomic64_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op " %[i], %[v]\n" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v)); \
> @@ -175,6 +184,7 @@ ATOMIC64_OP(add, stadd)
> static inline long __lse_atomic64_fetch_##op##name(s64 i, atomic64_t *v)\
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op #mb " %[i], %[i], %[v]" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v) \
> @@ -203,6 +213,7 @@ static inline long __lse_atomic64_add_return##name(s64 i, atomic64_t *v)\
> unsigned long tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " ldadd" #mb " %[i], %x[tmp], %[v]\n" \
> " add %[i], %[i], %x[tmp]" \
> : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
> @@ -222,6 +233,7 @@ ATOMIC64_OP_ADD_RETURN( , al, "memory")
> static inline void __lse_atomic64_and(s64 i, atomic64_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " mvn %[i], %[i]\n"
> " stclr %[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -232,6 +244,7 @@ static inline void __lse_atomic64_and(s64 i, atomic64_t *v)
> static inline long __lse_atomic64_fetch_and##name(s64 i, atomic64_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " mvn %[i], %[i]\n" \
> " ldclr" #mb " %[i], %[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -251,6 +264,7 @@ ATOMIC64_FETCH_OP_AND( , al, "memory")
> static inline void __lse_atomic64_sub(s64 i, atomic64_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " neg %[i], %[i]\n"
> " stadd %[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -263,6 +277,7 @@ static inline long __lse_atomic64_sub_return##name(s64 i, atomic64_t *v) \
> unsigned long tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %[i], %[i]\n" \
> " ldadd" #mb " %[i], %x[tmp], %[v]\n" \
> " add %[i], %[i], %x[tmp]" \
> @@ -284,6 +299,7 @@ ATOMIC64_OP_SUB_RETURN( , al, "memory")
> static inline long __lse_atomic64_fetch_sub##name(s64 i, atomic64_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %[i], %[i]\n" \
> " ldadd" #mb " %[i], %[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -305,6 +321,7 @@ static inline s64 __lse_atomic64_dec_if_positive(atomic64_t *v)
> unsigned long tmp;
>
> asm volatile(
> + __LSE_PREAMBLE
> "1: ldr %x[tmp], %[v]\n"
> " subs %[ret], %x[tmp], #1\n"
> " b.lt 2f\n"
> @@ -331,6 +348,7 @@ static inline u##sz __lse__cmpxchg_case_##name##sz(volatile void *ptr, \
> unsigned long tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " mov %" #w "[tmp], %" #w "[old]\n" \
> " cas" #mb #sfx "\t%" #w "[tmp], %" #w "[new], %[v]\n" \
> " mov %" #w "[ret], %" #w "[tmp]" \
> @@ -377,6 +395,7 @@ static inline long __lse__cmpxchg_double##name(unsigned long old1, \
> register unsigned long x4 asm ("x4") = (unsigned long)ptr; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\
> " eor %[old1], %[old1], %[oldval1]\n" \
> " eor %[old2], %[old2], %[oldval2]\n" \
> diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
> index 80b388278149..73834996c4b6 100644
> --- a/arch/arm64/include/asm/lse.h
> +++ b/arch/arm64/include/asm/lse.h
> @@ -6,6 +6,8 @@
>
> #if defined(CONFIG_AS_LSE) && defined(CONFIG_ARM64_LSE_ATOMICS)
>
> +#define __LSE_PREAMBLE ".arch armv8-a+lse\n"
> +
> #include <linux/compiler_types.h>
> #include <linux/export.h>
> #include <linux/jump_label.h>
> @@ -14,8 +16,6 @@
> #include <asm/atomic_lse.h>
> #include <asm/cpucaps.h>
>
> -__asm__(".arch_extension lse");
> -
> extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
> extern struct static_key_false arm64_const_caps_ready;
>
> @@ -34,7 +34,7 @@ static inline bool system_uses_lse_atomics(void)
>
> /* In-line patching at runtime */
> #define ARM64_LSE_ATOMIC_INSN(llsc, lse) \
> - ALTERNATIVE(llsc, lse, ARM64_HAS_LSE_ATOMICS)
> + ALTERNATIVE(llsc, __LSE_PREAMBLE lse, ARM64_HAS_LSE_ATOMICS)
>
> #else /* CONFIG_AS_LSE && CONFIG_ARM64_LSE_ATOMICS */
>
> --
> 2.23.0.581.g78d2f28ef7-goog
>

2019-10-09 00:02:28

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Tue, Oct 8, 2019 at 4:31 PM Andrew Murray <[email protected]> wrote:
> This looks good to me. I can build and boot in a model with both Clang
> (9.0.6) and GCC (7.3.1) and boot a guest without anything going bang.

Great, thank you for testing this!

> Though when I build with AS=clang, e.g.
>
> make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- CC=clang AS=clang Image

Note that this patch only fixes issues with inline assembly, which
should at some point allow us to drop -no-integrated-as from clang
builds. I believe there are still other fixes needed before AS=clang
works.

> I get errors like this:
>
> CC init/main.o
> In file included from init/main.c:17:
> In file included from ./include/linux/module.h:9:
> In file included from ./include/linux/list.h:9:
> In file included from ./include/linux/kernel.h:12:
> In file included from ./include/linux/bitops.h:26:
> In file included from ./arch/arm64/include/asm/bitops.h:26:
> In file included from ./include/asm-generic/bitops/atomic.h:5:
> In file included from ./include/linux/atomic.h:7:
> In file included from ./arch/arm64/include/asm/atomic.h:16:
> In file included from ./arch/arm64/include/asm/cmpxchg.h:14:
> In file included from ./arch/arm64/include/asm/lse.h:13:
> In file included from ./include/linux/jump_label.h:117:
> ./arch/arm64/include/asm/jump_label.h:24:20: error: expected a symbol reference in '.long' directive
> " .align 3 \n\t"
> ^
> <inline asm>:4:21: note: instantiated into assembly here
> .long 1b - ., "" - .
> ^
>
> I'm assuming that I'm doing something wrong?

No, this particular issue will be fixed in clang 10:
https://github.com/ClangBuiltLinux/linux/issues/500

Sami

2019-10-09 00:02:43

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Tue, Oct 08, 2019 at 04:59:25PM -0700, 'Sami Tolvanen' via Clang Built Linux wrote:
> On Tue, Oct 8, 2019 at 4:31 PM Andrew Murray <[email protected]> wrote:
> > This looks good to me. I can build and boot in a model with both Clang
> > (9.0.6) and GCC (7.3.1) and boot a guest without anything going bang.
>
> Great, thank you for testing this!
>
> > Though when I build with AS=clang, e.g.
> >
> > make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- CC=clang AS=clang Image
>
> Note that this patch only fixes issues with inline assembly, which
> should at some point allow us to drop -no-integrated-as from clang
> builds. I believe there are still other fixes needed before AS=clang
> works.
>
> > I get errors like this:
> >
> > CC init/main.o
> > In file included from init/main.c:17:
> > In file included from ./include/linux/module.h:9:
> > In file included from ./include/linux/list.h:9:
> > In file included from ./include/linux/kernel.h:12:
> > In file included from ./include/linux/bitops.h:26:
> > In file included from ./arch/arm64/include/asm/bitops.h:26:
> > In file included from ./include/asm-generic/bitops/atomic.h:5:
> > In file included from ./include/linux/atomic.h:7:
> > In file included from ./arch/arm64/include/asm/atomic.h:16:
> > In file included from ./arch/arm64/include/asm/cmpxchg.h:14:
> > In file included from ./arch/arm64/include/asm/lse.h:13:
> > In file included from ./include/linux/jump_label.h:117:
> > ./arch/arm64/include/asm/jump_label.h:24:20: error: expected a symbol reference in '.long' directive
> > " .align 3 \n\t"
> > ^
> > <inline asm>:4:21: note: instantiated into assembly here
> > .long 1b - ., "" - .
> > ^
> >
> > I'm assuming that I'm doing something wrong?
>
> No, this particular issue will be fixed in clang 10:
> https://github.com/ClangBuiltLinux/linux/issues/500
>
> Sami

I believe that it should be fixed with AOSP's Clang 9.0.8 or upstream
Clang 9.0.0.

Cheers,
Nathan

2019-10-09 08:29:56

by Andrew Murray

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Tue, Oct 08, 2019 at 05:01:59PM -0700, Nathan Chancellor wrote:
> On Tue, Oct 08, 2019 at 04:59:25PM -0700, 'Sami Tolvanen' via Clang Built Linux wrote:
> > On Tue, Oct 8, 2019 at 4:31 PM Andrew Murray <[email protected]> wrote:
> > > This looks good to me. I can build and boot in a model with both Clang
> > > (9.0.6) and GCC (7.3.1) and boot a guest without anything going bang.
> >
> > Great, thank you for testing this!
> >
> > > Though when I build with AS=clang, e.g.
> > >
> > > make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- CC=clang AS=clang Image
> >
> > Note that this patch only fixes issues with inline assembly, which
> > should at some point allow us to drop -no-integrated-as from clang
> > builds. I believe there are still other fixes needed before AS=clang
> > works.
> >
> > > I get errors like this:
> > >
> > > CC init/main.o
> > > In file included from init/main.c:17:
> > > In file included from ./include/linux/module.h:9:
> > > In file included from ./include/linux/list.h:9:
> > > In file included from ./include/linux/kernel.h:12:
> > > In file included from ./include/linux/bitops.h:26:
> > > In file included from ./arch/arm64/include/asm/bitops.h:26:
> > > In file included from ./include/asm-generic/bitops/atomic.h:5:
> > > In file included from ./include/linux/atomic.h:7:
> > > In file included from ./arch/arm64/include/asm/atomic.h:16:
> > > In file included from ./arch/arm64/include/asm/cmpxchg.h:14:
> > > In file included from ./arch/arm64/include/asm/lse.h:13:
> > > In file included from ./include/linux/jump_label.h:117:
> > > ./arch/arm64/include/asm/jump_label.h:24:20: error: expected a symbol reference in '.long' directive
> > > " .align 3 \n\t"
> > > ^
> > > <inline asm>:4:21: note: instantiated into assembly here
> > > .long 1b - ., "" - .
> > > ^
> > >
> > > I'm assuming that I'm doing something wrong?
> >
> > No, this particular issue will be fixed in clang 10:
> > https://github.com/ClangBuiltLinux/linux/issues/500
> >
> > Sami
>
> I believe that it should be fixed with AOSP's Clang 9.0.8 or upstream
> Clang 9.0.0.

OK, understood. You can add:

Reviewed-by: Andrew Murray <[email protected]>
Tested-by: Andrew Murray <[email protected]>

>
> Cheers,
> Nathan

2019-10-09 10:06:03

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On 2019-10-08 10:03 pm, Ard Biesheuvel wrote:
> On Tue, 8 Oct 2019 at 18:19, Robin Murphy <[email protected]> wrote:
>>
>> On 08/10/2019 16:22, Sami Tolvanen wrote:
>>> On Mon, Oct 7, 2019 at 2:46 PM 'Nick Desaulniers' via Clang Built
>>> Linux <[email protected]> wrote:
>>>> I'm worried that one of these might lower to LSE atomics without
>>>> ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.
>>>
>>> True, that's a valid concern. I think adding the directive to each
>>> assembly block is the way forward then, assuming the maintainers are
>>> fine with that.
>>
>> It's definitely a valid concern in principle, but in practice note that
>> lse.h ends up included in ~99% of C files, so the extension is enabled
>> more or less everywhere already.
>>
>
> lse.h currently does
>
> __asm__(".arch_extension lse");
>
> which instructs the assembler to permit the use of LSE opcodes, but it
> does not instruct the compiler to emit them, so this is not quite the
> same thing.

Derp, of course it isn't. And IIRC we can't just pass the option through
with -Wa either because at least some versions of GCC emit an explicit
.arch directive at the top of the output. Oh well; sorry for the
distraction.

Robin.

2019-10-10 21:00:52

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Tue, Oct 08, 2019 at 02:27:30PM -0700, Sami Tolvanen wrote:
> Unlike gcc, clang considers each inline assembly block to be independent
> and therefore, when using the integrated assembler for inline assembly,
> any preambles that enable features must be repeated in each block.
>
> This change defines __LSE_PREAMBLE and adds it to each inline assembly
> block that has LSE instructions, which allows them to be compiled also
> with clang's assembler.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/671
> Signed-off-by: Sami Tolvanen <[email protected]>

FWIW, my arm64 builds remain happy with this too.

Tested-by: Kees Cook <[email protected]>
Reviewed-by: Kees Cook <[email protected]>

-Kees

> ---
> v2:
> - Add a preamble to inline assembly blocks that use LSE instead
> of allowing the compiler to emit LSE instructions everywhere.
>
> ---
> arch/arm64/include/asm/atomic_lse.h | 19 +++++++++++++++++++
> arch/arm64/include/asm/lse.h | 6 +++---
> 2 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
> index c6bd87d2915b..3ee600043042 100644
> --- a/arch/arm64/include/asm/atomic_lse.h
> +++ b/arch/arm64/include/asm/atomic_lse.h
> @@ -14,6 +14,7 @@
> static inline void __lse_atomic_##op(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op " %w[i], %[v]\n" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v)); \
> @@ -30,6 +31,7 @@ ATOMIC_OP(add, stadd)
> static inline int __lse_atomic_fetch_##op##name(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op #mb " %w[i], %w[i], %[v]" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v) \
> @@ -58,6 +60,7 @@ static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \
> u32 tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
> " add %w[i], %w[i], %w[tmp]" \
> : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
> @@ -77,6 +80,7 @@ ATOMIC_OP_ADD_RETURN( , al, "memory")
> static inline void __lse_atomic_and(int i, atomic_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " mvn %w[i], %w[i]\n"
> " stclr %w[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -87,6 +91,7 @@ static inline void __lse_atomic_and(int i, atomic_t *v)
> static inline int __lse_atomic_fetch_and##name(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " mvn %w[i], %w[i]\n" \
> " ldclr" #mb " %w[i], %w[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -106,6 +111,7 @@ ATOMIC_FETCH_OP_AND( , al, "memory")
> static inline void __lse_atomic_sub(int i, atomic_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " neg %w[i], %w[i]\n"
> " stadd %w[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -118,6 +124,7 @@ static inline int __lse_atomic_sub_return##name(int i, atomic_t *v) \
> u32 tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %w[i], %w[i]\n" \
> " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
> " add %w[i], %w[i], %w[tmp]" \
> @@ -139,6 +146,7 @@ ATOMIC_OP_SUB_RETURN( , al, "memory")
> static inline int __lse_atomic_fetch_sub##name(int i, atomic_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %w[i], %w[i]\n" \
> " ldadd" #mb " %w[i], %w[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -159,6 +167,7 @@ ATOMIC_FETCH_OP_SUB( , al, "memory")
> static inline void __lse_atomic64_##op(s64 i, atomic64_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op " %[i], %[v]\n" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v)); \
> @@ -175,6 +184,7 @@ ATOMIC64_OP(add, stadd)
> static inline long __lse_atomic64_fetch_##op##name(s64 i, atomic64_t *v)\
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " " #asm_op #mb " %[i], %[i], %[v]" \
> : [i] "+r" (i), [v] "+Q" (v->counter) \
> : "r" (v) \
> @@ -203,6 +213,7 @@ static inline long __lse_atomic64_add_return##name(s64 i, atomic64_t *v)\
> unsigned long tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " ldadd" #mb " %[i], %x[tmp], %[v]\n" \
> " add %[i], %[i], %x[tmp]" \
> : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
> @@ -222,6 +233,7 @@ ATOMIC64_OP_ADD_RETURN( , al, "memory")
> static inline void __lse_atomic64_and(s64 i, atomic64_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " mvn %[i], %[i]\n"
> " stclr %[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -232,6 +244,7 @@ static inline void __lse_atomic64_and(s64 i, atomic64_t *v)
> static inline long __lse_atomic64_fetch_and##name(s64 i, atomic64_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " mvn %[i], %[i]\n" \
> " ldclr" #mb " %[i], %[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -251,6 +264,7 @@ ATOMIC64_FETCH_OP_AND( , al, "memory")
> static inline void __lse_atomic64_sub(s64 i, atomic64_t *v)
> {
> asm volatile(
> + __LSE_PREAMBLE
> " neg %[i], %[i]\n"
> " stadd %[i], %[v]"
> : [i] "+&r" (i), [v] "+Q" (v->counter)
> @@ -263,6 +277,7 @@ static inline long __lse_atomic64_sub_return##name(s64 i, atomic64_t *v) \
> unsigned long tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %[i], %[i]\n" \
> " ldadd" #mb " %[i], %x[tmp], %[v]\n" \
> " add %[i], %[i], %x[tmp]" \
> @@ -284,6 +299,7 @@ ATOMIC64_OP_SUB_RETURN( , al, "memory")
> static inline long __lse_atomic64_fetch_sub##name(s64 i, atomic64_t *v) \
> { \
> asm volatile( \
> + __LSE_PREAMBLE \
> " neg %[i], %[i]\n" \
> " ldadd" #mb " %[i], %[i], %[v]" \
> : [i] "+&r" (i), [v] "+Q" (v->counter) \
> @@ -305,6 +321,7 @@ static inline s64 __lse_atomic64_dec_if_positive(atomic64_t *v)
> unsigned long tmp;
>
> asm volatile(
> + __LSE_PREAMBLE
> "1: ldr %x[tmp], %[v]\n"
> " subs %[ret], %x[tmp], #1\n"
> " b.lt 2f\n"
> @@ -331,6 +348,7 @@ static inline u##sz __lse__cmpxchg_case_##name##sz(volatile void *ptr, \
> unsigned long tmp; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " mov %" #w "[tmp], %" #w "[old]\n" \
> " cas" #mb #sfx "\t%" #w "[tmp], %" #w "[new], %[v]\n" \
> " mov %" #w "[ret], %" #w "[tmp]" \
> @@ -377,6 +395,7 @@ static inline long __lse__cmpxchg_double##name(unsigned long old1, \
> register unsigned long x4 asm ("x4") = (unsigned long)ptr; \
> \
> asm volatile( \
> + __LSE_PREAMBLE \
> " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\
> " eor %[old1], %[old1], %[oldval1]\n" \
> " eor %[old2], %[old2], %[oldval2]\n" \
> diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
> index 80b388278149..73834996c4b6 100644
> --- a/arch/arm64/include/asm/lse.h
> +++ b/arch/arm64/include/asm/lse.h
> @@ -6,6 +6,8 @@
>
> #if defined(CONFIG_AS_LSE) && defined(CONFIG_ARM64_LSE_ATOMICS)
>
> +#define __LSE_PREAMBLE ".arch armv8-a+lse\n"
> +
> #include <linux/compiler_types.h>
> #include <linux/export.h>
> #include <linux/jump_label.h>
> @@ -14,8 +16,6 @@
> #include <asm/atomic_lse.h>
> #include <asm/cpucaps.h>
>
> -__asm__(".arch_extension lse");
> -
> extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
> extern struct static_key_false arm64_const_caps_ready;
>
> @@ -34,7 +34,7 @@ static inline bool system_uses_lse_atomics(void)
>
> /* In-line patching at runtime */
> #define ARM64_LSE_ATOMIC_INSN(llsc, lse) \
> - ALTERNATIVE(llsc, lse, ARM64_HAS_LSE_ATOMICS)
> + ALTERNATIVE(llsc, __LSE_PREAMBLE lse, ARM64_HAS_LSE_ATOMICS)
>
> #else /* CONFIG_AS_LSE && CONFIG_ARM64_LSE_ATOMICS */
>
> --
> 2.23.0.581.g78d2f28ef7-goog
>

--
Kees Cook

2019-10-15 02:00:55

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: lse: fix LSE atomics with LLVM's integrated assembler

On Tue, Oct 08, 2019 at 02:27:30PM -0700, Sami Tolvanen wrote:
> Unlike gcc, clang considers each inline assembly block to be independent
> and therefore, when using the integrated assembler for inline assembly,
> any preambles that enable features must be repeated in each block.
>
> This change defines __LSE_PREAMBLE and adds it to each inline assembly
> block that has LSE instructions, which allows them to be compiled also
> with clang's assembler.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/671
> Signed-off-by: Sami Tolvanen <[email protected]>
> ---
> v2:
> - Add a preamble to inline assembly blocks that use LSE instead
> of allowing the compiler to emit LSE instructions everywhere.
>
> ---
> arch/arm64/include/asm/atomic_lse.h | 19 +++++++++++++++++++
> arch/arm64/include/asm/lse.h | 6 +++---
> 2 files changed, 22 insertions(+), 3 deletions(-)

One thing I've always wanted from binutils is the ability to pass a flag to
the assembler which means that it accepts all of the instructions that it
knows about for a given major architecture (a bit like the '-cpu max' option
to qemu). Even better would be the ability to supply a file at build time
specifying the encodings, so that we could ship that with the kernel and
avoid some of the mess we have in places like sysreg.h were we end up
fighting against the assembler when trying to define new system register
accessors.

The latter suggestion is a bit "pie in the sky", but do you think there is
any scope for the former with clang?

Will