2023-06-23 13:46:05

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 0/9] LoongArch: Preliminary ClangBuiltLinux enablement

From: WANG Xuerui <[email protected]>

Hi,

It's been a long time since the LoongArch port was upstreamed to LLVM,
and there seems to be evidence that Linux was successfully built with
Clang inside Loongson roughly around that time; however, a lot has
changed since then, and the Linux/LoongArch codebase now makes use of
more novel features that necessitate further work. (The enablement work
is tracked at [1].)

With this patch series and a patched LLVM/LLD ([2] for llvm-objcopy,
and [3] for LLD), a working kernel can be built with `make LLVM=1`;
although currently support for CONFIG_RELOCATABLE and CONFIG_MODULE is
still TODO, we've decided to post the series early to hopefully reduce
the rebase burden. The series contains several useful cleanups anyway.

Regarding how to merge this: because only Patch 8 is outside
arch/loongarch, I'd prefer the series to get merged through Huacai's
tree. The series applies cleanly on top of next-20230622.

Thanks go to the ClangBuiltLinux team, and LoongArch toolchain
maintainers from Loongson and the community alike; without your help
this would come much later, if at all (my free time has been steadily
dwindling this year already).

Your comments are welcome!

[1]: https://github.com/ClangBuiltLinux/linux/issues/1787
[2]: https://reviews.llvm.org/D153609
[3]: https://reviews.llvm.org/D138135

WANG Rui (2):
LoongArch: Calculate various sizes in the linker script
LoongArch: extable: Also recognize ABI names of registers

WANG Xuerui (7):
LoongArch: Prepare for assemblers with proper FCSR bank support
LoongArch: Make {read,write}_fcsr compatible with LLVM/Clang
LoongArch: Make the CPUCFG and CSR ops simple aliases of compiler
built-ins
LoongArch: Simplify the invtlb wrappers
LoongArch: Tweak CFLAGS for Clang compatibility
Makefile: Add loongarch target flag for Clang compilation
LoongArch: Mark Clang LTO as working

arch/loongarch/Kconfig | 5 ++
arch/loongarch/Makefile | 14 ++++-
arch/loongarch/include/asm/fpregdef.h | 7 +++
arch/loongarch/include/asm/gpr-num.h | 30 ++++++++++
arch/loongarch/include/asm/loongarch.h | 82 ++++++++------------------
arch/loongarch/include/asm/tlb.h | 45 +++++---------
arch/loongarch/kernel/efi-header.S | 6 +-
arch/loongarch/kernel/head.S | 8 +--
arch/loongarch/kernel/traps.c | 2 +-
arch/loongarch/kernel/vmlinux.lds.S | 7 +++
arch/loongarch/lib/dump_tlb.c | 6 +-
arch/loongarch/mm/tlb.c | 10 ++--
arch/loongarch/vdso/Makefile | 6 +-
scripts/Makefile.clang | 1 +
14 files changed, 122 insertions(+), 107 deletions(-)

--
2.40.0



2023-06-23 13:46:12

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 2/9] LoongArch: extable: Also recognize ABI names of registers

From: WANG Rui <[email protected]>

When the kernel is compiled with LLVM, the register names being handled
during exception fixup building are ABI names instead of bare $rNN
style. Add mapping for the ABI names for LLVM compatibility.

Signed-off-by: WANG Rui <[email protected]>
Signed-off-by: WANG Xuerui <[email protected]>
---
arch/loongarch/include/asm/gpr-num.h | 30 ++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/arch/loongarch/include/asm/gpr-num.h b/arch/loongarch/include/asm/gpr-num.h
index e0941af20c7e..996038da806d 100644
--- a/arch/loongarch/include/asm/gpr-num.h
+++ b/arch/loongarch/include/asm/gpr-num.h
@@ -9,6 +9,22 @@
.equ .L__gpr_num_$r\num, \num
.endr

+ /* ABI names of registers */
+ .equ .L__gpr_num_$ra, 1
+ .equ .L__gpr_num_$tp, 2
+ .equ .L__gpr_num_$sp, 3
+ .irp num,0,1,2,3,4,5,6,7
+ .equ .L__gpr_num_$a\num, 4 + \num
+ .endr
+ .irp num,0,1,2,3,4,5,6,7,8
+ .equ .L__gpr_num_$t\num, 12 + \num
+ .endr
+ .equ .L__gpr_num_$s9, 22
+ .equ .L__gpr_num_$fp, 22
+ .irp num,0,1,2,3,4,5,6,7,8
+ .equ .L__gpr_num_$s\num, 23 + \num
+ .endr
+
#else /* __ASSEMBLY__ */

#define __DEFINE_ASM_GPR_NUMS \
@@ -16,6 +32,20 @@
" .irp num,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31\n" \
" .equ .L__gpr_num_$r\\num, \\num\n" \
" .endr\n" \
+" .equ .L__gpr_num_$ra, 1\n" \
+" .equ .L__gpr_num_$tp, 2\n" \
+" .equ .L__gpr_num_$sp, 3\n" \
+" .irp num,0,1,2,3,4,5,6,7\n" \
+" .equ .L__gpr_num_$a\\num, 4 + \\num\n" \
+" .endr\n" \
+" .irp num,0,1,2,3,4,5,6,7,8\n" \
+" .equ .L__gpr_num_$t\\num, 12 + \\num\n" \
+" .endr\n" \
+" .equ .L__gpr_num_$s9, 22\n" \
+" .equ .L__gpr_num_$fp, 22\n" \
+" .irp num,0,1,2,3,4,5,6,7,8\n" \
+" .equ .L__gpr_num_$s\\num, 23 + \\num\n" \
+" .endr\n" \

#endif /* __ASSEMBLY__ */

--
2.40.0


2023-06-23 13:52:55

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 9/9] LoongArch: Mark Clang LTO as working

From: WANG Xuerui <[email protected]>

Confirmed working with QEMU system emulation.

Signed-off-by: WANG Xuerui <[email protected]>
---
arch/loongarch/Kconfig | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index c8e4f8b03c55..7c5d562b2623 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -51,6 +51,8 @@ config LOONGARCH
select ARCH_SUPPORTS_ACPI
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_HUGETLBFS
+ select ARCH_SUPPORTS_LTO_CLANG
+ select ARCH_SUPPORTS_LTO_CLANG_THIN
select ARCH_SUPPORTS_NUMA_BALANCING
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF
--
2.40.0


2023-06-23 13:53:57

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 7/9] LoongArch: Tweak CFLAGS for Clang compatibility

From: WANG Xuerui <[email protected]>

Now the arch code is mostly ready for LLVM/Clang consumption, it is time
to re-organize the CFLAGS a little to actually enable the LLVM build.

A build with !RELOCATABLE && !MODULE is confirmed working within a QEMU
environment; support for the two features are currently blocked by
LLVM/Clang, and will come later.

Signed-off-by: WANG Xuerui <[email protected]>
---
arch/loongarch/Makefile | 14 +++++++++++---
arch/loongarch/vdso/Makefile | 6 +++++-
2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index a27e264bdaa5..efe9b50bd829 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -46,12 +46,18 @@ ld-emul = $(64bit-emul)
cflags-y += -mabi=lp64s
endif

-cflags-y += -G0 -pipe -msoft-float
-LDFLAGS_vmlinux += -G0 -static -n -nostdlib
+ifndef CONFIG_CC_IS_CLANG
+cflags-y += -G0
+LDFLAGS_vmlinux += -G0
+endif
+cflags-y += -pipe
+LDFLAGS_vmlinux += -static -n -nostdlib

# When the assembler supports explicit relocation hint, we must use it.
# GCC may have -mexplicit-relocs off by default if it was built with an old
-# assembler, so we force it via an option.
+# assembler, so we force it via an option. For LLVM/Clang the desired behavior
+# is the default, and the flag is not supported, so don't pass it if Clang is
+# being used.
#
# When the assembler does not supports explicit relocation hint, we can't use
# it. Disable it if the compiler supports it.
@@ -61,8 +67,10 @@ LDFLAGS_vmlinux += -G0 -static -n -nostdlib
# combination of a "new" assembler and "old" compiler is not supported. Either
# upgrade the compiler or downgrade the assembler.
ifdef CONFIG_AS_HAS_EXPLICIT_RELOCS
+ifndef CONFIG_CC_IS_CLANG
cflags-y += -mexplicit-relocs
KBUILD_CFLAGS_KERNEL += -mdirect-extern-access
+endif
else
cflags-y += $(call cc-option,-mno-explicit-relocs)
KBUILD_AFLAGS_KERNEL += -Wa,-mla-global-with-pcrel
diff --git a/arch/loongarch/vdso/Makefile b/arch/loongarch/vdso/Makefile
index 4c859a0e4754..19f6c75a1106 100644
--- a/arch/loongarch/vdso/Makefile
+++ b/arch/loongarch/vdso/Makefile
@@ -25,13 +25,17 @@ endif
cflags-vdso := $(ccflags-vdso) \
-isystem $(shell $(CC) -print-file-name=include) \
$(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \
- -O2 -g -fno-strict-aliasing -fno-common -fno-builtin -G0 \
+ -O2 -g -fno-strict-aliasing -fno-common -fno-builtin \
-fno-stack-protector -fno-jump-tables -DDISABLE_BRANCH_PROFILING \
$(call cc-option, -fno-asynchronous-unwind-tables) \
$(call cc-option, -fno-stack-protector)
aflags-vdso := $(ccflags-vdso) \
-D__ASSEMBLY__ -Wa,-gdwarf-2

+ifndef CONFIG_CC_IS_CLANG
+cflags-vdso += -G0
+endif
+
ifneq ($(c-gettimeofday-y),)
CFLAGS_vgettimeofday.o += -include $(c-gettimeofday-y)
endif
--
2.40.0


2023-06-23 13:58:12

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 1/9] LoongArch: Calculate various sizes in the linker script

From: WANG Rui <[email protected]>

Taking the address delta between symbols in different sections is not
supported by the LLVM IAS. Instead, do this in the linker script, so
the same data can be properly referenced in assembly.

Signed-off-by: WANG Rui <[email protected]>
Signed-off-by: WANG Xuerui <[email protected]>
---
arch/loongarch/kernel/efi-header.S | 6 +++---
arch/loongarch/kernel/head.S | 8 ++++----
arch/loongarch/kernel/vmlinux.lds.S | 7 +++++++
3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/loongarch/kernel/efi-header.S b/arch/loongarch/kernel/efi-header.S
index 8c1d229a2afa..5f23b85d78ca 100644
--- a/arch/loongarch/kernel/efi-header.S
+++ b/arch/loongarch/kernel/efi-header.S
@@ -24,7 +24,7 @@
.byte 0x02 /* MajorLinkerVersion */
.byte 0x14 /* MinorLinkerVersion */
.long __inittext_end - .Lefi_header_end /* SizeOfCode */
- .long _end - __initdata_begin /* SizeOfInitializedData */
+ .long _kernel_vsize /* SizeOfInitializedData */
.long 0 /* SizeOfUninitializedData */
.long __efistub_efi_pe_entry - _head /* AddressOfEntryPoint */
.long .Lefi_header_end - _head /* BaseOfCode */
@@ -79,9 +79,9 @@
IMAGE_SCN_MEM_EXECUTE /* Characteristics */

.ascii ".data\0\0\0"
- .long _end - __initdata_begin /* VirtualSize */
+ .long _kernel_vsize /* VirtualSize */
.long __initdata_begin - _head /* VirtualAddress */
- .long _edata - __initdata_begin /* SizeOfRawData */
+ .long _kernel_rsize /* SizeOfRawData */
.long __initdata_begin - _head /* PointerToRawData */

.long 0 /* PointerToRelocations */
diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
index 0d8180153ec0..53b883db0786 100644
--- a/arch/loongarch/kernel/head.S
+++ b/arch/loongarch/kernel/head.S
@@ -23,7 +23,7 @@ _head:
.word MZ_MAGIC /* "MZ", MS-DOS header */
.org 0x8
.dword kernel_entry /* Kernel entry point */
- .dword _end - _text /* Kernel image effective size */
+ .dword _kernel_asize /* Kernel image effective size */
.quad PHYS_LINK_KADDR /* Kernel image load offset from start of RAM */
.org 0x38 /* 0x20 ~ 0x37 reserved */
.long LINUX_PE_MAGIC
@@ -32,9 +32,9 @@ _head:
pe_header:
__EFI_PE_HEADER

-SYM_DATA(kernel_asize, .long _end - _text);
-SYM_DATA(kernel_fsize, .long _edata - _text);
-SYM_DATA(kernel_offset, .long kernel_offset - _text);
+SYM_DATA(kernel_asize, .long _kernel_asize);
+SYM_DATA(kernel_fsize, .long _kernel_fsize);
+SYM_DATA(kernel_offset, .long _kernel_offset);

#endif

diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
index 0c7b041be9d8..79f238df029e 100644
--- a/arch/loongarch/kernel/vmlinux.lds.S
+++ b/arch/loongarch/kernel/vmlinux.lds.S
@@ -136,6 +136,13 @@ SECTIONS
DWARF_DEBUG
ELF_DETAILS

+ /* header symbols */
+ _kernel_asize = _end - _text;
+ _kernel_fsize = _edata - _text;
+ _kernel_offset = kernel_offset - _text;
+ _kernel_vsize = _end - __initdata_begin;
+ _kernel_rsize = _edata - __initdata_begin;
+
.gptab.sdata : {
*(.gptab.data)
*(.gptab.sdata)
--
2.40.0


2023-06-23 13:58:30

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 5/9] LoongArch: Make the CPUCFG and CSR ops simple aliases of compiler built-ins

From: WANG Xuerui <[email protected]>

In addition to less visual clutter, this also makes Clang happy
regarding the const-ness of arguments. In the original approach, all
Clang gets to see is the incoming arguments whose const-ness cannot be
proven without first being inlined; so Clang errors out here while GCC
is fine.

While at it, tweak several printk format strings because the return type
of csr_read64 becomes effectively unsigned long, instead of unsigned
long long,

Signed-off-by: WANG Xuerui <[email protected]>
---
arch/loongarch/include/asm/loongarch.h | 63 +++++---------------------
arch/loongarch/kernel/traps.c | 2 +-
arch/loongarch/lib/dump_tlb.c | 6 +--
3 files changed, 15 insertions(+), 56 deletions(-)

diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
index eedc313b5241..c2a6f698a3af 100644
--- a/arch/loongarch/include/asm/loongarch.h
+++ b/arch/loongarch/include/asm/loongarch.h
@@ -56,10 +56,7 @@ __asm__(".macro parse_r var r\n\t"
#undef _IFC_REG

/* CPUCFG */
-static inline u32 read_cpucfg(u32 reg)
-{
- return __cpucfg(reg);
-}
+#define read_cpucfg(reg) __cpucfg(reg)

#endif /* !__ASSEMBLY__ */

@@ -207,56 +204,18 @@ static inline u32 read_cpucfg(u32 reg)
#ifndef __ASSEMBLY__

/* CSR */
-static __always_inline u32 csr_read32(u32 reg)
-{
- return __csrrd_w(reg);
-}
-
-static __always_inline u64 csr_read64(u32 reg)
-{
- return __csrrd_d(reg);
-}
-
-static __always_inline void csr_write32(u32 val, u32 reg)
-{
- __csrwr_w(val, reg);
-}
-
-static __always_inline void csr_write64(u64 val, u32 reg)
-{
- __csrwr_d(val, reg);
-}
-
-static __always_inline u32 csr_xchg32(u32 val, u32 mask, u32 reg)
-{
- return __csrxchg_w(val, mask, reg);
-}
-
-static __always_inline u64 csr_xchg64(u64 val, u64 mask, u32 reg)
-{
- return __csrxchg_d(val, mask, reg);
-}
+#define csr_read32(reg) __csrrd_w(reg)
+#define csr_read64(reg) __csrrd_d(reg)
+#define csr_write32(val, reg) __csrwr_w(val, reg)
+#define csr_write64(val, reg) __csrwr_d(val, reg)
+#define csr_xchg32(val, mask, reg) __csrxchg_w(val, mask, reg)
+#define csr_xchg64(val, mask, reg) __csrxchg_d(val, mask, reg)

/* IOCSR */
-static __always_inline u32 iocsr_read32(u32 reg)
-{
- return __iocsrrd_w(reg);
-}
-
-static __always_inline u64 iocsr_read64(u32 reg)
-{
- return __iocsrrd_d(reg);
-}
-
-static __always_inline void iocsr_write32(u32 val, u32 reg)
-{
- __iocsrwr_w(val, reg);
-}
-
-static __always_inline void iocsr_write64(u64 val, u32 reg)
-{
- __iocsrwr_d(val, reg);
-}
+#define iocsr_read32(reg) __iocsrrd_w(reg)
+#define iocsr_read64(reg) __iocsrrd_d(reg)
+#define iocsr_write32(val, reg) __iocsrwr_w(val, reg)
+#define iocsr_write64(val, reg) __iocsrwr_d(val, reg)

#endif /* !__ASSEMBLY__ */

diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
index 22179cf6f33c..8fb5e7a77145 100644
--- a/arch/loongarch/kernel/traps.c
+++ b/arch/loongarch/kernel/traps.c
@@ -999,7 +999,7 @@ asmlinkage void cache_parity_error(void)
/* For the moment, report the problem and hang. */
pr_err("Cache error exception:\n");
pr_err("csr_merrctl == %08x\n", csr_read32(LOONGARCH_CSR_MERRCTL));
- pr_err("csr_merrera == %016llx\n", csr_read64(LOONGARCH_CSR_MERRERA));
+ pr_err("csr_merrera == %016lx\n", csr_read64(LOONGARCH_CSR_MERRERA));
panic("Can't handle the cache error!");
}

diff --git a/arch/loongarch/lib/dump_tlb.c b/arch/loongarch/lib/dump_tlb.c
index c2cc7ce343c9..0b886a6e260f 100644
--- a/arch/loongarch/lib/dump_tlb.c
+++ b/arch/loongarch/lib/dump_tlb.c
@@ -20,9 +20,9 @@ void dump_tlb_regs(void)

pr_info("Index : 0x%0x\n", read_csr_tlbidx());
pr_info("PageSize : 0x%0x\n", read_csr_pagesize());
- pr_info("EntryHi : 0x%0*llx\n", field, read_csr_entryhi());
- pr_info("EntryLo0 : 0x%0*llx\n", field, read_csr_entrylo0());
- pr_info("EntryLo1 : 0x%0*llx\n", field, read_csr_entrylo1());
+ pr_info("EntryHi : 0x%0*lx\n", field, read_csr_entryhi());
+ pr_info("EntryLo0 : 0x%0*lx\n", field, read_csr_entrylo0());
+ pr_info("EntryLo1 : 0x%0*lx\n", field, read_csr_entrylo1());
}

static void dump_tlb(int first, int last)
--
2.40.0


2023-06-23 13:59:11

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 8/9] Makefile: Add loongarch target flag for Clang compilation

From: WANG Xuerui <[email protected]>

The LoongArch kernel is 64-bit and built with the soft-float ABI,
hence the loongarch64-linux-gnusf target. (The "libc" part doesn't
matter.)

Signed-off-by: WANG Xuerui <[email protected]>
---
scripts/Makefile.clang | 1 +
1 file changed, 1 insertion(+)

diff --git a/scripts/Makefile.clang b/scripts/Makefile.clang
index 058a4c0f864e..6c23c6af797f 100644
--- a/scripts/Makefile.clang
+++ b/scripts/Makefile.clang
@@ -4,6 +4,7 @@
CLANG_TARGET_FLAGS_arm := arm-linux-gnueabi
CLANG_TARGET_FLAGS_arm64 := aarch64-linux-gnu
CLANG_TARGET_FLAGS_hexagon := hexagon-linux-musl
+CLANG_TARGET_FLAGS_loongarch := loongarch64-linux-gnusf
CLANG_TARGET_FLAGS_m68k := m68k-linux-gnu
CLANG_TARGET_FLAGS_mips := mipsel-linux-gnu
CLANG_TARGET_FLAGS_powerpc := powerpc64le-linux-gnu
--
2.40.0


2023-06-23 13:59:24

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 6/9] LoongArch: Simplify the invtlb wrappers

From: WANG Xuerui <[email protected]>

Of the 3 existing invtlb wrappers, invtlb_info is not used at all,
so it is removed; invtlb_all and invtlb_addr have their unused
argument(s) removed from their signatures.

Also, the invtlb instruction has been supported by upstream LoongArch
toolchains from day one, so ditch the raw opcode trickery and just use
plain inline asm for it.

Signed-off-by: WANG Xuerui <[email protected]>
---
arch/loongarch/include/asm/tlb.h | 45 ++++++++++++--------------------
arch/loongarch/mm/tlb.c | 10 +++----
2 files changed, 21 insertions(+), 34 deletions(-)

diff --git a/arch/loongarch/include/asm/tlb.h b/arch/loongarch/include/asm/tlb.h
index 0dc9ee2b05d2..5e6ee9a15f0f 100644
--- a/arch/loongarch/include/asm/tlb.h
+++ b/arch/loongarch/include/asm/tlb.h
@@ -88,52 +88,39 @@ enum invtlb_ops {
INVTLB_GID_ADDR = 0x16,
};

-/*
- * invtlb op info addr
- * (0x1 << 26) | (0x24 << 20) | (0x13 << 15) |
- * (addr << 10) | (info << 5) | op
- */
static inline void invtlb(u32 op, u32 info, u64 addr)
{
__asm__ __volatile__(
- "parse_r addr,%0\n\t"
- "parse_r info,%1\n\t"
- ".word ((0x6498000) | (addr << 10) | (info << 5) | %2)\n\t"
- :
- : "r"(addr), "r"(info), "i"(op)
- :
- );
-}
-
-static inline void invtlb_addr(u32 op, u32 info, u64 addr)
-{
- __asm__ __volatile__(
- "parse_r addr,%0\n\t"
- ".word ((0x6498000) | (addr << 10) | (0 << 5) | %1)\n\t"
- :
- : "r"(addr), "i"(op)
+ "invtlb %0, %1, %2\n\t"
:
+ : "i"(op), "r"(info), "r"(addr)
+ : "memory"
);
}

-static inline void invtlb_info(u32 op, u32 info, u64 addr)
+static inline void invtlb_addr(u32 op, u64 addr)
{
+ /*
+ * The ISA manual says $zero shall be used in case a particular op
+ * does not take the respective argument, hence the invtlb helper is
+ * not re-used to make sure this is the case.
+ */
__asm__ __volatile__(
- "parse_r info,%0\n\t"
- ".word ((0x6498000) | (0 << 10) | (info << 5) | %1)\n\t"
- :
- : "r"(info), "i"(op)
+ "invtlb %0, $zero, %1\n\t"
:
+ : "i"(op), "r"(addr)
+ : "memory"
);
}

-static inline void invtlb_all(u32 op, u32 info, u64 addr)
+static inline void invtlb_all(u32 op)
{
+ /* Similar to invtlb_addr, ensure the operands are actually $zero. */
__asm__ __volatile__(
- ".word ((0x6498000) | (0 << 10) | (0 << 5) | %0)\n\t"
+ "invtlb %0, $zero, $zero\n\t"
:
: "i"(op)
- :
+ : "memory"
);
}

diff --git a/arch/loongarch/mm/tlb.c b/arch/loongarch/mm/tlb.c
index 00bb563e3c89..de04d2624ef4 100644
--- a/arch/loongarch/mm/tlb.c
+++ b/arch/loongarch/mm/tlb.c
@@ -17,19 +17,19 @@

void local_flush_tlb_all(void)
{
- invtlb_all(INVTLB_CURRENT_ALL, 0, 0);
+ invtlb_all(INVTLB_CURRENT_ALL);
}
EXPORT_SYMBOL(local_flush_tlb_all);

void local_flush_tlb_user(void)
{
- invtlb_all(INVTLB_CURRENT_GFALSE, 0, 0);
+ invtlb_all(INVTLB_CURRENT_GFALSE);
}
EXPORT_SYMBOL(local_flush_tlb_user);

void local_flush_tlb_kernel(void)
{
- invtlb_all(INVTLB_CURRENT_GTRUE, 0, 0);
+ invtlb_all(INVTLB_CURRENT_GTRUE);
}
EXPORT_SYMBOL(local_flush_tlb_kernel);

@@ -100,7 +100,7 @@ void local_flush_tlb_kernel_range(unsigned long start, unsigned long end)
end &= (PAGE_MASK << 1);

while (start < end) {
- invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, 0, start);
+ invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, start);
start += (PAGE_SIZE << 1);
}
} else {
@@ -131,7 +131,7 @@ void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long page)
void local_flush_tlb_one(unsigned long page)
{
page &= (PAGE_MASK << 1);
- invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, 0, page);
+ invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, page);
}

static void __update_hugetlb(struct vm_area_struct *vma, unsigned long address, pte_t *ptep)
--
2.40.0


2023-06-23 14:09:07

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 3/9] LoongArch: Prepare for assemblers with proper FCSR bank support

From: WANG Xuerui <[email protected]>

The GNU assembler (as of 2.40) mis-treats FCSR operands as GPRs, but
the LLVM IAS does not. Probe for this and refer to FCSRs as "$fcsrNN"
if support is present.

Signed-off-by: WANG Xuerui <[email protected]>
---
arch/loongarch/Kconfig | 3 +++
arch/loongarch/include/asm/fpregdef.h | 7 +++++++
2 files changed, 10 insertions(+)

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 743d87655742..c8e4f8b03c55 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -242,6 +242,9 @@ config SCHED_OMIT_FRAME_POINTER
config AS_HAS_EXPLICIT_RELOCS
def_bool $(as-instr,x:pcalau12i \$t0$(comma)%pc_hi20(x))

+config AS_HAS_FCSR_BANK
+ def_bool $(as-instr,x:movfcsr2gr \$t0$(comma)\$fcsr0)
+
config CC_HAS_LSX_EXTENSION
def_bool $(cc-option,-mlsx)

diff --git a/arch/loongarch/include/asm/fpregdef.h b/arch/loongarch/include/asm/fpregdef.h
index b6be527831dd..b0ac640db74c 100644
--- a/arch/loongarch/include/asm/fpregdef.h
+++ b/arch/loongarch/include/asm/fpregdef.h
@@ -40,6 +40,12 @@
#define fs6 $f30
#define fs7 $f31

+#ifdef CONFIG_AS_HAS_FCSR_BANK
+#define fcsr0 $fcsr0
+#define fcsr1 $fcsr1
+#define fcsr2 $fcsr2
+#define fcsr3 $fcsr3
+#else
/*
* Current binutils expects *GPRs* at FCSR position for the FCSR
* operation instructions, so define aliases for those used.
@@ -48,5 +54,6 @@
#define fcsr1 $r1
#define fcsr2 $r2
#define fcsr3 $r3
+#endif

#endif /* _ASM_FPREGDEF_H */
--
2.40.0


2023-06-23 14:12:51

by WANG Xuerui

[permalink] [raw]
Subject: [PATCH 4/9] LoongArch: Make {read,write}_fcsr compatible with LLVM/Clang

From: WANG Xuerui <[email protected]>

LLVM/Clang does not see FCSRs as GPRs, so make use of compiler
built-ins instead for better maintainability with less code.

The existing version cannot be wholly removed though, because the
built-ins, while available on GCC too, is predicated TARGET_HARD_FLOAT,
which means soft-float code cannot make use of them.

Signed-off-by: WANG Xuerui <[email protected]>
---
arch/loongarch/include/asm/loongarch.h | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
index ac83e60c60d1..eedc313b5241 100644
--- a/arch/loongarch/include/asm/loongarch.h
+++ b/arch/loongarch/include/asm/loongarch.h
@@ -1445,12 +1445,6 @@ __BUILD_CSR_OP(tlbidx)
#define EXCCODE_INT_START 64
#define EXCCODE_INT_END (EXCCODE_INT_START + EXCCODE_INT_NUM - 1)

-/* FPU register names */
-#define LOONGARCH_FCSR0 $r0
-#define LOONGARCH_FCSR1 $r1
-#define LOONGARCH_FCSR2 $r2
-#define LOONGARCH_FCSR3 $r3
-
/* FPU Status Register Values */
#define FPU_CSR_RSVD 0xe0e0fce0

@@ -1487,6 +1481,18 @@ __BUILD_CSR_OP(tlbidx)
#define FPU_CSR_RU 0x200 /* towards +Infinity */
#define FPU_CSR_RD 0x300 /* towards -Infinity */

+#ifdef CONFIG_CC_IS_CLANG
+#define LOONGARCH_FCSR0 0
+#define LOONGARCH_FCSR1 1
+#define LOONGARCH_FCSR2 2
+#define LOONGARCH_FCSR3 3
+#define read_fcsr(source) __movfcsr2gr(source)
+#define write_fcsr(dest, val) __movgr2fcsr(dest, val)
+#else /* CONFIG_CC_IS_CLANG */
+#define LOONGARCH_FCSR0 $r0
+#define LOONGARCH_FCSR1 $r1
+#define LOONGARCH_FCSR2 $r2
+#define LOONGARCH_FCSR3 $r3
#define read_fcsr(source) \
({ \
unsigned int __res; \
@@ -1503,5 +1509,6 @@ do { \
" movgr2fcsr "__stringify(dest)", %0 \n" \
: : "r" (val)); \
} while (0)
+#endif /* CONFIG_CC_IS_CLANG */

#endif /* _ASM_LOONGARCH_H */
--
2.40.0


2023-06-23 16:28:48

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH 4/9] LoongArch: Make {read,write}_fcsr compatible with LLVM/Clang

Hi, Xuerui,

On Fri, Jun 23, 2023 at 9:44 PM WANG Xuerui <[email protected]> wrote:
>
> From: WANG Xuerui <[email protected]>
>
> LLVM/Clang does not see FCSRs as GPRs, so make use of compiler
> built-ins instead for better maintainability with less code.
>
> The existing version cannot be wholly removed though, because the
> built-ins, while available on GCC too, is predicated TARGET_HARD_FLOAT,
> which means soft-float code cannot make use of them.
>
> Signed-off-by: WANG Xuerui <[email protected]>
> ---
> arch/loongarch/include/asm/loongarch.h | 19 +++++++++++++------
> 1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
> index ac83e60c60d1..eedc313b5241 100644
> --- a/arch/loongarch/include/asm/loongarch.h
> +++ b/arch/loongarch/include/asm/loongarch.h
> @@ -1445,12 +1445,6 @@ __BUILD_CSR_OP(tlbidx)
> #define EXCCODE_INT_START 64
> #define EXCCODE_INT_END (EXCCODE_INT_START + EXCCODE_INT_NUM - 1)
>
> -/* FPU register names */
> -#define LOONGARCH_FCSR0 $r0
> -#define LOONGARCH_FCSR1 $r1
> -#define LOONGARCH_FCSR2 $r2
> -#define LOONGARCH_FCSR3 $r3
> -
> /* FPU Status Register Values */
> #define FPU_CSR_RSVD 0xe0e0fce0
>
> @@ -1487,6 +1481,18 @@ __BUILD_CSR_OP(tlbidx)
> #define FPU_CSR_RU 0x200 /* towards +Infinity */
> #define FPU_CSR_RD 0x300 /* towards -Infinity */
>
> +#ifdef CONFIG_CC_IS_CLANG
> +#define LOONGARCH_FCSR0 0
> +#define LOONGARCH_FCSR1 1
> +#define LOONGARCH_FCSR2 2
> +#define LOONGARCH_FCSR3 3
> +#define read_fcsr(source) __movfcsr2gr(source)
> +#define write_fcsr(dest, val) __movgr2fcsr(dest, val)
> +#else /* CONFIG_CC_IS_CLANG */
> +#define LOONGARCH_FCSR0 $r0
> +#define LOONGARCH_FCSR1 $r1
> +#define LOONGARCH_FCSR2 $r2
> +#define LOONGARCH_FCSR3 $r3
> #define read_fcsr(source) \
> ({ \
> unsigned int __res; \
Now the latest binutils also supports $fcsr, so I suggest to always
use inline asm, and change CONFIG_CC_IS_CLANG to
CONFIG_AS_HAS_FCSR_CLASS. And of course, Patch3 and Patch4 can be
merged then.

Huacai

> @@ -1503,5 +1509,6 @@ do { \
> " movgr2fcsr "__stringify(dest)", %0 \n" \
> : : "r" (val)); \
> } while (0)
> +#endif /* CONFIG_CC_IS_CLANG */
>
> #endif /* _ASM_LOONGARCH_H */
> --
> 2.40.0
>

2023-06-23 16:46:40

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH 7/9] LoongArch: Tweak CFLAGS for Clang compatibility

On Fri, Jun 23, 2023 at 6:44 AM WANG Xuerui <[email protected]> wrote:
>
> From: WANG Xuerui <[email protected]>
>
> Now the arch code is mostly ready for LLVM/Clang consumption, it is time
> to re-organize the CFLAGS a little to actually enable the LLVM build.
>
> A build with !RELOCATABLE && !MODULE is confirmed working within a QEMU
> environment; support for the two features are currently blocked by
> LLVM/Clang, and will come later.
>
> Signed-off-by: WANG Xuerui <[email protected]>
> ---
> arch/loongarch/Makefile | 14 +++++++++++---
> arch/loongarch/vdso/Makefile | 6 +++++-
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> index a27e264bdaa5..efe9b50bd829 100644
> --- a/arch/loongarch/Makefile
> +++ b/arch/loongarch/Makefile
> @@ -46,12 +46,18 @@ ld-emul = $(64bit-emul)
> cflags-y += -mabi=lp64s
> endif
>
> -cflags-y += -G0 -pipe -msoft-float

This seems to drop -msoft-float for GCC. Intentional?

> -LDFLAGS_vmlinux += -G0 -static -n -nostdlib
> +ifndef CONFIG_CC_IS_CLANG
> +cflags-y += -G0
> +LDFLAGS_vmlinux += -G0

Thanks for the patch!

I can understand not passing -G0 to clang if clang doesn't understand
it, but should you be using CONFIG_LD_IS_LLD for LDFLAGS?

What does -G0 do?

Is there a plan to support it in clang and lld?

If so, please file a bug in LLVM's issue tracker
https://github.com/llvm/llvm-project/issues
then link to it in a comment in this Makefile above the relevant condition.

> +endif
> +cflags-y += -pipe
> +LDFLAGS_vmlinux += -static -n -nostdlib
>
> # When the assembler supports explicit relocation hint, we must use it.
> # GCC may have -mexplicit-relocs off by default if it was built with an old
> -# assembler, so we force it via an option.
> +# assembler, so we force it via an option. For LLVM/Clang the desired behavior
> +# is the default, and the flag is not supported, so don't pass it if Clang is
> +# being used.
> #
> # When the assembler does not supports explicit relocation hint, we can't use
> # it. Disable it if the compiler supports it.
> @@ -61,8 +67,10 @@ LDFLAGS_vmlinux += -G0 -static -n -nostdlib
> # combination of a "new" assembler and "old" compiler is not supported. Either
> # upgrade the compiler or downgrade the assembler.
> ifdef CONFIG_AS_HAS_EXPLICIT_RELOCS
> +ifndef CONFIG_CC_IS_CLANG
> cflags-y += -mexplicit-relocs
> KBUILD_CFLAGS_KERNEL += -mdirect-extern-access
> +endif

Why would AS_HAS_EXPLICIT_RELOCS be set if -mexplicit-relocs isn't
supported? Is the kconfig for that broken?

Does AS_HAS_EXPLICIT_RELOCS also need to test for the support for
-mdirect-extern-access or should there be a new config for that?
CC_SUPPORTS_DIRECT_EXTERN_ACCESS

> else
> cflags-y += $(call cc-option,-mno-explicit-relocs)
> KBUILD_AFLAGS_KERNEL += -Wa,-mla-global-with-pcrel
> diff --git a/arch/loongarch/vdso/Makefile b/arch/loongarch/vdso/Makefile
> index 4c859a0e4754..19f6c75a1106 100644
> --- a/arch/loongarch/vdso/Makefile
> +++ b/arch/loongarch/vdso/Makefile
> @@ -25,13 +25,17 @@ endif
> cflags-vdso := $(ccflags-vdso) \
> -isystem $(shell $(CC) -print-file-name=include) \
> $(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \
> - -O2 -g -fno-strict-aliasing -fno-common -fno-builtin -G0 \
> + -O2 -g -fno-strict-aliasing -fno-common -fno-builtin \
> -fno-stack-protector -fno-jump-tables -DDISABLE_BRANCH_PROFILING \
> $(call cc-option, -fno-asynchronous-unwind-tables) \
> $(call cc-option, -fno-stack-protector)
> aflags-vdso := $(ccflags-vdso) \
> -D__ASSEMBLY__ -Wa,-gdwarf-2
>
> +ifndef CONFIG_CC_IS_CLANG
> +cflags-vdso += -G0
> +endif
> +
> ifneq ($(c-gettimeofday-y),)
> CFLAGS_vgettimeofday.o += -include $(c-gettimeofday-y)
> endif
> --
> 2.40.0
>
>


--
Thanks,
~Nick Desaulniers

2023-06-23 16:54:18

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH 6/9] LoongArch: Simplify the invtlb wrappers

Hi, Xuerui,

To minimize modifications, and be more convenient to rebase, please
only modify the implementation of these functions, don't remove
functions and parameters. Thank you.

Huacai

On Fri, Jun 23, 2023 at 9:44 PM WANG Xuerui <[email protected]> wrote:
>
> From: WANG Xuerui <[email protected]>
>
> Of the 3 existing invtlb wrappers, invtlb_info is not used at all,
> so it is removed; invtlb_all and invtlb_addr have their unused
> argument(s) removed from their signatures.
>
> Also, the invtlb instruction has been supported by upstream LoongArch
> toolchains from day one, so ditch the raw opcode trickery and just use
> plain inline asm for it.
>
> Signed-off-by: WANG Xuerui <[email protected]>
> ---
> arch/loongarch/include/asm/tlb.h | 45 ++++++++++++--------------------
> arch/loongarch/mm/tlb.c | 10 +++----
> 2 files changed, 21 insertions(+), 34 deletions(-)
>
> diff --git a/arch/loongarch/include/asm/tlb.h b/arch/loongarch/include/asm/tlb.h
> index 0dc9ee2b05d2..5e6ee9a15f0f 100644
> --- a/arch/loongarch/include/asm/tlb.h
> +++ b/arch/loongarch/include/asm/tlb.h
> @@ -88,52 +88,39 @@ enum invtlb_ops {
> INVTLB_GID_ADDR = 0x16,
> };
>
> -/*
> - * invtlb op info addr
> - * (0x1 << 26) | (0x24 << 20) | (0x13 << 15) |
> - * (addr << 10) | (info << 5) | op
> - */
> static inline void invtlb(u32 op, u32 info, u64 addr)
> {
> __asm__ __volatile__(
> - "parse_r addr,%0\n\t"
> - "parse_r info,%1\n\t"
> - ".word ((0x6498000) | (addr << 10) | (info << 5) | %2)\n\t"
> - :
> - : "r"(addr), "r"(info), "i"(op)
> - :
> - );
> -}
> -
> -static inline void invtlb_addr(u32 op, u32 info, u64 addr)
> -{
> - __asm__ __volatile__(
> - "parse_r addr,%0\n\t"
> - ".word ((0x6498000) | (addr << 10) | (0 << 5) | %1)\n\t"
> - :
> - : "r"(addr), "i"(op)
> + "invtlb %0, %1, %2\n\t"
> :
> + : "i"(op), "r"(info), "r"(addr)
> + : "memory"
> );
> }
>
> -static inline void invtlb_info(u32 op, u32 info, u64 addr)
> +static inline void invtlb_addr(u32 op, u64 addr)
> {
> + /*
> + * The ISA manual says $zero shall be used in case a particular op
> + * does not take the respective argument, hence the invtlb helper is
> + * not re-used to make sure this is the case.
> + */
> __asm__ __volatile__(
> - "parse_r info,%0\n\t"
> - ".word ((0x6498000) | (0 << 10) | (info << 5) | %1)\n\t"
> - :
> - : "r"(info), "i"(op)
> + "invtlb %0, $zero, %1\n\t"
> :
> + : "i"(op), "r"(addr)
> + : "memory"
> );
> }
>
> -static inline void invtlb_all(u32 op, u32 info, u64 addr)
> +static inline void invtlb_all(u32 op)
> {
> + /* Similar to invtlb_addr, ensure the operands are actually $zero. */
> __asm__ __volatile__(
> - ".word ((0x6498000) | (0 << 10) | (0 << 5) | %0)\n\t"
> + "invtlb %0, $zero, $zero\n\t"
> :
> : "i"(op)
> - :
> + : "memory"
> );
> }
>
> diff --git a/arch/loongarch/mm/tlb.c b/arch/loongarch/mm/tlb.c
> index 00bb563e3c89..de04d2624ef4 100644
> --- a/arch/loongarch/mm/tlb.c
> +++ b/arch/loongarch/mm/tlb.c
> @@ -17,19 +17,19 @@
>
> void local_flush_tlb_all(void)
> {
> - invtlb_all(INVTLB_CURRENT_ALL, 0, 0);
> + invtlb_all(INVTLB_CURRENT_ALL);
> }
> EXPORT_SYMBOL(local_flush_tlb_all);
>
> void local_flush_tlb_user(void)
> {
> - invtlb_all(INVTLB_CURRENT_GFALSE, 0, 0);
> + invtlb_all(INVTLB_CURRENT_GFALSE);
> }
> EXPORT_SYMBOL(local_flush_tlb_user);
>
> void local_flush_tlb_kernel(void)
> {
> - invtlb_all(INVTLB_CURRENT_GTRUE, 0, 0);
> + invtlb_all(INVTLB_CURRENT_GTRUE);
> }
> EXPORT_SYMBOL(local_flush_tlb_kernel);
>
> @@ -100,7 +100,7 @@ void local_flush_tlb_kernel_range(unsigned long start, unsigned long end)
> end &= (PAGE_MASK << 1);
>
> while (start < end) {
> - invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, 0, start);
> + invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, start);
> start += (PAGE_SIZE << 1);
> }
> } else {
> @@ -131,7 +131,7 @@ void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long page)
> void local_flush_tlb_one(unsigned long page)
> {
> page &= (PAGE_MASK << 1);
> - invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, 0, page);
> + invtlb_addr(INVTLB_ADDR_GTRUE_OR_ASID, page);
> }
>
> static void __update_hugetlb(struct vm_area_struct *vma, unsigned long address, pte_t *ptep)
> --
> 2.40.0
>
>

2023-06-23 16:54:20

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH 8/9] Makefile: Add loongarch target flag for Clang compilation

Hi, Xuerui,

This is the enablement patch, I think moving it to the last is better.

Huacai

On Fri, Jun 23, 2023 at 9:44 PM WANG Xuerui <[email protected]> wrote:
>
> From: WANG Xuerui <[email protected]>
>
> The LoongArch kernel is 64-bit and built with the soft-float ABI,
> hence the loongarch64-linux-gnusf target. (The "libc" part doesn't
> matter.)
>
> Signed-off-by: WANG Xuerui <[email protected]>
> ---
> scripts/Makefile.clang | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/scripts/Makefile.clang b/scripts/Makefile.clang
> index 058a4c0f864e..6c23c6af797f 100644
> --- a/scripts/Makefile.clang
> +++ b/scripts/Makefile.clang
> @@ -4,6 +4,7 @@
> CLANG_TARGET_FLAGS_arm := arm-linux-gnueabi
> CLANG_TARGET_FLAGS_arm64 := aarch64-linux-gnu
> CLANG_TARGET_FLAGS_hexagon := hexagon-linux-musl
> +CLANG_TARGET_FLAGS_loongarch := loongarch64-linux-gnusf
> CLANG_TARGET_FLAGS_m68k := m68k-linux-gnu
> CLANG_TARGET_FLAGS_mips := mipsel-linux-gnu
> CLANG_TARGET_FLAGS_powerpc := powerpc64le-linux-gnu
> --
> 2.40.0
>
>

2023-06-23 17:08:45

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH 8/9] Makefile: Add loongarch target flag for Clang compilation

On Fri, Jun 23, 2023 at 6:44 AM WANG Xuerui <[email protected]> wrote:
>
> From: WANG Xuerui <[email protected]>
>
> The LoongArch kernel is 64-bit and built with the soft-float ABI,
> hence the loongarch64-linux-gnusf target. (The "libc" part doesn't
> matter.)

Technically, IIRC llvm may make different decisions on libcall
optimizations based on the libc part of the target triple. For
instance, is bcmp defined in that libc or not? That's why we specify
-gnu or -musl (I forgot we did that for hexagon) explicitly rather
than leave that part of the triple blank. Minutia that doesn't need
to be in this commit message, but now it's explicitly documented on
LKML and linkable-to.

>
> Signed-off-by: WANG Xuerui <[email protected]>

Thanks for the patch!
Reviewed-by: Nick Desaulniers <[email protected]>

> ---
> scripts/Makefile.clang | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/scripts/Makefile.clang b/scripts/Makefile.clang
> index 058a4c0f864e..6c23c6af797f 100644
> --- a/scripts/Makefile.clang
> +++ b/scripts/Makefile.clang
> @@ -4,6 +4,7 @@
> CLANG_TARGET_FLAGS_arm := arm-linux-gnueabi
> CLANG_TARGET_FLAGS_arm64 := aarch64-linux-gnu
> CLANG_TARGET_FLAGS_hexagon := hexagon-linux-musl
> +CLANG_TARGET_FLAGS_loongarch := loongarch64-linux-gnusf
> CLANG_TARGET_FLAGS_m68k := m68k-linux-gnu
> CLANG_TARGET_FLAGS_mips := mipsel-linux-gnu
> CLANG_TARGET_FLAGS_powerpc := powerpc64le-linux-gnu
> --
> 2.40.0
>
>


--
Thanks,
~Nick Desaulniers

2023-06-23 17:11:37

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH 3/9] LoongArch: Prepare for assemblers with proper FCSR bank support

Hi, Xuerui,

On Fri, Jun 23, 2023 at 9:44 PM WANG Xuerui <[email protected]> wrote:
>
> From: WANG Xuerui <[email protected]>
>
> The GNU assembler (as of 2.40) mis-treats FCSR operands as GPRs, but
> the LLVM IAS does not. Probe for this and refer to FCSRs as "$fcsrNN"
> if support is present.
>
> Signed-off-by: WANG Xuerui <[email protected]>
> ---
> arch/loongarch/Kconfig | 3 +++
> arch/loongarch/include/asm/fpregdef.h | 7 +++++++
> 2 files changed, 10 insertions(+)
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 743d87655742..c8e4f8b03c55 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -242,6 +242,9 @@ config SCHED_OMIT_FRAME_POINTER
> config AS_HAS_EXPLICIT_RELOCS
> def_bool $(as-instr,x:pcalau12i \$t0$(comma)%pc_hi20(x))
>
> +config AS_HAS_FCSR_BANK
> + def_bool $(as-instr,x:movfcsr2gr \$t0$(comma)\$fcsr0)
The word "bank" is difficult to understand, at least for me, so use
"class" may be better.

Huacai
> +
> config CC_HAS_LSX_EXTENSION
> def_bool $(cc-option,-mlsx)
>
> diff --git a/arch/loongarch/include/asm/fpregdef.h b/arch/loongarch/include/asm/fpregdef.h
> index b6be527831dd..b0ac640db74c 100644
> --- a/arch/loongarch/include/asm/fpregdef.h
> +++ b/arch/loongarch/include/asm/fpregdef.h
> @@ -40,6 +40,12 @@
> #define fs6 $f30
> #define fs7 $f31
>
> +#ifdef CONFIG_AS_HAS_FCSR_BANK
> +#define fcsr0 $fcsr0
> +#define fcsr1 $fcsr1
> +#define fcsr2 $fcsr2
> +#define fcsr3 $fcsr3
> +#else
> /*
> * Current binutils expects *GPRs* at FCSR position for the FCSR
> * operation instructions, so define aliases for those used.
> @@ -48,5 +54,6 @@
> #define fcsr1 $r1
> #define fcsr2 $r2
> #define fcsr3 $r3
> +#endif
>
> #endif /* _ASM_FPREGDEF_H */
> --
> 2.40.0
>

2023-06-23 17:35:12

by Xi Ruoyao

[permalink] [raw]
Subject: Re: [PATCH 7/9] LoongArch: Tweak CFLAGS for Clang compatibility

On Fri, 2023-06-23 at 21:43 +0800, WANG Xuerui wrote:

> -cflags-y                       += -G0 -pipe -msoft-float

-msoft-float should not be removed. Our consensus (made when I was
developing https://gcc.gnu.org/r13-6500) is -mabi=lp64s does *not*
disable floating point instructions, but only disable FPRs for passing
arguments and return values. So w/o -msoft-float (or -mfpu=none) GCC is
allowed to generate FP instructions everywhere in kernel and it may
cause kernel FPD exception in the future.

--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University

2023-06-23 17:35:29

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH 9/9] LoongArch: Mark Clang LTO as working

On Fri, Jun 23, 2023 at 6:44 AM WANG Xuerui <[email protected]> wrote:
>
> From: WANG Xuerui <[email protected]>
>
> Confirmed working with QEMU system emulation.
>
> Signed-off-by: WANG Xuerui <[email protected]>

Acked-by: Nick Desaulniers <[email protected]>

Untested though.

> ---
> arch/loongarch/Kconfig | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index c8e4f8b03c55..7c5d562b2623 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -51,6 +51,8 @@ config LOONGARCH
> select ARCH_SUPPORTS_ACPI
> select ARCH_SUPPORTS_ATOMIC_RMW
> select ARCH_SUPPORTS_HUGETLBFS
> + select ARCH_SUPPORTS_LTO_CLANG
> + select ARCH_SUPPORTS_LTO_CLANG_THIN
> select ARCH_SUPPORTS_NUMA_BALANCING
> select ARCH_USE_BUILTIN_BSWAP
> select ARCH_USE_CMPXCHG_LOCKREF
> --
> 2.40.0
>
>


--
Thanks,
~Nick Desaulniers

2023-06-23 17:44:48

by WANG Xuerui

[permalink] [raw]
Subject: Re: [PATCH 7/9] LoongArch: Tweak CFLAGS for Clang compatibility

On 6/24/23 01:00, Xi Ruoyao wrote:
> On Fri, 2023-06-23 at 21:43 +0800, WANG Xuerui wrote:
>
>> -cflags-y                       += -G0 -pipe -msoft-float
> -msoft-float should not be removed. Our consensus (made when I was
> developing https://gcc.gnu.org/r13-6500) is -mabi=lp64s does *not*
> disable floating point instructions, but only disable FPRs for passing
> arguments and return values. So w/o -msoft-float (or -mfpu=none) GCC is
> allowed to generate FP instructions everywhere in kernel and it may
> cause kernel FPD exception in the future.
Hmm, now I remember (still vaguely) about the discussion... I'll have to
check how to minimize churn around FPU-touching code though if
-msoft-float is to be kept.

--
WANG "xen0n" Xuerui

Linux/LoongArch mailing list: https://lore.kernel.org/loongarch/


2023-06-23 17:50:57

by WANG Xuerui

[permalink] [raw]
Subject: Re: [PATCH 7/9] LoongArch: Tweak CFLAGS for Clang compatibility


On 6/24/23 00:39, Nick Desaulniers wrote:
> On Fri, Jun 23, 2023 at 6:44 AM WANG Xuerui <[email protected]> wrote:
>> From: WANG Xuerui <[email protected]>
>>
>> Now the arch code is mostly ready for LLVM/Clang consumption, it is time
>> to re-organize the CFLAGS a little to actually enable the LLVM build.
>>
>> A build with !RELOCATABLE && !MODULE is confirmed working within a QEMU
>> environment; support for the two features are currently blocked by
>> LLVM/Clang, and will come later.
>>
>> Signed-off-by: WANG Xuerui <[email protected]>
>> ---
>> arch/loongarch/Makefile | 14 +++++++++++---
>> arch/loongarch/vdso/Makefile | 6 +++++-
>> 2 files changed, 16 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
>> index a27e264bdaa5..efe9b50bd829 100644
>> --- a/arch/loongarch/Makefile
>> +++ b/arch/loongarch/Makefile
>> @@ -46,12 +46,18 @@ ld-emul = $(64bit-emul)
>> cflags-y += -mabi=lp64s
>> endif
>>
>> -cflags-y += -G0 -pipe -msoft-float
> This seems to drop -msoft-float for GCC. Intentional?

Kind-of; according to the LoongArch Toolchain Conventions [1],
-msoft-float basically selects the soft-float ABI, but *also prevents
use of any FP instructions*. This is where things get hairy, because it
means e.g. any translation unit can't manipulate the FP context at all
without special-casing its CFLAGS to have the -msoft-float flag removed.
I've tried and stopped when I noticed >3 files needed such treatment
even in arch/loongarch/kernel alone; -mabi=lp64s is always present right
now and that's enough.

[1]:
https://loongson.github.io/LoongArch-Documentation/LoongArch-toolchain-conventions-EN.html#_compiler_options

>
>> -LDFLAGS_vmlinux += -G0 -static -n -nostdlib
>> +ifndef CONFIG_CC_IS_CLANG
>> +cflags-y += -G0
>> +LDFLAGS_vmlinux += -G0
> Thanks for the patch!
>
> I can understand not passing -G0 to clang if clang doesn't understand
> it, but should you be using CONFIG_LD_IS_LLD for LDFLAGS?
>
> What does -G0 do?
Just as Ruoyao explained earlier, it's the "small data threshold". It's
not implemented on LoongArch yet, and we don't have ABI provisions for
that either, so IMO it's even okay to just drop it unconditionally. (I
haven't double-checked the GCC behavior though.)
>
> Is there a plan to support it in clang and lld?
>
> If so, please file a bug in LLVM's issue tracker
> https://github.com/llvm/llvm-project/issues
> then link to it in a comment in this Makefile above the relevant condition.
As explained above, proper support for "small data optimization"
probably means some cooperation from ABI side (e.g. reserving a GP
register for being able to reference +/-4KiB from it with a single
insn), so I don't expect this to happen anytime soon.
>
>> +endif
>> +cflags-y += -pipe
>> +LDFLAGS_vmlinux += -static -n -nostdlib
>>
>> # When the assembler supports explicit relocation hint, we must use it.
>> # GCC may have -mexplicit-relocs off by default if it was built with an old
>> -# assembler, so we force it via an option.
>> +# assembler, so we force it via an option. For LLVM/Clang the desired behavior
>> +# is the default, and the flag is not supported, so don't pass it if Clang is
>> +# being used.
>> #
>> # When the assembler does not supports explicit relocation hint, we can't use
>> # it. Disable it if the compiler supports it.
>> @@ -61,8 +67,10 @@ LDFLAGS_vmlinux += -G0 -static -n -nostdlib
>> # combination of a "new" assembler and "old" compiler is not supported. Either
>> # upgrade the compiler or downgrade the assembler.
>> ifdef CONFIG_AS_HAS_EXPLICIT_RELOCS
>> +ifndef CONFIG_CC_IS_CLANG
>> cflags-y += -mexplicit-relocs
>> KBUILD_CFLAGS_KERNEL += -mdirect-extern-access
>> +endif
> Why would AS_HAS_EXPLICIT_RELOCS be set if -mexplicit-relocs isn't
> supported? Is the kconfig for that broken?
>
> Does AS_HAS_EXPLICIT_RELOCS also need to test for the support for
> -mdirect-extern-access or should there be a new config for that?
> CC_SUPPORTS_DIRECT_EXTERN_ACCESS
>
>> else
>> cflags-y += $(call cc-option,-mno-explicit-relocs)
>> KBUILD_AFLAGS_KERNEL += -Wa,-mla-global-with-pcrel
>> diff --git a/arch/loongarch/vdso/Makefile b/arch/loongarch/vdso/Makefile
>> index 4c859a0e4754..19f6c75a1106 100644
>> --- a/arch/loongarch/vdso/Makefile
>> +++ b/arch/loongarch/vdso/Makefile
>> @@ -25,13 +25,17 @@ endif
>> cflags-vdso := $(ccflags-vdso) \
>> -isystem $(shell $(CC) -print-file-name=include) \
>> $(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \
>> - -O2 -g -fno-strict-aliasing -fno-common -fno-builtin -G0 \
>> + -O2 -g -fno-strict-aliasing -fno-common -fno-builtin \
>> -fno-stack-protector -fno-jump-tables -DDISABLE_BRANCH_PROFILING \
>> $(call cc-option, -fno-asynchronous-unwind-tables) \
>> $(call cc-option, -fno-stack-protector)
>> aflags-vdso := $(ccflags-vdso) \
>> -D__ASSEMBLY__ -Wa,-gdwarf-2
>>
>> +ifndef CONFIG_CC_IS_CLANG
>> +cflags-vdso += -G0
>> +endif
>> +
>> ifneq ($(c-gettimeofday-y),)
>> CFLAGS_vgettimeofday.o += -include $(c-gettimeofday-y)
>> endif
>> --
>> 2.40.0
>>
>>
>
--
WANG "xen0n" Xuerui

Linux/LoongArch mailing list: https://lore.kernel.org/loongarch/


2023-06-23 17:53:56

by Xi Ruoyao

[permalink] [raw]
Subject: Re: [PATCH 7/9] LoongArch: Tweak CFLAGS for Clang compatibility

On Fri, 2023-06-23 at 09:39 -0700, Nick Desaulniers wrote:
> On Fri, Jun 23, 2023 at 6:44 AM WANG Xuerui <[email protected]> wrote:
> >
> > From: WANG Xuerui <[email protected]>
> >
> > Now the arch code is mostly ready for LLVM/Clang consumption, it is time
> > to re-organize the CFLAGS a little to actually enable the LLVM build.
> >
> > A build with !RELOCATABLE && !MODULE is confirmed working within a QEMU
> > environment; support for the two features are currently blocked by
> > LLVM/Clang, and will come later.
> >
> > Signed-off-by: WANG Xuerui <[email protected]>
> > ---
> >  arch/loongarch/Makefile      | 14 +++++++++++---
> >  arch/loongarch/vdso/Makefile |  6 +++++-
> >  2 files changed, 16 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> > index a27e264bdaa5..efe9b50bd829 100644
> > --- a/arch/loongarch/Makefile
> > +++ b/arch/loongarch/Makefile
> > @@ -46,12 +46,18 @@ ld-emul                     = $(64bit-emul)
> >  cflags-y               += -mabi=lp64s
> >  endif
> >
> > -cflags-y                       += -G0 -pipe -msoft-float
>
> This seems to drop -msoft-float for GCC. Intentional?
>
> > -LDFLAGS_vmlinux                        += -G0 -static -n -nostdlib
> > +ifndef CONFIG_CC_IS_CLANG
> > +cflags-y                       += -G0
> > +LDFLAGS_vmlinux                        += -G0
>
> Thanks for the patch!
>
> I can understand not passing -G0 to clang if clang doesn't understand
> it, but should you be using CONFIG_LD_IS_LLD for LDFLAGS?
>
> What does -G0 do?

-G0 is a no-op for now because there is no small bss/data optimization
implemented for LoongArch yet.

/* snip */

> Why would AS_HAS_EXPLICIT_RELOCS be set if -mexplicit-relocs isn't
> supported? Is the kconfig for that broken?

Using GCC 12 (w/o -mexplicit-relocs support) together with Binutils >=
2.39 (with explicit relocs support) will cause kernel modules fail to be
loaded (because there will be R_LARCH_ABS_* relocations in the modules
and the module loader does not support them), so we deliberately reject
such a combination at compile time.

I could add R_LARCH_ABS_* implementation into the module loader to make
it work, but Huacai suggested to just declare the combination of GCC 12
and Binutils >= 2.39 unsupported.

--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University