The version 2.00 of LoongArch ELF ABI specification introduced new
relocation types, and the development tree of Binutils and GCC has
started to use them. If the kernel is built with the latest snapshot of
Binutils or GCC, it will fail to load the modules because of unrecognized
relocation types in modules.
Add support for GOT and new relocation types for the module loader, so
the kernel (with modules) can be built with the "normal" code model and
function properly.
This series does not break the compatibility with old toolchain using
stack-based relocation types, so with the patches applied the kernel can
be be built with both old and new toolchains. But, the combination of
"new" Binutils and "old" GCC is not supported.
Tested by building the kernel with the following combinations:
- GCC 12 and Binutils 2.39
- GCC trunk and Binutils trunk
and running the builds with 35 in-tree modules loaded, and loading one
module with 20 GOT loads and a per-CPU variable (loaded addresses
verified by comparing with /proc/kallsyms).
Changes from v6 to v7:
- Simplify apply_r_larch_pcala.
- Remove a build check only for excluding early GCC 13 dev snapshots.
- Squash model attribute addition into the previous patch.
- Retain "-fplt".
Changes from v5 to v6:
- Restore version number.
- Rename CONFIG_CC_HAS_EXPLICIT_RELOCS to CONFIG_AS_HAS_EXPLICIT_RELOCS.
It now only checks assembler.
- No longer support "old GCC with new Binutils", so R_LARCH_ABS* is
dropped.
- "Old GCC with old Binutils" is still supported until Arnd ack.
- "New GCC with old Binutils" is still supported as it does not
require additional code.
- Remove "cc-option" around "-mexplicit-relocs". For unsupported
"old GCC with new Binutils" combination, forcing -mexplicit-relocs
makes assembling fail, instead of silently producing unloadable
modules.
- Move the error report for "lacking model attribute" into Makefile.
- Squash the two patches for R_LARCH_B26 and R_LARCH_PCALA* into one.
Changes from v4 to v5 ("v5" missed in the subject):
- Change subject.
- Introduce CONFIG_CC_HAS_EXPLICIT_RELOCS.
- Retain -Wa,-mla-* options for old toolchains
(!CONFIG_CC_HAS_EXPLICIT_RELOCS).
- Use __attribute__((model("extreme"))) in PER_CPU_ATTRIBUTES, to fix
a breakage with per-CPU variables defined in modules.
- Handle R_LARCH_PCALA64_{HI12,LO12} for extreme model.
- Handle R_LARCH_ABS* for "old GCC with new Binutils".
- Separate the last patch into more small patches.
- Avoid BUG_ON() for the handling of GOT.
Changes from v3 to v4:
- No code change. Reword the commit message of the 3rd patch again
based on suggestion from Huacai.
Changes from v2 to v3:
- Use `union loongarch_instruction` instead of explicit bit shifts
applying the relocation. Suggested by Youling.
- For R_LARCH_B26, move the alignment check before the range check to be
consistent with stack pop relocations. Suggested by Youling.
- Reword the commit message of the 3rd patch. Suggested by Huacai.
Changes from v1 to v2:
- Fix a stupid programming error (confusion between the number of PLT
entries and the number of GOT entries). (Bug spotted by Youling).
- Synthesize the _GLOBAL_OFFSET_TABLE_ symbol with module.lds, instead
of faking it at runtime. The 3rd patch from V1 is now merged into
the 1st patch because it would be a one-line change. (Suggested by
Jinyang).
- Keep reloc_rela_handlers[] ordered by the relocation type ID.
(Suggested by Youling).
- Remove -fplt along with -Wa,-mla-* options because it's the default.
(Suggested by Youling).
Xi Ruoyao (5):
LoongArch: Add CONFIG_AS_HAS_EXPLICIT_RELOCS
LoongArch: Adjust symbol addressing for CONFIG_AS_HAS_EXPLICIT_RELOCS
LoongArch: Define ELF relocation types added in v2.00 ABI
LoongArch: Support PC-relative relocations in modules
LoongArch: Support R_LARCH_GOT_PC* in modules
arch/loongarch/Kconfig | 3 +
arch/loongarch/Makefile | 17 +++++
arch/loongarch/include/asm/elf.h | 37 ++++++++++
arch/loongarch/include/asm/module.h | 23 ++++++
arch/loongarch/include/asm/module.lds.h | 1 +
arch/loongarch/include/asm/percpu.h | 8 +++
arch/loongarch/kernel/head.S | 10 +--
arch/loongarch/kernel/module-sections.c | 56 +++++++++++++--
arch/loongarch/kernel/module.c | 93 ++++++++++++++++++++++++-
9 files changed, 238 insertions(+), 10 deletions(-)
--
2.37.0
GCC >= 13 and GNU assembler >= 2.40 use these relocations to address
external symbols, so we need to add them.
Let the module loader emit GOT entries for data symbols so we would be
able to handle GOT relocations. The GOT entry is just the data symbol
address.
In module.lds, emit a stub .got section for a section header entry.
The actual content of the entry will be filled at runtime by
module_frob_arch_sections.
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/loongarch/include/asm/module.h | 23 ++++++++++++
arch/loongarch/include/asm/module.lds.h | 1 +
arch/loongarch/kernel/module-sections.c | 49 +++++++++++++++++++++++--
arch/loongarch/kernel/module.c | 24 ++++++++++++
4 files changed, 94 insertions(+), 3 deletions(-)
diff --git a/arch/loongarch/include/asm/module.h b/arch/loongarch/include/asm/module.h
index 9f6718df1854..76a98a0ab8a0 100644
--- a/arch/loongarch/include/asm/module.h
+++ b/arch/loongarch/include/asm/module.h
@@ -19,6 +19,7 @@ struct mod_section {
struct mod_arch_specific {
struct mod_section plt;
struct mod_section plt_idx;
+ struct mod_section got;
};
struct plt_entry {
@@ -28,11 +29,16 @@ struct plt_entry {
u32 inst_jirl;
};
+struct got_entry {
+ Elf_Addr symbol_addr;
+};
+
struct plt_idx_entry {
unsigned long symbol_addr;
};
Elf_Addr module_emit_plt_entry(struct module *mod, unsigned long val);
+Elf_Addr module_emit_got_entry(struct module *mod, Elf_Addr val);
static inline struct plt_entry emit_plt_entry(unsigned long val)
{
@@ -51,6 +57,11 @@ static inline struct plt_idx_entry emit_plt_idx_entry(unsigned long val)
return (struct plt_idx_entry) { val };
}
+static inline struct got_entry emit_got_entry(Elf_Addr val)
+{
+ return (struct got_entry) { val };
+}
+
static inline int get_plt_idx(unsigned long val, const struct mod_section *sec)
{
int i;
@@ -77,4 +88,16 @@ static inline struct plt_entry *get_plt_entry(unsigned long val,
return plt + plt_idx;
}
+static inline struct got_entry *get_got_entry(Elf_Addr val,
+ const struct mod_section *sec)
+{
+ struct got_entry *got = (struct got_entry *)sec->shdr->sh_addr;
+ int i;
+
+ for (i = 0; i < sec->num_entries; i++)
+ if (got[i].symbol_addr == val)
+ return &got[i];
+ return NULL;
+}
+
#endif /* _ASM_MODULE_H */
diff --git a/arch/loongarch/include/asm/module.lds.h b/arch/loongarch/include/asm/module.lds.h
index 31c1c0db11a3..57bbd0cedd26 100644
--- a/arch/loongarch/include/asm/module.lds.h
+++ b/arch/loongarch/include/asm/module.lds.h
@@ -4,4 +4,5 @@ SECTIONS {
. = ALIGN(4);
.plt : { BYTE(0) }
.plt.idx : { BYTE(0) }
+ .got : { BYTE(0) }
}
diff --git a/arch/loongarch/kernel/module-sections.c b/arch/loongarch/kernel/module-sections.c
index c67b9cb220eb..4c99737cd8dc 100644
--- a/arch/loongarch/kernel/module-sections.c
+++ b/arch/loongarch/kernel/module-sections.c
@@ -33,6 +33,31 @@ Elf_Addr module_emit_plt_entry(struct module *mod, unsigned long val)
return (Elf_Addr)&plt[nr];
}
+Elf_Addr module_emit_got_entry(struct module *mod, Elf_Addr val)
+{
+ struct mod_section *got_sec = &mod->arch.got;
+ int i = got_sec->num_entries;
+ struct got_entry *got = get_got_entry(val, got_sec);
+
+ if (got)
+ return (Elf_Addr)got;
+
+ /* There is no GOT entry existing for val yet. Create a new one. */
+ got = (struct got_entry *)got_sec->shdr->sh_addr;
+ got[i] = emit_got_entry(val);
+
+ got_sec->num_entries++;
+ if (got_sec->num_entries > got_sec->max_entries) {
+ /* This may happen when the module contains a GOT_HI20 without
+ * a paired GOT_LO12. Such a module is broken, reject it.
+ */
+ pr_err("%s: module contains bad GOT relocation\n", mod->name);
+ return 0;
+ }
+
+ return (Elf_Addr)&got[i];
+}
+
static int is_rela_equal(const Elf_Rela *x, const Elf_Rela *y)
{
return x->r_info == y->r_info && x->r_addend == y->r_addend;
@@ -50,7 +75,8 @@ static bool duplicate_rela(const Elf_Rela *rela, int idx)
return false;
}
-static void count_max_entries(Elf_Rela *relas, int num, unsigned int *plts)
+static void count_max_entries(Elf_Rela *relas, int num,
+ unsigned int *plts, unsigned int *gots)
{
unsigned int i, type;
@@ -62,6 +88,10 @@ static void count_max_entries(Elf_Rela *relas, int num, unsigned int *plts)
if (!duplicate_rela(relas, i))
(*plts)++;
break;
+ case R_LARCH_GOT_PC_HI20:
+ if (!duplicate_rela(relas, i))
+ (*gots)++;
+ break;
default:
/* Do nothing. */
}
@@ -71,7 +101,7 @@ static void count_max_entries(Elf_Rela *relas, int num, unsigned int *plts)
int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
char *secstrings, struct module *mod)
{
- unsigned int i, num_plts = 0;
+ unsigned int i, num_plts = 0, num_gots = 0;
/*
* Find the empty .plt sections.
@@ -81,6 +111,8 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
mod->arch.plt.shdr = sechdrs + i;
else if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt.idx"))
mod->arch.plt_idx.shdr = sechdrs + i;
+ else if (!strcmp(secstrings + sechdrs[i].sh_name, ".got"))
+ mod->arch.got.shdr = sechdrs + i;
}
if (!mod->arch.plt.shdr) {
@@ -91,6 +123,10 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
pr_err("%s: module PLT.IDX section(s) missing\n", mod->name);
return -ENOEXEC;
}
+ if (!mod->arch.got.shdr) {
+ pr_err("%s: module GOT section(s) missing\n", mod->name);
+ return -ENOEXEC;
+ }
/* Calculate the maxinum number of entries */
for (i = 0; i < ehdr->e_shnum; i++) {
@@ -105,7 +141,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
if (!(dst_sec->sh_flags & SHF_EXECINSTR))
continue;
- count_max_entries(relas, num_rela, &num_plts);
+ count_max_entries(relas, num_rela, &num_plts, &num_gots);
}
mod->arch.plt.shdr->sh_type = SHT_NOBITS;
@@ -122,5 +158,12 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
mod->arch.plt_idx.num_entries = 0;
mod->arch.plt_idx.max_entries = num_plts;
+ mod->arch.got.shdr->sh_type = SHT_NOBITS;
+ mod->arch.got.shdr->sh_flags = SHF_ALLOC;
+ mod->arch.got.shdr->sh_addralign = L1_CACHE_BYTES;
+ mod->arch.got.shdr->sh_size = (num_gots + 1) * sizeof(struct got_entry);
+ mod->arch.got.num_entries = 0;
+ mod->arch.got.max_entries = num_gots;
+
return 0;
}
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
index c09ddbe5ed8b..059bc6c86a99 100644
--- a/arch/loongarch/kernel/module.c
+++ b/arch/loongarch/kernel/module.c
@@ -346,6 +346,29 @@ static int apply_r_larch_pcala(struct module *mod, u32 *location, Elf_Addr v,
return 0;
}
+static int apply_r_larch_got_pc(struct module *mod, u32 *location, Elf_Addr v,
+ s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+ Elf_Addr got = module_emit_got_entry(mod, v);
+
+ if (!got)
+ return -EINVAL;
+
+ switch (type) {
+ case R_LARCH_GOT_PC_LO12:
+ type = R_LARCH_PCALA_LO12;
+ break;
+ case R_LARCH_GOT_PC_HI20:
+ type = R_LARCH_PCALA_HI20;
+ break;
+ default:
+ pr_err("%s: Unsupport relocation type %u\n", mod->name, type);
+ return -EINVAL;
+ }
+
+ return apply_r_larch_pcala(mod, location, got, rela_stack, rela_stack_top, type);
+}
+
/*
* reloc_handlers_rela() - Apply a particular relocation to a module
* @mod: the module to apply the reloc to
@@ -377,6 +400,7 @@ static reloc_rela_handler reloc_rela_handlers[] = {
[R_LARCH_ADD32 ... R_LARCH_SUB64] = apply_r_larch_add_sub,
[R_LARCH_B26] = apply_r_larch_b26,
[R_LARCH_PCALA_HI20...R_LARCH_PCALA64_HI12] = apply_r_larch_pcala,
+ [R_LARCH_GOT_PC_HI20...R_LARCH_GOT_PC_LO12] = apply_r_larch_got_pc,
};
int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
--
2.37.0
GNU as >= 2.40 and GCC >= 13 will support using explicit relocation
hints in the assembly code, instead of la.* macros. The usage of
explicit relocation hints can improve code generation so it's enabled
by default by GCC >= 13.
Introduce CONFIG_AS_HAS_EXPLICIT_RELOCS as the switch for
"use explicit relocation hints or not."
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/loongarch/Kconfig | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index d3ce96a1a744..0721b4b2207a 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -204,6 +204,9 @@ config SCHED_OMIT_FRAME_POINTER
bool
default y
+config AS_HAS_EXPLICIT_RELOCS
+ def_bool $(as-instr,x:pcalau12i \$t0$(comma)%pc_hi20(x))
+
menu "Kernel type and options"
source "kernel/Kconfig.hz"
--
2.37.0
Binutils >= 2.40 uses R_LARCH_B26 instead of R_LARCH_SOP_PUSH_PLT_PCREL,
and R_LARCH_PCALA* instead of R_LARCH_SOP_PUSH_PCREL.
Handle R_LARCH_B26 and R_LARCH_PCALA* in the module loader. For
R_LARCH_B26, also create a PLT entry as needed.
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/loongarch/kernel/module-sections.c | 7 ++-
arch/loongarch/kernel/module.c | 67 +++++++++++++++++++++++++
2 files changed, 73 insertions(+), 1 deletion(-)
diff --git a/arch/loongarch/kernel/module-sections.c b/arch/loongarch/kernel/module-sections.c
index 6d498288977d..c67b9cb220eb 100644
--- a/arch/loongarch/kernel/module-sections.c
+++ b/arch/loongarch/kernel/module-sections.c
@@ -56,9 +56,14 @@ static void count_max_entries(Elf_Rela *relas, int num, unsigned int *plts)
for (i = 0; i < num; i++) {
type = ELF_R_TYPE(relas[i].r_info);
- if (type == R_LARCH_SOP_PUSH_PLT_PCREL) {
+ switch (type) {
+ case R_LARCH_SOP_PUSH_PLT_PCREL:
+ case R_LARCH_B26:
if (!duplicate_rela(relas, i))
(*plts)++;
+ break;
+ default:
+ /* Do nothing. */
}
}
}
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
index 755d91ef8d85..c09ddbe5ed8b 100644
--- a/arch/loongarch/kernel/module.c
+++ b/arch/loongarch/kernel/module.c
@@ -281,6 +281,71 @@ static int apply_r_larch_add_sub(struct module *mod, u32 *location, Elf_Addr v,
}
}
+static int apply_r_larch_b26(struct module *mod, u32 *location, Elf_Addr v,
+ s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+ ptrdiff_t offset = (void *)v - (void *)location;
+ union loongarch_instruction *insn = (union loongarch_instruction *)location;
+
+ if (offset >= SZ_128M)
+ v = module_emit_plt_entry(mod, v);
+
+ if (offset < -SZ_128M)
+ v = module_emit_plt_entry(mod, v);
+
+ offset = (void *)v - (void *)location;
+
+ if (offset & 3) {
+ pr_err("module %s: jump offset = 0x%llx unaligned! dangerous R_LARCH_B26 (%u) relocation\n",
+ mod->name, (long long)offset, type);
+ return -ENOEXEC;
+ }
+
+ if (!signed_imm_check(offset, 28)) {
+ pr_err("module %s: jump offset = 0x%llx overflow! dangerous R_LARCH_B26 (%u) relocation\n",
+ mod->name, (long long)offset, type);
+ return -ENOEXEC;
+ }
+
+ offset >>= 2;
+ insn->reg0i26_format.immediate_l = offset & 0xffff;
+ insn->reg0i26_format.immediate_h = (offset >> 16) & 0x3ff;
+ return 0;
+}
+
+static int apply_r_larch_pcala(struct module *mod, u32 *location, Elf_Addr v,
+ s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
+{
+ union loongarch_instruction *insn = (union loongarch_instruction *)location;
+ /* Use s32 for a sign-extension deliberately. */
+ s32 offset_hi20 = (void *)((v + 0x800) & ~0xfff) -
+ (void *)((Elf_Addr)location & ~0xfff);
+ Elf_Addr anchor = (((Elf_Addr)location) & ~0xfff) + offset_hi20;
+ ptrdiff_t offset_rem = (void *)v - (void *)anchor;
+
+ switch (type) {
+ case R_LARCH_PCALA_HI20:
+ v = offset_hi20 >> 12;
+ insn->reg1i20_format.immediate = v & 0xfffff;
+ break;
+ case R_LARCH_PCALA64_LO20:
+ v = offset_rem >> 32;
+ insn->reg1i20_format.immediate = v & 0xfffff;
+ break;
+ case R_LARCH_PCALA64_HI12:
+ v = offset_rem >> 52;
+ /* fall through */
+ case R_LARCH_PCALA_LO12:
+ insn->reg2i12_format.immediate = v & 0xfff;
+ break;
+ default:
+ pr_err("%s: Unsupport relocation type %u\n", mod->name, type);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
/*
* reloc_handlers_rela() - Apply a particular relocation to a module
* @mod: the module to apply the reloc to
@@ -310,6 +375,8 @@ static reloc_rela_handler reloc_rela_handlers[] = {
[R_LARCH_SOP_SUB ... R_LARCH_SOP_IF_ELSE] = apply_r_larch_sop,
[R_LARCH_SOP_POP_32_S_10_5 ... R_LARCH_SOP_POP_32_U] = apply_r_larch_sop_imm_field,
[R_LARCH_ADD32 ... R_LARCH_SUB64] = apply_r_larch_add_sub,
+ [R_LARCH_B26] = apply_r_larch_b26,
+ [R_LARCH_PCALA_HI20...R_LARCH_PCALA64_HI12] = apply_r_larch_pcala,
};
int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
--
2.37.0
These relocation types are used by GNU binutils >= 2.40 and GCC >= 13.
Add their definitions so we will be able to use them in later patches.
Link: https://github.com/loongson/LoongArch-Documentation/pull/57
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/loongarch/include/asm/elf.h | 37 ++++++++++++++++++++++++++++++++
arch/loongarch/kernel/module.c | 2 +-
2 files changed, 38 insertions(+), 1 deletion(-)
diff --git a/arch/loongarch/include/asm/elf.h b/arch/loongarch/include/asm/elf.h
index 5f3ff4781fda..7af0cebf28d7 100644
--- a/arch/loongarch/include/asm/elf.h
+++ b/arch/loongarch/include/asm/elf.h
@@ -74,6 +74,43 @@
#define R_LARCH_SUB64 56
#define R_LARCH_GNU_VTINHERIT 57
#define R_LARCH_GNU_VTENTRY 58
+#define R_LARCH_B16 64
+#define R_LARCH_B21 65
+#define R_LARCH_B26 66
+#define R_LARCH_ABS_HI20 67
+#define R_LARCH_ABS_LO12 68
+#define R_LARCH_ABS64_LO20 69
+#define R_LARCH_ABS64_HI12 70
+#define R_LARCH_PCALA_HI20 71
+#define R_LARCH_PCALA_LO12 72
+#define R_LARCH_PCALA64_LO20 73
+#define R_LARCH_PCALA64_HI12 74
+#define R_LARCH_GOT_PC_HI20 75
+#define R_LARCH_GOT_PC_LO12 76
+#define R_LARCH_GOT64_PC_LO20 77
+#define R_LARCH_GOT64_PC_HI12 78
+#define R_LARCH_GOT_HI20 79
+#define R_LARCH_GOT_LO12 80
+#define R_LARCH_GOT64_LO20 81
+#define R_LARCH_GOT64_HI12 82
+#define R_LARCH_TLS_LE_HI20 83
+#define R_LARCH_TLS_LE_LO12 84
+#define R_LARCH_TLS_LE64_LO20 85
+#define R_LARCH_TLS_LE64_HI12 86
+#define R_LARCH_TLS_IE_PC_HI20 87
+#define R_LARCH_TLS_IE_PC_LO12 88
+#define R_LARCH_TLS_IE64_PC_LO20 89
+#define R_LARCH_TLS_IE64_PC_HI12 90
+#define R_LARCH_TLS_IE_HI20 91
+#define R_LARCH_TLS_IE_LO12 92
+#define R_LARCH_TLS_IE64_LO20 93
+#define R_LARCH_TLS_IE64_HI12 94
+#define R_LARCH_TLS_LD_PC_HI20 95
+#define R_LARCH_TLS_LD_HI20 96
+#define R_LARCH_TLS_GD_PC_HI20 97
+#define R_LARCH_TLS_GD_HI20 98
+#define R_LARCH_32_PCREL 99
+#define R_LARCH_RELAX 100
#ifndef ELF_ARCH
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
index 638427ff0d51..755d91ef8d85 100644
--- a/arch/loongarch/kernel/module.c
+++ b/arch/loongarch/kernel/module.c
@@ -296,7 +296,7 @@ typedef int (*reloc_rela_handler)(struct module *mod, u32 *location, Elf_Addr v,
/* The handlers for known reloc types */
static reloc_rela_handler reloc_rela_handlers[] = {
- [R_LARCH_NONE ... R_LARCH_SUB64] = apply_r_larch_error,
+ [R_LARCH_NONE ... R_LARCH_RELAX] = apply_r_larch_error,
[R_LARCH_NONE] = apply_r_larch_none,
[R_LARCH_32] = apply_r_larch_32,
--
2.37.0
If explicit relocation hints is used by the toolchain, -Wa,-mla-*
options will be useless for C code. Only use them for
!CONFIG_AS_HAS_EXPLICIT_RELOCS.
Replace "la" with "la.pcrel" in head.S to keep the semantic consistent
with new and old toolchains for the low level startup code.
For per-CPU variables, the "address" of the symbol is actually an
offset from $r21. The value is nearing the loading address of main
kernel image, but far from the address of modules. Use model("extreme")
attibute to tell the compiler that a a PC-relative addressing with
32-bit offset is not sufficient for local per-CPU variables.
The behavior with different assemblers and compilers are summarized in
the following table:
AS has CC has
explicit reloc explicit reloc * Behavior
==============================================================
No No Use la.* macros.
No change from Linux 6.0.
--------------------------------------------------------------
No Yes Disable explicit reloc.
No change from Linux 6.0.
--------------------------------------------------------------
Yes No Not supported.
--------------------------------------------------------------
Yes Yes Use explicit relocs.
No -Wa,-mla* options.
==============================================================
*: We assume CC must have model attribute if it has explicit reloc.
Both features are added in GCC 13 development cycle, so any GCC
release >= 13 should be OK. Using early GCC 13 development snapshots
may produce modules with unsupported relocations.
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=f09482a
Link: https://gcc.gnu.org/r13-1834
Link: https://gcc.gnu.org/r13-2199
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/loongarch/Makefile | 17 +++++++++++++++++
arch/loongarch/include/asm/percpu.h | 8 ++++++++
arch/loongarch/kernel/head.S | 10 +++++-----
3 files changed, 30 insertions(+), 5 deletions(-)
diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index 7051a95f7f31..92c4a52c4c3e 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -40,10 +40,27 @@ endif
cflags-y += -G0 -pipe -msoft-float
LDFLAGS_vmlinux += -G0 -static -n -nostdlib
+
+# When the assembler supports explicit relocation hint, we must use it.
+# GCC may have -mexplicit-relocs off by default if it was built with an old
+# assembler, so we force it via an option.
+#
+# When the assembler does not supports explicit relocation hint, we can't use
+# it. Disable it if the compiler supports it.
+#
+# If you've seen "unknown reloc hint" message building the kernel and you are
+# now wondering why "-mexplicit-relocs" is not wrapped with cc-option: the
+# combination of a "new" assembler and "old" compiler is not supported. Either
+# upgrade the compiler or downgrade the assembler.
+ifdef CONFIG_AS_HAS_EXPLICIT_RELOCS
+cflags-y += -mexplicit-relocs
+else
+cflags-y += $(call cc-option,-mno-explicit-relocs)
KBUILD_AFLAGS_KERNEL += -Wa,-mla-global-with-pcrel
KBUILD_CFLAGS_KERNEL += -Wa,-mla-global-with-pcrel
KBUILD_AFLAGS_MODULE += -Wa,-mla-global-with-abs
KBUILD_CFLAGS_MODULE += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs
+endif
cflags-y += -ffreestanding
cflags-y += $(call cc-option, -mno-check-zero-division)
diff --git a/arch/loongarch/include/asm/percpu.h b/arch/loongarch/include/asm/percpu.h
index 0bd6b0110198..dd7fcc553efa 100644
--- a/arch/loongarch/include/asm/percpu.h
+++ b/arch/loongarch/include/asm/percpu.h
@@ -8,6 +8,14 @@
#include <asm/cmpxchg.h>
#include <asm/loongarch.h>
+#if defined(MODULE) && defined(CONFIG_AS_HAS_EXPLICIT_RELOCS)
+/* The "address" (in fact, offset from $r21) of a per-CPU variable is close
+ * to the load address of main kernel image, but far from where the modules are
+ * loaded. Tell the compiler this fact.
+ */
+# define PER_CPU_ATTRIBUTES __attribute__((model("extreme")))
+#endif
+
/* Use r21 for fast access */
register unsigned long __my_cpu_offset __asm__("$r21");
diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
index 01bac62a6442..eb3f641d5915 100644
--- a/arch/loongarch/kernel/head.S
+++ b/arch/loongarch/kernel/head.S
@@ -55,17 +55,17 @@ SYM_CODE_START(kernel_entry) # kernel entry point
li.w t0, 0x00 # FPE=0, SXE=0, ASXE=0, BTE=0
csrwr t0, LOONGARCH_CSR_EUEN
- la t0, __bss_start # clear .bss
+ la.pcrel t0, __bss_start # clear .bss
st.d zero, t0, 0
- la t1, __bss_stop - LONGSIZE
+ la.pcrel t1, __bss_stop - LONGSIZE
1:
addi.d t0, t0, LONGSIZE
st.d zero, t0, 0
bne t0, t1, 1b
- la t0, fw_arg0
+ la.pcrel t0, fw_arg0
st.d a0, t0, 0 # firmware arguments
- la t0, fw_arg1
+ la.pcrel t0, fw_arg1
st.d a1, t0, 0
/* KSave3 used for percpu base, initialized as 0 */
@@ -73,7 +73,7 @@ SYM_CODE_START(kernel_entry) # kernel entry point
/* GPR21 used for percpu base (runtime), initialized as 0 */
move u0, zero
- la tp, init_thread_union
+ la.pcrel tp, init_thread_union
/* Set the SP after an empty pt_regs. */
PTR_LI sp, (_THREAD_SIZE - 32 - PT_SIZE)
PTR_ADD sp, sp, tp
--
2.37.0
Hi Xi,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on kees/for-next/execve]
[also build test WARNING on linus/master v6.0-rc3 next-20220830]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Xi-Ruoyao/LoongArch-Support-toolchain-with-new-relocation-types/20220830-185350
base: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/execve
config: loongarch-allyesconfig (https://download.01.org/0day-ci/archive/20220830/[email protected]/config)
compiler: loongarch64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/529c0f36d2dad7dd4bcec3815f821547d9e9643c
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Xi-Ruoyao/LoongArch-Support-toolchain-with-new-relocation-types/20220830-185350
git checkout 529c0f36d2dad7dd4bcec3815f821547d9e9643c
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch SHELL=/bin/bash arch/loongarch/kernel/
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>
All warnings (new ones prefixed by >>):
366 | [R_LARCH_NONE] = apply_r_larch_none,
| ^~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:366:64: note: (near initialization for 'reloc_rela_handlers[0]')
arch/loongarch/kernel/module.c:367:64: warning: initialized field overwritten [-Woverride-init]
367 | [R_LARCH_32] = apply_r_larch_32,
| ^~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:367:64: note: (near initialization for 'reloc_rela_handlers[1]')
arch/loongarch/kernel/module.c:368:64: warning: initialized field overwritten [-Woverride-init]
368 | [R_LARCH_64] = apply_r_larch_64,
| ^~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:368:64: note: (near initialization for 'reloc_rela_handlers[2]')
arch/loongarch/kernel/module.c:369:64: warning: initialized field overwritten [-Woverride-init]
369 | [R_LARCH_MARK_LA] = apply_r_larch_none,
| ^~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:369:64: note: (near initialization for 'reloc_rela_handlers[20]')
arch/loongarch/kernel/module.c:370:64: warning: initialized field overwritten [-Woverride-init]
370 | [R_LARCH_MARK_PCREL] = apply_r_larch_none,
| ^~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:370:64: note: (near initialization for 'reloc_rela_handlers[21]')
arch/loongarch/kernel/module.c:371:64: warning: initialized field overwritten [-Woverride-init]
371 | [R_LARCH_SOP_PUSH_PCREL] = apply_r_larch_sop_push_pcrel,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:371:64: note: (near initialization for 'reloc_rela_handlers[22]')
arch/loongarch/kernel/module.c:372:64: warning: initialized field overwritten [-Woverride-init]
372 | [R_LARCH_SOP_PUSH_ABSOLUTE] = apply_r_larch_sop_push_absolute,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:372:64: note: (near initialization for 'reloc_rela_handlers[23]')
arch/loongarch/kernel/module.c:373:64: warning: initialized field overwritten [-Woverride-init]
373 | [R_LARCH_SOP_PUSH_DUP] = apply_r_larch_sop_push_dup,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:373:64: note: (near initialization for 'reloc_rela_handlers[24]')
arch/loongarch/kernel/module.c:374:64: warning: initialized field overwritten [-Woverride-init]
374 | [R_LARCH_SOP_PUSH_PLT_PCREL] = apply_r_larch_sop_push_plt_pcrel,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:374:64: note: (near initialization for 'reloc_rela_handlers[29]')
arch/loongarch/kernel/module.c:375:64: warning: initialized field overwritten [-Woverride-init]
375 | [R_LARCH_SOP_SUB ... R_LARCH_SOP_IF_ELSE] = apply_r_larch_sop,
| ^~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:375:64: note: (near initialization for 'reloc_rela_handlers[32]')
arch/loongarch/kernel/module.c:375:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:375:64: note: (near initialization for 'reloc_rela_handlers[33]')
arch/loongarch/kernel/module.c:375:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:375:64: note: (near initialization for 'reloc_rela_handlers[34]')
arch/loongarch/kernel/module.c:375:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:375:64: note: (near initialization for 'reloc_rela_handlers[35]')
arch/loongarch/kernel/module.c:375:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:375:64: note: (near initialization for 'reloc_rela_handlers[36]')
arch/loongarch/kernel/module.c:375:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:375:64: note: (near initialization for 'reloc_rela_handlers[37]')
arch/loongarch/kernel/module.c:376:64: warning: initialized field overwritten [-Woverride-init]
376 | [R_LARCH_SOP_POP_32_S_10_5 ... R_LARCH_SOP_POP_32_U] = apply_r_larch_sop_imm_field,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:376:64: note: (near initialization for 'reloc_rela_handlers[38]')
arch/loongarch/kernel/module.c:376:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:376:64: note: (near initialization for 'reloc_rela_handlers[39]')
arch/loongarch/kernel/module.c:376:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:376:64: note: (near initialization for 'reloc_rela_handlers[40]')
arch/loongarch/kernel/module.c:376:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:376:64: note: (near initialization for 'reloc_rela_handlers[41]')
arch/loongarch/kernel/module.c:376:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:376:64: note: (near initialization for 'reloc_rela_handlers[42]')
arch/loongarch/kernel/module.c:376:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:376:64: note: (near initialization for 'reloc_rela_handlers[43]')
arch/loongarch/kernel/module.c:376:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:376:64: note: (near initialization for 'reloc_rela_handlers[44]')
arch/loongarch/kernel/module.c:376:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:376:64: note: (near initialization for 'reloc_rela_handlers[45]')
arch/loongarch/kernel/module.c:376:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:376:64: note: (near initialization for 'reloc_rela_handlers[46]')
arch/loongarch/kernel/module.c:377:64: warning: initialized field overwritten [-Woverride-init]
377 | [R_LARCH_ADD32 ... R_LARCH_SUB64] = apply_r_larch_add_sub,
| ^~~~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:377:64: note: (near initialization for 'reloc_rela_handlers[50]')
arch/loongarch/kernel/module.c:377:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:377:64: note: (near initialization for 'reloc_rela_handlers[51]')
arch/loongarch/kernel/module.c:377:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:377:64: note: (near initialization for 'reloc_rela_handlers[52]')
arch/loongarch/kernel/module.c:377:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:377:64: note: (near initialization for 'reloc_rela_handlers[53]')
arch/loongarch/kernel/module.c:377:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:377:64: note: (near initialization for 'reloc_rela_handlers[54]')
arch/loongarch/kernel/module.c:377:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:377:64: note: (near initialization for 'reloc_rela_handlers[55]')
arch/loongarch/kernel/module.c:377:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:377:64: note: (near initialization for 'reloc_rela_handlers[56]')
arch/loongarch/kernel/module.c:378:64: warning: initialized field overwritten [-Woverride-init]
378 | [R_LARCH_B26] = apply_r_larch_b26,
| ^~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:378:64: note: (near initialization for 'reloc_rela_handlers[66]')
arch/loongarch/kernel/module.c:379:64: warning: initialized field overwritten [-Woverride-init]
379 | [R_LARCH_PCALA_HI20...R_LARCH_PCALA64_HI12] = apply_r_larch_pcala,
| ^~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:379:64: note: (near initialization for 'reloc_rela_handlers[71]')
arch/loongarch/kernel/module.c:379:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:379:64: note: (near initialization for 'reloc_rela_handlers[72]')
arch/loongarch/kernel/module.c:379:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:379:64: note: (near initialization for 'reloc_rela_handlers[73]')
arch/loongarch/kernel/module.c:379:64: warning: initialized field overwritten [-Woverride-init]
arch/loongarch/kernel/module.c:379:64: note: (near initialization for 'reloc_rela_handlers[74]')
arch/loongarch/kernel/module.c: In function 'apply_r_larch_pcala':
>> arch/loongarch/kernel/module.c:336:19: warning: this statement may fall through [-Wimplicit-fallthrough=]
336 | v = offset_rem >> 52;
| ~~^~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module.c:338:9: note: here
338 | case R_LARCH_PCALA_LO12:
| ^~~~
vim +336 arch/loongarch/kernel/module.c
315
316 static int apply_r_larch_pcala(struct module *mod, u32 *location, Elf_Addr v,
317 s64 *rela_stack, size_t *rela_stack_top, unsigned int type)
318 {
319 union loongarch_instruction *insn = (union loongarch_instruction *)location;
320 /* Use s32 for a sign-extension deliberately. */
321 s32 offset_hi20 = (void *)((v + 0x800) & ~0xfff) -
322 (void *)((Elf_Addr)location & ~0xfff);
323 Elf_Addr anchor = (((Elf_Addr)location) & ~0xfff) + offset_hi20;
324 ptrdiff_t offset_rem = (void *)v - (void *)anchor;
325
326 switch (type) {
327 case R_LARCH_PCALA_HI20:
328 v = offset_hi20 >> 12;
329 insn->reg1i20_format.immediate = v & 0xfffff;
330 break;
331 case R_LARCH_PCALA64_LO20:
332 v = offset_rem >> 32;
333 insn->reg1i20_format.immediate = v & 0xfffff;
334 break;
335 case R_LARCH_PCALA64_HI12:
> 336 v = offset_rem >> 52;
337 /* fall through */
338 case R_LARCH_PCALA_LO12:
339 insn->reg2i12_format.immediate = v & 0xfff;
340 break;
341 default:
342 pr_err("%s: Unsupport relocation type %u\n", mod->name, type);
343 return -EINVAL;
344 }
345
346 return 0;
347 }
348
--
0-DAY CI Kernel Test Service
https://01.org/lkp
Hi, Ruoyao,
Thank you for your contribution, this whole series will be queued for
6.1. Though lkp reported some warnings, I will fix them myself.
Huacai
On Tue, Aug 30, 2022 at 6:48 PM Xi Ruoyao <[email protected]> wrote:
>
> The version 2.00 of LoongArch ELF ABI specification introduced new
> relocation types, and the development tree of Binutils and GCC has
> started to use them. If the kernel is built with the latest snapshot of
> Binutils or GCC, it will fail to load the modules because of unrecognized
> relocation types in modules.
>
> Add support for GOT and new relocation types for the module loader, so
> the kernel (with modules) can be built with the "normal" code model and
> function properly.
>
> This series does not break the compatibility with old toolchain using
> stack-based relocation types, so with the patches applied the kernel can
> be be built with both old and new toolchains. But, the combination of
> "new" Binutils and "old" GCC is not supported.
>
> Tested by building the kernel with the following combinations:
>
> - GCC 12 and Binutils 2.39
> - GCC trunk and Binutils trunk
>
> and running the builds with 35 in-tree modules loaded, and loading one
> module with 20 GOT loads and a per-CPU variable (loaded addresses
> verified by comparing with /proc/kallsyms).
>
> Changes from v6 to v7:
>
> - Simplify apply_r_larch_pcala.
> - Remove a build check only for excluding early GCC 13 dev snapshots.
> - Squash model attribute addition into the previous patch.
> - Retain "-fplt".
>
> Changes from v5 to v6:
>
> - Restore version number.
> - Rename CONFIG_CC_HAS_EXPLICIT_RELOCS to CONFIG_AS_HAS_EXPLICIT_RELOCS.
> It now only checks assembler.
> - No longer support "old GCC with new Binutils", so R_LARCH_ABS* is
> dropped.
> - "Old GCC with old Binutils" is still supported until Arnd ack.
> - "New GCC with old Binutils" is still supported as it does not
> require additional code.
> - Remove "cc-option" around "-mexplicit-relocs". For unsupported
> "old GCC with new Binutils" combination, forcing -mexplicit-relocs
> makes assembling fail, instead of silently producing unloadable
> modules.
> - Move the error report for "lacking model attribute" into Makefile.
> - Squash the two patches for R_LARCH_B26 and R_LARCH_PCALA* into one.
>
> Changes from v4 to v5 ("v5" missed in the subject):
>
> - Change subject.
> - Introduce CONFIG_CC_HAS_EXPLICIT_RELOCS.
> - Retain -Wa,-mla-* options for old toolchains
> (!CONFIG_CC_HAS_EXPLICIT_RELOCS).
> - Use __attribute__((model("extreme"))) in PER_CPU_ATTRIBUTES, to fix
> a breakage with per-CPU variables defined in modules.
> - Handle R_LARCH_PCALA64_{HI12,LO12} for extreme model.
> - Handle R_LARCH_ABS* for "old GCC with new Binutils".
> - Separate the last patch into more small patches.
> - Avoid BUG_ON() for the handling of GOT.
>
> Changes from v3 to v4:
>
> - No code change. Reword the commit message of the 3rd patch again
> based on suggestion from Huacai.
>
> Changes from v2 to v3:
>
> - Use `union loongarch_instruction` instead of explicit bit shifts
> applying the relocation. Suggested by Youling.
> - For R_LARCH_B26, move the alignment check before the range check to be
> consistent with stack pop relocations. Suggested by Youling.
> - Reword the commit message of the 3rd patch. Suggested by Huacai.
>
> Changes from v1 to v2:
>
> - Fix a stupid programming error (confusion between the number of PLT
> entries and the number of GOT entries). (Bug spotted by Youling).
> - Synthesize the _GLOBAL_OFFSET_TABLE_ symbol with module.lds, instead
> of faking it at runtime. The 3rd patch from V1 is now merged into
> the 1st patch because it would be a one-line change. (Suggested by
> Jinyang).
> - Keep reloc_rela_handlers[] ordered by the relocation type ID.
> (Suggested by Youling).
> - Remove -fplt along with -Wa,-mla-* options because it's the default.
> (Suggested by Youling).
>
> Xi Ruoyao (5):
> LoongArch: Add CONFIG_AS_HAS_EXPLICIT_RELOCS
> LoongArch: Adjust symbol addressing for CONFIG_AS_HAS_EXPLICIT_RELOCS
> LoongArch: Define ELF relocation types added in v2.00 ABI
> LoongArch: Support PC-relative relocations in modules
> LoongArch: Support R_LARCH_GOT_PC* in modules
>
> arch/loongarch/Kconfig | 3 +
> arch/loongarch/Makefile | 17 +++++
> arch/loongarch/include/asm/elf.h | 37 ++++++++++
> arch/loongarch/include/asm/module.h | 23 ++++++
> arch/loongarch/include/asm/module.lds.h | 1 +
> arch/loongarch/include/asm/percpu.h | 8 +++
> arch/loongarch/kernel/head.S | 10 +--
> arch/loongarch/kernel/module-sections.c | 56 +++++++++++++--
> arch/loongarch/kernel/module.c | 93 ++++++++++++++++++++++++-
> 9 files changed, 238 insertions(+), 10 deletions(-)
>
> --
> 2.37.0
>
On Tue, 2022-08-30 at 21:05 +0800, Huacai Chen wrote:
> Hi, Ruoyao,
>
> Thank you for your contribution, this whole series will be queued for
> 6.1. Though lkp reported some warnings, I will fix them myself.
Hmm, we are using -Wimplicit-fallthrough=5 so "fallthrough;" (which
expands into some magic) should be used instead of "/* fallthrough */".
For -Woverride-init I'm not sure how to fix it properly.
--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University
On 8/30/22 21:05, Huacai Chen wrote:
> Hi, Ruoyao,
>
> Thank you for your contribution, this whole series will be queued for
> 6.1. Though lkp reported some warnings, I will fix them myself.
And of course add my Tested-by:
Tested-by: WANG Xuerui <[email protected]>
(I've run the v6 actually, but there is not substantial change from v6
to v7 so the status should stay good.)
--
WANG "xen0n" Xuerui
Linux/LoongArch mailing list: https://lore.kernel.org/loongarch/
Hi, Ruoyao and Xuerui,
On Wed, Aug 31, 2022 at 12:38 AM WANG Xuerui <[email protected]> wrote:
>
> On 8/30/22 21:05, Huacai Chen wrote:
> > Hi, Ruoyao,
> >
> > Thank you for your contribution, this whole series will be queued for
> > 6.1. Though lkp reported some warnings, I will fix them myself.
>
> And of course add my Tested-by:
>
> Tested-by: WANG Xuerui <[email protected]>
>
> (I've run the v6 actually, but there is not substantial change from v6
> to v7 so the status should stay good.)
With this series applied and ARCH_WANT_LD_ORPHAN_WARN enabled, we get
loongarch64-unknown-linux-gnu-ld: warning: orphan section `.got' from
`arch/loongarch/kernel/head.o' being placed in section `.got'
loongarch64-unknown-linux-gnu-ld: warning: orphan section `.got.plt'
from `arch/loongarch/kernel/head.o' being placed in section `.got.plt'
I think we should add this lines in vmlinux.lds.S
.got : { *(.got) }
.got.plt : { *(.got.plt) }
But put them to which patch? Patch 2 or Patch 5?
Huacai
>
> --
> WANG "xen0n" Xuerui
>
> Linux/LoongArch mailing list: https://lore.kernel.org/loongarch/
>
On Wed, 2022-08-31 at 13:44 +0800, Huacai Chen wrote:
> With this series applied and ARCH_WANT_LD_ORPHAN_WARN enabled, we get
> loongarch64-unknown-linux-gnu-ld: warning: orphan section `.got' from
> `arch/loongarch/kernel/head.o' being placed in section `.got'
> loongarch64-unknown-linux-gnu-ld: warning: orphan section `.got.plt'
> from `arch/loongarch/kernel/head.o' being placed in section `.got.plt'
>
> I think we should add this lines in vmlinux.lds.S
> .got : { *(.got) }
> .got.plt : { *(.got.plt) }
>
> But put them to which patch? Patch 2 or Patch 5?
In patch 2 IMO. Because in patch 2 we already know "-Wa,-mla-global-
with-pcrel" does not prevent the generation of GOT with new toolchain.
If you need a v8 please tell me to send it, but I don't know how to
handle -Woverride-init warnings (IMO the fix for this warning should be
a standalone patch outside of the series).
P. S. The ld warning message seems a little strange because "head.o"
does not contain .got or .got.plt sections... I guess there is a linker
bug causing it outputs the very first input file in the message, instead
of the first input file really containing an orphaned section.
Another P. S.: the use of GOT is actually unneeded in main kernel image
but we don't have something equivalent to "-Wa,-mla-global-with-pcrel"
in the new toolchain. Perhaps we can add this feature to GCC later.
--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University
On 08/31/2022 02:10 PM, Xi Ruoyao wrote:
> On Wed, 2022-08-31 at 13:44 +0800, Huacai Chen wrote:
>
>> With this series applied and ARCH_WANT_LD_ORPHAN_WARN enabled, we get
>> loongarch64-unknown-linux-gnu-ld: warning: orphan section `.got' from
>> `arch/loongarch/kernel/head.o' being placed in section `.got'
>> loongarch64-unknown-linux-gnu-ld: warning: orphan section `.got.plt'
>> from `arch/loongarch/kernel/head.o' being placed in section `.got.plt'
>>
>> I think we should add this lines in vmlinux.lds.S
>> .got : { *(.got) }
>> .got.plt : { *(.got.plt) }
>>
>> But put them to which patch? Patch 2 or Patch 5?
> In patch 2 IMO. Because in patch 2 we already know "-Wa,-mla-global-
> with-pcrel" does not prevent the generation of GOT with new toolchain.
>
> If you need a v8 please tell me to send it, but I don't know how to
> handle -Woverride-init warnings (IMO the fix for this warning should be
> a standalone patch outside of the series).
>
> P. S. The ld warning message seems a little strange because "head.o"
> does not contain .got or .got.plt sections... I guess there is a linker
> bug causing it outputs the very first input file in the message, instead
> of the first input file really containing an orphaned section.
>
> Another P. S.: the use of GOT is actually unneeded in main kernel image
> but we don't have something equivalent to "-Wa,-mla-global-with-pcrel"
> in the new toolchain. Perhaps we can add this feature to GCC later.
>
That's right. Also I am wondering why new toolchain produce .got* in
kernel. It's unneeded. In the past, gcc create la.global and parsed
to la.pcrel by gas, and kernel works well. Now it seems we lost this
feature in gcc. I checked the x86 asm code just now. And some info
follows,
LoongArch64, ./net/ipv4/udp_diag.s, *have reloc hint*
pcalau12i $r4,%got_pc_hi20(udplite_table)
ld.d $r4,$r4,%got_pc_lo12(udplite_table)
b udp_dump
x86_64, ./net/ipv4/udp_diag.s
movq $udplite_table, %rdi
jmp udp_dump
It seems related to -fno-PIE and -cmodel=kernel on x86_64.
Hope new gcc with this feature now.
On Wed, 2022-08-31 at 14:58 +0800, Jinyang He wrote:
> That's right. Also I am wondering why new toolchain produce .got* in
> kernel. It's unneeded. In the past, gcc create la.global and parsed
> to la.pcrel by gas, and kernel works well. Now it seems we lost this
> feature in gcc. I checked the x86 asm code just now. And some info
> follows,
>
> LoongArch64, ./net/ipv4/udp_diag.s, *have reloc hint*
> pcalau12i $r4,%got_pc_hi20(udplite_table)
> ld.d $r4,$r4,%got_pc_lo12(udplite_table)
> b udp_dump
>
> x86_64, ./net/ipv4/udp_diag.s
> movq $udplite_table, %rdi
> jmp udp_dump
>
> It seems related to -fno-PIE and -cmodel=kernel on x86_64.
> Hope new gcc with this feature now.
On x86_64 -mcmodel=kernel means "all code and data are located in [-
2GiB, 0) range. We actually don't strictly require a "high" range as
we're mostly a PIC-friendly architecture: note that we use a
pcalau12i/addi.d pair for PIC addressing in [PC-2GiB, PC+2GiB, and a
lu12i.w/addi.d pair for "non-PIC" addressing in [-2GiB, 2GiB), both are
2-insn sequence.
If we can put the main kernel image and the modules in one 2GiB VA
range, we can avoid GOT completely. But it's not possible for now
because main kernel image is loaded in XKPRANGE but the modules are in
XKVRANGE. So the best we can achieve before implementing
CONFIG_RELOCATION is using GOT in modules, and avoid GOT in the main
kernel image (with a new code model in GCC, which will benefit both the
kernel and statically linked executables).
--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University
On Wed, Aug 31, 2022 at 4:09 PM Xi Ruoyao <[email protected]> wrote:
>
> On Wed, 2022-08-31 at 14:58 +0800, Jinyang He wrote:
> > That's right. Also I am wondering why new toolchain produce .got* in
> > kernel. It's unneeded. In the past, gcc create la.global and parsed
> > to la.pcrel by gas, and kernel works well. Now it seems we lost this
> > feature in gcc. I checked the x86 asm code just now. And some info
> > follows,
> >
> > LoongArch64, ./net/ipv4/udp_diag.s, *have reloc hint*
> > pcalau12i $r4,%got_pc_hi20(udplite_table)
> > ld.d $r4,$r4,%got_pc_lo12(udplite_table)
> > b udp_dump
> >
> > x86_64, ./net/ipv4/udp_diag.s
> > movq $udplite_table, %rdi
> > jmp udp_dump
> >
> > It seems related to -fno-PIE and -cmodel=kernel on x86_64.
> > Hope new gcc with this feature now.
>
> On x86_64 -mcmodel=kernel means "all code and data are located in [-
> 2GiB, 0) range. We actually don't strictly require a "high" range as
> we're mostly a PIC-friendly architecture: note that we use a
> pcalau12i/addi.d pair for PIC addressing in [PC-2GiB, PC+2GiB, and a
> lu12i.w/addi.d pair for "non-PIC" addressing in [-2GiB, 2GiB), both are
> 2-insn sequence.
>
> If we can put the main kernel image and the modules in one 2GiB VA
> range, we can avoid GOT completely. But it's not possible for now
> because main kernel image is loaded in XKPRANGE but the modules are in
> XKVRANGE. So the best we can achieve before implementing
> CONFIG_RELOCATION is using GOT in modules, and avoid GOT in the main
> kernel image (with a new code model in GCC, which will benefit both the
> kernel and statically linked executables).
Emmm, can you implement this new code model in the near future?
Huacai
>
> --
> Xi Ruoyao <[email protected]>
> School of Aerospace Science and Technology, Xidian University
>
On Wed, 2022-08-31 at 22:40 +0800, Huacai Chen wrote:
> On Wed, Aug 31, 2022 at 4:09 PM Xi Ruoyao <[email protected]> wrote:
> >
> > On Wed, 2022-08-31 at 14:58 +0800, Jinyang He wrote:
> > > That's right. Also I am wondering why new toolchain produce .got* in
> > > kernel. It's unneeded. In the past, gcc create la.global and parsed
> > > to la.pcrel by gas, and kernel works well. Now it seems we lost this
> > > feature in gcc. I checked the x86 asm code just now. And some info
> > > follows,
> > >
> > > LoongArch64, ./net/ipv4/udp_diag.s, *have reloc hint*
> > > pcalau12i $r4,%got_pc_hi20(udplite_table)
> > > ld.d $r4,$r4,%got_pc_lo12(udplite_table)
> > > b udp_dump
> > >
> > > x86_64, ./net/ipv4/udp_diag.s
> > > movq $udplite_table, %rdi
> > > jmp udp_dump
> > >
> > > It seems related to -fno-PIE and -cmodel=kernel on x86_64.
> > > Hope new gcc with this feature now.
> >
> > On x86_64 -mcmodel=kernel means "all code and data are located in [-
> > 2GiB, 0) range. We actually don't strictly require a "high" range as
> > we're mostly a PIC-friendly architecture: note that we use a
> > pcalau12i/addi.d pair for PIC addressing in [PC-2GiB, PC+2GiB, and a
> > lu12i.w/addi.d pair for "non-PIC" addressing in [-2GiB, 2GiB), both are
> > 2-insn sequence.
> >
> > If we can put the main kernel image and the modules in one 2GiB VA
> > range, we can avoid GOT completely. But it's not possible for now
> > because main kernel image is loaded in XKPRANGE but the modules are in
> > XKVRANGE. So the best we can achieve before implementing
> > CONFIG_RELOCATION is using GOT in modules, and avoid GOT in the main
> > kernel image (with a new code model in GCC, which will benefit both the
> > kernel and statically linked executables).
> Emmm, can you implement this new code model in the near future?
I have a plan to make our toolchain addressing the symbols better:
(1) https://sourceware.org/pipermail/binutils/2022-August/122682.html.
This change will allow the linker to link a main executable image
(dynamically linked or statically linked, PIE or non-PIE, kernel or
userspace) with R_LARCH_COPY instead of GOT. (Note that R_LARCH_COPY
will not show up in the kernel because we don't link to shared objects,
but GOT will be gone.)
(2) Change GCC to stop using GOT unless -fPIC. (Technically it's a one-
line change.)
(3) In kernel, for main kernel image the default of toolchain will be
good enough (no GOT). For modules we have two options:
(a) get rid of XKPRANGE.
(b) force -mcmodel=extreme globally.
(c) use -Wl,nocopyreloc to produce GOT.
(a) is the best, the performance of (b) and (c) will be worse than (a).
I'm not sure which one in (b) and (c) is better, but as (a) will be the
final solution we can just choose one in (b) and (c) "randomly" for now.
I don't want to add a new code model now, because if (1) works fine
we'll not need a new code model. (1) is also the most tricky step in
the plan (I've sent the patch but not sure if it's completely correct),
(2) and (3) should be trivial.
--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University
Hi, Ruoyao,
On Wed, Aug 31, 2022 at 11:15 PM Xi Ruoyao <[email protected]> wrote:
>
> On Wed, 2022-08-31 at 22:40 +0800, Huacai Chen wrote:
> > On Wed, Aug 31, 2022 at 4:09 PM Xi Ruoyao <[email protected]> wrote:
> > >
> > > On Wed, 2022-08-31 at 14:58 +0800, Jinyang He wrote:
> > > > That's right. Also I am wondering why new toolchain produce .got* in
> > > > kernel. It's unneeded. In the past, gcc create la.global and parsed
> > > > to la.pcrel by gas, and kernel works well. Now it seems we lost this
> > > > feature in gcc. I checked the x86 asm code just now. And some info
> > > > follows,
> > > >
> > > > LoongArch64, ./net/ipv4/udp_diag.s, *have reloc hint*
> > > > pcalau12i $r4,%got_pc_hi20(udplite_table)
> > > > ld.d $r4,$r4,%got_pc_lo12(udplite_table)
> > > > b udp_dump
> > > >
> > > > x86_64, ./net/ipv4/udp_diag.s
> > > > movq $udplite_table, %rdi
> > > > jmp udp_dump
> > > >
> > > > It seems related to -fno-PIE and -cmodel=kernel on x86_64.
> > > > Hope new gcc with this feature now.
> > >
> > > On x86_64 -mcmodel=kernel means "all code and data are located in [-
> > > 2GiB, 0) range. We actually don't strictly require a "high" range as
> > > we're mostly a PIC-friendly architecture: note that we use a
> > > pcalau12i/addi.d pair for PIC addressing in [PC-2GiB, PC+2GiB, and a
> > > lu12i.w/addi.d pair for "non-PIC" addressing in [-2GiB, 2GiB), both are
> > > 2-insn sequence.
> > >
> > > If we can put the main kernel image and the modules in one 2GiB VA
> > > range, we can avoid GOT completely. But it's not possible for now
> > > because main kernel image is loaded in XKPRANGE but the modules are in
> > > XKVRANGE. So the best we can achieve before implementing
> > > CONFIG_RELOCATION is using GOT in modules, and avoid GOT in the main
> > > kernel image (with a new code model in GCC, which will benefit both the
> > > kernel and statically linked executables).
>
> > Emmm, can you implement this new code model in the near future?
>
> I have a plan to make our toolchain addressing the symbols better:
>
> (1) https://sourceware.org/pipermail/binutils/2022-August/122682.html.
> This change will allow the linker to link a main executable image
> (dynamically linked or statically linked, PIE or non-PIE, kernel or
> userspace) with R_LARCH_COPY instead of GOT. (Note that R_LARCH_COPY
> will not show up in the kernel because we don't link to shared objects,
> but GOT will be gone.)
>
> (2) Change GCC to stop using GOT unless -fPIC. (Technically it's a one-
> line change.)
>
> (3) In kernel, for main kernel image the default of toolchain will be
> good enough (no GOT). For modules we have two options:
>
> (a) get rid of XKPRANGE.
> (b) force -mcmodel=extreme globally.
> (c) use -Wl,nocopyreloc to produce GOT.
>
> (a) is the best, the performance of (b) and (c) will be worse than (a).
> I'm not sure which one in (b) and (c) is better, but as (a) will be the
> final solution we can just choose one in (b) and (c) "randomly" for now.
>
> I don't want to add a new code model now, because if (1) works fine
> we'll not need a new code model. (1) is also the most tricky step in
> the plan (I've sent the patch but not sure if it's completely correct),
> (2) and (3) should be trivial.
Now all global variable accesses are via got, I think the performance
may be much worse than before when we didn't use explicit-relocs.
I don't know whether "a new code model" or your "(1)(2)(3)" is easier
to implement, but I think it is better to solve the performance issue
before 6.1-rc1.
Huacai
> --
> Xi Ruoyao <[email protected]>
> School of Aerospace Science and Technology, Xidian University
>
On Thu, 2022-09-01 at 10:17 +0800, Huacai Chen wrote:
> Now all global variable accesses are via got, I think the performance
> may be much worse than before when we didn't use explicit-relocs.
> I don't know whether "a new code model" or your "(1)(2)(3)" is easier
> to implement, but I think it is better to solve the performance issue
> before 6.1-rc1.
Both won't be too difficult, but I need to debate with toolchain
developers :(.
If we are running out of time:
cflags-y += $(cc-option -mno-explicit-relocs)
This will at least make the new toolchain work, though we cannot be
benefited from the optimizations allowed by explicit relocations.
--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University
On Thu, 2022-09-01 at 10:17 +0800, Huacai Chen wrote:
> Now all global variable accesses are via got, I think the performance
> may be much worse than before when we didn't use explicit-relocs.
> I don't know whether "a new code model" or your "(1)(2)(3)" is easier
> to implement, but I think it is better to solve the performance issue
> before 6.1-rc1.
Hi Huacai,
We've added a GCC option for this at https://gcc.gnu.org/r13-2433. On
the kernel side we need a one-line change:
diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index 92c4a52c4c3e..69b39ba3a09d 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -54,6 +54,7 @@ LDFLAGS_vmlinux += -G0 -static -n -nostdlib
# upgrade the compiler or downgrade the assembler.
ifdef CONFIG_AS_HAS_EXPLICIT_RELOCS
cflags-y += -mexplicit-relocs
+KBUILD_CFLAGS_KERNEL += -mdirect-extern-access
else
cflags-y += $(call cc-option,-mno-explicit-relocs)
KBUILD_AFLAGS_KERNEL += -Wa,-mla-global-with-pcrel
And we also need a one-line change in the EFI stub patch (under review):
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 8931ed24379e..8c1225b92492 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -27,7 +27,7 @@ cflags-$(CONFIG_ARM) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
cflags-$(CONFIG_RISCV) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
-fpic
cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
- -fpic
+ -fpie
cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
(Some explanation: -fpic does not only mean "generate position-
independent code", but "generate position-independent code *suitable for
use in a shared library*". On LoongArch -mdirect-extern-access cannot
work for a shared library so the "-fpic -mdirect-extern-access"
combination is rejected deliberately.)
Not sure how to submit these changes properly... Do you prefer me to
send V8 of this series or a single patch on top of your tree on GitHub?
--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University
Hi, Ruoyao,
On Tue, Sep 6, 2022 at 8:32 AM Xi Ruoyao <[email protected]> wrote:
>
> On Thu, 2022-09-01 at 10:17 +0800, Huacai Chen wrote:
>
> > Now all global variable accesses are via got, I think the performance
> > may be much worse than before when we didn't use explicit-relocs.
> > I don't know whether "a new code model" or your "(1)(2)(3)" is easier
> > to implement, but I think it is better to solve the performance issue
> > before 6.1-rc1.
>
> Hi Huacai,
>
> We've added a GCC option for this at https://gcc.gnu.org/r13-2433. On
> the kernel side we need a one-line change:
>
> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> index 92c4a52c4c3e..69b39ba3a09d 100644
> --- a/arch/loongarch/Makefile
> +++ b/arch/loongarch/Makefile
> @@ -54,6 +54,7 @@ LDFLAGS_vmlinux += -G0 -static -n -nostdlib
> # upgrade the compiler or downgrade the assembler.
> ifdef CONFIG_AS_HAS_EXPLICIT_RELOCS
> cflags-y += -mexplicit-relocs
> +KBUILD_CFLAGS_KERNEL += -mdirect-extern-access
> else
> cflags-y += $(call cc-option,-mno-explicit-relocs)
> KBUILD_AFLAGS_KERNEL += -Wa,-mla-global-with-pcrel
>
> And we also need a one-line change in the EFI stub patch (under review):
>
> diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> index 8931ed24379e..8c1225b92492 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -27,7 +27,7 @@ cflags-$(CONFIG_ARM) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> cflags-$(CONFIG_RISCV) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> -fpic
> cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> - -fpic
> + -fpie
>
> cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
>
> (Some explanation: -fpic does not only mean "generate position-
> independent code", but "generate position-independent code *suitable for
> use in a shared library*". On LoongArch -mdirect-extern-access cannot
> work for a shared library so the "-fpic -mdirect-extern-access"
> combination is rejected deliberately.)
>
> Not sure how to submit these changes properly... Do you prefer me to
> send V8 of this series or a single patch on top of your tree on GitHub?
Don't need V8, I will squash it into the previous patch myself. But
can we keep efistub as is?
Huacai
>
> --
> Xi Ruoyao <[email protected]>
> School of Aerospace Science and Technology, Xidian University
>
On Tue, 2022-09-06 at 09:52 +0800, Huacai Chen wrote:
> > cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > - -fpic
> > + -fpie
> >
> > cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> >
> > (Some explanation: -fpic does not only mean "generate position-
> > independent code", but "generate position-independent code *suitable for
> > use in a shared library*". On LoongArch -mdirect-extern-access cannot
> > work for a shared library so the "-fpic -mdirect-extern-access"
> > combination is rejected deliberately.)
> >
> > Not sure how to submit these changes properly... Do you prefer me to
> > send V8 of this series or a single patch on top of your tree on GitHub?
> Don't need V8, I will squash it into the previous patch myself. But
> can we keep efistub as is?
No, we can't allow -mdirect-extern-access -fpic on LoongArch because
without copy relocation such a combination just does not make sense (i.
e. we cannot find a sensible way to handle such a combination in GCC).
So such a combination will cause GCC refuse to run.
Note that -fpic/-fPIC is "position-independent code *suitable for
use in a shared library*", while -fpie/-fPIE is more like just
"position-independent code". The names of those options are confusing.
(When -fpic was invented first time, people mostly believed "PIC had
been only for shared libraries", so it's named -fpic instead of -shlib
or something.) IMO in the EFI stub for other ports, -fpie should be
used instead of -fpic as well because the EFI stub is not similar to a
shared library in any means.
--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University
Hi, Ruoyao,
On Tue, Sep 6, 2022 at 12:27 PM Xi Ruoyao <[email protected]> wrote:
>
> On Tue, 2022-09-06 at 09:52 +0800, Huacai Chen wrote:
> > > cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > - -fpic
> > > + -fpie
> > >
> > > cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> > >
> > > (Some explanation: -fpic does not only mean "generate position-
> > > independent code", but "generate position-independent code *suitable for
> > > use in a shared library*". On LoongArch -mdirect-extern-access cannot
> > > work for a shared library so the "-fpic -mdirect-extern-access"
> > > combination is rejected deliberately.)
> > >
> > > Not sure how to submit these changes properly... Do you prefer me to
> > > send V8 of this series or a single patch on top of your tree on GitHub?
>
> > Don't need V8, I will squash it into the previous patch myself. But
> > can we keep efistub as is?
>
> No, we can't allow -mdirect-extern-access -fpic on LoongArch because
> without copy relocation such a combination just does not make sense (i.
> e. we cannot find a sensible way to handle such a combination in GCC).
> So such a combination will cause GCC refuse to run.
>
> Note that -fpic/-fPIC is "position-independent code *suitable for
> use in a shared library*", while -fpie/-fPIE is more like just
> "position-independent code". The names of those options are confusing.
> (When -fpic was invented first time, people mostly believed "PIC had
> been only for shared libraries", so it's named -fpic instead of -shlib
> or something.) IMO in the EFI stub for other ports, -fpie should be
> used instead of -fpic as well because the EFI stub is not similar to a
> shared library in any means.
You are right, but I guess that Ard doesn't want to squash the efistub
change into the LoongArch efistub support patch. :)
Huacai
>
> --
> Xi Ruoyao <[email protected]>
> School of Aerospace Science and Technology, Xidian University
>
On Tue, 2022-09-06 at 12:43 +0800, Huacai Chen wrote:
> > Note that -fpic/-fPIC is "position-independent code *suitable for
> > use in a shared library*", while -fpie/-fPIE is more like just
> > "position-independent code". The names of those options are confusing.
> > (When -fpic was invented first time, people mostly believed "PIC had
> > been only for shared libraries", so it's named -fpic instead of -shlib
> > or something.) IMO in the EFI stub for other ports, -fpie should be
> > used instead of -fpic as well because the EFI stub is not similar to a
> > shared library in any means.
> You are right, but I guess that Ard doesn't want to squash the efistub
> change into the LoongArch efistub support patch. :)
It only changes cflags-$(CONFIG_LOONGARCH), which is LoongArch specific.
And arm64 is also using -fpie.
Should I send the one-line EFI stub change to linux-efi first?
--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University
Hi, Ruoyao,
On Tue, Sep 6, 2022 at 1:01 PM Xi Ruoyao <[email protected]> wrote:
>
> On Tue, 2022-09-06 at 12:43 +0800, Huacai Chen wrote:
> > > Note that -fpic/-fPIC is "position-independent code *suitable for
> > > use in a shared library*", while -fpie/-fPIE is more like just
> > > "position-independent code". The names of those options are confusing.
> > > (When -fpic was invented first time, people mostly believed "PIC had
> > > been only for shared libraries", so it's named -fpic instead of -shlib
> > > or something.) IMO in the EFI stub for other ports, -fpie should be
> > > used instead of -fpic as well because the EFI stub is not similar to a
> > > shared library in any means.
>
> > You are right, but I guess that Ard doesn't want to squash the efistub
> > change into the LoongArch efistub support patch. :)
>
> It only changes cflags-$(CONFIG_LOONGARCH), which is LoongArch specific.
> And arm64 is also using -fpie.
>
> Should I send the one-line EFI stub change to linux-efi first?
I know that should be changed. I just don't like the one-line patch
and hope it can be squashed to the original patch. Of course Ard is
free to decide how to handle it.
Huacai
>
> --
> Xi Ruoyao <[email protected]>
> School of Aerospace Science and Technology, Xidian University
On Tue, 6 Sept 2022 at 06:43, Huacai Chen <[email protected]> wrote:
>
> Hi, Ruoyao,
>
> On Tue, Sep 6, 2022 at 12:27 PM Xi Ruoyao <[email protected]> wrote:
> >
> > On Tue, 2022-09-06 at 09:52 +0800, Huacai Chen wrote:
> > > > cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > > - -fpic
> > > > + -fpie
> > > >
> > > > cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> > > >
> > > > (Some explanation: -fpic does not only mean "generate position-
> > > > independent code", but "generate position-independent code *suitable for
> > > > use in a shared library*". On LoongArch -mdirect-extern-access cannot
> > > > work for a shared library so the "-fpic -mdirect-extern-access"
> > > > combination is rejected deliberately.)
> > > >
> > > > Not sure how to submit these changes properly... Do you prefer me to
> > > > send V8 of this series or a single patch on top of your tree on GitHub?
> >
> > > Don't need V8, I will squash it into the previous patch myself. But
> > > can we keep efistub as is?
> >
> > No, we can't allow -mdirect-extern-access -fpic on LoongArch because
> > without copy relocation such a combination just does not make sense (i.
> > e. we cannot find a sensible way to handle such a combination in GCC).
> > So such a combination will cause GCC refuse to run.
> >
> > Note that -fpic/-fPIC is "position-independent code *suitable for
> > use in a shared library*", while -fpie/-fPIE is more like just
> > "position-independent code". The names of those options are confusing.
> > (When -fpic was invented first time, people mostly believed "PIC had
> > been only for shared libraries", so it's named -fpic instead of -shlib
> > or something.) IMO in the EFI stub for other ports, -fpie should be
> > used instead of -fpic as well because the EFI stub is not similar to a
> > shared library in any means.
> You are right, but I guess that Ard doesn't want to squash the efistub
> change into the LoongArch efistub support patch. :)
>
I don't mind changing the stable tag at this point - I don't have
anything queued up on top of it at the moment.
But I don't see the actual patch: please send me the delta patch that
you want to apply, and I will update it. Then, you can rebase your
v6.1 tree on top of it.
On Tue, Sep 6, 2022 at 3:18 PM Ard Biesheuvel <[email protected]> wrote:
>
> On Tue, 6 Sept 2022 at 06:43, Huacai Chen <[email protected]> wrote:
> >
> > Hi, Ruoyao,
> >
> > On Tue, Sep 6, 2022 at 12:27 PM Xi Ruoyao <[email protected]> wrote:
> > >
> > > On Tue, 2022-09-06 at 09:52 +0800, Huacai Chen wrote:
> > > > > cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > > > - -fpic
> > > > > + -fpie
> > > > >
> > > > > cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> > > > >
> > > > > (Some explanation: -fpic does not only mean "generate position-
> > > > > independent code", but "generate position-independent code *suitable for
> > > > > use in a shared library*". On LoongArch -mdirect-extern-access cannot
> > > > > work for a shared library so the "-fpic -mdirect-extern-access"
> > > > > combination is rejected deliberately.)
> > > > >
> > > > > Not sure how to submit these changes properly... Do you prefer me to
> > > > > send V8 of this series or a single patch on top of your tree on GitHub?
> > >
> > > > Don't need V8, I will squash it into the previous patch myself. But
> > > > can we keep efistub as is?
> > >
> > > No, we can't allow -mdirect-extern-access -fpic on LoongArch because
> > > without copy relocation such a combination just does not make sense (i.
> > > e. we cannot find a sensible way to handle such a combination in GCC).
> > > So such a combination will cause GCC refuse to run.
> > >
> > > Note that -fpic/-fPIC is "position-independent code *suitable for
> > > use in a shared library*", while -fpie/-fPIE is more like just
> > > "position-independent code". The names of those options are confusing.
> > > (When -fpic was invented first time, people mostly believed "PIC had
> > > been only for shared libraries", so it's named -fpic instead of -shlib
> > > or something.) IMO in the EFI stub for other ports, -fpie should be
> > > used instead of -fpic as well because the EFI stub is not similar to a
> > > shared library in any means.
> > You are right, but I guess that Ard doesn't want to squash the efistub
> > change into the LoongArch efistub support patch. :)
> >
>
> I don't mind changing the stable tag at this point - I don't have
> anything queued up on top of it at the moment.
>
> But I don't see the actual patch: please send me the delta patch that
> you want to apply, and I will update it. Then, you can rebase your
> v6.1 tree on top of it.
OK, Ruoyao, please send a patch to change the efistub cflags. Thank you.
Huacai
On Tue, Sep 6, 2022 at 4:20 PM Huacai Chen <[email protected]> wrote:
>
> On Tue, Sep 6, 2022 at 3:18 PM Ard Biesheuvel <[email protected]> wrote:
> >
> > On Tue, 6 Sept 2022 at 06:43, Huacai Chen <[email protected]> wrote:
> > >
> > > Hi, Ruoyao,
> > >
> > > On Tue, Sep 6, 2022 at 12:27 PM Xi Ruoyao <[email protected]> wrote:
> > > >
> > > > On Tue, 2022-09-06 at 09:52 +0800, Huacai Chen wrote:
> > > > > > cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > > > > - -fpic
> > > > > > + -fpie
> > > > > >
> > > > > > cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> > > > > >
> > > > > > (Some explanation: -fpic does not only mean "generate position-
> > > > > > independent code", but "generate position-independent code *suitable for
> > > > > > use in a shared library*". On LoongArch -mdirect-extern-access cannot
> > > > > > work for a shared library so the "-fpic -mdirect-extern-access"
> > > > > > combination is rejected deliberately.)
> > > > > >
> > > > > > Not sure how to submit these changes properly... Do you prefer me to
> > > > > > send V8 of this series or a single patch on top of your tree on GitHub?
> > > >
> > > > > Don't need V8, I will squash it into the previous patch myself. But
> > > > > can we keep efistub as is?
> > > >
> > > > No, we can't allow -mdirect-extern-access -fpic on LoongArch because
> > > > without copy relocation such a combination just does not make sense (i.
> > > > e. we cannot find a sensible way to handle such a combination in GCC).
> > > > So such a combination will cause GCC refuse to run.
> > > >
> > > > Note that -fpic/-fPIC is "position-independent code *suitable for
> > > > use in a shared library*", while -fpie/-fPIE is more like just
> > > > "position-independent code". The names of those options are confusing.
> > > > (When -fpic was invented first time, people mostly believed "PIC had
> > > > been only for shared libraries", so it's named -fpic instead of -shlib
> > > > or something.) IMO in the EFI stub for other ports, -fpie should be
> > > > used instead of -fpic as well because the EFI stub is not similar to a
> > > > shared library in any means.
> > > You are right, but I guess that Ard doesn't want to squash the efistub
> > > change into the LoongArch efistub support patch. :)
> > >
> >
> > I don't mind changing the stable tag at this point - I don't have
> > anything queued up on top of it at the moment.
> >
> > But I don't see the actual patch: please send me the delta patch that
> > you want to apply, and I will update it. Then, you can rebase your
> > v6.1 tree on top of it.
> OK, Ruoyao, please send a patch to change the efistub cflags. Thank you.
Oh, I think you needn't send, just showing the diff to Ard is OK. :)
diff --git a/drivers/firmware/efi/libstub/Makefile
b/drivers/firmware/efi/libstub/Makefile
index 8931ed24379e..8c1225b92492 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -27,7 +27,7 @@ cflags-$(CONFIG_ARM) := $(subst
$(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
cflags-$(CONFIG_RISCV) := $(subst
$(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
-fpic
cflags-$(CONFIG_LOONGARCH) := $(subst
$(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
- -fpic
+ -fpie
cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
Huacai
>
> Huacai
On Tue, 6 Sept 2022 at 10:59, Huacai Chen <[email protected]> wrote:
>
> On Tue, Sep 6, 2022 at 4:20 PM Huacai Chen <[email protected]> wrote:
> >
> > On Tue, Sep 6, 2022 at 3:18 PM Ard Biesheuvel <[email protected]> wrote:
> > >
> > > On Tue, 6 Sept 2022 at 06:43, Huacai Chen <[email protected]> wrote:
> > > >
> > > > Hi, Ruoyao,
> > > >
> > > > On Tue, Sep 6, 2022 at 12:27 PM Xi Ruoyao <[email protected]> wrote:
> > > > >
> > > > > On Tue, 2022-09-06 at 09:52 +0800, Huacai Chen wrote:
> > > > > > > cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > > > > > - -fpic
> > > > > > > + -fpie
> > > > > > >
> > > > > > > cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
> > > > > > >
> > > > > > > (Some explanation: -fpic does not only mean "generate position-
> > > > > > > independent code", but "generate position-independent code *suitable for
> > > > > > > use in a shared library*". On LoongArch -mdirect-extern-access cannot
> > > > > > > work for a shared library so the "-fpic -mdirect-extern-access"
> > > > > > > combination is rejected deliberately.)
> > > > > > >
> > > > > > > Not sure how to submit these changes properly... Do you prefer me to
> > > > > > > send V8 of this series or a single patch on top of your tree on GitHub?
> > > > >
> > > > > > Don't need V8, I will squash it into the previous patch myself. But
> > > > > > can we keep efistub as is?
> > > > >
> > > > > No, we can't allow -mdirect-extern-access -fpic on LoongArch because
> > > > > without copy relocation such a combination just does not make sense (i.
> > > > > e. we cannot find a sensible way to handle such a combination in GCC).
> > > > > So such a combination will cause GCC refuse to run.
> > > > >
> > > > > Note that -fpic/-fPIC is "position-independent code *suitable for
> > > > > use in a shared library*", while -fpie/-fPIE is more like just
> > > > > "position-independent code". The names of those options are confusing.
> > > > > (When -fpic was invented first time, people mostly believed "PIC had
> > > > > been only for shared libraries", so it's named -fpic instead of -shlib
> > > > > or something.) IMO in the EFI stub for other ports, -fpie should be
> > > > > used instead of -fpic as well because the EFI stub is not similar to a
> > > > > shared library in any means.
> > > > You are right, but I guess that Ard doesn't want to squash the efistub
> > > > change into the LoongArch efistub support patch. :)
> > > >
> > >
> > > I don't mind changing the stable tag at this point - I don't have
> > > anything queued up on top of it at the moment.
> > >
> > > But I don't see the actual patch: please send me the delta patch that
> > > you want to apply, and I will update it. Then, you can rebase your
> > > v6.1 tree on top of it.
> > OK, Ruoyao, please send a patch to change the efistub cflags. Thank you.
> Oh, I think you needn't send, just showing the diff to Ard is OK. :)
>
> diff --git a/drivers/firmware/efi/libstub/Makefile
> b/drivers/firmware/efi/libstub/Makefile
> index 8931ed24379e..8c1225b92492 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -27,7 +27,7 @@ cflags-$(CONFIG_ARM) := $(subst
> $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> cflags-$(CONFIG_RISCV) := $(subst
> $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> -fpic
> cflags-$(CONFIG_LOONGARCH) := $(subst
> $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> - -fpic
> + -fpie
>
> cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
>
I have merged this into the patch, and updated the tag
efi-loongarch-for-v6.1
ead384d956345681e1ddf97890d5e15ded015f07
It should be in linux-next tomorrow, you can merge the tag now from
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git