2024-03-22 16:52:55

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

Currently, x86-64 uses an unusual percpu layout, where the percpu section
is linked at absolute address 0. The reason behind this is that older GCC
versions placed the stack protector (if enabled) at a fixed offset from the
GS segment base. Since the GS segement is also used for percpu variables,
this forced the current layout.

GCC since version 8.1 supports a configurable location for the stack
protector value, which allows removal of the restriction on how the percpu
section is linked. This allows the percpu section to be linked normally,
like other architectures. In turn, this allows removal of code that was
needed to support the zero-based percpu section.

v4:
- Updated to current tip tree
- Added two new cleanups made possible by the removal of IA-64.
- Small improvements to the objtool conversion code.

Brian Gerst (16):
x86/stackprotector/32: Remove stack protector test script
x86/stackprotector/64: Remove stack protector test script
x86/boot: Disable stack protector for early boot code
x86/pvh: Use fixed_percpu_data for early boot GSBASE
x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
objtool: Allow adding relocations to an existing section
objtool: Convert fixed location stack protector accesses
x86/stackprotector/64: Convert to normal percpu variable
x86/percpu/64: Use relative percpu offsets
x86/percpu/64: Remove fixed_percpu_data
x86/boot/64: Remove inverse relocations
x86/percpu/64: Remove INIT_PER_CPU macros
percpu: Remove PER_CPU_FIRST_SECTION
percpu: Remove PERCPU_VADDR()
percpu: Remove __per_cpu_load
kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU

arch/x86/Kconfig | 16 +--
arch/x86/Makefile | 21 ++--
arch/x86/boot/compressed/misc.c | 14 +--
arch/x86/entry/entry_64.S | 2 +-
arch/x86/include/asm/desc.h | 1 -
arch/x86/include/asm/percpu.h | 22 ----
arch/x86/include/asm/processor.h | 28 +----
arch/x86/include/asm/stackprotector.h | 36 +-----
arch/x86/kernel/Makefile | 2 +
arch/x86/kernel/asm-offsets_64.c | 6 -
arch/x86/kernel/cpu/common.c | 9 +-
arch/x86/kernel/head64.c | 2 +-
arch/x86/kernel/head_64.S | 20 ++-
arch/x86/kernel/irq_64.c | 1 -
arch/x86/kernel/setup_percpu.c | 12 +-
arch/x86/kernel/vmlinux.lds.S | 35 ------
arch/x86/platform/pvh/head.S | 10 +-
arch/x86/tools/relocs.c | 143 ++--------------------
arch/x86/xen/xen-head.S | 10 +-
include/asm-generic/sections.h | 2 +-
include/asm-generic/vmlinux.lds.h | 43 +------
include/linux/percpu-defs.h | 12 --
init/Kconfig | 11 +-
kernel/kallsyms.c | 12 +-
mm/percpu.c | 4 +-
scripts/Makefile.lib | 2 +
scripts/gcc-x86_32-has-stack-protector.sh | 8 --
scripts/gcc-x86_64-has-stack-protector.sh | 4 -
scripts/kallsyms.c | 80 +++---------
scripts/link-vmlinux.sh | 4 -
tools/objtool/arch/x86/decode.c | 46 +++++++
tools/objtool/arch/x86/special.c | 91 ++++++++++++++
tools/objtool/builtin-check.c | 9 +-
tools/objtool/check.c | 14 ++-
tools/objtool/elf.c | 133 ++++++++++++++++----
tools/objtool/include/objtool/arch.h | 3 +
tools/objtool/include/objtool/builtin.h | 2 +
tools/objtool/include/objtool/elf.h | 90 +++++++++++---
38 files changed, 442 insertions(+), 518 deletions(-)
delete mode 100755 scripts/gcc-x86_32-has-stack-protector.sh
delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh


base-commit: 30052fd948a3b43506c83590eaaada12d1f2dd09
--
2.44.0



2024-03-22 16:53:04

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 01/16] x86/stackprotector/32: Remove stack protector test script

Test for compiler support directly in Kconfig.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/Kconfig | 2 +-
scripts/gcc-x86_32-has-stack-protector.sh | 8 --------
2 files changed, 1 insertion(+), 9 deletions(-)
delete mode 100755 scripts/gcc-x86_32-has-stack-protector.sh

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 03483b23a009..f326903cbe67 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -412,7 +412,7 @@ config PGTABLE_LEVELS
config CC_HAS_SANE_STACKPROTECTOR
bool
default $(success,$(srctree)/scripts/gcc-x86_64-has-stack-protector.sh $(CC) $(CLANG_FLAGS)) if 64BIT
- default $(success,$(srctree)/scripts/gcc-x86_32-has-stack-protector.sh $(CC) $(CLANG_FLAGS))
+ default $(cc-option,-mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard)
help
We have to make sure stack protector is unconditionally disabled if
the compiler produces broken code or if it does not let us control
diff --git a/scripts/gcc-x86_32-has-stack-protector.sh b/scripts/gcc-x86_32-has-stack-protector.sh
deleted file mode 100755
index 825c75c5b715..000000000000
--- a/scripts/gcc-x86_32-has-stack-protector.sh
+++ /dev/null
@@ -1,8 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0
-
-# This requires GCC 8.1 or better. Specifically, we require
-# -mstack-protector-guard-reg, added by
-# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81708
-
-echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m32 -O0 -fstack-protector -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard - -o - 2> /dev/null | grep -q "%fs"
--
2.44.0


2024-03-22 16:53:59

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 05/16] x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations

Clang may produce R_X86_64_REX_GOTPCRELX relocations when redefining the
stack protector location. Treat them as another type of PC-relative
relocation.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/tools/relocs.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index e7a44a7f617f..adf11a48ec70 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -31,6 +31,11 @@ static struct relocs relocs32;
static struct relocs relocs32neg;
static struct relocs relocs64;
#define FMT PRIu64
+
+#ifndef R_X86_64_REX_GOTPCRELX
+#define R_X86_64_REX_GOTPCRELX 42
+#endif
+
#else
#define FMT PRIu32
#endif
@@ -224,6 +229,7 @@ static const char *rel_type(unsigned type)
REL_TYPE(R_X86_64_PC16),
REL_TYPE(R_X86_64_8),
REL_TYPE(R_X86_64_PC8),
+ REL_TYPE(R_X86_64_REX_GOTPCRELX),
#else
REL_TYPE(R_386_NONE),
REL_TYPE(R_386_32),
@@ -865,6 +871,7 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,

case R_X86_64_PC32:
case R_X86_64_PLT32:
+ case R_X86_64_REX_GOTPCRELX:
/*
* PC relative relocations don't need to be adjusted unless
* referencing a percpu symbol.
--
2.44.0


2024-03-22 16:54:22

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 03/16] x86/boot: Disable stack protector for early boot code

On 64-bit, this will prevent crashes when the canary access is changed
from %gs:40 to %gs:__stack_chk_guard(%rip). RIP-relative addresses from
the identity-mapped early boot code will target the wrong address with
zero-based percpu. KASLR could then shift that address to an unmapped
page causing a crash on boot.

This early boot code runs well before userspace is active and does not
need stack protector enabled.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/kernel/Makefile | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 5d128167e2e2..9884d2c9de15 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -40,6 +40,8 @@ KMSAN_SANITIZE_sev.o := n
KCOV_INSTRUMENT_head$(BITS).o := n
KCOV_INSTRUMENT_sev.o := n

+CFLAGS_head32.o := -fno-stack-protector
+CFLAGS_head64.o := -fno-stack-protector
CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace

obj-y += head_$(BITS).o
--
2.44.0


2024-03-22 16:54:25

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 07/16] objtool: Convert fixed location stack protector accesses

Older versions of GCC fixed the location of the stack protector canary
at %gs:40. Use objtool to convert these accesses to normal percpu
accesses to __stack_chk_guard.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/Kconfig | 4 ++
scripts/Makefile.lib | 2 +
tools/objtool/arch/x86/decode.c | 46 +++++++++++++
tools/objtool/arch/x86/special.c | 91 +++++++++++++++++++++++++
tools/objtool/builtin-check.c | 9 ++-
tools/objtool/check.c | 12 ++++
tools/objtool/elf.c | 34 +++++++--
tools/objtool/include/objtool/arch.h | 3 +
tools/objtool/include/objtool/builtin.h | 2 +
tools/objtool/include/objtool/elf.h | 6 ++
10 files changed, 204 insertions(+), 5 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 88d72227e3cb..121cfb9ffc0e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -417,6 +417,10 @@ config CC_HAS_SANE_STACKPROTECTOR
We have to make sure stack protector is unconditionally disabled if
the compiler does not allow control of the segment and symbol.

+config STACKPROTECTOR_OBJTOOL
+ bool
+ default n
+
menu "Processor type and features"

config SMP
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 1bd59b8db05f..6bc4c69a9e50 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -258,6 +258,8 @@ objtool := $(objtree)/tools/objtool/objtool
objtool-args-$(CONFIG_HAVE_JUMP_LABEL_HACK) += --hacks=jump_label
objtool-args-$(CONFIG_HAVE_NOINSTR_HACK) += --hacks=noinstr
objtool-args-$(CONFIG_MITIGATION_CALL_DEPTH_TRACKING) += --hacks=skylake
+objtool-args-$(CONFIG_STACKPROTECTOR_OBJTOOL) += --hacks=stackprotector
+objtool-args-$(CONFIG_SMP) += --smp
objtool-args-$(CONFIG_X86_KERNEL_IBT) += --ibt
objtool-args-$(CONFIG_FINEIBT) += --cfi
objtool-args-$(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL) += --mcount
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 3a1d80a7878d..583a16b8bf47 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -144,6 +144,18 @@ static bool has_notrack_prefix(struct insn *insn)
return false;
}

+static bool has_gs_prefix(struct insn *insn)
+{
+ int i;
+
+ for (i = 0; i < insn->prefixes.nbytes; i++) {
+ if (insn->prefixes.bytes[i] == 0x65)
+ return true;
+ }
+
+ return false;
+}
+
int arch_decode_instruction(struct objtool_file *file, const struct section *sec,
unsigned long offset, unsigned int maxlen,
struct instruction *insn)
@@ -408,10 +420,44 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec

break;

+ case 0x2b:
+ case 0x3b:
+ case 0x39:
+ if (!rex_w)
+ break;
+
+ /* sub %gs:0x28, reg */
+ /* cmp %gs:0x28, reg */
+ /* cmp reg, %gs:0x28 */
+ if (has_gs_prefix(&ins) &&
+ modrm_mod == 0 &&
+ modrm_rm == 4 &&
+ sib_index == 4 &&
+ sib_base == 5 &&
+ ins.displacement.value == 0x28)
+ {
+ insn->type = INSN_STACKPROTECTOR;
+ break;
+ }
+
+ break;
+
case 0x8b:
if (!rex_w)
break;

+ /* mov %gs:0x28, reg */
+ if (has_gs_prefix(&ins) &&
+ modrm_mod == 0 &&
+ modrm_rm == 4 &&
+ sib_index == 4 &&
+ sib_base == 5 &&
+ ins.displacement.value == 0x28)
+ {
+ insn->type = INSN_STACKPROTECTOR;
+ break;
+ }
+
if (rm_is_mem(CFI_BP)) {

/* mov disp(%rbp), reg */
diff --git a/tools/objtool/arch/x86/special.c b/tools/objtool/arch/x86/special.c
index 4134d27c696b..020b6040c487 100644
--- a/tools/objtool/arch/x86/special.c
+++ b/tools/objtool/arch/x86/special.c
@@ -3,6 +3,9 @@

#include <objtool/special.h>
#include <objtool/builtin.h>
+#include <objtool/warn.h>
+#include <objtool/check.h>
+#include <objtool/elf.h>

#define X86_FEATURE_POPCNT (4 * 32 + 23)
#define X86_FEATURE_SMAP (9 * 32 + 20)
@@ -137,3 +140,91 @@ struct reloc *arch_find_switch_table(struct objtool_file *file,

return rodata_reloc;
}
+
+/*
+ * Convert op %gs:0x28, reg -> op __stack_chk_guard(%rip), reg
+ * op is MOV, SUB, or CMP.
+ *
+ * This can be removed when the minimum supported GCC version is raised
+ * to 8.1 or later.
+ */
+int arch_hack_stackprotector(struct objtool_file *file)
+{
+ struct section *sec;
+ struct symbol *__stack_chk_guard;
+ struct instruction *insn;
+
+ int i;
+
+ __stack_chk_guard = find_symbol_by_name(file->elf, "__stack_chk_guard");
+
+ for_each_sec(file, sec) {
+ int count = 0;
+ int idx;
+ struct section *rsec = sec->rsec;
+
+ sec_for_each_insn(file, sec, insn) {
+ if (insn->type == INSN_STACKPROTECTOR)
+ count++;
+ }
+
+ if (!count)
+ continue;
+
+ if (!__stack_chk_guard)
+ __stack_chk_guard = elf_create_undef_symbol(file->elf, "__stack_chk_guard");
+
+ if (!rsec) {
+ idx = 0;
+ rsec = sec->rsec = elf_create_rela_section(file->elf, sec, count);
+ } else {
+ idx = sec_num_entries(rsec);
+ if (elf_extend_rela_section(file->elf, rsec, count))
+ return -1;
+ }
+
+ sec_for_each_insn(file, sec, insn) {
+ unsigned char *data = sec->data->d_buf + insn->offset;
+
+ if (insn->type != INSN_STACKPROTECTOR)
+ continue;
+
+ if (insn->len != 9)
+ goto invalid;
+
+ /* Convert GS prefix to DS if !SMP */
+ if (data[0] != 0x65)
+ goto invalid;
+ if (!opts.smp)
+ data[0] = 0x3e;
+
+ /* Set Mod=00, R/M=101. Preserve Reg */
+ data[3] = (data[3] & 0x38) | 5;
+
+ /* Displacement 0 */
+ data[4] = 0;
+ data[5] = 0;
+ data[6] = 0;
+ data[7] = 0;
+
+ /* Pad with NOP */
+ data[8] = 0x90;
+
+ if (!elf_init_reloc_data_sym(file->elf, sec, insn->offset + 4, idx++, __stack_chk_guard, -4))
+ return -1;
+
+ continue;
+
+invalid:
+ fprintf(stderr, "Invalid stackprotector instruction at %s+0x%lx: ", sec->name, insn->offset);
+ for (i = 0; i < insn->len; i++)
+ fprintf(stderr, "%02x ", data[i]);
+ fprintf(stderr, "\n");
+ return -1;
+ }
+
+ mark_sec_changed(file->elf, sec, true);
+ }
+
+ return 0;
+}
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 5e21cfb7661d..0ab2efb45c0e 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -62,12 +62,17 @@ static int parse_hacks(const struct option *opt, const char *str, int unset)
found = true;
}

+ if (!str || strstr(str, "stackprotector")) {
+ opts.hack_stackprotector = true;
+ found = true;
+ }
+
return found ? 0 : -1;
}

static const struct option check_options[] = {
OPT_GROUP("Actions:"),
- OPT_CALLBACK_OPTARG('h', "hacks", NULL, NULL, "jump_label,noinstr,skylake", "patch toolchain bugs/limitations", parse_hacks),
+ OPT_CALLBACK_OPTARG('h', "hacks", NULL, NULL, "jump_label,noinstr,skylake,stackprotector", "patch toolchain bugs/limitations", parse_hacks),
OPT_BOOLEAN('i', "ibt", &opts.ibt, "validate and annotate IBT"),
OPT_BOOLEAN('m', "mcount", &opts.mcount, "annotate mcount/fentry calls for ftrace"),
OPT_BOOLEAN('n', "noinstr", &opts.noinstr, "validate noinstr rules"),
@@ -94,6 +99,7 @@ static const struct option check_options[] = {
OPT_BOOLEAN(0, "sec-address", &opts.sec_address, "print section addresses in warnings"),
OPT_BOOLEAN(0, "stats", &opts.stats, "print statistics"),
OPT_BOOLEAN('v', "verbose", &opts.verbose, "verbose warnings"),
+ OPT_BOOLEAN(0, "smp", &opts.smp, "building an SMP kernel"),

OPT_END(),
};
@@ -133,6 +139,7 @@ static bool opts_valid(void)
{
if (opts.hack_jump_label ||
opts.hack_noinstr ||
+ opts.hack_stackprotector ||
opts.ibt ||
opts.mcount ||
opts.noinstr ||
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 0a2c161fc04d..0056dd99ff7f 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1315,6 +1315,11 @@ __weak bool arch_is_embedded_insn(struct symbol *sym)
return false;
}

+__weak int arch_hack_stackprotector(struct objtool_file *file)
+{
+ return 0;
+}
+
static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
{
struct reloc *reloc;
@@ -4824,6 +4829,13 @@ int check(struct objtool_file *file)
warnings += ret;
}

+ if (opts.hack_stackprotector) {
+ ret = arch_hack_stackprotector(file);
+ if (ret < 0)
+ goto out;
+ warnings += ret;
+ }
+
free_insns(file);

if (opts.verbose)
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index cfb970727c8a..2af99b2a054c 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -846,6 +846,32 @@ elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size)
return sym;
}

+struct symbol *
+elf_create_undef_symbol(struct elf *elf, const char *sym_name)
+{
+ struct symbol *sym = calloc(1, sizeof(*sym));
+ char *name = strdup(sym_name);
+
+ if (!sym || !name) {
+ perror("malloc");
+ return NULL;
+ }
+
+ sym->name = name;
+ sym->sec = find_section_by_index(elf, 0);
+
+ sym->sym.st_name = elf_add_string(elf, NULL, name);
+ sym->sym.st_info = GELF_ST_INFO(STB_GLOBAL, STT_NOTYPE);
+ sym->sym.st_value = 0;
+ sym->sym.st_size = 0;
+
+ sym = __elf_create_symbol(elf, sym);
+ if (sym)
+ elf_add_symbol(elf, sym);
+
+ return sym;
+}
+
static struct reloc *elf_init_reloc(struct elf *elf, struct section *rsec,
unsigned int reloc_idx,
unsigned long offset, struct symbol *sym,
@@ -924,7 +950,7 @@ struct reloc *elf_init_reloc_data_sym(struct elf *elf, struct section *sec,
struct symbol *sym,
s64 addend)
{
- if (sym->sec && (sec->sh.sh_flags & SHF_EXECINSTR)) {
+ if (sym->sec && (sym->sec->sh.sh_flags & SHF_EXECINSTR)) {
WARN("bad call to %s() for text symbol %s",
__func__, sym->name);
return NULL;
@@ -1196,9 +1222,9 @@ struct section *elf_create_section(struct elf *elf, const char *name,
return sec;
}

-static struct section *elf_create_rela_section(struct elf *elf,
- struct section *sec,
- unsigned int reloc_nr)
+struct section *elf_create_rela_section(struct elf *elf,
+ struct section *sec,
+ unsigned int reloc_nr)
{
struct section *rsec;
struct reloc_block *block;
diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
index 0b303eba660e..c60fec88b3af 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -28,6 +28,7 @@ enum insn_type {
INSN_CLD,
INSN_TRAP,
INSN_ENDBR,
+ INSN_STACKPROTECTOR,
INSN_OTHER,
};

@@ -96,4 +97,6 @@ int arch_rewrite_retpolines(struct objtool_file *file);

bool arch_pc_relative_reloc(struct reloc *reloc);

+int arch_hack_stackprotector(struct objtool_file *file);
+
#endif /* _ARCH_H */
diff --git a/tools/objtool/include/objtool/builtin.h b/tools/objtool/include/objtool/builtin.h
index fcca6662c8b4..5085d3135e6b 100644
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -13,6 +13,7 @@ struct opts {
bool hack_jump_label;
bool hack_noinstr;
bool hack_skylake;
+ bool hack_stackprotector;
bool ibt;
bool mcount;
bool noinstr;
@@ -38,6 +39,7 @@ struct opts {
bool sec_address;
bool stats;
bool verbose;
+ bool smp;
};

extern struct opts opts;
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 7851467f6878..b5eec9e4a65d 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -120,6 +120,10 @@ struct elf *elf_open_read(const char *name, int flags);
struct section *elf_create_section(struct elf *elf, const char *name,
size_t entsize, unsigned int nr);

+struct section *elf_create_rela_section(struct elf *elf,
+ struct section *sec,
+ unsigned int reloc_nr);
+
int elf_extend_rela_section(struct elf *elf,
struct section *rsec,
int add_relocs);
@@ -130,6 +134,8 @@ struct section *elf_create_section_pair(struct elf *elf, const char *name,

struct symbol *elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size);

+struct symbol *elf_create_undef_symbol(struct elf *elf, const char *sym_name);
+
struct reloc *elf_init_reloc_text_sym(struct elf *elf, struct section *sec,
unsigned long offset,
unsigned int reloc_idx,
--
2.44.0


2024-03-22 16:54:33

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 08/16] x86/stackprotector/64: Convert to normal percpu variable

Older versions of GCC fixed the location of the stack protector canary
at %gs:40. This constraint forced the percpu section to be linked at
virtual address 0 so that the canary could be the first data object in
the percpu section. Supporting the zero-based percpu section requires
additional code to handle relocations for RIP-relative references to
percpu data, extra complexity to kallsyms, and workarounds for linker
bugs due to the use of absolute symbols.

Use compiler options to redefine the stack protector location if
supported, otherwise use objtool. This will remove the contraint that
the percpu section must be zero-based.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/Kconfig | 11 ++++----
arch/x86/Makefile | 21 ++++++++++------
arch/x86/entry/entry_64.S | 2 +-
arch/x86/include/asm/processor.h | 16 ++----------
arch/x86/include/asm/stackprotector.h | 36 ++++-----------------------
arch/x86/kernel/asm-offsets_64.c | 6 -----
arch/x86/kernel/cpu/common.c | 5 +---
arch/x86/kernel/head_64.S | 3 +--
arch/x86/xen/xen-head.S | 3 +--
9 files changed, 30 insertions(+), 73 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 121cfb9ffc0e..3dbefdb8a5d6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -271,7 +271,7 @@ config X86
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_SETUP_PER_CPU_AREA
select HAVE_SOFTIRQ_ON_OWN_STACK
- select HAVE_STACKPROTECTOR if CC_HAS_SANE_STACKPROTECTOR
+ select HAVE_STACKPROTECTOR if X86_64 || CC_HAS_SANE_STACKPROTECTOR
select HAVE_STACK_VALIDATION if HAVE_OBJTOOL
select HAVE_STATIC_CALL
select HAVE_STATIC_CALL_INLINE if HAVE_OBJTOOL
@@ -411,15 +411,14 @@ config PGTABLE_LEVELS

config CC_HAS_SANE_STACKPROTECTOR
bool
- default y if 64BIT
+ default $(cc-option,-mstack-protector-guard-reg=gs -mstack-protector-guard-symbol=__stack_chk_guard) if 64BIT
default $(cc-option,-mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard)
- help
- We have to make sure stack protector is unconditionally disabled if
- the compiler does not allow control of the segment and symbol.

config STACKPROTECTOR_OBJTOOL
bool
- default n
+ depends on X86_64 && STACKPROTECTOR
+ default !CC_HAS_SANE_STACKPROTECTOR
+ prompt "Debug objtool stack protector conversion" if CC_HAS_SANE_STACKPROTECTOR && DEBUG_KERNEL

menu "Processor type and features"

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 662d9d4033e6..2a3ba1abb802 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -116,13 +116,7 @@ ifeq ($(CONFIG_X86_32),y)
# temporary until string.h is fixed
KBUILD_CFLAGS += -ffreestanding

- ifeq ($(CONFIG_STACKPROTECTOR),y)
- ifeq ($(CONFIG_SMP),y)
- KBUILD_CFLAGS += -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard
- else
- KBUILD_CFLAGS += -mstack-protector-guard=global
- endif
- endif
+ percpu_seg := fs
else
BITS := 64
UTS_MACHINE := x86_64
@@ -172,6 +166,19 @@ else
KBUILD_CFLAGS += -mcmodel=kernel
KBUILD_RUSTFLAGS += -Cno-redzone=y
KBUILD_RUSTFLAGS += -Ccode-model=kernel
+
+ percpu_seg := gs
+endif
+
+ifeq ($(CONFIG_STACKPROTECTOR),y)
+ ifneq ($(CONFIG_STACKPROTECTOR_OBJTOOL),y)
+ ifeq ($(CONFIG_SMP),y)
+ KBUILD_CFLAGS += -mstack-protector-guard-reg=$(percpu_seg)
+ KBUILD_CFLAGS += -mstack-protector-guard-symbol=__stack_chk_guard
+ else
+ KBUILD_CFLAGS += -mstack-protector-guard=global
+ endif
+ endif
endif

#
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 8af2a26b24f6..9478ff768dd0 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -191,7 +191,7 @@ SYM_FUNC_START(__switch_to_asm)

#ifdef CONFIG_STACKPROTECTOR
movq TASK_stack_canary(%rsi), %rbx
- movq %rbx, PER_CPU_VAR(fixed_percpu_data + FIXED_stack_canary)
+ movq %rbx, PER_CPU_VAR(__stack_chk_guard)
#endif

/*
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 89ed5237e79f..946bebce396f 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -387,16 +387,8 @@ struct irq_stack {

#ifdef CONFIG_X86_64
struct fixed_percpu_data {
- /*
- * GCC hardcodes the stack canary as %gs:40. Since the
- * irq_stack is the object at %gs:0, we reserve the bottom
- * 48 bytes of the irq stack for the canary.
- *
- * Once we are willing to require -mstack-protector-guard-symbol=
- * support for x86_64 stackprotector, we can get rid of this.
- */
char gs_base[40];
- unsigned long stack_canary;
+ unsigned long reserved;
};

DECLARE_PER_CPU_FIRST(struct fixed_percpu_data, fixed_percpu_data) __visible;
@@ -411,11 +403,7 @@ extern asmlinkage void entry_SYSCALL32_ignore(void);

/* Save actual FS/GS selectors and bases to current->thread */
void current_save_fsgs(void);
-#else /* X86_64 */
-#ifdef CONFIG_STACKPROTECTOR
-DECLARE_PER_CPU(unsigned long, __stack_chk_guard);
-#endif
-#endif /* !X86_64 */
+#endif /* X86_64 */

struct perf_event;

diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index 00473a650f51..d43fb589fcf6 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -2,26 +2,10 @@
/*
* GCC stack protector support.
*
- * Stack protector works by putting predefined pattern at the start of
+ * Stack protector works by putting a predefined pattern at the start of
* the stack frame and verifying that it hasn't been overwritten when
- * returning from the function. The pattern is called stack canary
- * and unfortunately gcc historically required it to be at a fixed offset
- * from the percpu segment base. On x86_64, the offset is 40 bytes.
- *
- * The same segment is shared by percpu area and stack canary. On
- * x86_64, percpu symbols are zero based and %gs (64-bit) points to the
- * base of percpu area. The first occupant of the percpu area is always
- * fixed_percpu_data which contains stack_canary at the appropriate
- * offset. On x86_32, the stack canary is just a regular percpu
- * variable.
- *
- * Putting percpu data in %fs on 32-bit is a minor optimization compared to
- * using %gs. Since 32-bit userspace normally has %fs == 0, we are likely
- * to load 0 into %fs on exit to usermode, whereas with percpu data in
- * %gs, we are likely to load a non-null %gs on return to user mode.
- *
- * Once we are willing to require GCC 8.1 or better for 64-bit stackprotector
- * support, we can remove some of this complexity.
+ * returning from the function. The pattern is called the stack canary
+ * and is a unique value for each task.
*/

#ifndef _ASM_STACKPROTECTOR_H
@@ -36,6 +20,8 @@

#include <linux/sched.h>

+DECLARE_PER_CPU(unsigned long, __stack_chk_guard);
+
/*
* Initialize the stackprotector canary value.
*
@@ -51,25 +37,13 @@ static __always_inline void boot_init_stack_canary(void)
{
unsigned long canary = get_random_canary();

-#ifdef CONFIG_X86_64
- BUILD_BUG_ON(offsetof(struct fixed_percpu_data, stack_canary) != 40);
-#endif
-
current->stack_canary = canary;
-#ifdef CONFIG_X86_64
- this_cpu_write(fixed_percpu_data.stack_canary, canary);
-#else
this_cpu_write(__stack_chk_guard, canary);
-#endif
}

static inline void cpu_init_stack_canary(int cpu, struct task_struct *idle)
{
-#ifdef CONFIG_X86_64
- per_cpu(fixed_percpu_data.stack_canary, cpu) = idle->stack_canary;
-#else
per_cpu(__stack_chk_guard, cpu) = idle->stack_canary;
-#endif
}

#else /* STACKPROTECTOR */
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index bb65371ea9df..590b6cd0eac0 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -54,11 +54,5 @@ int main(void)
BLANK();
#undef ENTRY

- BLANK();
-
-#ifdef CONFIG_STACKPROTECTOR
- OFFSET(FIXED_stack_canary, fixed_percpu_data, stack_canary);
- BLANK();
-#endif
return 0;
}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 9a34651d24e7..f49e8f5b858d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2063,16 +2063,13 @@ void syscall_init(void)
if (!cpu_feature_enabled(X86_FEATURE_FRED))
idt_syscall_init();
}
-
-#else /* CONFIG_X86_64 */
+#endif /* CONFIG_X86_64 */

#ifdef CONFIG_STACKPROTECTOR
DEFINE_PER_CPU(unsigned long, __stack_chk_guard);
EXPORT_PER_CPU_SYMBOL(__stack_chk_guard);
#endif

-#endif /* CONFIG_X86_64 */
-
/*
* Clear all 6 debug registers:
*/
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index b11526869a40..cfbf0486d424 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -361,8 +361,7 @@ SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL)

/* Set up %gs.
*
- * The base of %gs always points to fixed_percpu_data. If the
- * stack protector canary is enabled, it is located at %gs:40.
+ * The base of %gs always points to fixed_percpu_data.
* Note that, on SMP, the boot cpu uses init data section until
* the per cpu areas are set up.
*/
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 758bcd47b72d..ae4672ea00bb 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -53,8 +53,7 @@ SYM_CODE_START(startup_xen)

/* Set up %gs.
*
- * The base of %gs always points to fixed_percpu_data. If the
- * stack protector canary is enabled, it is located at %gs:40.
+ * The base of %gs always points to fixed_percpu_data.
* Note that, on SMP, the boot cpu uses init data section until
* the per cpu areas are set up.
*/
--
2.44.0


2024-03-22 16:54:49

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 02/16] x86/stackprotector/64: Remove stack protector test script

This test for the stack protector was added in 2006 to make sure the
compiler had the PR28281 patch applied. With GCC 5.1 being the minimum
supported compiler now, it is no longer necessary.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/Kconfig | 5 ++---
scripts/gcc-x86_64-has-stack-protector.sh | 4 ----
2 files changed, 2 insertions(+), 7 deletions(-)
delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f326903cbe67..88d72227e3cb 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -411,12 +411,11 @@ config PGTABLE_LEVELS

config CC_HAS_SANE_STACKPROTECTOR
bool
- default $(success,$(srctree)/scripts/gcc-x86_64-has-stack-protector.sh $(CC) $(CLANG_FLAGS)) if 64BIT
+ default y if 64BIT
default $(cc-option,-mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard)
help
We have to make sure stack protector is unconditionally disabled if
- the compiler produces broken code or if it does not let us control
- the segment on 32-bit kernels.
+ the compiler does not allow control of the segment and symbol.

menu "Processor type and features"

diff --git a/scripts/gcc-x86_64-has-stack-protector.sh b/scripts/gcc-x86_64-has-stack-protector.sh
deleted file mode 100755
index 75e4e22b986a..000000000000
--- a/scripts/gcc-x86_64-has-stack-protector.sh
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0
-
-echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m64 -O0 -mcmodel=kernel -fno-PIE -fstack-protector - -o - 2> /dev/null | grep -q "%gs"
--
2.44.0


2024-03-22 16:54:55

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 09/16] x86/percpu/64: Use relative percpu offsets

The percpu section is currently linked at virtual address 0, because
older compilers hardcoded the stack protector canary value at a fixed
offset from the start of the GS segment. Now that the canary is a
normal percpu variable, the percpu section does not need to be linked
at a specific virtual address. This means that x86-64 will calculate
percpu offsets like most other architectures, as the delta between the
initial percpu address and the dynamically allocated memory.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/include/asm/processor.h | 6 +++++-
arch/x86/kernel/head_64.S | 19 +++++++++----------
arch/x86/kernel/setup_percpu.c | 12 ++----------
arch/x86/kernel/vmlinux.lds.S | 29 +----------------------------
arch/x86/platform/pvh/head.S | 5 ++---
arch/x86/tools/relocs.c | 10 +++-------
arch/x86/xen/xen-head.S | 9 ++++-----
init/Kconfig | 2 +-
8 files changed, 27 insertions(+), 65 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 946bebce396f..40d6add8ff31 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -396,7 +396,11 @@ DECLARE_INIT_PER_CPU(fixed_percpu_data);

static inline unsigned long cpu_kernelmode_gs_base(int cpu)
{
- return (unsigned long)per_cpu(fixed_percpu_data.gs_base, cpu);
+#ifdef CONFIG_SMP
+ return per_cpu_offset(cpu);
+#else
+ return 0;
+#endif
}

extern asmlinkage void entry_SYSCALL32_ignore(void);
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index cfbf0486d424..5b2cc711feec 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -68,11 +68,14 @@ SYM_CODE_START_NOALIGN(startup_64)
/* Set up the stack for verify_cpu() */
leaq __top_init_kernel_stack(%rip), %rsp

- /* Setup GSBASE to allow stack canary access for C code */
+ /*
+ * Set up GSBASE.
+ * Note that, on SMP, the boot cpu uses init data section until
+ * the per cpu areas are set up.
+ */
movl $MSR_GS_BASE, %ecx
- leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
- movl %edx, %eax
- shrq $32, %rdx
+ xorl %eax, %eax
+ xorl %edx, %edx
wrmsr

call startup_64_setup_gdt_idt
@@ -359,16 +362,12 @@ SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL)
movl %eax,%fs
movl %eax,%gs

- /* Set up %gs.
- *
- * The base of %gs always points to fixed_percpu_data.
+ /*
+ * Set up GSBASE.
* Note that, on SMP, the boot cpu uses init data section until
* the per cpu areas are set up.
*/
movl $MSR_GS_BASE,%ecx
-#ifndef CONFIG_SMP
- leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
-#endif
movl %edx, %eax
shrq $32, %rdx
wrmsr
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index b30d6e180df7..1e7be9409aa2 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -23,18 +23,10 @@
#include <asm/cpumask.h>
#include <asm/cpu.h>

-#ifdef CONFIG_X86_64
-#define BOOT_PERCPU_OFFSET ((unsigned long)__per_cpu_load)
-#else
-#define BOOT_PERCPU_OFFSET 0
-#endif
-
-DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off) = BOOT_PERCPU_OFFSET;
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off);
EXPORT_PER_CPU_SYMBOL(this_cpu_off);

-unsigned long __per_cpu_offset[NR_CPUS] __ro_after_init = {
- [0 ... NR_CPUS-1] = BOOT_PERCPU_OFFSET,
-};
+unsigned long __per_cpu_offset[NR_CPUS] __ro_after_init;
EXPORT_SYMBOL(__per_cpu_offset);

/*
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 3509afc6a672..0b152f96c24e 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -99,12 +99,6 @@ const_pcpu_hot = pcpu_hot;
PHDRS {
text PT_LOAD FLAGS(5); /* R_E */
data PT_LOAD FLAGS(6); /* RW_ */
-#ifdef CONFIG_X86_64
-#ifdef CONFIG_SMP
- percpu PT_LOAD FLAGS(6); /* RW_ */
-#endif
- init PT_LOAD FLAGS(7); /* RWE */
-#endif
note PT_NOTE FLAGS(0); /* ___ */
}

@@ -222,21 +216,7 @@ SECTIONS
__init_begin = .; /* paired with __init_end */
}

-#if defined(CONFIG_X86_64) && defined(CONFIG_SMP)
- /*
- * percpu offsets are zero-based on SMP. PERCPU_VADDR() changes the
- * output PHDR, so the next output section - .init.text - should
- * start another segment - init.
- */
- PERCPU_VADDR(INTERNODE_CACHE_BYTES, 0, :percpu)
- ASSERT(SIZEOF(.data..percpu) < CONFIG_PHYSICAL_START,
- "per-CPU data too large - increase CONFIG_PHYSICAL_START")
-#endif
-
INIT_TEXT_SECTION(PAGE_SIZE)
-#ifdef CONFIG_X86_64
- :init
-#endif

/*
* Section for code used exclusively before alternatives are run. All
@@ -353,9 +333,7 @@ SECTIONS
EXIT_DATA
}

-#if !defined(CONFIG_X86_64) || !defined(CONFIG_SMP)
PERCPU_SECTION(INTERNODE_CACHE_BYTES)
-#endif

. = ALIGN(PAGE_SIZE);

@@ -493,16 +471,11 @@ SECTIONS
* Per-cpu symbols which need to be offset from __per_cpu_load
* for the boot processor.
*/
-#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x) + __per_cpu_load
+#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x)
INIT_PER_CPU(gdt_page);
INIT_PER_CPU(fixed_percpu_data);
INIT_PER_CPU(irq_stack_backing_store);

-#ifdef CONFIG_SMP
-. = ASSERT((fixed_percpu_data == 0),
- "fixed_percpu_data is not at start of per-cpu area");
-#endif
-
#ifdef CONFIG_MITIGATION_UNRET_ENTRY
. = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
#endif
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index 1f1c3230b27b..7e3e07c6170f 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -101,9 +101,8 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
* the per cpu areas are set up.
*/
mov $MSR_GS_BASE,%ecx
- lea INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
- mov %edx, %eax
- shr $32, %rdx
+ xor %eax, %eax
+ xor %edx, %edx
wrmsr

call xen_prepare_pvh
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index adf11a48ec70..016650ddaf7f 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -839,12 +839,7 @@ static void percpu_init(void)
*/
static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
{
- int shndx = sym_index(sym);
-
- return (shndx == per_cpu_shndx) &&
- strcmp(symname, "__init_begin") &&
- strcmp(symname, "__per_cpu_load") &&
- strncmp(symname, "init_per_cpu_", 13);
+ return 0;
}


@@ -1068,7 +1063,8 @@ static int cmp_relocs(const void *va, const void *vb)

static void sort_relocs(struct relocs *r)
{
- qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
+ if (r->count)
+ qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
}

static int write32(uint32_t v, FILE *f)
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index ae4672ea00bb..1796884b727d 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -51,15 +51,14 @@ SYM_CODE_START(startup_xen)

leaq __top_init_kernel_stack(%rip), %rsp

- /* Set up %gs.
- *
- * The base of %gs always points to fixed_percpu_data.
+ /*
+ * Set up GSBASE.
* Note that, on SMP, the boot cpu uses init data section until
* the per cpu areas are set up.
*/
movl $MSR_GS_BASE,%ecx
- movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
- cdq
+ xorl %eax, %eax
+ xorl %edx, %edx
wrmsr

mov %rsi, %rdi
diff --git a/init/Kconfig b/init/Kconfig
index f3ea5dea9c85..0f928f82dc7a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1784,7 +1784,7 @@ config KALLSYMS_ALL
config KALLSYMS_ABSOLUTE_PERCPU
bool
depends on KALLSYMS
- default X86_64 && SMP
+ default n

config KALLSYMS_BASE_RELATIVE
bool
--
2.44.0


2024-03-22 16:54:58

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 10/16] x86/percpu/64: Remove fixed_percpu_data

Now that the stack protector canary value is a normal percpu variable,
fixed_percpu_data is unused and can be removed.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/include/asm/processor.h | 8 --------
arch/x86/kernel/cpu/common.c | 4 ----
arch/x86/kernel/vmlinux.lds.S | 1 -
arch/x86/tools/relocs.c | 1 -
4 files changed, 14 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 40d6add8ff31..1f15a7bee876 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -386,14 +386,6 @@ struct irq_stack {
} __aligned(IRQ_STACK_SIZE);

#ifdef CONFIG_X86_64
-struct fixed_percpu_data {
- char gs_base[40];
- unsigned long reserved;
-};
-
-DECLARE_PER_CPU_FIRST(struct fixed_percpu_data, fixed_percpu_data) __visible;
-DECLARE_INIT_PER_CPU(fixed_percpu_data);
-
static inline unsigned long cpu_kernelmode_gs_base(int cpu)
{
#ifdef CONFIG_SMP
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f49e8f5b858d..395a8831d507 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1997,10 +1997,6 @@ EXPORT_PER_CPU_SYMBOL(pcpu_hot);
EXPORT_PER_CPU_SYMBOL(const_pcpu_hot);

#ifdef CONFIG_X86_64
-DEFINE_PER_CPU_FIRST(struct fixed_percpu_data,
- fixed_percpu_data) __aligned(PAGE_SIZE) __visible;
-EXPORT_PER_CPU_SYMBOL_GPL(fixed_percpu_data);
-
static void wrmsrl_cstar(unsigned long val)
{
/*
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 0b152f96c24e..667202ebd37f 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -473,7 +473,6 @@ SECTIONS
*/
#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x)
INIT_PER_CPU(gdt_page);
-INIT_PER_CPU(fixed_percpu_data);
INIT_PER_CPU(irq_stack_backing_store);

#ifdef CONFIG_MITIGATION_UNRET_ENTRY
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 016650ddaf7f..65e6f3d6d890 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -834,7 +834,6 @@ static void percpu_init(void)
* __per_cpu_load
*
* The "gold" linker incorrectly associates:
- * init_per_cpu__fixed_percpu_data
* init_per_cpu__gdt_page
*/
static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
--
2.44.0


2024-03-22 16:55:31

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 11/16] x86/boot/64: Remove inverse relocations

Inverse relocations were needed to offset the effects of relocation for
RIP-relative accesses to zero-based percpu data. Now that the percpu
section is linked normally as part of the kernel image, they are no
longer needed.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/boot/compressed/misc.c | 14 +---
arch/x86/tools/relocs.c | 126 +-------------------------------
2 files changed, 2 insertions(+), 138 deletions(-)

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b70e4a21c15f..371288202268 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -235,7 +235,7 @@ static void handle_relocations(void *output, unsigned long output_len,

/*
* Process relocations: 32 bit relocations first then 64 bit after.
- * Three sets of binary relocations are added to the end of the kernel
+ * Two sets of binary relocations are added to the end of the kernel
* before compression. Each relocation table entry is the kernel
* address of the location which needs to be updated stored as a
* 32-bit value which is sign extended to 64 bits.
@@ -245,8 +245,6 @@ static void handle_relocations(void *output, unsigned long output_len,
* kernel bits...
* 0 - zero terminator for 64 bit relocations
* 64 bit relocation repeated
- * 0 - zero terminator for inverse 32 bit relocations
- * 32 bit inverse relocation repeated
* 0 - zero terminator for 32 bit relocations
* 32 bit relocation repeated
*
@@ -263,16 +261,6 @@ static void handle_relocations(void *output, unsigned long output_len,
*(uint32_t *)ptr += delta;
}
#ifdef CONFIG_X86_64
- while (*--reloc) {
- long extended = *reloc;
- extended += map;
-
- ptr = (unsigned long)extended;
- if (ptr < min_addr || ptr > max_addr)
- error("inverse 32-bit relocation outside of kernel!\n");
-
- *(int32_t *)ptr -= delta;
- }
for (reloc--; *reloc; reloc--) {
long extended = *reloc;
extended += map;
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 65e6f3d6d890..e490297a486b 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -28,7 +28,6 @@ struct relocs {
static struct relocs relocs16;
static struct relocs relocs32;
#if ELF_BITS == 64
-static struct relocs relocs32neg;
static struct relocs relocs64;
#define FMT PRIu64

@@ -89,7 +88,6 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = {
"__initramfs_start|"
"(jiffies|jiffies_64)|"
#if ELF_BITS == 64
- "__per_cpu_load|"
"init_per_cpu__.*|"
"__end_rodata_hpage_align|"
#endif
@@ -287,33 +285,6 @@ static const char *sym_name(const char *sym_strtab, Elf_Sym *sym)
return name;
}

-static Elf_Sym *sym_lookup(const char *symname)
-{
- int i;
- for (i = 0; i < shnum; i++) {
- struct section *sec = &secs[i];
- long nsyms;
- char *strtab;
- Elf_Sym *symtab;
- Elf_Sym *sym;
-
- if (sec->shdr.sh_type != SHT_SYMTAB)
- continue;
-
- nsyms = sec->shdr.sh_size/sizeof(Elf_Sym);
- symtab = sec->symtab;
- strtab = sec->link->strtab;
-
- for (sym = symtab; --nsyms >= 0; sym++) {
- if (!sym->st_name)
- continue;
- if (strcmp(symname, strtab + sym->st_name) == 0)
- return sym;
- }
- }
- return 0;
-}
-
#if BYTE_ORDER == LITTLE_ENDIAN
#define le16_to_cpu(val) (val)
#define le32_to_cpu(val) (val)
@@ -773,75 +744,8 @@ static void walk_relocs(int (*process)(struct section *sec, Elf_Rel *rel,
}
}

-/*
- * The .data..percpu section is a special case for x86_64 SMP kernels.
- * It is used to initialize the actual per_cpu areas and to provide
- * definitions for the per_cpu variables that correspond to their offsets
- * within the percpu area. Since the values of all of the symbols need
- * to be offsets from the start of the per_cpu area the virtual address
- * (sh_addr) of .data..percpu is 0 in SMP kernels.
- *
- * This means that:
- *
- * Relocations that reference symbols in the per_cpu area do not
- * need further relocation (since the value is an offset relative
- * to the start of the per_cpu area that does not change).
- *
- * Relocations that apply to the per_cpu area need to have their
- * offset adjusted by by the value of __per_cpu_load to make them
- * point to the correct place in the loaded image (because the
- * virtual address of .data..percpu is 0).
- *
- * For non SMP kernels .data..percpu is linked as part of the normal
- * kernel data and does not require special treatment.
- *
- */
-static int per_cpu_shndx = -1;
-static Elf_Addr per_cpu_load_addr;
-
-static void percpu_init(void)
-{
- int i;
- for (i = 0; i < shnum; i++) {
- ElfW(Sym) *sym;
- if (strcmp(sec_name(i), ".data..percpu"))
- continue;
-
- if (secs[i].shdr.sh_addr != 0) /* non SMP kernel */
- return;
-
- sym = sym_lookup("__per_cpu_load");
- if (!sym)
- die("can't find __per_cpu_load\n");
-
- per_cpu_shndx = i;
- per_cpu_load_addr = sym->st_value;
- return;
- }
-}
-
#if ELF_BITS == 64

-/*
- * Check to see if a symbol lies in the .data..percpu section.
- *
- * The linker incorrectly associates some symbols with the
- * .data..percpu section so we also need to check the symbol
- * name to make sure that we classify the symbol correctly.
- *
- * The GNU linker incorrectly associates:
- * __init_begin
- * __per_cpu_load
- *
- * The "gold" linker incorrectly associates:
- * init_per_cpu__gdt_page
- */
-static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
-{
- return 0;
-}
-
-
static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
const char *symname)
{
@@ -852,12 +756,6 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
if (sym->st_shndx == SHN_UNDEF)
return 0;

- /*
- * Adjust the offset if this reloc applies to the percpu section.
- */
- if (sec->shdr.sh_info == per_cpu_shndx)
- offset += per_cpu_load_addr;
-
switch (r_type) {
case R_X86_64_NONE:
/* NONE can be ignored. */
@@ -867,33 +765,21 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
case R_X86_64_PLT32:
case R_X86_64_REX_GOTPCRELX:
/*
- * PC relative relocations don't need to be adjusted unless
- * referencing a percpu symbol.
+ * PC relative relocations don't need to be adjusted.
*
* NB: R_X86_64_PLT32 can be treated as R_X86_64_PC32.
*/
- if (is_percpu_sym(sym, symname))
- add_reloc(&relocs32neg, offset);
break;

case R_X86_64_PC64:
/*
* Only used by jump labels
*/
- if (is_percpu_sym(sym, symname))
- die("Invalid R_X86_64_PC64 relocation against per-CPU symbol %s\n",
- symname);
break;

case R_X86_64_32:
case R_X86_64_32S:
case R_X86_64_64:
- /*
- * References to the percpu area don't need to be adjusted.
- */
- if (is_percpu_sym(sym, symname))
- break;
-
if (shn_abs) {
/*
* Whitelisted absolute symbols do not require
@@ -1107,7 +993,6 @@ static void emit_relocs(int as_text, int use_real_mode)
/* Order the relocations for more efficient processing */
sort_relocs(&relocs32);
#if ELF_BITS == 64
- sort_relocs(&relocs32neg);
sort_relocs(&relocs64);
#else
sort_relocs(&relocs16);
@@ -1139,13 +1024,6 @@ static void emit_relocs(int as_text, int use_real_mode)
/* Now print each relocation */
for (i = 0; i < relocs64.count; i++)
write_reloc(relocs64.offset[i], stdout);
-
- /* Print a stop */
- write_reloc(0, stdout);
-
- /* Now print each inverse 32-bit relocation */
- for (i = 0; i < relocs32neg.count; i++)
- write_reloc(relocs32neg.offset[i], stdout);
#endif

/* Print a stop */
@@ -1196,8 +1074,6 @@ void process(FILE *fp, int use_real_mode, int as_text,
read_strtabs(fp);
read_symtabs(fp);
read_relocs(fp);
- if (ELF_BITS == 64)
- percpu_init();
if (show_absolute_syms) {
print_absolute_symbols();
return;
--
2.44.0


2024-03-22 16:55:46

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 12/16] x86/percpu/64: Remove INIT_PER_CPU macros

Now that the load and link addresses of percpu variables are the same,
these macros are no longer necessary.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/include/asm/desc.h | 1 -
arch/x86/include/asm/percpu.h | 22 ----------------------
arch/x86/kernel/head64.c | 2 +-
arch/x86/kernel/irq_64.c | 1 -
arch/x86/kernel/vmlinux.lds.S | 7 -------
arch/x86/tools/relocs.c | 1 -
6 files changed, 1 insertion(+), 33 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index 62dc9f59ea76..ec95fe44fa3a 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -46,7 +46,6 @@ struct gdt_page {
} __attribute__((aligned(PAGE_SIZE)));

DECLARE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page);
-DECLARE_INIT_PER_CPU(gdt_page);

/* Provide the original GDT */
static inline struct desc_struct *get_cpu_gdt_rw(unsigned int cpu)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index f6ddbaaf80bc..59d91fdfe037 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -20,12 +20,6 @@

#define PER_CPU_VAR(var) __percpu(var)__percpu_rel

-#ifdef CONFIG_X86_64_SMP
-#define INIT_PER_CPU_VAR(var) init_per_cpu__##var
-#else
-#define INIT_PER_CPU_VAR(var) var
-#endif
-
#else /* ...!ASSEMBLY */

#include <linux/build_bug.h>
@@ -101,22 +95,6 @@
#define __percpu_arg(x) __percpu_prefix "%" #x
#define __force_percpu_arg(x) __force_percpu_prefix "%" #x

-/*
- * Initialized pointers to per-cpu variables needed for the boot
- * processor need to use these macros to get the proper address
- * offset from __per_cpu_load on SMP.
- *
- * There also must be an entry in vmlinux_64.lds.S
- */
-#define DECLARE_INIT_PER_CPU(var) \
- extern typeof(var) init_per_cpu_var(var)
-
-#ifdef CONFIG_X86_64_SMP
-#define init_per_cpu_var(var) init_per_cpu__##var
-#else
-#define init_per_cpu_var(var) var
-#endif
-
/* For arch-specific code, we can use direct single-insn ops (they
* don't give an lvalue though). */

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 212e8e06aeba..5f0523610e92 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -564,7 +564,7 @@ void __head startup_64_setup_gdt_idt(void)
void *handler = NULL;

struct desc_ptr startup_gdt_descr = {
- .address = (unsigned long)&RIP_REL_REF(init_per_cpu_var(gdt_page.gdt)),
+ .address = (unsigned long)&RIP_REL_REF(gdt_page.gdt),
.size = GDT_SIZE - 1,
};

diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c
index fe0c859873d1..30424f9876bc 100644
--- a/arch/x86/kernel/irq_64.c
+++ b/arch/x86/kernel/irq_64.c
@@ -26,7 +26,6 @@
#include <asm/apic.h>

DEFINE_PER_CPU_PAGE_ALIGNED(struct irq_stack, irq_stack_backing_store) __visible;
-DECLARE_INIT_PER_CPU(irq_stack_backing_store);

#ifdef CONFIG_VMAP_STACK
/*
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 667202ebd37f..8d5bf4a71e39 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -467,13 +467,6 @@ SECTIONS
"kernel image bigger than KERNEL_IMAGE_SIZE");

#ifdef CONFIG_X86_64
-/*
- * Per-cpu symbols which need to be offset from __per_cpu_load
- * for the boot processor.
- */
-#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x)
-INIT_PER_CPU(gdt_page);
-INIT_PER_CPU(irq_stack_backing_store);

#ifdef CONFIG_MITIGATION_UNRET_ENTRY
. = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index e490297a486b..8db231affba2 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -88,7 +88,6 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = {
"__initramfs_start|"
"(jiffies|jiffies_64)|"
#if ELF_BITS == 64
- "init_per_cpu__.*|"
"__end_rodata_hpage_align|"
#endif
"__vvar_page|"
--
2.44.0


2024-03-22 16:56:02

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 13/16] percpu: Remove PER_CPU_FIRST_SECTION

x86-64 was the last user.

Signed-off-by: Brian Gerst <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 1 -
include/linux/percpu-defs.h | 12 ------------
2 files changed, 13 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 9752eb420ffa..74f169772778 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -1033,7 +1033,6 @@
*/
#define PERCPU_INPUT(cacheline) \
__per_cpu_start = .; \
- *(.data..percpu..first) \
. = ALIGN(PAGE_SIZE); \
*(.data..percpu..page_aligned) \
. = ALIGN(cacheline); \
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index ec3573119923..b9ddee91e6c7 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -26,13 +26,11 @@
#define PER_CPU_SHARED_ALIGNED_SECTION "..shared_aligned"
#define PER_CPU_ALIGNED_SECTION "..shared_aligned"
#endif
-#define PER_CPU_FIRST_SECTION "..first"

#else

#define PER_CPU_SHARED_ALIGNED_SECTION ""
#define PER_CPU_ALIGNED_SECTION "..shared_aligned"
-#define PER_CPU_FIRST_SECTION ""

#endif

@@ -114,16 +112,6 @@
#define DEFINE_PER_CPU(type, name) \
DEFINE_PER_CPU_SECTION(type, name, "")

-/*
- * Declaration/definition used for per-CPU variables that must come first in
- * the set of variables.
- */
-#define DECLARE_PER_CPU_FIRST(type, name) \
- DECLARE_PER_CPU_SECTION(type, name, PER_CPU_FIRST_SECTION)
-
-#define DEFINE_PER_CPU_FIRST(type, name) \
- DEFINE_PER_CPU_SECTION(type, name, PER_CPU_FIRST_SECTION)
-
/*
* Declaration/definition used for per-CPU variables that must be cacheline
* aligned under SMP conditions so that, whilst a particular instance of the
--
2.44.0


2024-03-22 16:57:00

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 16/16] kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU

x86-64 was the only user.

Signed-off-by: Brian Gerst <[email protected]>
---
init/Kconfig | 11 +-----
kernel/kallsyms.c | 12 ++-----
scripts/kallsyms.c | 80 ++++++++---------------------------------
scripts/link-vmlinux.sh | 4 ---
4 files changed, 18 insertions(+), 89 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 0f928f82dc7a..0622599ce503 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1781,11 +1781,6 @@ config KALLSYMS_ALL

Say N unless you really need all symbols, or kernel live patching.

-config KALLSYMS_ABSOLUTE_PERCPU
- bool
- depends on KALLSYMS
- default n
-
config KALLSYMS_BASE_RELATIVE
bool
depends on KALLSYMS
@@ -1793,11 +1788,7 @@ config KALLSYMS_BASE_RELATIVE
help
Instead of emitting them as absolute values in the native word size,
emit the symbol references in the kallsyms table as 32-bit entries,
- each containing a relative value in the range [base, base + U32_MAX]
- or, when KALLSYMS_ABSOLUTE_PERCPU is in effect, each containing either
- an absolute value in the range [0, S32_MAX] or a relative value in the
- range [base, base + S32_MAX], where base is the lowest relative symbol
- address encountered in the image.
+ each containing a relative value in the range [base, base + U32_MAX].

On 64-bit builds, this reduces the size of the address table by 50%,
but more importantly, it results in entries whose values are build
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 18edd57b5fe8..f4e8e531052a 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -151,16 +151,8 @@ unsigned long kallsyms_sym_address(int idx)
if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
return kallsyms_addresses[idx];

- /* values are unsigned offsets if --absolute-percpu is not in effect */
- if (!IS_ENABLED(CONFIG_KALLSYMS_ABSOLUTE_PERCPU))
- return kallsyms_relative_base + (u32)kallsyms_offsets[idx];
-
- /* ...otherwise, positive offsets are absolute values */
- if (kallsyms_offsets[idx] >= 0)
- return kallsyms_offsets[idx];
-
- /* ...and negative offsets are relative to kallsyms_relative_base - 1 */
- return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
+ /* values are unsigned offsets */
+ return kallsyms_relative_base + (u32)kallsyms_offsets[idx];
}

static void cleanup_symbol_name(char *s)
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 653b92f6d4c8..501f978abf4b 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -5,8 +5,8 @@
* This software may be used and distributed according to the terms
* of the GNU General Public License, incorporated herein by reference.
*
- * Usage: kallsyms [--all-symbols] [--absolute-percpu]
- * [--base-relative] [--lto-clang] in.map > out.S
+ * Usage: kallsyms [--all-symbols] [--base-relative] [--lto-clang]
+ * in.map > out.S
*
* Table compression uses all the unused char codes on the symbols and
* maps these to the most used substrings (tokens). For instance, it might
@@ -37,7 +37,6 @@ struct sym_entry {
unsigned int len;
unsigned int seq;
unsigned int start_pos;
- unsigned int percpu_absolute;
unsigned char sym[];
};

@@ -55,14 +54,9 @@ static struct addr_range text_ranges[] = {
#define text_range_text (&text_ranges[0])
#define text_range_inittext (&text_ranges[1])

-static struct addr_range percpu_range = {
- "__per_cpu_start", "__per_cpu_end", -1ULL, 0
-};
-
static struct sym_entry **table;
static unsigned int table_size, table_cnt;
static int all_symbols;
-static int absolute_percpu;
static int base_relative;
static int lto_clang;

@@ -75,7 +69,7 @@ static unsigned char best_table_len[256];

static void usage(void)
{
- fprintf(stderr, "Usage: kallsyms [--all-symbols] [--absolute-percpu] "
+ fprintf(stderr, "Usage: kallsyms [--all-symbols] "
"[--base-relative] [--lto-clang] in.map > out.S\n");
exit(1);
}
@@ -167,7 +161,6 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len)
return NULL;

check_symbol_range(name, addr, text_ranges, ARRAY_SIZE(text_ranges));
- check_symbol_range(name, addr, &percpu_range, 1);

/* include the type field in the symbol name, so that it gets
* compressed together */
@@ -183,7 +176,6 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len)
sym->len = len;
sym->sym[0] = type;
strcpy(sym_name(sym), name);
- sym->percpu_absolute = 0;

return sym;
}
@@ -334,11 +326,6 @@ static int expand_symbol(const unsigned char *data, int len, char *result)
return total;
}

-static int symbol_absolute(const struct sym_entry *s)
-{
- return s->percpu_absolute;
-}
-
static void cleanup_symbol_name(char *s)
{
char *p;
@@ -499,30 +486,17 @@ static void write_src(void)
*/

long long offset;
- int overflow;
-
- if (!absolute_percpu) {
- offset = table[i]->addr - relative_base;
- overflow = (offset < 0 || offset > UINT_MAX);
- } else if (symbol_absolute(table[i])) {
- offset = table[i]->addr;
- overflow = (offset < 0 || offset > INT_MAX);
- } else {
- offset = relative_base - table[i]->addr - 1;
- overflow = (offset < INT_MIN || offset >= 0);
- }
- if (overflow) {
+
+ offset = table[i]->addr - relative_base;
+ if (offset < 0 || offset > UINT_MAX) {
fprintf(stderr, "kallsyms failure: "
- "%s symbol value %#llx out of range in relative mode\n",
- symbol_absolute(table[i]) ? "absolute" : "relative",
+ "symbol value %#llx out of range in relative mode\n",
table[i]->addr);
exit(EXIT_FAILURE);
}
printf("\t.long\t%#x /* %s */\n", (int)offset, table[i]->sym);
- } else if (!symbol_absolute(table[i])) {
- output_address(table[i]->addr);
} else {
- printf("\tPTR\t%#llx\n", table[i]->addr);
+ output_address(table[i]->addr);
}
}
printf("\n");
@@ -775,36 +749,15 @@ static void sort_symbols(void)
qsort(table, table_cnt, sizeof(table[0]), compare_symbols);
}

-static void make_percpus_absolute(void)
-{
- unsigned int i;
-
- for (i = 0; i < table_cnt; i++)
- if (symbol_in_range(table[i], &percpu_range, 1)) {
- /*
- * Keep the 'A' override for percpu symbols to
- * ensure consistent behavior compared to older
- * versions of this tool.
- */
- table[i]->sym[0] = 'A';
- table[i]->percpu_absolute = 1;
- }
-}
-
-/* find the minimum non-absolute symbol address */
+/* find the minimum symbol address */
static void record_relative_base(void)
{
- unsigned int i;
-
- for (i = 0; i < table_cnt; i++)
- if (!symbol_absolute(table[i])) {
- /*
- * The table is sorted by address.
- * Take the first non-absolute symbol value.
- */
- relative_base = table[i]->addr;
- return;
- }
+ /*
+ * The table is sorted by address.
+ * Take the first symbol value.
+ */
+ if (table_cnt)
+ relative_base = table[0]->addr;
}

int main(int argc, char **argv)
@@ -812,7 +765,6 @@ int main(int argc, char **argv)
while (1) {
static const struct option long_options[] = {
{"all-symbols", no_argument, &all_symbols, 1},
- {"absolute-percpu", no_argument, &absolute_percpu, 1},
{"base-relative", no_argument, &base_relative, 1},
{"lto-clang", no_argument, &lto_clang, 1},
{},
@@ -831,8 +783,6 @@ int main(int argc, char **argv)

read_map(argv[optind]);
shrink_table();
- if (absolute_percpu)
- make_percpus_absolute();
sort_symbols();
if (base_relative)
record_relative_base();
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 7862a8101747..7b0596cd5a80 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -153,10 +153,6 @@ kallsyms()
kallsymopt="${kallsymopt} --all-symbols"
fi

- if is_enabled CONFIG_KALLSYMS_ABSOLUTE_PERCPU; then
- kallsymopt="${kallsymopt} --absolute-percpu"
- fi
-
if is_enabled CONFIG_KALLSYMS_BASE_RELATIVE; then
kallsymopt="${kallsymopt} --base-relative"
fi
--
2.44.0


2024-03-22 17:02:55

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 06/16] objtool: Allow adding relocations to an existing section

In order to add relocations to existing sections (e.g. ".rela.text"),
encapsulate the reloc array in a block header to allow chaining blocks
to add more relocs without moving and relinking the existing ones.
This adds minimal memory overhead, while still being able to easily
access the arrays by index.

Signed-off-by: Brian Gerst <[email protected]>
---
tools/objtool/check.c | 2 +-
tools/objtool/elf.c | 99 +++++++++++++++++++++++------
tools/objtool/include/objtool/elf.h | 84 +++++++++++++++++++-----
3 files changed, 148 insertions(+), 37 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index eb7e12ebc1d0..0a2c161fc04d 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -4421,7 +4421,7 @@ static int validate_ibt_data_reloc(struct objtool_file *file,
return 0;

WARN_FUNC("data relocation to !ENDBR: %s",
- reloc->sec->base, reloc_offset(reloc),
+ reloc->block->sec->base, reloc_offset(reloc),
offstr(dest->sec, dest->offset));

return 1;
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 3d27983dc908..cfb970727c8a 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -277,7 +277,7 @@ struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *se
for_offset_range(o, offset, offset + len) {
elf_hash_for_each_possible(reloc, reloc, hash,
sec_offset_hash(rsec, o)) {
- if (reloc->sec != rsec)
+ if (reloc->block->sec != rsec)
continue;

if (reloc_offset(reloc) >= offset &&
@@ -333,6 +333,7 @@ static int read_sections(struct elf *elf)
sec = &elf->section_data[i];

INIT_LIST_HEAD(&sec->symbol_list);
+ INIT_LIST_HEAD(&sec->reloc_list);

s = elf_getscn(elf->elf, i);
if (!s) {
@@ -850,7 +851,7 @@ static struct reloc *elf_init_reloc(struct elf *elf, struct section *rsec,
unsigned long offset, struct symbol *sym,
s64 addend, unsigned int type)
{
- struct reloc *reloc, empty = { 0 };
+ struct reloc *reloc;

if (reloc_idx >= sec_num_entries(rsec)) {
WARN("%s: bad reloc_idx %u for %s with %d relocs",
@@ -858,15 +859,18 @@ static struct reloc *elf_init_reloc(struct elf *elf, struct section *rsec,
return NULL;
}

- reloc = &rsec->relocs[reloc_idx];
+ reloc = get_reloc_by_index(rsec, reloc_idx);
+ if (!reloc) {
+ WARN("%s: %s: reloc %d out of range!",
+ __func__, rsec->name, reloc_idx);
+ return NULL;
+ }

- if (memcmp(reloc, &empty, sizeof(empty))) {
+ if (reloc->sym) {
WARN("%s: %s: reloc %d already initialized!",
__func__, rsec->name, reloc_idx);
return NULL;
}
-
- reloc->sec = rsec;
reloc->sym = sym;

set_reloc_offset(elf, reloc, offset);
@@ -930,19 +934,45 @@ struct reloc *elf_init_reloc_data_sym(struct elf *elf, struct section *sec,
elf_data_rela_type(elf));
}

+static struct reloc_block *alloc_reloc_block(struct section *rsec, size_t num_relocs)
+{
+ struct reloc_block *block;
+ size_t block_size = sizeof(struct reloc_block) + sec_num_entries(rsec) * sizeof(struct reloc);
+ int i;
+
+ block = malloc(block_size);
+ if (!block) {
+ perror("malloc");
+ return NULL;
+ }
+
+ memset(block, 0, block_size);
+ INIT_LIST_HEAD(&block->list);
+ block->sec = rsec;
+ block->start_idx = rsec->num_relocs;
+ block->len = num_relocs;
+
+ for (i = 0; i < num_relocs; i++)
+ block->relocs[i].block = block;
+
+ rsec->num_relocs += num_relocs;
+ list_add_tail(&block->list, &rsec->reloc_list);
+
+ return block;
+}
+
static int read_relocs(struct elf *elf)
{
unsigned long nr_reloc, max_reloc = 0;
struct section *rsec;
- struct reloc *reloc;
- unsigned int symndx;
- struct symbol *sym;
int i;

if (!elf_alloc_hash(reloc, elf->num_relocs))
return -1;

list_for_each_entry(rsec, &elf->sections, list) {
+ struct reloc_block *block;
+
if (!is_reloc_sec(rsec))
continue;

@@ -956,15 +986,15 @@ static int read_relocs(struct elf *elf)
rsec->base->rsec = rsec;

nr_reloc = 0;
- rsec->relocs = calloc(sec_num_entries(rsec), sizeof(*reloc));
- if (!rsec->relocs) {
- perror("calloc");
+ block = alloc_reloc_block(rsec, sec_num_entries(rsec));
+ if (!block)
return -1;
- }
+
for (i = 0; i < sec_num_entries(rsec); i++) {
- reloc = &rsec->relocs[i];
+ struct reloc *reloc = &block->relocs[i];
+ struct symbol *sym;
+ unsigned int symndx;

- reloc->sec = rsec;
symndx = reloc_sym(reloc);
reloc->sym = sym = find_symbol_by_index(elf, symndx);
if (!reloc->sym) {
@@ -1100,6 +1130,7 @@ struct section *elf_create_section(struct elf *elf, const char *name,
memset(sec, 0, sizeof(*sec));

INIT_LIST_HEAD(&sec->symbol_list);
+ INIT_LIST_HEAD(&sec->reloc_list);

s = elf_newscn(elf->elf);
if (!s) {
@@ -1170,6 +1201,7 @@ static struct section *elf_create_rela_section(struct elf *elf,
unsigned int reloc_nr)
{
struct section *rsec;
+ struct reloc_block *block;
char *rsec_name;

rsec_name = malloc(strlen(sec->name) + strlen(".rela") + 1);
@@ -1192,11 +1224,9 @@ static struct section *elf_create_rela_section(struct elf *elf,
rsec->sh.sh_info = sec->idx;
rsec->sh.sh_flags = SHF_INFO_LINK;

- rsec->relocs = calloc(sec_num_entries(rsec), sizeof(struct reloc));
- if (!rsec->relocs) {
- perror("calloc");
+ block = alloc_reloc_block(rsec, sec_num_entries(rsec));
+ if (!block)
return NULL;
- }

sec->rsec = rsec;
rsec->base = sec;
@@ -1204,6 +1234,37 @@ static struct section *elf_create_rela_section(struct elf *elf,
return rsec;
}

+int elf_extend_rela_section(struct elf *elf,
+ struct section *rsec,
+ int add_relocs)
+{
+ int newnr = sec_num_entries(rsec) + add_relocs;
+ size_t oldsize = rsec->sh.sh_size;
+ size_t newsize = newnr * rsec->sh.sh_entsize;
+ void *buf;
+ struct reloc_block *block;
+
+ buf = realloc(rsec->data->d_buf, newnr * rsec->sh.sh_entsize);
+ if (!buf) {
+ perror("realloc");
+ return -1;
+ }
+
+ memset(buf + oldsize, 0, newsize - oldsize);
+
+ rsec->data->d_size = newsize;
+ rsec->data->d_buf = buf;
+ rsec->sh.sh_size = newsize;
+
+ mark_sec_changed(elf, rsec, true);
+
+ block = alloc_reloc_block(rsec, add_relocs);
+ if (!block)
+ return -1;
+
+ return 0;
+}
+
struct section *elf_create_section_pair(struct elf *elf, const char *name,
size_t entsize, unsigned int nr,
unsigned int reloc_nr)
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 9f71e988eca4..7851467f6878 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -43,7 +43,8 @@ struct section {
char *name;
int idx;
bool _changed, text, rodata, noinstr, init, truncate;
- struct reloc *relocs;
+ struct list_head reloc_list;
+ int num_relocs;
};

struct symbol {
@@ -71,13 +72,23 @@ struct symbol {
struct reloc *relocs;
};

+struct reloc_block;
+
struct reloc {
struct elf_hash_node hash;
- struct section *sec;
+ struct reloc_block *block;
struct symbol *sym;
struct reloc *sym_next_reloc;
};

+struct reloc_block {
+ struct list_head list;
+ struct section *sec;
+ int start_idx;
+ int len;
+ struct reloc relocs[0];
+};
+
struct elf {
Elf *elf;
GElf_Ehdr ehdr;
@@ -108,6 +119,11 @@ struct elf *elf_open_read(const char *name, int flags);

struct section *elf_create_section(struct elf *elf, const char *name,
size_t entsize, unsigned int nr);
+
+int elf_extend_rela_section(struct elf *elf,
+ struct section *rsec,
+ int add_relocs);
+
struct section *elf_create_section_pair(struct elf *elf, const char *name,
size_t entsize, unsigned int nr,
unsigned int reloc_nr);
@@ -197,12 +213,12 @@ static inline unsigned int sec_num_entries(struct section *sec)

static inline unsigned int reloc_idx(struct reloc *reloc)
{
- return reloc - reloc->sec->relocs;
+ return reloc->block->start_idx + (reloc - &reloc->block->relocs[0]);
}

static inline void *reloc_rel(struct reloc *reloc)
{
- struct section *rsec = reloc->sec;
+ struct section *rsec = reloc->block->sec;

return rsec->data->d_buf + (reloc_idx(reloc) * rsec->sh.sh_entsize);
}
@@ -215,7 +231,7 @@ static inline bool is_32bit_reloc(struct reloc *reloc)
* Elf64_Rel: 16 bytes
* Elf64_Rela: 24 bytes
*/
- return reloc->sec->sh.sh_entsize < 16;
+ return reloc->block->sec->sh.sh_entsize < 16;
}

#define __get_reloc_field(reloc, field) \
@@ -241,7 +257,7 @@ static inline u64 reloc_offset(struct reloc *reloc)
static inline void set_reloc_offset(struct elf *elf, struct reloc *reloc, u64 offset)
{
__set_reloc_field(reloc, r_offset, offset);
- mark_sec_changed(elf, reloc->sec, true);
+ mark_sec_changed(elf, reloc->block->sec, true);
}

static inline s64 reloc_addend(struct reloc *reloc)
@@ -252,7 +268,7 @@ static inline s64 reloc_addend(struct reloc *reloc)
static inline void set_reloc_addend(struct elf *elf, struct reloc *reloc, s64 addend)
{
__set_reloc_field(reloc, r_addend, addend);
- mark_sec_changed(elf, reloc->sec, true);
+ mark_sec_changed(elf, reloc->block->sec, true);
}


@@ -282,7 +298,7 @@ static inline void set_reloc_sym(struct elf *elf, struct reloc *reloc, unsigned

__set_reloc_field(reloc, r_info, info);

- mark_sec_changed(elf, reloc->sec, true);
+ mark_sec_changed(elf, reloc->block->sec, true);
}
static inline void set_reloc_type(struct elf *elf, struct reloc *reloc, unsigned int type)
{
@@ -292,7 +308,46 @@ static inline void set_reloc_type(struct elf *elf, struct reloc *reloc, unsigned

__set_reloc_field(reloc, r_info, info);

- mark_sec_changed(elf, reloc->sec, true);
+ mark_sec_changed(elf, reloc->block->sec, true);
+}
+
+static inline struct reloc *get_reloc_by_index(struct section *rsec, int idx)
+{
+ struct reloc_block *block;
+
+ list_for_each_entry(block, &rsec->reloc_list, list) {
+ if (idx < block->len)
+ return &block->relocs[idx];
+ idx -= block->len;
+ }
+
+ return NULL;
+}
+
+static inline struct reloc *first_reloc(struct section *sec)
+{
+ struct reloc_block *block;
+
+ if (list_empty(&sec->reloc_list))
+ return NULL;
+
+ block = list_first_entry(&sec->reloc_list, struct reloc_block, list);
+ return &block->relocs[0];
+}
+
+static inline struct reloc *next_reloc(struct reloc *reloc)
+{
+ struct reloc_block *block = reloc->block;
+
+ reloc++;
+ if (reloc < &block->relocs[block->len])
+ return reloc;
+
+ if (list_is_last(&block->list, &block->sec->reloc_list))
+ return NULL;
+
+ block = list_next_entry(block, list);
+ return &block->relocs[0];
}

#define for_each_sec(file, sec) \
@@ -308,15 +363,10 @@ static inline void set_reloc_type(struct elf *elf, struct reloc *reloc, unsigned
sec_for_each_sym(__sec, sym)

#define for_each_reloc(rsec, reloc) \
- for (int __i = 0, __fake = 1; __fake; __fake = 0) \
- for (reloc = rsec->relocs; \
- __i < sec_num_entries(rsec); \
- __i++, reloc++)
+ for (reloc = first_reloc(rsec); reloc; reloc = next_reloc(reloc))

#define for_each_reloc_from(rsec, reloc) \
- for (int __i = reloc_idx(reloc); \
- __i < sec_num_entries(rsec); \
- __i++, reloc++)
+ for (; reloc; reloc = next_reloc(reloc))

#define OFFSET_STRIDE_BITS 4
#define OFFSET_STRIDE (1UL << OFFSET_STRIDE_BITS)
@@ -344,7 +394,7 @@ static inline u32 sec_offset_hash(struct section *sec, unsigned long offset)

static inline u32 reloc_hash(struct reloc *reloc)
{
- return sec_offset_hash(reloc->sec, reloc_offset(reloc));
+ return sec_offset_hash(reloc->block->sec, reloc_offset(reloc));
}

#endif /* _OBJTOOL_ELF_H */
--
2.44.0


2024-03-22 17:03:31

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 04/16] x86/pvh: Use fixed_percpu_data for early boot GSBASE

Instead of having a private area for the stack canary, use
fixed_percpu_data for GSBASE like the native kernel.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/platform/pvh/head.S | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index f7235ef87bc3..1f1c3230b27b 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -95,10 +95,15 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
/* 64-bit entry point. */
.code64
1:
- /* Set base address in stack canary descriptor. */
+ /*
+ * Set up GSBASE.
+ * Note that, on SMP, the boot cpu uses init data section until
+ * the per cpu areas are set up.
+ */
mov $MSR_GS_BASE,%ecx
- mov $_pa(canary), %eax
- xor %edx, %edx
+ lea INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
+ mov %edx, %eax
+ shr $32, %rdx
wrmsr

call xen_prepare_pvh
@@ -157,8 +162,6 @@ SYM_DATA_START_LOCAL(gdt_start)
SYM_DATA_END_LABEL(gdt_start, SYM_L_LOCAL, gdt_end)

.balign 16
-SYM_DATA_LOCAL(canary, .fill 48, 1, 0)
-
SYM_DATA_START_LOCAL(early_stack)
.fill BOOT_STACK_SIZE, 1, 0
SYM_DATA_END_LABEL(early_stack, SYM_L_LOCAL, early_stack_end)
--
2.44.0


2024-03-22 17:03:50

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 15/16] percpu: Remove __per_cpu_load

__per_cpu_load is now always equal to __per_cpu_start.

Signed-off-by: Brian Gerst <[email protected]>
---
include/asm-generic/sections.h | 2 +-
include/asm-generic/vmlinux.lds.h | 1 -
mm/percpu.c | 4 ++--
3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
index c768de6f19a9..0755bc39b0d8 100644
--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -39,7 +39,7 @@ extern char __init_begin[], __init_end[];
extern char _sinittext[], _einittext[];
extern char __start_ro_after_init[], __end_ro_after_init[];
extern char _end[];
-extern char __per_cpu_load[], __per_cpu_start[], __per_cpu_end[];
+extern char __per_cpu_start[], __per_cpu_end[];
extern char __kprobes_text_start[], __kprobes_text_end[];
extern char __entry_text_start[], __entry_text_end[];
extern char __start_rodata[], __end_rodata[];
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 430b2ed2aa58..a67452ea94d3 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -1052,7 +1052,6 @@
#define PERCPU_SECTION(cacheline) \
. = ALIGN(PAGE_SIZE); \
.data..percpu : AT(ADDR(.data..percpu) - LOAD_OFFSET) { \
- __per_cpu_load = .; \
PERCPU_INPUT(cacheline) \
}

diff --git a/mm/percpu.c b/mm/percpu.c
index 4e11fc1e6def..a584b5a78f72 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -3153,7 +3153,7 @@ int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
continue;
}
/* copy and return the unused part */
- memcpy(ptr, __per_cpu_load, ai->static_size);
+ memcpy(ptr, __per_cpu_start, ai->static_size);
pcpu_fc_free(ptr + size_sum, ai->unit_size - size_sum);
}
}
@@ -3336,7 +3336,7 @@ int __init pcpu_page_first_chunk(size_t reserved_size, pcpu_fc_cpu_to_node_fn_t
flush_cache_vmap_early(unit_addr, unit_addr + ai->unit_size);

/* copy static data */
- memcpy((void *)unit_addr, __per_cpu_load, ai->static_size);
+ memcpy((void *)unit_addr, __per_cpu_start, ai->static_size);
}

/* we're ready, commit */
--
2.44.0


2024-03-22 17:04:12

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v4 14/16] percpu: Remove PERCPU_VADDR()

x86-64 was the last user.

Signed-off-by: Brian Gerst <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 41 ++-----------------------------
1 file changed, 2 insertions(+), 39 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 74f169772778..430b2ed2aa58 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -1044,47 +1044,10 @@
__per_cpu_end = .;

/**
- * PERCPU_VADDR - define output section for percpu area
+ * PERCPU_SECTION - define output section for percpu area
* @cacheline: cacheline size
- * @vaddr: explicit base address (optional)
- * @phdr: destination PHDR (optional)
*
- * Macro which expands to output section for percpu area.
- *
- * @cacheline is used to align subsections to avoid false cacheline
- * sharing between subsections for different purposes.
- *
- * If @vaddr is not blank, it specifies explicit base address and all
- * percpu symbols will be offset from the given address. If blank,
- * @vaddr always equals @laddr + LOAD_OFFSET.
- *
- * @phdr defines the output PHDR to use if not blank. Be warned that
- * output PHDR is sticky. If @phdr is specified, the next output
- * section in the linker script will go there too. @phdr should have
- * a leading colon.
- *
- * Note that this macros defines __per_cpu_load as an absolute symbol.
- * If there is no need to put the percpu section at a predetermined
- * address, use PERCPU_SECTION.
- */
-#define PERCPU_VADDR(cacheline, vaddr, phdr) \
- __per_cpu_load = .; \
- .data..percpu vaddr : AT(__per_cpu_load - LOAD_OFFSET) { \
- PERCPU_INPUT(cacheline) \
- } phdr \
- . = __per_cpu_load + SIZEOF(.data..percpu);
-
-/**
- * PERCPU_SECTION - define output section for percpu area, simple version
- * @cacheline: cacheline size
- *
- * Align to PAGE_SIZE and outputs output section for percpu area. This
- * macro doesn't manipulate @vaddr or @phdr and __per_cpu_load and
- * __per_cpu_start will be identical.
- *
- * This macro is equivalent to ALIGN(PAGE_SIZE); PERCPU_VADDR(@cacheline,,)
- * except that __per_cpu_load is defined as a relative symbol against
- * .data..percpu which is required for relocatable x86_32 configuration.
+ * Align to PAGE_SIZE and outputs output section for percpu area.
*/
#define PERCPU_SECTION(cacheline) \
. = ALIGN(PAGE_SIZE); \
--
2.44.0


2024-03-23 11:39:21

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
>
> Currently, x86-64 uses an unusual percpu layout, where the percpu section
> is linked at absolute address 0. The reason behind this is that older GCC
> versions placed the stack protector (if enabled) at a fixed offset from the
> GS segment base. Since the GS segement is also used for percpu variables,
> this forced the current layout.
>
> GCC since version 8.1 supports a configurable location for the stack
> protector value, which allows removal of the restriction on how the percpu
> section is linked. This allows the percpu section to be linked normally,
> like other architectures. In turn, this allows removal of code that was
> needed to support the zero-based percpu section.

The number of simplifications throughout the code, enabled by this
patch set, is really impressive, and it reflects the number of
workarounds to enable the feature that was originally not designed for
the kernel usage. As noted above, this issue was recognized in the GCC
compiler and the stack protector support was generalized by adding
configurable location for the stack protector value [1,2].

The improved stack protector support was implemented in gcc-8.1,
released on May 2, 2018, when linux 4.17 was in development. In light
of this fact, and 5 (soon 6) GCC major releases later, I'd like to ask
if the objtool support to fixup earlier compilers is really necessary.
Please note that years ago x86_32 simply dropped stack protector
support with earlier compilers and IMO, we should follow this example
also with x86_64, because:

a) There are currently 5 (soon 6) GCC major releases that support
configurable location for stack protector value. GCC 10 is already out
of active maintenance, and GCC 7 is considered an ancient release at
this time. For x86_32, it was advised to drop the support for stack
protector entirely with too old compilers to somehow force users to
upgrade the compiler.

b) Stack protector is not a core feature - the kernel will still boot
without stack protector. So, if someone really has the urge to use
ancient compilers with the bleeding edge kernel, it is still possible
to create a bootable image. I do not think using ancient compilers to
compile bleeding edge kernels makes any sense at all.

c) Maintenance burden - an objtool feature will have to be maintained
until gcc-8.1 is the minimum required compiler version. This feature
will IMO be seldom used and thus prone to bitrot.

d) Discrepancy between x86_32 and x86_64 - either both targets should
use objtool fixups for stack protector, or none at all. As shown by
x86_32 approach in the past, removing stack protector support with
ancient compilers was not problematic at all.

That said, the whole series is heartily Acked-by: Uros Bizjak
<[email protected]>

[1] https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=e1769bdd4cef522ada32aec863feba41116b183a
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81708

Thanks,
Uros.

>
> v4:
> - Updated to current tip tree
> - Added two new cleanups made possible by the removal of IA-64.
> - Small improvements to the objtool conversion code.
>
> Brian Gerst (16):
> x86/stackprotector/32: Remove stack protector test script
> x86/stackprotector/64: Remove stack protector test script
> x86/boot: Disable stack protector for early boot code
> x86/pvh: Use fixed_percpu_data for early boot GSBASE
> x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
> objtool: Allow adding relocations to an existing section
> objtool: Convert fixed location stack protector accesses
> x86/stackprotector/64: Convert to normal percpu variable
> x86/percpu/64: Use relative percpu offsets
> x86/percpu/64: Remove fixed_percpu_data
> x86/boot/64: Remove inverse relocations
> x86/percpu/64: Remove INIT_PER_CPU macros
> percpu: Remove PER_CPU_FIRST_SECTION
> percpu: Remove PERCPU_VADDR()
> percpu: Remove __per_cpu_load
> kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU
>
> arch/x86/Kconfig | 16 +--
> arch/x86/Makefile | 21 ++--
> arch/x86/boot/compressed/misc.c | 14 +--
> arch/x86/entry/entry_64.S | 2 +-
> arch/x86/include/asm/desc.h | 1 -
> arch/x86/include/asm/percpu.h | 22 ----
> arch/x86/include/asm/processor.h | 28 +----
> arch/x86/include/asm/stackprotector.h | 36 +-----
> arch/x86/kernel/Makefile | 2 +
> arch/x86/kernel/asm-offsets_64.c | 6 -
> arch/x86/kernel/cpu/common.c | 9 +-
> arch/x86/kernel/head64.c | 2 +-
> arch/x86/kernel/head_64.S | 20 ++-
> arch/x86/kernel/irq_64.c | 1 -
> arch/x86/kernel/setup_percpu.c | 12 +-
> arch/x86/kernel/vmlinux.lds.S | 35 ------
> arch/x86/platform/pvh/head.S | 10 +-
> arch/x86/tools/relocs.c | 143 ++--------------------
> arch/x86/xen/xen-head.S | 10 +-
> include/asm-generic/sections.h | 2 +-
> include/asm-generic/vmlinux.lds.h | 43 +------
> include/linux/percpu-defs.h | 12 --
> init/Kconfig | 11 +-
> kernel/kallsyms.c | 12 +-
> mm/percpu.c | 4 +-
> scripts/Makefile.lib | 2 +
> scripts/gcc-x86_32-has-stack-protector.sh | 8 --
> scripts/gcc-x86_64-has-stack-protector.sh | 4 -
> scripts/kallsyms.c | 80 +++---------
> scripts/link-vmlinux.sh | 4 -
> tools/objtool/arch/x86/decode.c | 46 +++++++
> tools/objtool/arch/x86/special.c | 91 ++++++++++++++
> tools/objtool/builtin-check.c | 9 +-
> tools/objtool/check.c | 14 ++-
> tools/objtool/elf.c | 133 ++++++++++++++++----
> tools/objtool/include/objtool/arch.h | 3 +
> tools/objtool/include/objtool/builtin.h | 2 +
> tools/objtool/include/objtool/elf.h | 90 +++++++++++---
> 38 files changed, 442 insertions(+), 518 deletions(-)
> delete mode 100755 scripts/gcc-x86_32-has-stack-protector.sh
> delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh
>
>
> base-commit: 30052fd948a3b43506c83590eaaada12d1f2dd09
> --
> 2.44.0
>

2024-03-23 13:23:11

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Sat, Mar 23, 2024 at 7:39 AM Uros Bizjak <[email protected]> wrote:
>
> On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
> >
> > Currently, x86-64 uses an unusual percpu layout, where the percpu section
> > is linked at absolute address 0. The reason behind this is that older GCC
> > versions placed the stack protector (if enabled) at a fixed offset from the
> > GS segment base. Since the GS segement is also used for percpu variables,
> > this forced the current layout.
> >
> > GCC since version 8.1 supports a configurable location for the stack
> > protector value, which allows removal of the restriction on how the percpu
> > section is linked. This allows the percpu section to be linked normally,
> > like other architectures. In turn, this allows removal of code that was
> > needed to support the zero-based percpu section.
>
> The number of simplifications throughout the code, enabled by this
> patch set, is really impressive, and it reflects the number of
> workarounds to enable the feature that was originally not designed for
> the kernel usage. As noted above, this issue was recognized in the GCC
> compiler and the stack protector support was generalized by adding
> configurable location for the stack protector value [1,2].
>
> The improved stack protector support was implemented in gcc-8.1,
> released on May 2, 2018, when linux 4.17 was in development. In light
> of this fact, and 5 (soon 6) GCC major releases later, I'd like to ask
> if the objtool support to fixup earlier compilers is really necessary.
> Please note that years ago x86_32 simply dropped stack protector
> support with earlier compilers and IMO, we should follow this example
> also with x86_64, because:
>
> a) There are currently 5 (soon 6) GCC major releases that support
> configurable location for stack protector value. GCC 10 is already out
> of active maintenance, and GCC 7 is considered an ancient release at
> this time. For x86_32, it was advised to drop the support for stack
> protector entirely with too old compilers to somehow force users to
> upgrade the compiler.

At least one developer claimed to be using an older compiler. I asked
for more details on why, but got no response.

> b) Stack protector is not a core feature - the kernel will still boot
> without stack protector. So, if someone really has the urge to use
> ancient compilers with the bleeding edge kernel, it is still possible
> to create a bootable image. I do not think using ancient compilers to
> compile bleeding edge kernels makes any sense at all.

One small issue is that Kconfig would silently disable istackprotector
if the compiler doesn't support the new options. That said, the
number of people that this would affect is very small, as just about
any modern distribution ships a compiler newer than 8.1.

I'm all in favor of only supporting compilers that are supported upstream.

> c) Maintenance burden - an objtool feature will have to be maintained
> until gcc-8.1 is the minimum required compiler version. This feature
> will IMO be seldom used and thus prone to bitrot.

That's the reason I added a config option to allow testing objtool
even if the compiler has support.

> d) Discrepancy between x86_32 and x86_64 - either both targets should
> use objtool fixups for stack protector, or none at all. As shown by
> x86_32 approach in the past, removing stack protector support with
> ancient compilers was not problematic at all.

Objtool doesn't support x86-32, mostly because working with REL type
relocations is a pain.

> That said, the whole series is heartily Acked-by: Uros Bizjak
> <[email protected]>
>
> [1] https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=e1769bdd4cef522ada32aec863feba41116b183a
> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81708
>
> Thanks,
> Uros.

Brian Gerst

2024-03-23 16:17:15

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Sat, 23 Mar 2024 at 06:23, Brian Gerst <[email protected]> wrote:
>
> One small issue is that Kconfig would silently disable istackprotector
> if the compiler doesn't support the new options. That said, the
> number of people that this would affect is very small, as just about
> any modern distribution ships a compiler newer than 8.1.

Yes, let's make the rule be that you can still compile the kernel with
gcc-5.1+, but you can't get stackprotector support unless you have
gcc-8.1+.

I'd hate to add the objtool support for an old compiler - this is a
hardening feature, not a core feature, and anybody who insists on old
compilers just won't get it.

And we have other cases like this where various debug features depend
on the gcc version, eg

config CC_HAS_WORKING_NOSANITIZE_ADDRESS
def_bool !CC_IS_GCC || GCC_VERSION >= 80300

so we could easily do the same for stack protector support.

And we might as well also do the semi-yearly compiler version review.
We raised the minimum to 4.9 almost four years ago, and then the jump
to 5.1 was first for arm64 due to a serious gcc code generation bug
and then globally in Sept 2021.

So it's probably time to think about that anyway,

That said, we don't actually have all that many gcc version checks
around any more, so I think the jump to 5.1 got rid of the worst of
the horrors. Most of the GCC_VERSION checks are either in gcc-plugins
(which we should just remove, imnsho - not the version checks, the
plugins entirely), or for various random minor details (warnign
enablement and the asm goto workaround).

So there doesn't seem to be a major reason to up the versioning, since
the stack protector thing can just be disabled for older versions.

But maybe even enterprise distros have upgraded anyway, and we should
be proactive.

Cc'ing Arnd, who has historically been one of the people pushing this.
He may no longer care because we haven't had huge issues.

Linus

2024-03-23 17:00:25

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v4 01/16] x86/stackprotector/32: Remove stack protector test script

On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
>
> Test for compiler support directly in Kconfig.
>
> Signed-off-by: Brian Gerst <[email protected]>

Reviewed-by: Uros Bizjak <[email protected]>

> ---
> arch/x86/Kconfig | 2 +-
> scripts/gcc-x86_32-has-stack-protector.sh | 8 --------
> 2 files changed, 1 insertion(+), 9 deletions(-)
> delete mode 100755 scripts/gcc-x86_32-has-stack-protector.sh
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 03483b23a009..f326903cbe67 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -412,7 +412,7 @@ config PGTABLE_LEVELS
> config CC_HAS_SANE_STACKPROTECTOR
> bool
> default $(success,$(srctree)/scripts/gcc-x86_64-has-stack-protector.sh $(CC) $(CLANG_FLAGS)) if 64BIT
> - default $(success,$(srctree)/scripts/gcc-x86_32-has-stack-protector.sh $(CC) $(CLANG_FLAGS))
> + default $(cc-option,-mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard)
> help
> We have to make sure stack protector is unconditionally disabled if
> the compiler produces broken code or if it does not let us control
> diff --git a/scripts/gcc-x86_32-has-stack-protector.sh b/scripts/gcc-x86_32-has-stack-protector.sh
> deleted file mode 100755
> index 825c75c5b715..000000000000
> --- a/scripts/gcc-x86_32-has-stack-protector.sh
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -#!/bin/sh
> -# SPDX-License-Identifier: GPL-2.0
> -
> -# This requires GCC 8.1 or better. Specifically, we require
> -# -mstack-protector-guard-reg, added by
> -# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81708
> -
> -echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m32 -O0 -fstack-protector -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard - -o - 2> /dev/null | grep -q "%fs"
> --
> 2.44.0
>

2024-03-23 17:01:28

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v4 02/16] x86/stackprotector/64: Remove stack protector test script

On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
>
> This test for the stack protector was added in 2006 to make sure the
> compiler had the PR28281 patch applied. With GCC 5.1 being the minimum
> supported compiler now, it is no longer necessary.
>
> Signed-off-by: Brian Gerst <[email protected]>

Reviewed-by: Uros Bizjak <[email protected]>

> ---
> arch/x86/Kconfig | 5 ++---
> scripts/gcc-x86_64-has-stack-protector.sh | 4 ----
> 2 files changed, 2 insertions(+), 7 deletions(-)
> delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index f326903cbe67..88d72227e3cb 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -411,12 +411,11 @@ config PGTABLE_LEVELS
>
> config CC_HAS_SANE_STACKPROTECTOR
> bool
> - default $(success,$(srctree)/scripts/gcc-x86_64-has-stack-protector.sh $(CC) $(CLANG_FLAGS)) if 64BIT
> + default y if 64BIT
> default $(cc-option,-mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard)
> help
> We have to make sure stack protector is unconditionally disabled if
> - the compiler produces broken code or if it does not let us control
> - the segment on 32-bit kernels.
> + the compiler does not allow control of the segment and symbol.
>
> menu "Processor type and features"
>
> diff --git a/scripts/gcc-x86_64-has-stack-protector.sh b/scripts/gcc-x86_64-has-stack-protector.sh
> deleted file mode 100755
> index 75e4e22b986a..000000000000
> --- a/scripts/gcc-x86_64-has-stack-protector.sh
> +++ /dev/null
> @@ -1,4 +0,0 @@
> -#!/bin/sh
> -# SPDX-License-Identifier: GPL-2.0
> -
> -echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m64 -O0 -mcmodel=kernel -fno-PIE -fstack-protector - -o - 2> /dev/null | grep -q "%gs"
> --
> 2.44.0
>

2024-03-23 17:07:01

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Sat, 23 Mar 2024 at 09:16, Linus Torvalds
<[email protected]> wrote:
>
> And we might as well also do the semi-yearly compiler version review.
> We raised the minimum to 4.9 almost four years ago, and then the jump
> to 5.1 was first for arm64 due to a serious gcc code generation bug
> and then globally in Sept 2021.

Looking at RHEL, I find a page that claims

RHEL9 : gcc 11.x in app stream
RHEL8 : gcc 8.x or gcc 9.x in app stream.
RHEL7 : gcc 4.8.x

so RHEL7 is already immaterial from a kernel compiler standpoint, and
so it looks like at least as far as RHEL is concerned, we could just
jump to gcc 8.1 as a minimum.

RHEL also has a "Developer Toolset" that allows you to pick a compiler
upgrade, so it's not *quite* as black-and-white as that, but it does
seem like we could at some point just pick gcc-8 as a new minimum with
very little pain on that front.

The SLES situation seems somewhat similar, with SLES12 being 4.8.x and
SLES15 being 7.3. But again with a "Development Tools Module" setup.
So that *might* argue for 7.3.

I can't make sense of Debian releases. There's "stable" (bookworm)
that comes with gcc-12.2, but there's oldstable, oldoldstable, and
various "archived" releases still under LTS. I can't even begin to
guess what may be relevant.

I don't think we care that deeply on the kernel side, other than a
"maybe we should be a bit more proactive about raising gcc version
requirements". I don't think we have any huge issues right now with
old gcc versions.

Linus

2024-03-23 17:12:05

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v4 08/16] x86/stackprotector/64: Convert to normal percpu variable

On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
>
> Older versions of GCC fixed the location of the stack protector canary
> at %gs:40. This constraint forced the percpu section to be linked at
> virtual address 0 so that the canary could be the first data object in
> the percpu section. Supporting the zero-based percpu section requires
> additional code to handle relocations for RIP-relative references to
> percpu data, extra complexity to kallsyms, and workarounds for linker
> bugs due to the use of absolute symbols.
>
> Use compiler options to redefine the stack protector location if
> supported, otherwise use objtool. This will remove the contraint that
> the percpu section must be zero-based.
>
> Signed-off-by: Brian Gerst <[email protected]>

HAVE_STACKPROTECTOR change is not needed without STACKPROTECTOR_OBJTOOL support.

With the above remark:

Reviewed-by: Uros Bizjak <[email protected]>

> ---
> arch/x86/Kconfig | 11 ++++----
> arch/x86/Makefile | 21 ++++++++++------
> arch/x86/entry/entry_64.S | 2 +-
> arch/x86/include/asm/processor.h | 16 ++----------
> arch/x86/include/asm/stackprotector.h | 36 ++++-----------------------
> arch/x86/kernel/asm-offsets_64.c | 6 -----
> arch/x86/kernel/cpu/common.c | 5 +---
> arch/x86/kernel/head_64.S | 3 +--
> arch/x86/xen/xen-head.S | 3 +--
> 9 files changed, 30 insertions(+), 73 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 121cfb9ffc0e..3dbefdb8a5d6 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -271,7 +271,7 @@ config X86
> select HAVE_FUNCTION_ARG_ACCESS_API
> select HAVE_SETUP_PER_CPU_AREA
> select HAVE_SOFTIRQ_ON_OWN_STACK
> - select HAVE_STACKPROTECTOR if CC_HAS_SANE_STACKPROTECTOR
> + select HAVE_STACKPROTECTOR if X86_64 || CC_HAS_SANE_STACKPROTECTOR
> select HAVE_STACK_VALIDATION if HAVE_OBJTOOL
> select HAVE_STATIC_CALL
> select HAVE_STATIC_CALL_INLINE if HAVE_OBJTOOL
> @@ -411,15 +411,14 @@ config PGTABLE_LEVELS
>
> config CC_HAS_SANE_STACKPROTECTOR
> bool
> - default y if 64BIT
> + default $(cc-option,-mstack-protector-guard-reg=gs -mstack-protector-guard-symbol=__stack_chk_guard) if 64BIT
> default $(cc-option,-mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard)
> - help
> - We have to make sure stack protector is unconditionally disabled if
> - the compiler does not allow control of the segment and symbol.
>
> config STACKPROTECTOR_OBJTOOL
> bool
> - default n
> + depends on X86_64 && STACKPROTECTOR
> + default !CC_HAS_SANE_STACKPROTECTOR
> + prompt "Debug objtool stack protector conversion" if CC_HAS_SANE_STACKPROTECTOR && DEBUG_KERNEL
>
> menu "Processor type and features"
>
> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 662d9d4033e6..2a3ba1abb802 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -116,13 +116,7 @@ ifeq ($(CONFIG_X86_32),y)
> # temporary until string.h is fixed
> KBUILD_CFLAGS += -ffreestanding
>
> - ifeq ($(CONFIG_STACKPROTECTOR),y)
> - ifeq ($(CONFIG_SMP),y)
> - KBUILD_CFLAGS += -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard
> - else
> - KBUILD_CFLAGS += -mstack-protector-guard=global
> - endif
> - endif
> + percpu_seg := fs
> else
> BITS := 64
> UTS_MACHINE := x86_64
> @@ -172,6 +166,19 @@ else
> KBUILD_CFLAGS += -mcmodel=kernel
> KBUILD_RUSTFLAGS += -Cno-redzone=y
> KBUILD_RUSTFLAGS += -Ccode-model=kernel
> +
> + percpu_seg := gs
> +endif
> +
> +ifeq ($(CONFIG_STACKPROTECTOR),y)
> + ifneq ($(CONFIG_STACKPROTECTOR_OBJTOOL),y)
> + ifeq ($(CONFIG_SMP),y)
> + KBUILD_CFLAGS += -mstack-protector-guard-reg=$(percpu_seg)
> + KBUILD_CFLAGS += -mstack-protector-guard-symbol=__stack_chk_guard
> + else
> + KBUILD_CFLAGS += -mstack-protector-guard=global
> + endif
> + endif
> endif
>
> #
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 8af2a26b24f6..9478ff768dd0 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -191,7 +191,7 @@ SYM_FUNC_START(__switch_to_asm)
>
> #ifdef CONFIG_STACKPROTECTOR
> movq TASK_stack_canary(%rsi), %rbx
> - movq %rbx, PER_CPU_VAR(fixed_percpu_data + FIXED_stack_canary)
> + movq %rbx, PER_CPU_VAR(__stack_chk_guard)
> #endif
>
> /*
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 89ed5237e79f..946bebce396f 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -387,16 +387,8 @@ struct irq_stack {
>
> #ifdef CONFIG_X86_64
> struct fixed_percpu_data {
> - /*
> - * GCC hardcodes the stack canary as %gs:40. Since the
> - * irq_stack is the object at %gs:0, we reserve the bottom
> - * 48 bytes of the irq stack for the canary.
> - *
> - * Once we are willing to require -mstack-protector-guard-symbol=
> - * support for x86_64 stackprotector, we can get rid of this.
> - */
> char gs_base[40];
> - unsigned long stack_canary;
> + unsigned long reserved;
> };
>
> DECLARE_PER_CPU_FIRST(struct fixed_percpu_data, fixed_percpu_data) __visible;
> @@ -411,11 +403,7 @@ extern asmlinkage void entry_SYSCALL32_ignore(void);
>
> /* Save actual FS/GS selectors and bases to current->thread */
> void current_save_fsgs(void);
> -#else /* X86_64 */
> -#ifdef CONFIG_STACKPROTECTOR
> -DECLARE_PER_CPU(unsigned long, __stack_chk_guard);
> -#endif
> -#endif /* !X86_64 */
> +#endif /* X86_64 */
>
> struct perf_event;
>
> diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
> index 00473a650f51..d43fb589fcf6 100644
> --- a/arch/x86/include/asm/stackprotector.h
> +++ b/arch/x86/include/asm/stackprotector.h
> @@ -2,26 +2,10 @@
> /*
> * GCC stack protector support.
> *
> - * Stack protector works by putting predefined pattern at the start of
> + * Stack protector works by putting a predefined pattern at the start of
> * the stack frame and verifying that it hasn't been overwritten when
> - * returning from the function. The pattern is called stack canary
> - * and unfortunately gcc historically required it to be at a fixed offset
> - * from the percpu segment base. On x86_64, the offset is 40 bytes.
> - *
> - * The same segment is shared by percpu area and stack canary. On
> - * x86_64, percpu symbols are zero based and %gs (64-bit) points to the
> - * base of percpu area. The first occupant of the percpu area is always
> - * fixed_percpu_data which contains stack_canary at the appropriate
> - * offset. On x86_32, the stack canary is just a regular percpu
> - * variable.
> - *
> - * Putting percpu data in %fs on 32-bit is a minor optimization compared to
> - * using %gs. Since 32-bit userspace normally has %fs == 0, we are likely
> - * to load 0 into %fs on exit to usermode, whereas with percpu data in
> - * %gs, we are likely to load a non-null %gs on return to user mode.
> - *
> - * Once we are willing to require GCC 8.1 or better for 64-bit stackprotector
> - * support, we can remove some of this complexity.
> + * returning from the function. The pattern is called the stack canary
> + * and is a unique value for each task.
> */
>
> #ifndef _ASM_STACKPROTECTOR_H
> @@ -36,6 +20,8 @@
>
> #include <linux/sched.h>
>
> +DECLARE_PER_CPU(unsigned long, __stack_chk_guard);
> +
> /*
> * Initialize the stackprotector canary value.
> *
> @@ -51,25 +37,13 @@ static __always_inline void boot_init_stack_canary(void)
> {
> unsigned long canary = get_random_canary();
>
> -#ifdef CONFIG_X86_64
> - BUILD_BUG_ON(offsetof(struct fixed_percpu_data, stack_canary) != 40);
> -#endif
> -
> current->stack_canary = canary;
> -#ifdef CONFIG_X86_64
> - this_cpu_write(fixed_percpu_data.stack_canary, canary);
> -#else
> this_cpu_write(__stack_chk_guard, canary);
> -#endif
> }
>
> static inline void cpu_init_stack_canary(int cpu, struct task_struct *idle)
> {
> -#ifdef CONFIG_X86_64
> - per_cpu(fixed_percpu_data.stack_canary, cpu) = idle->stack_canary;
> -#else
> per_cpu(__stack_chk_guard, cpu) = idle->stack_canary;
> -#endif
> }
>
> #else /* STACKPROTECTOR */
> diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
> index bb65371ea9df..590b6cd0eac0 100644
> --- a/arch/x86/kernel/asm-offsets_64.c
> +++ b/arch/x86/kernel/asm-offsets_64.c
> @@ -54,11 +54,5 @@ int main(void)
> BLANK();
> #undef ENTRY
>
> - BLANK();
> -
> -#ifdef CONFIG_STACKPROTECTOR
> - OFFSET(FIXED_stack_canary, fixed_percpu_data, stack_canary);
> - BLANK();
> -#endif
> return 0;
> }
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 9a34651d24e7..f49e8f5b858d 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2063,16 +2063,13 @@ void syscall_init(void)
> if (!cpu_feature_enabled(X86_FEATURE_FRED))
> idt_syscall_init();
> }
> -
> -#else /* CONFIG_X86_64 */
> +#endif /* CONFIG_X86_64 */
>
> #ifdef CONFIG_STACKPROTECTOR
> DEFINE_PER_CPU(unsigned long, __stack_chk_guard);
> EXPORT_PER_CPU_SYMBOL(__stack_chk_guard);
> #endif
>
> -#endif /* CONFIG_X86_64 */
> -
> /*
> * Clear all 6 debug registers:
> */
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index b11526869a40..cfbf0486d424 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -361,8 +361,7 @@ SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL)
>
> /* Set up %gs.
> *
> - * The base of %gs always points to fixed_percpu_data. If the
> - * stack protector canary is enabled, it is located at %gs:40.
> + * The base of %gs always points to fixed_percpu_data.
> * Note that, on SMP, the boot cpu uses init data section until
> * the per cpu areas are set up.
> */
> diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
> index 758bcd47b72d..ae4672ea00bb 100644
> --- a/arch/x86/xen/xen-head.S
> +++ b/arch/x86/xen/xen-head.S
> @@ -53,8 +53,7 @@ SYM_CODE_START(startup_xen)
>
> /* Set up %gs.
> *
> - * The base of %gs always points to fixed_percpu_data. If the
> - * stack protector canary is enabled, it is located at %gs:40.
> + * The base of %gs always points to fixed_percpu_data.
> * Note that, on SMP, the boot cpu uses init data section until
> * the per cpu areas are set up.
> */
> --
> 2.44.0
>

2024-03-23 17:14:35

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v4 09/16] x86/percpu/64: Use relative percpu offsets

On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
>
> The percpu section is currently linked at virtual address 0, because
> older compilers hardcoded the stack protector canary value at a fixed
> offset from the start of the GS segment. Now that the canary is a
> normal percpu variable, the percpu section does not need to be linked
> at a specific virtual address. This means that x86-64 will calculate
> percpu offsets like most other architectures, as the delta between the
> initial percpu address and the dynamically allocated memory.
>
> Signed-off-by: Brian Gerst <[email protected]>

Reviewed-by: Uros Bizjak <[email protected]>

> ---
> arch/x86/include/asm/processor.h | 6 +++++-
> arch/x86/kernel/head_64.S | 19 +++++++++----------
> arch/x86/kernel/setup_percpu.c | 12 ++----------
> arch/x86/kernel/vmlinux.lds.S | 29 +----------------------------
> arch/x86/platform/pvh/head.S | 5 ++---
> arch/x86/tools/relocs.c | 10 +++-------
> arch/x86/xen/xen-head.S | 9 ++++-----
> init/Kconfig | 2 +-
> 8 files changed, 27 insertions(+), 65 deletions(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 946bebce396f..40d6add8ff31 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -396,7 +396,11 @@ DECLARE_INIT_PER_CPU(fixed_percpu_data);
>
> static inline unsigned long cpu_kernelmode_gs_base(int cpu)
> {
> - return (unsigned long)per_cpu(fixed_percpu_data.gs_base, cpu);
> +#ifdef CONFIG_SMP
> + return per_cpu_offset(cpu);
> +#else
> + return 0;
> +#endif
> }
>
> extern asmlinkage void entry_SYSCALL32_ignore(void);
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index cfbf0486d424..5b2cc711feec 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -68,11 +68,14 @@ SYM_CODE_START_NOALIGN(startup_64)
> /* Set up the stack for verify_cpu() */
> leaq __top_init_kernel_stack(%rip), %rsp
>
> - /* Setup GSBASE to allow stack canary access for C code */
> + /*
> + * Set up GSBASE.
> + * Note that, on SMP, the boot cpu uses init data section until
> + * the per cpu areas are set up.
> + */
> movl $MSR_GS_BASE, %ecx
> - leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
> - movl %edx, %eax
> - shrq $32, %rdx
> + xorl %eax, %eax
> + xorl %edx, %edx
> wrmsr
>
> call startup_64_setup_gdt_idt
> @@ -359,16 +362,12 @@ SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL)
> movl %eax,%fs
> movl %eax,%gs
>
> - /* Set up %gs.
> - *
> - * The base of %gs always points to fixed_percpu_data.
> + /*
> + * Set up GSBASE.
> * Note that, on SMP, the boot cpu uses init data section until
> * the per cpu areas are set up.
> */
> movl $MSR_GS_BASE,%ecx
> -#ifndef CONFIG_SMP
> - leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
> -#endif
> movl %edx, %eax
> shrq $32, %rdx
> wrmsr
> diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
> index b30d6e180df7..1e7be9409aa2 100644
> --- a/arch/x86/kernel/setup_percpu.c
> +++ b/arch/x86/kernel/setup_percpu.c
> @@ -23,18 +23,10 @@
> #include <asm/cpumask.h>
> #include <asm/cpu.h>
>
> -#ifdef CONFIG_X86_64
> -#define BOOT_PERCPU_OFFSET ((unsigned long)__per_cpu_load)
> -#else
> -#define BOOT_PERCPU_OFFSET 0
> -#endif
> -
> -DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off) = BOOT_PERCPU_OFFSET;
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off);
> EXPORT_PER_CPU_SYMBOL(this_cpu_off);
>
> -unsigned long __per_cpu_offset[NR_CPUS] __ro_after_init = {
> - [0 ... NR_CPUS-1] = BOOT_PERCPU_OFFSET,
> -};
> +unsigned long __per_cpu_offset[NR_CPUS] __ro_after_init;
> EXPORT_SYMBOL(__per_cpu_offset);
>
> /*
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 3509afc6a672..0b152f96c24e 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -99,12 +99,6 @@ const_pcpu_hot = pcpu_hot;
> PHDRS {
> text PT_LOAD FLAGS(5); /* R_E */
> data PT_LOAD FLAGS(6); /* RW_ */
> -#ifdef CONFIG_X86_64
> -#ifdef CONFIG_SMP
> - percpu PT_LOAD FLAGS(6); /* RW_ */
> -#endif
> - init PT_LOAD FLAGS(7); /* RWE */
> -#endif
> note PT_NOTE FLAGS(0); /* ___ */
> }
>
> @@ -222,21 +216,7 @@ SECTIONS
> __init_begin = .; /* paired with __init_end */
> }
>
> -#if defined(CONFIG_X86_64) && defined(CONFIG_SMP)
> - /*
> - * percpu offsets are zero-based on SMP. PERCPU_VADDR() changes the
> - * output PHDR, so the next output section - .init.text - should
> - * start another segment - init.
> - */
> - PERCPU_VADDR(INTERNODE_CACHE_BYTES, 0, :percpu)
> - ASSERT(SIZEOF(.data..percpu) < CONFIG_PHYSICAL_START,
> - "per-CPU data too large - increase CONFIG_PHYSICAL_START")
> -#endif
> -
> INIT_TEXT_SECTION(PAGE_SIZE)
> -#ifdef CONFIG_X86_64
> - :init
> -#endif
>
> /*
> * Section for code used exclusively before alternatives are run. All
> @@ -353,9 +333,7 @@ SECTIONS
> EXIT_DATA
> }
>
> -#if !defined(CONFIG_X86_64) || !defined(CONFIG_SMP)
> PERCPU_SECTION(INTERNODE_CACHE_BYTES)
> -#endif
>
> . = ALIGN(PAGE_SIZE);
>
> @@ -493,16 +471,11 @@ SECTIONS
> * Per-cpu symbols which need to be offset from __per_cpu_load
> * for the boot processor.
> */
> -#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x) + __per_cpu_load
> +#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x)
> INIT_PER_CPU(gdt_page);
> INIT_PER_CPU(fixed_percpu_data);
> INIT_PER_CPU(irq_stack_backing_store);
>
> -#ifdef CONFIG_SMP
> -. = ASSERT((fixed_percpu_data == 0),
> - "fixed_percpu_data is not at start of per-cpu area");
> -#endif
> -
> #ifdef CONFIG_MITIGATION_UNRET_ENTRY
> . = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
> #endif
> diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
> index 1f1c3230b27b..7e3e07c6170f 100644
> --- a/arch/x86/platform/pvh/head.S
> +++ b/arch/x86/platform/pvh/head.S
> @@ -101,9 +101,8 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
> * the per cpu areas are set up.
> */
> mov $MSR_GS_BASE,%ecx
> - lea INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
> - mov %edx, %eax
> - shr $32, %rdx
> + xor %eax, %eax
> + xor %edx, %edx
> wrmsr
>
> call xen_prepare_pvh
> diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
> index adf11a48ec70..016650ddaf7f 100644
> --- a/arch/x86/tools/relocs.c
> +++ b/arch/x86/tools/relocs.c
> @@ -839,12 +839,7 @@ static void percpu_init(void)
> */
> static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
> {
> - int shndx = sym_index(sym);
> -
> - return (shndx == per_cpu_shndx) &&
> - strcmp(symname, "__init_begin") &&
> - strcmp(symname, "__per_cpu_load") &&
> - strncmp(symname, "init_per_cpu_", 13);
> + return 0;
> }
>
>
> @@ -1068,7 +1063,8 @@ static int cmp_relocs(const void *va, const void *vb)
>
> static void sort_relocs(struct relocs *r)
> {
> - qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
> + if (r->count)
> + qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
> }
>
> static int write32(uint32_t v, FILE *f)
> diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
> index ae4672ea00bb..1796884b727d 100644
> --- a/arch/x86/xen/xen-head.S
> +++ b/arch/x86/xen/xen-head.S
> @@ -51,15 +51,14 @@ SYM_CODE_START(startup_xen)
>
> leaq __top_init_kernel_stack(%rip), %rsp
>
> - /* Set up %gs.
> - *
> - * The base of %gs always points to fixed_percpu_data.
> + /*
> + * Set up GSBASE.
> * Note that, on SMP, the boot cpu uses init data section until
> * the per cpu areas are set up.
> */
> movl $MSR_GS_BASE,%ecx
> - movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
> - cdq
> + xorl %eax, %eax
> + xorl %edx, %edx
> wrmsr
>
> mov %rsi, %rdi
> diff --git a/init/Kconfig b/init/Kconfig
> index f3ea5dea9c85..0f928f82dc7a 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1784,7 +1784,7 @@ config KALLSYMS_ALL
> config KALLSYMS_ABSOLUTE_PERCPU
> bool
> depends on KALLSYMS
> - default X86_64 && SMP
> + default n
>
> config KALLSYMS_BASE_RELATIVE
> bool
> --
> 2.44.0
>

2024-03-23 17:15:57

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v4 10/16] x86/percpu/64: Remove fixed_percpu_data

On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
>
> Now that the stack protector canary value is a normal percpu variable,
> fixed_percpu_data is unused and can be removed.
>
> Signed-off-by: Brian Gerst <[email protected]>

Reviewed-by: Uros Bizjak <[email protected]>

> ---
> arch/x86/include/asm/processor.h | 8 --------
> arch/x86/kernel/cpu/common.c | 4 ----
> arch/x86/kernel/vmlinux.lds.S | 1 -
> arch/x86/tools/relocs.c | 1 -
> 4 files changed, 14 deletions(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 40d6add8ff31..1f15a7bee876 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -386,14 +386,6 @@ struct irq_stack {
> } __aligned(IRQ_STACK_SIZE);
>
> #ifdef CONFIG_X86_64
> -struct fixed_percpu_data {
> - char gs_base[40];
> - unsigned long reserved;
> -};
> -
> -DECLARE_PER_CPU_FIRST(struct fixed_percpu_data, fixed_percpu_data) __visible;
> -DECLARE_INIT_PER_CPU(fixed_percpu_data);
> -
> static inline unsigned long cpu_kernelmode_gs_base(int cpu)
> {
> #ifdef CONFIG_SMP
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index f49e8f5b858d..395a8831d507 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1997,10 +1997,6 @@ EXPORT_PER_CPU_SYMBOL(pcpu_hot);
> EXPORT_PER_CPU_SYMBOL(const_pcpu_hot);
>
> #ifdef CONFIG_X86_64
> -DEFINE_PER_CPU_FIRST(struct fixed_percpu_data,
> - fixed_percpu_data) __aligned(PAGE_SIZE) __visible;
> -EXPORT_PER_CPU_SYMBOL_GPL(fixed_percpu_data);
> -
> static void wrmsrl_cstar(unsigned long val)
> {
> /*
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 0b152f96c24e..667202ebd37f 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -473,7 +473,6 @@ SECTIONS
> */
> #define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x)
> INIT_PER_CPU(gdt_page);
> -INIT_PER_CPU(fixed_percpu_data);
> INIT_PER_CPU(irq_stack_backing_store);
>
> #ifdef CONFIG_MITIGATION_UNRET_ENTRY
> diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
> index 016650ddaf7f..65e6f3d6d890 100644
> --- a/arch/x86/tools/relocs.c
> +++ b/arch/x86/tools/relocs.c
> @@ -834,7 +834,6 @@ static void percpu_init(void)
> * __per_cpu_load
> *
> * The "gold" linker incorrectly associates:
> - * init_per_cpu__fixed_percpu_data
> * init_per_cpu__gdt_page
> */
> static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
> --
> 2.44.0
>

2024-03-23 17:16:24

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v4 12/16] x86/percpu/64: Remove INIT_PER_CPU macros

On Fri, Mar 22, 2024 at 5:53 PM Brian Gerst <[email protected]> wrote:
>
> Now that the load and link addresses of percpu variables are the same,
> these macros are no longer necessary.
>
> Signed-off-by: Brian Gerst <[email protected]>

Reviewed-by: Uros Bizjak <[email protected]>

> ---
> arch/x86/include/asm/desc.h | 1 -
> arch/x86/include/asm/percpu.h | 22 ----------------------
> arch/x86/kernel/head64.c | 2 +-
> arch/x86/kernel/irq_64.c | 1 -
> arch/x86/kernel/vmlinux.lds.S | 7 -------
> arch/x86/tools/relocs.c | 1 -
> 6 files changed, 1 insertion(+), 33 deletions(-)
>
> diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
> index 62dc9f59ea76..ec95fe44fa3a 100644
> --- a/arch/x86/include/asm/desc.h
> +++ b/arch/x86/include/asm/desc.h
> @@ -46,7 +46,6 @@ struct gdt_page {
> } __attribute__((aligned(PAGE_SIZE)));
>
> DECLARE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page);
> -DECLARE_INIT_PER_CPU(gdt_page);
>
> /* Provide the original GDT */
> static inline struct desc_struct *get_cpu_gdt_rw(unsigned int cpu)
> diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
> index f6ddbaaf80bc..59d91fdfe037 100644
> --- a/arch/x86/include/asm/percpu.h
> +++ b/arch/x86/include/asm/percpu.h
> @@ -20,12 +20,6 @@
>
> #define PER_CPU_VAR(var) __percpu(var)__percpu_rel
>
> -#ifdef CONFIG_X86_64_SMP
> -#define INIT_PER_CPU_VAR(var) init_per_cpu__##var
> -#else
> -#define INIT_PER_CPU_VAR(var) var
> -#endif
> -
> #else /* ...!ASSEMBLY */
>
> #include <linux/build_bug.h>
> @@ -101,22 +95,6 @@
> #define __percpu_arg(x) __percpu_prefix "%" #x
> #define __force_percpu_arg(x) __force_percpu_prefix "%" #x
>
> -/*
> - * Initialized pointers to per-cpu variables needed for the boot
> - * processor need to use these macros to get the proper address
> - * offset from __per_cpu_load on SMP.
> - *
> - * There also must be an entry in vmlinux_64.lds.S
> - */
> -#define DECLARE_INIT_PER_CPU(var) \
> - extern typeof(var) init_per_cpu_var(var)
> -
> -#ifdef CONFIG_X86_64_SMP
> -#define init_per_cpu_var(var) init_per_cpu__##var
> -#else
> -#define init_per_cpu_var(var) var
> -#endif
> -
> /* For arch-specific code, we can use direct single-insn ops (they
> * don't give an lvalue though). */
>
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 212e8e06aeba..5f0523610e92 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -564,7 +564,7 @@ void __head startup_64_setup_gdt_idt(void)
> void *handler = NULL;
>
> struct desc_ptr startup_gdt_descr = {
> - .address = (unsigned long)&RIP_REL_REF(init_per_cpu_var(gdt_page.gdt)),
> + .address = (unsigned long)&RIP_REL_REF(gdt_page.gdt),
> .size = GDT_SIZE - 1,
> };
>
> diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c
> index fe0c859873d1..30424f9876bc 100644
> --- a/arch/x86/kernel/irq_64.c
> +++ b/arch/x86/kernel/irq_64.c
> @@ -26,7 +26,6 @@
> #include <asm/apic.h>
>
> DEFINE_PER_CPU_PAGE_ALIGNED(struct irq_stack, irq_stack_backing_store) __visible;
> -DECLARE_INIT_PER_CPU(irq_stack_backing_store);
>
> #ifdef CONFIG_VMAP_STACK
> /*
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 667202ebd37f..8d5bf4a71e39 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -467,13 +467,6 @@ SECTIONS
> "kernel image bigger than KERNEL_IMAGE_SIZE");
>
> #ifdef CONFIG_X86_64
> -/*
> - * Per-cpu symbols which need to be offset from __per_cpu_load
> - * for the boot processor.
> - */
> -#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x)
> -INIT_PER_CPU(gdt_page);
> -INIT_PER_CPU(irq_stack_backing_store);
>
> #ifdef CONFIG_MITIGATION_UNRET_ENTRY
> . = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
> diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
> index e490297a486b..8db231affba2 100644
> --- a/arch/x86/tools/relocs.c
> +++ b/arch/x86/tools/relocs.c
> @@ -88,7 +88,6 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = {
> "__initramfs_start|"
> "(jiffies|jiffies_64)|"
> #if ELF_BITS == 64
> - "init_per_cpu__.*|"
> "__end_rodata_hpage_align|"
> #endif
> "__vvar_page|"
> --
> 2.44.0
>

2024-03-23 17:18:31

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v4 13/16] percpu: Remove PER_CPU_FIRST_SECTION

On Fri, Mar 22, 2024 at 5:53 PM Brian Gerst <[email protected]> wrote:
>
> x86-64 was the last user.
>
> Signed-off-by: Brian Gerst <[email protected]>

Reviewed-by: Uros Bizjak <[email protected]>

> ---
> include/asm-generic/vmlinux.lds.h | 1 -
> include/linux/percpu-defs.h | 12 ------------
> 2 files changed, 13 deletions(-)
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 9752eb420ffa..74f169772778 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -1033,7 +1033,6 @@
> */
> #define PERCPU_INPUT(cacheline) \
> __per_cpu_start = .; \
> - *(.data..percpu..first) \
> . = ALIGN(PAGE_SIZE); \
> *(.data..percpu..page_aligned) \
> . = ALIGN(cacheline); \
> diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
> index ec3573119923..b9ddee91e6c7 100644
> --- a/include/linux/percpu-defs.h
> +++ b/include/linux/percpu-defs.h
> @@ -26,13 +26,11 @@
> #define PER_CPU_SHARED_ALIGNED_SECTION "..shared_aligned"
> #define PER_CPU_ALIGNED_SECTION "..shared_aligned"
> #endif
> -#define PER_CPU_FIRST_SECTION "..first"
>
> #else
>
> #define PER_CPU_SHARED_ALIGNED_SECTION ""
> #define PER_CPU_ALIGNED_SECTION "..shared_aligned"
> -#define PER_CPU_FIRST_SECTION ""
>
> #endif
>
> @@ -114,16 +112,6 @@
> #define DEFINE_PER_CPU(type, name) \
> DEFINE_PER_CPU_SECTION(type, name, "")
>
> -/*
> - * Declaration/definition used for per-CPU variables that must come first in
> - * the set of variables.
> - */
> -#define DECLARE_PER_CPU_FIRST(type, name) \
> - DECLARE_PER_CPU_SECTION(type, name, PER_CPU_FIRST_SECTION)
> -
> -#define DEFINE_PER_CPU_FIRST(type, name) \
> - DEFINE_PER_CPU_SECTION(type, name, PER_CPU_FIRST_SECTION)
> -
> /*
> * Declaration/definition used for per-CPU variables that must be cacheline
> * aligned under SMP conditions so that, whilst a particular instance of the
> --
> 2.44.0
>

2024-03-23 22:55:57

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Sat, Mar 23, 2024, at 17:16, Linus Torvalds wrote:
>
> That said, we don't actually have all that many gcc version checks
> around any more, so I think the jump to 5.1 got rid of the worst of
> the horrors. Most of the GCC_VERSION checks are either in gcc-plugins
> (which we should just remove, imnsho - not the version checks, the
> plugins entirely), or for various random minor details (warnign
> enablement and the asm goto workaround).

There are also a bunch of feature checks in Kconfig that don't
check the version number but instead try if something works.

At least three of the six plugins can be removed with sufficiently
new compilers: GCC_PLUGIN_SANCOV after gcc-6, and both
GCC_PLUGIN_STRUCTLEAK and GCC_PLUGIN_ARM_SSP_PER_TASK after
gcc-12.

> So there doesn't seem to be a major reason to up the versioning, since
> the stack protector thing can just be disabled for older versions.
>
> But maybe even enterprise distros have upgraded anyway, and we should
> be proactive.
>
> Cc'ing Arnd, who has historically been one of the people pushing this.
> He may no longer care because we haven't had huge issues.

I'm not aware of any major issues, but it keeps coming up and
a number of people have asked me about it because of smaller
ones. Unfortunately I didn't write down what the problems
were.

I think based on what compiler versions are shipped
by LTS distros, gcc-8.1 is the next logical step when we
do it, covering Debian 10, RHEL 8 and Ubuntu 20.04, which
are probably the oldest we still need to support.

RHEL 7 and SLES 12 are still technically supported distros
as well, but those shipped with gcc-4.8, so we dropped them
in 2020 with the move to gcc-4.9.

So in short, not a lot to gain or lose from raising the
minimum to 8.1, but it would be nice to reduce the possible
test matrix from 10 gcc versions back to the 7 we had in
the past, as well as clean up a few version dependencies.
Similarly we can probably raise the oldest binutils version
to 2.30, as that seems to be the earliest version that was
paired with gcc-8 (in RHEL-8).

Arnd

2024-03-24 02:25:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements


* Uros Bizjak <[email protected]> wrote:

> On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
> >
> > Currently, x86-64 uses an unusual percpu layout, where the percpu section
> > is linked at absolute address 0. The reason behind this is that older GCC
> > versions placed the stack protector (if enabled) at a fixed offset from the
> > GS segment base. Since the GS segement is also used for percpu variables,
> > this forced the current layout.
> >
> > GCC since version 8.1 supports a configurable location for the stack
> > protector value, which allows removal of the restriction on how the percpu
> > section is linked. This allows the percpu section to be linked normally,
> > like other architectures. In turn, this allows removal of code that was
> > needed to support the zero-based percpu section.
>
> The number of simplifications throughout the code, enabled by this
> patch set, is really impressive, and it reflects the number of
> workarounds to enable the feature that was originally not designed for
> the kernel usage. As noted above, this issue was recognized in the GCC
> compiler and the stack protector support was generalized by adding
> configurable location for the stack protector value [1,2].
>
> The improved stack protector support was implemented in gcc-8.1,
> released on May 2, 2018, when linux 4.17 was in development. In light
> of this fact, and 5 (soon 6) GCC major releases later, I'd like to ask
> if the objtool support to fixup earlier compilers is really necessary.
> Please note that years ago x86_32 simply dropped stack protector
> support with earlier compilers and IMO, we should follow this example
> also with x86_64, because:

Ack on raising the minimum version requirement for x86-64
stackprotector to 8.1 or so - this causes no real pain on the distro
side: when *this* new kernel of ours is picked by a distro, it almost
always goes hand in hand with a compiler version upgrade.

We should be careful with fixes marked for -stable backport, but other
than that, new improvements like Brian's series are a fair game to
tweak compiler version requirements.

But please emit a (single) prominent build-time warning if a feature is
disabled though, even if there are no functional side-effects, such as
for hardening features.

In general distro kernel developers & maintainers like seeing the
performance (and other) effects of their compiler version choices, but
we are not very transparent about this: our fallbacks are way too
opaque right now.

Thanks,

Ingo

2024-03-24 03:51:31

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Sat, Mar 23, 2024 at 10:25 PM Ingo Molnar <[email protected]> wrote:
>
>
> * Uros Bizjak <[email protected]> wrote:
>
> > On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
> > >
> > > Currently, x86-64 uses an unusual percpu layout, where the percpu section
> > > is linked at absolute address 0. The reason behind this is that older GCC
> > > versions placed the stack protector (if enabled) at a fixed offset from the
> > > GS segment base. Since the GS segement is also used for percpu variables,
> > > this forced the current layout.
> > >
> > > GCC since version 8.1 supports a configurable location for the stack
> > > protector value, which allows removal of the restriction on how the percpu
> > > section is linked. This allows the percpu section to be linked normally,
> > > like other architectures. In turn, this allows removal of code that was
> > > needed to support the zero-based percpu section.
> >
> > The number of simplifications throughout the code, enabled by this
> > patch set, is really impressive, and it reflects the number of
> > workarounds to enable the feature that was originally not designed for
> > the kernel usage. As noted above, this issue was recognized in the GCC
> > compiler and the stack protector support was generalized by adding
> > configurable location for the stack protector value [1,2].
> >
> > The improved stack protector support was implemented in gcc-8.1,
> > released on May 2, 2018, when linux 4.17 was in development. In light
> > of this fact, and 5 (soon 6) GCC major releases later, I'd like to ask
> > if the objtool support to fixup earlier compilers is really necessary.
> > Please note that years ago x86_32 simply dropped stack protector
> > support with earlier compilers and IMO, we should follow this example
> > also with x86_64, because:
>
> Ack on raising the minimum version requirement for x86-64
> stackprotector to 8.1 or so - this causes no real pain on the distro
> side: when *this* new kernel of ours is picked by a distro, it almost
> always goes hand in hand with a compiler version upgrade.
>
> We should be careful with fixes marked for -stable backport, but other
> than that, new improvements like Brian's series are a fair game to
> tweak compiler version requirements.
>
> But please emit a (single) prominent build-time warning if a feature is
> disabled though, even if there are no functional side-effects, such as
> for hardening features.

Disabled for any reason or only if the compiler lacks support?

Brian Gerst

2024-03-24 04:05:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements


* Brian Gerst <[email protected]> wrote:

> On Sat, Mar 23, 2024 at 10:25 PM Ingo Molnar <[email protected]> wrote:
> >
> >
> > * Uros Bizjak <[email protected]> wrote:
> >
> > > On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
> > > >
> > > > Currently, x86-64 uses an unusual percpu layout, where the percpu section
> > > > is linked at absolute address 0. The reason behind this is that older GCC
> > > > versions placed the stack protector (if enabled) at a fixed offset from the
> > > > GS segment base. Since the GS segement is also used for percpu variables,
> > > > this forced the current layout.
> > > >
> > > > GCC since version 8.1 supports a configurable location for the stack
> > > > protector value, which allows removal of the restriction on how the percpu
> > > > section is linked. This allows the percpu section to be linked normally,
> > > > like other architectures. In turn, this allows removal of code that was
> > > > needed to support the zero-based percpu section.
> > >
> > > The number of simplifications throughout the code, enabled by this
> > > patch set, is really impressive, and it reflects the number of
> > > workarounds to enable the feature that was originally not designed for
> > > the kernel usage. As noted above, this issue was recognized in the GCC
> > > compiler and the stack protector support was generalized by adding
> > > configurable location for the stack protector value [1,2].
> > >
> > > The improved stack protector support was implemented in gcc-8.1,
> > > released on May 2, 2018, when linux 4.17 was in development. In light
> > > of this fact, and 5 (soon 6) GCC major releases later, I'd like to ask
> > > if the objtool support to fixup earlier compilers is really necessary.
> > > Please note that years ago x86_32 simply dropped stack protector
> > > support with earlier compilers and IMO, we should follow this example
> > > also with x86_64, because:
> >
> > Ack on raising the minimum version requirement for x86-64
> > stackprotector to 8.1 or so - this causes no real pain on the distro
> > side: when *this* new kernel of ours is picked by a distro, it almost
> > always goes hand in hand with a compiler version upgrade.
> >
> > We should be careful with fixes marked for -stable backport, but other
> > than that, new improvements like Brian's series are a fair game to
> > tweak compiler version requirements.
> >
> > But please emit a (single) prominent build-time warning if a feature is
> > disabled though, even if there are no functional side-effects, such as
> > for hardening features.
>
> Disabled for any reason or only if the compiler lacks support?

Only if the user desires to have it enabled, but it's not possible due
to compiler (or other build environment) reasons. Ie. if something
unexpected happens from the user's perspective.

The .config option is preserved even if the compiler doesn't support
it, right?

I suspect this should also cover features that get select-ed by a
feature that the user enables. (Not sure about architecture level
select-ed options.)

Thanks,

Ingo

2024-03-24 05:44:24

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Sun, Mar 24, 2024 at 12:05 AM Ingo Molnar <[email protected]> wrote:
>
>
> * Brian Gerst <[email protected]> wrote:
>
> > On Sat, Mar 23, 2024 at 10:25 PM Ingo Molnar <[email protected]> wrote:
> > >
> > >
> > > * Uros Bizjak <[email protected]> wrote:
> > >
> > > > On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
> > > > >
> > > > > Currently, x86-64 uses an unusual percpu layout, where the percpu section
> > > > > is linked at absolute address 0. The reason behind this is that older GCC
> > > > > versions placed the stack protector (if enabled) at a fixed offset from the
> > > > > GS segment base. Since the GS segement is also used for percpu variables,
> > > > > this forced the current layout.
> > > > >
> > > > > GCC since version 8.1 supports a configurable location for the stack
> > > > > protector value, which allows removal of the restriction on how the percpu
> > > > > section is linked. This allows the percpu section to be linked normally,
> > > > > like other architectures. In turn, this allows removal of code that was
> > > > > needed to support the zero-based percpu section.
> > > >
> > > > The number of simplifications throughout the code, enabled by this
> > > > patch set, is really impressive, and it reflects the number of
> > > > workarounds to enable the feature that was originally not designed for
> > > > the kernel usage. As noted above, this issue was recognized in the GCC
> > > > compiler and the stack protector support was generalized by adding
> > > > configurable location for the stack protector value [1,2].
> > > >
> > > > The improved stack protector support was implemented in gcc-8.1,
> > > > released on May 2, 2018, when linux 4.17 was in development. In light
> > > > of this fact, and 5 (soon 6) GCC major releases later, I'd like to ask
> > > > if the objtool support to fixup earlier compilers is really necessary.
> > > > Please note that years ago x86_32 simply dropped stack protector
> > > > support with earlier compilers and IMO, we should follow this example
> > > > also with x86_64, because:
> > >
> > > Ack on raising the minimum version requirement for x86-64
> > > stackprotector to 8.1 or so - this causes no real pain on the distro
> > > side: when *this* new kernel of ours is picked by a distro, it almost
> > > always goes hand in hand with a compiler version upgrade.
> > >
> > > We should be careful with fixes marked for -stable backport, but other
> > > than that, new improvements like Brian's series are a fair game to
> > > tweak compiler version requirements.
> > >
> > > But please emit a (single) prominent build-time warning if a feature is
> > > disabled though, even if there are no functional side-effects, such as
> > > for hardening features.
> >
> > Disabled for any reason or only if the compiler lacks support?
>
> Only if the user desires to have it enabled, but it's not possible due
> to compiler (or other build environment) reasons. Ie. if something
> unexpected happens from the user's perspective.
>
> The .config option is preserved even if the compiler doesn't support
> it, right?
>
> I suspect this should also cover features that get select-ed by a
> feature that the user enables. (Not sure about architecture level
> select-ed options.)
>
> Thanks,
>
> Ingo

I could add something like:

comment "Stack protector is not supported by the architecture or compiler"
depends on !HAVE_STACKPROTECTOR

But, "make oldconfig" will still silently disable stack protector if
the compiler doesn't support the new options. It does put the comment
into the .config file though, so that may be enough.

Brian Gerst

2024-03-24 10:53:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements


* Brian Gerst <[email protected]> wrote:

> On Sun, Mar 24, 2024 at 12:05 AM Ingo Molnar <[email protected]> wrote:
> >
> >
> > * Brian Gerst <[email protected]> wrote:
> >
> > > On Sat, Mar 23, 2024 at 10:25 PM Ingo Molnar <[email protected]> wrote:
> > > >
> > > >
> > > > * Uros Bizjak <[email protected]> wrote:
> > > >
> > > > > On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
> > > > > >
> > > > > > Currently, x86-64 uses an unusual percpu layout, where the percpu section
> > > > > > is linked at absolute address 0. The reason behind this is that older GCC
> > > > > > versions placed the stack protector (if enabled) at a fixed offset from the
> > > > > > GS segment base. Since the GS segement is also used for percpu variables,
> > > > > > this forced the current layout.
> > > > > >
> > > > > > GCC since version 8.1 supports a configurable location for the stack
> > > > > > protector value, which allows removal of the restriction on how the percpu
> > > > > > section is linked. This allows the percpu section to be linked normally,
> > > > > > like other architectures. In turn, this allows removal of code that was
> > > > > > needed to support the zero-based percpu section.
> > > > >
> > > > > The number of simplifications throughout the code, enabled by this
> > > > > patch set, is really impressive, and it reflects the number of
> > > > > workarounds to enable the feature that was originally not designed for
> > > > > the kernel usage. As noted above, this issue was recognized in the GCC
> > > > > compiler and the stack protector support was generalized by adding
> > > > > configurable location for the stack protector value [1,2].
> > > > >
> > > > > The improved stack protector support was implemented in gcc-8.1,
> > > > > released on May 2, 2018, when linux 4.17 was in development. In light
> > > > > of this fact, and 5 (soon 6) GCC major releases later, I'd like to ask
> > > > > if the objtool support to fixup earlier compilers is really necessary.
> > > > > Please note that years ago x86_32 simply dropped stack protector
> > > > > support with earlier compilers and IMO, we should follow this example
> > > > > also with x86_64, because:
> > > >
> > > > Ack on raising the minimum version requirement for x86-64
> > > > stackprotector to 8.1 or so - this causes no real pain on the distro
> > > > side: when *this* new kernel of ours is picked by a distro, it almost
> > > > always goes hand in hand with a compiler version upgrade.
> > > >
> > > > We should be careful with fixes marked for -stable backport, but other
> > > > than that, new improvements like Brian's series are a fair game to
> > > > tweak compiler version requirements.
> > > >
> > > > But please emit a (single) prominent build-time warning if a feature is
> > > > disabled though, even if there are no functional side-effects, such as
> > > > for hardening features.
> > >
> > > Disabled for any reason or only if the compiler lacks support?
> >
> > Only if the user desires to have it enabled, but it's not possible due
> > to compiler (or other build environment) reasons. Ie. if something
> > unexpected happens from the user's perspective.
> >
> > The .config option is preserved even if the compiler doesn't support
> > it, right?
> >
> > I suspect this should also cover features that get select-ed by a
> > feature that the user enables. (Not sure about architecture level
> > select-ed options.)
> >
> > Thanks,
> >
> > Ingo
>
> I could add something like:
>
> comment "Stack protector is not supported by the architecture or compiler"
> depends on !HAVE_STACKPROTECTOR
>
> But, "make oldconfig" will still silently disable stack protector if
> the compiler doesn't support the new options. It does put the
> comment into the .config file though, so that may be enough.

So I was thinking more along the lines of emitting an actual warning to
the build log, every time the compiler check is executed and fails to
detect [certain] essential or good-to-have compiler features.

A bit like the red '[ OFF ]' build lines during the perf build:

Auto-detecting system features:

.. dwarf: [ on ]
.. dwarf_getlocations: [ on ]
.. glibc: [ on ]
.. libbfd: [ on ]
.. libbfd-buildid: [ on ]
.. libcap: [ on ]
.. libelf: [ on ]
.. libnuma: [ on ]
.. numa_num_possible_cpus: [ on ]
.. libperl: [ on ]
.. libpython: [ on ]
.. libcrypto: [ on ]
.. libunwind: [ on ]
.. libdw-dwarf-unwind: [ on ]
.. libcapstone: [ OFF ] <========
.. zlib: [ on ]
.. lzma: [ on ]
.. get_cpuid: [ on ]
.. bpf: [ on ]
.. libaio: [ on ]
.. libzstd: [ on ]

.. or something like that.

Thanks,

Ingo

2024-03-24 12:34:24

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Sun, Mar 24, 2024 at 6:53 AM Ingo Molnar <[email protected]> wrote:
>
>
> * Brian Gerst <[email protected]> wrote:
>
> > On Sun, Mar 24, 2024 at 12:05 AM Ingo Molnar <[email protected]> wrote:
> > >
> > >
> > > * Brian Gerst <[email protected]> wrote:
> > >
> > > > On Sat, Mar 23, 2024 at 10:25 PM Ingo Molnar <[email protected]> wrote:
> > > > >
> > > > >
> > > > > * Uros Bizjak <[email protected]> wrote:
> > > > >
> > > > > > On Fri, Mar 22, 2024 at 5:52 PM Brian Gerst <[email protected]> wrote:
> > > > > > >
> > > > > > > Currently, x86-64 uses an unusual percpu layout, where the percpu section
> > > > > > > is linked at absolute address 0. The reason behind this is that older GCC
> > > > > > > versions placed the stack protector (if enabled) at a fixed offset from the
> > > > > > > GS segment base. Since the GS segement is also used for percpu variables,
> > > > > > > this forced the current layout.
> > > > > > >
> > > > > > > GCC since version 8.1 supports a configurable location for the stack
> > > > > > > protector value, which allows removal of the restriction on how the percpu
> > > > > > > section is linked. This allows the percpu section to be linked normally,
> > > > > > > like other architectures. In turn, this allows removal of code that was
> > > > > > > needed to support the zero-based percpu section.
> > > > > >
> > > > > > The number of simplifications throughout the code, enabled by this
> > > > > > patch set, is really impressive, and it reflects the number of
> > > > > > workarounds to enable the feature that was originally not designed for
> > > > > > the kernel usage. As noted above, this issue was recognized in the GCC
> > > > > > compiler and the stack protector support was generalized by adding
> > > > > > configurable location for the stack protector value [1,2].
> > > > > >
> > > > > > The improved stack protector support was implemented in gcc-8.1,
> > > > > > released on May 2, 2018, when linux 4.17 was in development. In light
> > > > > > of this fact, and 5 (soon 6) GCC major releases later, I'd like to ask
> > > > > > if the objtool support to fixup earlier compilers is really necessary.
> > > > > > Please note that years ago x86_32 simply dropped stack protector
> > > > > > support with earlier compilers and IMO, we should follow this example
> > > > > > also with x86_64, because:
> > > > >
> > > > > Ack on raising the minimum version requirement for x86-64
> > > > > stackprotector to 8.1 or so - this causes no real pain on the distro
> > > > > side: when *this* new kernel of ours is picked by a distro, it almost
> > > > > always goes hand in hand with a compiler version upgrade.
> > > > >
> > > > > We should be careful with fixes marked for -stable backport, but other
> > > > > than that, new improvements like Brian's series are a fair game to
> > > > > tweak compiler version requirements.
> > > > >
> > > > > But please emit a (single) prominent build-time warning if a feature is
> > > > > disabled though, even if there are no functional side-effects, such as
> > > > > for hardening features.
> > > >
> > > > Disabled for any reason or only if the compiler lacks support?
> > >
> > > Only if the user desires to have it enabled, but it's not possible due
> > > to compiler (or other build environment) reasons. Ie. if something
> > > unexpected happens from the user's perspective.
> > >
> > > The .config option is preserved even if the compiler doesn't support
> > > it, right?
> > >
> > > I suspect this should also cover features that get select-ed by a
> > > feature that the user enables. (Not sure about architecture level
> > > select-ed options.)
> > >
> > > Thanks,
> > >
> > > Ingo
> >
> > I could add something like:
> >
> > comment "Stack protector is not supported by the architecture or compiler"
> > depends on !HAVE_STACKPROTECTOR
> >
> > But, "make oldconfig" will still silently disable stack protector if
> > the compiler doesn't support the new options. It does put the
> > comment into the .config file though, so that may be enough.
>
> So I was thinking more along the lines of emitting an actual warning to
> the build log, every time the compiler check is executed and fails to
> detect [certain] essential or good-to-have compiler features.
>
> A bit like the red '[ OFF ]' build lines during the perf build:
>
> Auto-detecting system features:
>
> ... dwarf: [ on ]
> ... dwarf_getlocations: [ on ]
> ... glibc: [ on ]
> ... libbfd: [ on ]
> ... libbfd-buildid: [ on ]
> ... libcap: [ on ]
> ... libelf: [ on ]
> ... libnuma: [ on ]
> ... numa_num_possible_cpus: [ on ]
> ... libperl: [ on ]
> ... libpython: [ on ]
> ... libcrypto: [ on ]
> ... libunwind: [ on ]
> ... libdw-dwarf-unwind: [ on ]
> ... libcapstone: [ OFF ] <========
> ... zlib: [ on ]
> ... lzma: [ on ]
> ... get_cpuid: [ on ]
> ... bpf: [ on ]
> ... libaio: [ on ]
> ... libzstd: [ on ]
>
> ... or something like that.
>
> Thanks,
>
> Ingo

That list comes from the perf tool itself
(tools/perf/builtin-version.c), not the kernel config or build system.
Something like that could be added to the main kernel build. But it
should be a separate patch series as it will likely need a lot of
design iteration.

Brian Gerst

2024-03-24 18:14:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements


* Brian Gerst <[email protected]> wrote:

> > > But, "make oldconfig" will still silently disable stack protector if
> > > the compiler doesn't support the new options. It does put the
> > > comment into the .config file though, so that may be enough.
> >
> > So I was thinking more along the lines of emitting an actual warning to
> > the build log, every time the compiler check is executed and fails to
> > detect [certain] essential or good-to-have compiler features.
> >
> > A bit like the red '[ OFF ]' build lines during the perf build:
> >
> > Auto-detecting system features:
> >
> > ... dwarf: [ on ]
> > ... dwarf_getlocations: [ on ]
> > ... glibc: [ on ]
> > ... libbfd: [ on ]
> > ... libbfd-buildid: [ on ]
> > ... libcap: [ on ]
> > ... libelf: [ on ]
> > ... libnuma: [ on ]
> > ... numa_num_possible_cpus: [ on ]
> > ... libperl: [ on ]
> > ... libpython: [ on ]
> > ... libcrypto: [ on ]
> > ... libunwind: [ on ]
> > ... libdw-dwarf-unwind: [ on ]
> > ... libcapstone: [ OFF ] <========
> > ... zlib: [ on ]
> > ... lzma: [ on ]
> > ... get_cpuid: [ on ]
> > ... bpf: [ on ]
> > ... libaio: [ on ]
> > ... libzstd: [ on ]
> >
> > ... or something like that.
> >
> > Thanks,
> >
> > Ingo
>
> That list comes from the perf tool itself
> (tools/perf/builtin-version.c), not the kernel config or build system.

Yeah, I know, I wrote the initial version. ;-)

( See upstream commits b6aa9979416e~1..4cc9117a35b2 )

> Something like that could be added to the main kernel build. But it
> should be a separate patch series as it will likely need a lot of design
> iteration.

Doesn't have to be complicated really, but obviously not a requirement for
this series.

Thanks,

Ingo

2024-03-24 19:10:25

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

From: Linus Torvalds
> Sent: 23 March 2024 17:07
>
> On Sat, 23 Mar 2024 at 09:16, Linus Torvalds
> <[email protected]> wrote:
> >
> > And we might as well also do the semi-yearly compiler version review.
> > We raised the minimum to 4.9 almost four years ago, and then the jump
> > to 5.1 was first for arm64 due to a serious gcc code generation bug
> > and then globally in Sept 2021.
>
> Looking at RHEL, I find a page that claims
>
> RHEL9 : gcc 11.x in app stream
> RHEL8 : gcc 8.x or gcc 9.x in app stream.
> RHEL7 : gcc 4.8.x

I'm running Ubuntu 18.04 LTS (and still getting updates) and that
has gcc 7.3.0.
Clearly it might be time to update/reinstall that system :-)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2024-03-25 16:41:20

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Sat, Mar 23, 2024, at 18:06, Linus Torvalds wrote:
> On Sat, 23 Mar 2024 at 09:16, Linus Torvalds <[email protected]> wrote:
> The SLES situation seems somewhat similar, with SLES12 being 4.8.x and
> SLES15 being 7.3. But again with a "Development Tools Module" setup.
> So that *might* argue for 7.3.

According to https://distrowatch.com/table.php?distribution=sle, they
also provide gcc-12.2.1 with the sp5 update, so we're probably fine.

On the other hand, I can see that OpenSUSE Leap 15.6 contains
a fairly modern kernel (6.4.x) built with the gcc-7.3 system
compiler, and I think this is the same one as in SLES.

Not sure if they plan to update the kernel release beyond that,
or how inconvenient it would be for them to require
using the other compiler for future updates, so I've added
the developers that last touched the OpenSUSE kernel RPM
package to Cc here.

Arnd

2024-03-25 16:58:42

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Mon, 25 Mar 2024 15:51:30 +0100,
Arnd Bergmann wrote:
>
> On Sat, Mar 23, 2024, at 18:06, Linus Torvalds wrote:
> > On Sat, 23 Mar 2024 at 09:16, Linus Torvalds <[email protected]> wrote:
> > The SLES situation seems somewhat similar, with SLES12 being 4.8.x and
> > SLES15 being 7.3. But again with a "Development Tools Module" setup.
> > So that *might* argue for 7.3.
>
> According to https://distrowatch.com/table.php?distribution=sle, they
> also provide gcc-12.2.1 with the sp5 update, so we're probably fine.
>
> On the other hand, I can see that OpenSUSE Leap 15.6 contains
> a fairly modern kernel (6.4.x) built with the gcc-7.3 system
> compiler, and I think this is the same one as in SLES.
>
> Not sure if they plan to update the kernel release beyond that,
> or how inconvenient it would be for them to require
> using the other compiler for future updates, so I've added
> the developers that last touched the OpenSUSE kernel RPM
> package to Cc here.

SLE15-SP6 kernel (based on 6.4.x) is still built with gcc7, currently
gcc 7.5, indeed. openSUSE Leap shares the very same kernel, so it's
with gcc 7.5, too. Even though gcc-13 is provided as additional
compiler package, it's not used for the kernel package build.

AFAIK, it's not decided yet about SP7 kernel. But since we take a
conservative approach for SLE, I guess SLE15-SP7 will be likely
sticking with the old gcc, unless forced to change by some reason.

SLE12 is built with the old gcc 4.8, and SLE12-SP5 (based on 4.12) is
still actively maintained, but only for a few months until October
2024.

The next generation of SLE is built with the latest gcc (gcc-13 for
now). So SLE16 will be a totally different story.

openSUSE Tumbleweed always uses a bleeding edge compiler (gcc-13),
too.


Takashi

2024-03-25 17:02:46

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Sun, 24 Mar 2024 at 00:55, Arnd Bergmann <[email protected]> wrote:
>
> On Sat, Mar 23, 2024, at 17:16, Linus Torvalds wrote:
> >
..
> > So there doesn't seem to be a major reason to up the versioning, since
> > the stack protector thing can just be disabled for older versions.
> >
> > But maybe even enterprise distros have upgraded anyway, and we should
> > be proactive.
> >
> > Cc'ing Arnd, who has historically been one of the people pushing this.
> > He may no longer care because we haven't had huge issues.
>
> I'm not aware of any major issues, but it keeps coming up and
> a number of people have asked me about it because of smaller
> ones. Unfortunately I didn't write down what the problems
> were.
>
> I think based on what compiler versions are shipped
> by LTS distros, gcc-8.1 is the next logical step when we
> do it, covering Debian 10, RHEL 8 and Ubuntu 20.04, which
> are probably the oldest we still need to support.
>
> RHEL 7 and SLES 12 are still technically supported distros
> as well, but those shipped with gcc-4.8, so we dropped them
> in 2020 with the move to gcc-4.9.
>
> So in short, not a lot to gain or lose from raising the
> minimum to 8.1, but it would be nice to reduce the possible
> test matrix from 10 gcc versions back to the 7 we had in
> the past, as well as clean up a few version dependencies.
> Similarly we can probably raise the oldest binutils version
> to 2.30, as that seems to be the earliest version that was
> paired with gcc-8 (in RHEL-8).
>

x86_64/SMP uses a pile of hacks to create a runtime relocatable
kernel, one of which is a workaround for the offset based addressing
of per-CPU variables. This requires RIP-relative per-CPU references,
e.g.,

leal %gs:foo(%rip), %reg

to be fixed up in the opposite direction (displacement subtracted
rather than added) in the decompressor. This scheme is used because
older GCCs can only access the stack protector cookie via a fixed
offset of GS+40, and so GS must carry the address of the start of the
per-CPU region rather than an arbitrary relative offset between the
per-CPU region in vmlinux and the one belonging to a CPU.

GCC 8.1 and later allow the cookie to be specified using a symbol, and
this would allow us to revert to the ordinary per-CPU addressing,
where the base is the vmlinux copy of a symbol, and each CPU carries a
different offset in GS that produces the address of its respective
private copy. [0]

With that out of the way, we could get rid of the weird relocation
format and just use the linker to link vmlinux in PIE mode (like other
architectures), using the condensed RELR format which only takes a
fraction of the space. Using PIC codegen and PIE linking also brings
us closer to what toolchains expect, and so fewer quirks/surprises
when moving to newer versions. (Currently on x86, we do a position
dependent link of vmlinux, and rely on the static relocations produced
by --emit-relocs to create the metadata we need to perform the
relocation fixups. Static relocations cannot describe all the
transformations and relaxations that the linker might apply, and so
static relocations might go out of sync with the actual code.)

Another minor cleanup is __GCC_ASM_FLAG_OUTPUTS__, which would always
be supported if we require 8.1 or newer.

None of this is high priority, though, so not a reason in itself to
deprecate GCC 7 and older, more of a nice bonus when we do get there.

--
Ard.



[0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=x86-remove-absolute-percpu

2024-03-25 18:43:15

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Mon, Mar 25, 2024, at 16:26, Takashi Iwai wrote:
> On Mon, 25 Mar 2024 15:51:30 +0100, Arnd Bergmann wrote:
>
>> Not sure if they plan to update the kernel release beyond that,
>> or how inconvenient it would be for them to require
>> using the other compiler for future updates, so I've added
>> the developers that last touched the OpenSUSE kernel RPM
>> package to Cc here.
>
> SLE15-SP6 kernel (based on 6.4.x) is still built with gcc7, currently
> gcc 7.5, indeed. openSUSE Leap shares the very same kernel, so it's
> with gcc 7.5, too. Even though gcc-13 is provided as additional
> compiler package, it's not used for the kernel package build.

Ok, so for SP6 there won't be a problem because it won't
update the kernel.

> AFAIK, it's not decided yet about SP7 kernel. But since we take a
> conservative approach for SLE, I guess SLE15-SP7 will be likely
> sticking with the old gcc, unless forced to change by some reason.

From https://www.suse.com/support/kb/doc/?id=000019587, it
looks like kernel versions are historically only updated for
even numbered updates (SP2, SP4, SP6), so I guess there is a
good chance that SP7 won't upgrade either, even if that decision
is still to be made.

I would assume that if SP7 does get a new kernel, it would
not be any later than the next LTS kernel (6.12 according
to phb-crystal-ball), and we could just decide to move to
gcc-8 only after that is out.

> SLE12 is built with the old gcc 4.8, and SLE12-SP5 (based on 4.12) is
> still actively maintained, but only for a few months until October
> 2024.

Right, and it also never updated beyond linux-4.12, which still
supported gcc-4.8. gcc-4.9 only became a requirement in linux-5.9.

> The next generation of SLE is built with the latest gcc (gcc-13 for
> now). So SLE16 will be a totally different story.
>
> openSUSE Tumbleweed always uses a bleeding edge compiler (gcc-13),
> too.

Thanks for the detailed reply!

Arnd

2024-03-26 07:02:59

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v4 00/16] x86-64: Stack protector and percpu improvements

On Mon, Mar 25, 2024 at 7:08 PM Arnd Bergmann <[email protected]> wrote:
>
> On Mon, Mar 25, 2024, at 16:26, Takashi Iwai wrote:
> > On Mon, 25 Mar 2024 15:51:30 +0100, Arnd Bergmann wrote:
> >
> >> Not sure if they plan to update the kernel release beyond that,
> >> or how inconvenient it would be for them to require
> >> using the other compiler for future updates, so I've added
> >> the developers that last touched the OpenSUSE kernel RPM
> >> package to Cc here.
> >
> > SLE15-SP6 kernel (based on 6.4.x) is still built with gcc7, currently
> > gcc 7.5, indeed. openSUSE Leap shares the very same kernel, so it's
> > with gcc 7.5, too. Even though gcc-13 is provided as additional
> > compiler package, it's not used for the kernel package build.
>
> Ok, so for SP6 there won't be a problem because it won't
> update the kernel.
>
> > AFAIK, it's not decided yet about SP7 kernel. But since we take a
> > conservative approach for SLE, I guess SLE15-SP7 will be likely
> > sticking with the old gcc, unless forced to change by some reason.
>
> From https://www.suse.com/support/kb/doc/?id=000019587, it
> looks like kernel versions are historically only updated for
> even numbered updates (SP2, SP4, SP6), so I guess there is a
> good chance that SP7 won't upgrade either, even if that decision
> is still to be made.
>
> I would assume that if SP7 does get a new kernel, it would
> not be any later than the next LTS kernel (6.12 according
> to phb-crystal-ball), and we could just decide to move to
> gcc-8 only after that is out.

I think that upgrading the compiler to 8.1 after 6.12 LTS is a good plan.

Following that plan also means that the original stack protector
improvement series that includes objtool changes is the way to go
forward. The stack protector specific part can be removed from objtool
after the compiler is upgraded.

Thanks,
Uros.