2023-11-15 17:38:19

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 00/14] x86-64: Stack protector and percpu improvements

Currently, x86-64 uses an unusual percpu layout, where the percpu section
is linked at absolute address 0. The reason behind this is that older GCC
versions placed the stack protector (if enabled) at a fixed offset from the
GS segment base. Since the GS segement is also used for percpu variables,
this forced the current layout.

GCC since version 8.1 supports a configurable location for the stack
protector value, which allows removal of the restriction on how the percpu
section is linked. This allows the percpu section to be linked
normally, like most other architectures. In turn, this allows removal
of code that was needed to support the zero-based percpu section.

The major change with this iteration is adding support to objtool for
older compilers that can't change the location of the stackprotector
canary value.

v3:
- Add objtool suport to convert stackprotector code from older compilers.
- Handle R_X86_64_REX_GOTPCRELX relocations from clang

v2:
- Include PVH boot in GSBASE changes.
- Split out removal of 64-bit test script to give full context on why
it's not needed anymore.
- Formatting and comment cleanups.

Brian Gerst (14):
x86/stackprotector/32: Remove stack protector test script
x86/stackprotector/64: Remove stack protector test script
x86/boot: Disable stack protector for early boot code
x86/pvh: Use fixed_percpu_data for early boot GSBASE
x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
objtool: Allow adding relocations to an existing section
objtool: Convert fixed location stack protector accesses
x86/stackprotector/64: Convert to normal percpu variable
x86/percpu/64: Use relative percpu offsets
x86/percpu/64: Remove fixed_percpu_data
x86/boot/64: Remove inverse relocations
x86/percpu/64: Remove INIT_PER_CPU macros
percpu: Remove PER_CPU_FIRST_SECTION
kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU

arch/x86/Kconfig | 16 +--
arch/x86/Makefile | 21 ++--
arch/x86/boot/compressed/misc.c | 14 +--
arch/x86/entry/entry_64.S | 2 +-
arch/x86/include/asm/percpu.h | 22 ----
arch/x86/include/asm/processor.h | 28 +----
arch/x86/include/asm/stackprotector.h | 36 +-----
arch/x86/kernel/Makefile | 2 +
arch/x86/kernel/asm-offsets_64.c | 6 -
arch/x86/kernel/cpu/common.c | 8 +-
arch/x86/kernel/head_64.S | 20 ++-
arch/x86/kernel/irq_64.c | 1 -
arch/x86/kernel/setup_percpu.c | 12 +-
arch/x86/kernel/vmlinux.lds.S | 35 ------
arch/x86/platform/pvh/head.S | 10 +-
arch/x86/tools/relocs.c | 143 ++--------------------
arch/x86/xen/xen-head.S | 10 +-
include/asm-generic/vmlinux.lds.h | 1 -
include/linux/percpu-defs.h | 12 --
init/Kconfig | 11 +-
kernel/kallsyms.c | 12 +-
scripts/Makefile.lib | 2 +
scripts/gcc-x86_32-has-stack-protector.sh | 8 --
scripts/gcc-x86_64-has-stack-protector.sh | 4 -
scripts/kallsyms.c | 80 +++---------
scripts/link-vmlinux.sh | 4 -
tools/objtool/arch/x86/decode.c | 46 +++++++
tools/objtool/arch/x86/special.c | 88 +++++++++++++
tools/objtool/builtin-check.c | 9 +-
tools/objtool/check.c | 14 ++-
tools/objtool/elf.c | 133 ++++++++++++++++----
tools/objtool/include/objtool/arch.h | 3 +
tools/objtool/include/objtool/builtin.h | 2 +
tools/objtool/include/objtool/elf.h | 90 +++++++++++---
34 files changed, 433 insertions(+), 472 deletions(-)
delete mode 100755 scripts/gcc-x86_32-has-stack-protector.sh
delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh

--
2.41.0


2023-11-15 17:38:33

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 02/14] x86/stackprotector/64: Remove stack protector test script

This test for the stack protector was added in 2006 to make sure the
compiler had the PR28281 patch applied. With GCC 5.1 being the minimum
supported compiler now, it is no longer necessary.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/Kconfig | 5 ++---
scripts/gcc-x86_64-has-stack-protector.sh | 4 ----
2 files changed, 2 insertions(+), 7 deletions(-)
delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 46c55fd7ca86..a1d2f7fe42bb 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -410,12 +410,11 @@ config PGTABLE_LEVELS

config CC_HAS_SANE_STACKPROTECTOR
bool
- default $(success,$(srctree)/scripts/gcc-x86_64-has-stack-protector.sh $(CC) $(CLANG_FLAGS)) if 64BIT
+ default y if 64BIT
default $(cc-option,-mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard)
help
We have to make sure stack protector is unconditionally disabled if
- the compiler produces broken code or if it does not let us control
- the segment on 32-bit kernels.
+ the compiler does not allow control of the segment and symbol.

menu "Processor type and features"

diff --git a/scripts/gcc-x86_64-has-stack-protector.sh b/scripts/gcc-x86_64-has-stack-protector.sh
deleted file mode 100755
index 75e4e22b986a..000000000000
--- a/scripts/gcc-x86_64-has-stack-protector.sh
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0
-
-echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m64 -O0 -mcmodel=kernel -fno-PIE -fstack-protector - -o - 2> /dev/null | grep -q "%gs"
--
2.41.0

2023-11-15 17:38:53

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 06/14] objtool: Allow adding relocations to an existing section

In order to add relocations to existing sections (e.g. ".rela.text"),
encapsulate the reloc array in a block header to allow chaining blocks
to add more relocs without moving and relinking the existing ones.
This adds minimal memory overhead, while still being able to easily
access the arrays by index.

Signed-off-by: Brian Gerst <[email protected]>
---
tools/objtool/check.c | 2 +-
tools/objtool/elf.c | 99 +++++++++++++++++++++++------
tools/objtool/include/objtool/elf.h | 84 +++++++++++++++++++-----
3 files changed, 148 insertions(+), 37 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e94756e09ca9..ac304140c395 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -4409,7 +4409,7 @@ static int validate_ibt_data_reloc(struct objtool_file *file,
return 0;

WARN_FUNC("data relocation to !ENDBR: %s",
- reloc->sec->base, reloc_offset(reloc),
+ reloc->block->sec->base, reloc_offset(reloc),
offstr(dest->sec, dest->offset));

return 1;
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 3d27983dc908..cfb970727c8a 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -277,7 +277,7 @@ struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *se
for_offset_range(o, offset, offset + len) {
elf_hash_for_each_possible(reloc, reloc, hash,
sec_offset_hash(rsec, o)) {
- if (reloc->sec != rsec)
+ if (reloc->block->sec != rsec)
continue;

if (reloc_offset(reloc) >= offset &&
@@ -333,6 +333,7 @@ static int read_sections(struct elf *elf)
sec = &elf->section_data[i];

INIT_LIST_HEAD(&sec->symbol_list);
+ INIT_LIST_HEAD(&sec->reloc_list);

s = elf_getscn(elf->elf, i);
if (!s) {
@@ -850,7 +851,7 @@ static struct reloc *elf_init_reloc(struct elf *elf, struct section *rsec,
unsigned long offset, struct symbol *sym,
s64 addend, unsigned int type)
{
- struct reloc *reloc, empty = { 0 };
+ struct reloc *reloc;

if (reloc_idx >= sec_num_entries(rsec)) {
WARN("%s: bad reloc_idx %u for %s with %d relocs",
@@ -858,15 +859,18 @@ static struct reloc *elf_init_reloc(struct elf *elf, struct section *rsec,
return NULL;
}

- reloc = &rsec->relocs[reloc_idx];
+ reloc = get_reloc_by_index(rsec, reloc_idx);
+ if (!reloc) {
+ WARN("%s: %s: reloc %d out of range!",
+ __func__, rsec->name, reloc_idx);
+ return NULL;
+ }

- if (memcmp(reloc, &empty, sizeof(empty))) {
+ if (reloc->sym) {
WARN("%s: %s: reloc %d already initialized!",
__func__, rsec->name, reloc_idx);
return NULL;
}
-
- reloc->sec = rsec;
reloc->sym = sym;

set_reloc_offset(elf, reloc, offset);
@@ -930,19 +934,45 @@ struct reloc *elf_init_reloc_data_sym(struct elf *elf, struct section *sec,
elf_data_rela_type(elf));
}

+static struct reloc_block *alloc_reloc_block(struct section *rsec, size_t num_relocs)
+{
+ struct reloc_block *block;
+ size_t block_size = sizeof(struct reloc_block) + sec_num_entries(rsec) * sizeof(struct reloc);
+ int i;
+
+ block = malloc(block_size);
+ if (!block) {
+ perror("malloc");
+ return NULL;
+ }
+
+ memset(block, 0, block_size);
+ INIT_LIST_HEAD(&block->list);
+ block->sec = rsec;
+ block->start_idx = rsec->num_relocs;
+ block->len = num_relocs;
+
+ for (i = 0; i < num_relocs; i++)
+ block->relocs[i].block = block;
+
+ rsec->num_relocs += num_relocs;
+ list_add_tail(&block->list, &rsec->reloc_list);
+
+ return block;
+}
+
static int read_relocs(struct elf *elf)
{
unsigned long nr_reloc, max_reloc = 0;
struct section *rsec;
- struct reloc *reloc;
- unsigned int symndx;
- struct symbol *sym;
int i;

if (!elf_alloc_hash(reloc, elf->num_relocs))
return -1;

list_for_each_entry(rsec, &elf->sections, list) {
+ struct reloc_block *block;
+
if (!is_reloc_sec(rsec))
continue;

@@ -956,15 +986,15 @@ static int read_relocs(struct elf *elf)
rsec->base->rsec = rsec;

nr_reloc = 0;
- rsec->relocs = calloc(sec_num_entries(rsec), sizeof(*reloc));
- if (!rsec->relocs) {
- perror("calloc");
+ block = alloc_reloc_block(rsec, sec_num_entries(rsec));
+ if (!block)
return -1;
- }
+
for (i = 0; i < sec_num_entries(rsec); i++) {
- reloc = &rsec->relocs[i];
+ struct reloc *reloc = &block->relocs[i];
+ struct symbol *sym;
+ unsigned int symndx;

- reloc->sec = rsec;
symndx = reloc_sym(reloc);
reloc->sym = sym = find_symbol_by_index(elf, symndx);
if (!reloc->sym) {
@@ -1100,6 +1130,7 @@ struct section *elf_create_section(struct elf *elf, const char *name,
memset(sec, 0, sizeof(*sec));

INIT_LIST_HEAD(&sec->symbol_list);
+ INIT_LIST_HEAD(&sec->reloc_list);

s = elf_newscn(elf->elf);
if (!s) {
@@ -1170,6 +1201,7 @@ static struct section *elf_create_rela_section(struct elf *elf,
unsigned int reloc_nr)
{
struct section *rsec;
+ struct reloc_block *block;
char *rsec_name;

rsec_name = malloc(strlen(sec->name) + strlen(".rela") + 1);
@@ -1192,11 +1224,9 @@ static struct section *elf_create_rela_section(struct elf *elf,
rsec->sh.sh_info = sec->idx;
rsec->sh.sh_flags = SHF_INFO_LINK;

- rsec->relocs = calloc(sec_num_entries(rsec), sizeof(struct reloc));
- if (!rsec->relocs) {
- perror("calloc");
+ block = alloc_reloc_block(rsec, sec_num_entries(rsec));
+ if (!block)
return NULL;
- }

sec->rsec = rsec;
rsec->base = sec;
@@ -1204,6 +1234,37 @@ static struct section *elf_create_rela_section(struct elf *elf,
return rsec;
}

+int elf_extend_rela_section(struct elf *elf,
+ struct section *rsec,
+ int add_relocs)
+{
+ int newnr = sec_num_entries(rsec) + add_relocs;
+ size_t oldsize = rsec->sh.sh_size;
+ size_t newsize = newnr * rsec->sh.sh_entsize;
+ void *buf;
+ struct reloc_block *block;
+
+ buf = realloc(rsec->data->d_buf, newnr * rsec->sh.sh_entsize);
+ if (!buf) {
+ perror("realloc");
+ return -1;
+ }
+
+ memset(buf + oldsize, 0, newsize - oldsize);
+
+ rsec->data->d_size = newsize;
+ rsec->data->d_buf = buf;
+ rsec->sh.sh_size = newsize;
+
+ mark_sec_changed(elf, rsec, true);
+
+ block = alloc_reloc_block(rsec, add_relocs);
+ if (!block)
+ return -1;
+
+ return 0;
+}
+
struct section *elf_create_section_pair(struct elf *elf, const char *name,
size_t entsize, unsigned int nr,
unsigned int reloc_nr)
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 9f71e988eca4..7851467f6878 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -43,7 +43,8 @@ struct section {
char *name;
int idx;
bool _changed, text, rodata, noinstr, init, truncate;
- struct reloc *relocs;
+ struct list_head reloc_list;
+ int num_relocs;
};

struct symbol {
@@ -71,13 +72,23 @@ struct symbol {
struct reloc *relocs;
};

+struct reloc_block;
+
struct reloc {
struct elf_hash_node hash;
- struct section *sec;
+ struct reloc_block *block;
struct symbol *sym;
struct reloc *sym_next_reloc;
};

+struct reloc_block {
+ struct list_head list;
+ struct section *sec;
+ int start_idx;
+ int len;
+ struct reloc relocs[0];
+};
+
struct elf {
Elf *elf;
GElf_Ehdr ehdr;
@@ -108,6 +119,11 @@ struct elf *elf_open_read(const char *name, int flags);

struct section *elf_create_section(struct elf *elf, const char *name,
size_t entsize, unsigned int nr);
+
+int elf_extend_rela_section(struct elf *elf,
+ struct section *rsec,
+ int add_relocs);
+
struct section *elf_create_section_pair(struct elf *elf, const char *name,
size_t entsize, unsigned int nr,
unsigned int reloc_nr);
@@ -197,12 +213,12 @@ static inline unsigned int sec_num_entries(struct section *sec)

static inline unsigned int reloc_idx(struct reloc *reloc)
{
- return reloc - reloc->sec->relocs;
+ return reloc->block->start_idx + (reloc - &reloc->block->relocs[0]);
}

static inline void *reloc_rel(struct reloc *reloc)
{
- struct section *rsec = reloc->sec;
+ struct section *rsec = reloc->block->sec;

return rsec->data->d_buf + (reloc_idx(reloc) * rsec->sh.sh_entsize);
}
@@ -215,7 +231,7 @@ static inline bool is_32bit_reloc(struct reloc *reloc)
* Elf64_Rel: 16 bytes
* Elf64_Rela: 24 bytes
*/
- return reloc->sec->sh.sh_entsize < 16;
+ return reloc->block->sec->sh.sh_entsize < 16;
}

#define __get_reloc_field(reloc, field) \
@@ -241,7 +257,7 @@ static inline u64 reloc_offset(struct reloc *reloc)
static inline void set_reloc_offset(struct elf *elf, struct reloc *reloc, u64 offset)
{
__set_reloc_field(reloc, r_offset, offset);
- mark_sec_changed(elf, reloc->sec, true);
+ mark_sec_changed(elf, reloc->block->sec, true);
}

static inline s64 reloc_addend(struct reloc *reloc)
@@ -252,7 +268,7 @@ static inline s64 reloc_addend(struct reloc *reloc)
static inline void set_reloc_addend(struct elf *elf, struct reloc *reloc, s64 addend)
{
__set_reloc_field(reloc, r_addend, addend);
- mark_sec_changed(elf, reloc->sec, true);
+ mark_sec_changed(elf, reloc->block->sec, true);
}


@@ -282,7 +298,7 @@ static inline void set_reloc_sym(struct elf *elf, struct reloc *reloc, unsigned

__set_reloc_field(reloc, r_info, info);

- mark_sec_changed(elf, reloc->sec, true);
+ mark_sec_changed(elf, reloc->block->sec, true);
}
static inline void set_reloc_type(struct elf *elf, struct reloc *reloc, unsigned int type)
{
@@ -292,7 +308,46 @@ static inline void set_reloc_type(struct elf *elf, struct reloc *reloc, unsigned

__set_reloc_field(reloc, r_info, info);

- mark_sec_changed(elf, reloc->sec, true);
+ mark_sec_changed(elf, reloc->block->sec, true);
+}
+
+static inline struct reloc *get_reloc_by_index(struct section *rsec, int idx)
+{
+ struct reloc_block *block;
+
+ list_for_each_entry(block, &rsec->reloc_list, list) {
+ if (idx < block->len)
+ return &block->relocs[idx];
+ idx -= block->len;
+ }
+
+ return NULL;
+}
+
+static inline struct reloc *first_reloc(struct section *sec)
+{
+ struct reloc_block *block;
+
+ if (list_empty(&sec->reloc_list))
+ return NULL;
+
+ block = list_first_entry(&sec->reloc_list, struct reloc_block, list);
+ return &block->relocs[0];
+}
+
+static inline struct reloc *next_reloc(struct reloc *reloc)
+{
+ struct reloc_block *block = reloc->block;
+
+ reloc++;
+ if (reloc < &block->relocs[block->len])
+ return reloc;
+
+ if (list_is_last(&block->list, &block->sec->reloc_list))
+ return NULL;
+
+ block = list_next_entry(block, list);
+ return &block->relocs[0];
}

#define for_each_sec(file, sec) \
@@ -308,15 +363,10 @@ static inline void set_reloc_type(struct elf *elf, struct reloc *reloc, unsigned
sec_for_each_sym(__sec, sym)

#define for_each_reloc(rsec, reloc) \
- for (int __i = 0, __fake = 1; __fake; __fake = 0) \
- for (reloc = rsec->relocs; \
- __i < sec_num_entries(rsec); \
- __i++, reloc++)
+ for (reloc = first_reloc(rsec); reloc; reloc = next_reloc(reloc))

#define for_each_reloc_from(rsec, reloc) \
- for (int __i = reloc_idx(reloc); \
- __i < sec_num_entries(rsec); \
- __i++, reloc++)
+ for (; reloc; reloc = next_reloc(reloc))

#define OFFSET_STRIDE_BITS 4
#define OFFSET_STRIDE (1UL << OFFSET_STRIDE_BITS)
@@ -344,7 +394,7 @@ static inline u32 sec_offset_hash(struct section *sec, unsigned long offset)

static inline u32 reloc_hash(struct reloc *reloc)
{
- return sec_offset_hash(reloc->sec, reloc_offset(reloc));
+ return sec_offset_hash(reloc->block->sec, reloc_offset(reloc));
}

#endif /* _OBJTOOL_ELF_H */
--
2.41.0

2023-11-15 17:39:08

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 03/14] x86/boot: Disable stack protector for early boot code

On 64-bit, this will prevent crashes when the canary access is changed
from %gs:40 to %gs:__stack_chk_guard(%rip). RIP-relative addresses from
the identity-mapped early boot code will target the wrong address with
zero-based percpu. KASLR could then shift that address to an unmapped
page causing a crash on boot.

This early boot code runs well before userspace is active and does not
need stack protector enabled.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/kernel/Makefile | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 0000325ab98f..aff619054e17 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -39,6 +39,8 @@ KMSAN_SANITIZE_nmi.o := n
KCOV_INSTRUMENT_head$(BITS).o := n
KCOV_INSTRUMENT_sev.o := n

+CFLAGS_head32.o := -fno-stack-protector
+CFLAGS_head64.o := -fno-stack-protector
CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace

obj-y += head_$(BITS).o
--
2.41.0

2023-11-15 17:39:11

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 11/14] x86/boot/64: Remove inverse relocations

Now that the percpu section is not at a fixed virtual address, inverse
relocations, which were needed to offset the effects of relocation on
RIP-relative percpu references, are no longer needed.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/boot/compressed/misc.c | 14 +---
arch/x86/tools/relocs.c | 126 +-------------------------------
2 files changed, 2 insertions(+), 138 deletions(-)

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b99e08e6815b..2de345a236c0 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -221,7 +221,7 @@ static void handle_relocations(void *output, unsigned long output_len,

/*
* Process relocations: 32 bit relocations first then 64 bit after.
- * Three sets of binary relocations are added to the end of the kernel
+ * Two sets of binary relocations are added to the end of the kernel
* before compression. Each relocation table entry is the kernel
* address of the location which needs to be updated stored as a
* 32-bit value which is sign extended to 64 bits.
@@ -231,8 +231,6 @@ static void handle_relocations(void *output, unsigned long output_len,
* kernel bits...
* 0 - zero terminator for 64 bit relocations
* 64 bit relocation repeated
- * 0 - zero terminator for inverse 32 bit relocations
- * 32 bit inverse relocation repeated
* 0 - zero terminator for 32 bit relocations
* 32 bit relocation repeated
*
@@ -249,16 +247,6 @@ static void handle_relocations(void *output, unsigned long output_len,
*(uint32_t *)ptr += delta;
}
#ifdef CONFIG_X86_64
- while (*--reloc) {
- long extended = *reloc;
- extended += map;
-
- ptr = (unsigned long)extended;
- if (ptr < min_addr || ptr > max_addr)
- error("inverse 32-bit relocation outside of kernel!\n");
-
- *(int32_t *)ptr -= delta;
- }
for (reloc--; *reloc; reloc--) {
long extended = *reloc;
extended += map;
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 3b0cfddd8b27..ae9bbf634826 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -28,7 +28,6 @@ struct relocs {
static struct relocs relocs16;
static struct relocs relocs32;
#if ELF_BITS == 64
-static struct relocs relocs32neg;
static struct relocs relocs64;
#define FMT PRIu64

@@ -89,7 +88,6 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = {
"__initramfs_start|"
"(jiffies|jiffies_64)|"
#if ELF_BITS == 64
- "__per_cpu_load|"
"init_per_cpu__.*|"
"__end_rodata_hpage_align|"
#endif
@@ -287,33 +285,6 @@ static const char *sym_name(const char *sym_strtab, Elf_Sym *sym)
return name;
}

-static Elf_Sym *sym_lookup(const char *symname)
-{
- int i;
- for (i = 0; i < shnum; i++) {
- struct section *sec = &secs[i];
- long nsyms;
- char *strtab;
- Elf_Sym *symtab;
- Elf_Sym *sym;
-
- if (sec->shdr.sh_type != SHT_SYMTAB)
- continue;
-
- nsyms = sec->shdr.sh_size/sizeof(Elf_Sym);
- symtab = sec->symtab;
- strtab = sec->link->strtab;
-
- for (sym = symtab; --nsyms >= 0; sym++) {
- if (!sym->st_name)
- continue;
- if (strcmp(symname, strtab + sym->st_name) == 0)
- return sym;
- }
- }
- return 0;
-}
-
#if BYTE_ORDER == LITTLE_ENDIAN
#define le16_to_cpu(val) (val)
#define le32_to_cpu(val) (val)
@@ -756,75 +727,8 @@ static void walk_relocs(int (*process)(struct section *sec, Elf_Rel *rel,
}
}

-/*
- * The .data..percpu section is a special case for x86_64 SMP kernels.
- * It is used to initialize the actual per_cpu areas and to provide
- * definitions for the per_cpu variables that correspond to their offsets
- * within the percpu area. Since the values of all of the symbols need
- * to be offsets from the start of the per_cpu area the virtual address
- * (sh_addr) of .data..percpu is 0 in SMP kernels.
- *
- * This means that:
- *
- * Relocations that reference symbols in the per_cpu area do not
- * need further relocation (since the value is an offset relative
- * to the start of the per_cpu area that does not change).
- *
- * Relocations that apply to the per_cpu area need to have their
- * offset adjusted by by the value of __per_cpu_load to make them
- * point to the correct place in the loaded image (because the
- * virtual address of .data..percpu is 0).
- *
- * For non SMP kernels .data..percpu is linked as part of the normal
- * kernel data and does not require special treatment.
- *
- */
-static int per_cpu_shndx = -1;
-static Elf_Addr per_cpu_load_addr;
-
-static void percpu_init(void)
-{
- int i;
- for (i = 0; i < shnum; i++) {
- ElfW(Sym) *sym;
- if (strcmp(sec_name(i), ".data..percpu"))
- continue;
-
- if (secs[i].shdr.sh_addr != 0) /* non SMP kernel */
- return;
-
- sym = sym_lookup("__per_cpu_load");
- if (!sym)
- die("can't find __per_cpu_load\n");
-
- per_cpu_shndx = i;
- per_cpu_load_addr = sym->st_value;
- return;
- }
-}
-
#if ELF_BITS == 64

-/*
- * Check to see if a symbol lies in the .data..percpu section.
- *
- * The linker incorrectly associates some symbols with the
- * .data..percpu section so we also need to check the symbol
- * name to make sure that we classify the symbol correctly.
- *
- * The GNU linker incorrectly associates:
- * __init_begin
- * __per_cpu_load
- *
- * The "gold" linker incorrectly associates:
- * init_per_cpu__gdt_page
- */
-static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
-{
- return 0;
-}
-
-
static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
const char *symname)
{
@@ -835,12 +739,6 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
if (sym->st_shndx == SHN_UNDEF)
return 0;

- /*
- * Adjust the offset if this reloc applies to the percpu section.
- */
- if (sec->shdr.sh_info == per_cpu_shndx)
- offset += per_cpu_load_addr;
-
switch (r_type) {
case R_X86_64_NONE:
/* NONE can be ignored. */
@@ -850,33 +748,21 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
case R_X86_64_PLT32:
case R_X86_64_REX_GOTPCRELX:
/*
- * PC relative relocations don't need to be adjusted unless
- * referencing a percpu symbol.
+ * PC relative relocations don't need to be adjusted.
*
* NB: R_X86_64_PLT32 can be treated as R_X86_64_PC32.
*/
- if (is_percpu_sym(sym, symname))
- add_reloc(&relocs32neg, offset);
break;

case R_X86_64_PC64:
/*
* Only used by jump labels
*/
- if (is_percpu_sym(sym, symname))
- die("Invalid R_X86_64_PC64 relocation against per-CPU symbol %s\n",
- symname);
break;

case R_X86_64_32:
case R_X86_64_32S:
case R_X86_64_64:
- /*
- * References to the percpu area don't need to be adjusted.
- */
- if (is_percpu_sym(sym, symname))
- break;
-
if (shn_abs) {
/*
* Whitelisted absolute symbols do not require
@@ -1090,7 +976,6 @@ static void emit_relocs(int as_text, int use_real_mode)
/* Order the relocations for more efficient processing */
sort_relocs(&relocs32);
#if ELF_BITS == 64
- sort_relocs(&relocs32neg);
sort_relocs(&relocs64);
#else
sort_relocs(&relocs16);
@@ -1122,13 +1007,6 @@ static void emit_relocs(int as_text, int use_real_mode)
/* Now print each relocation */
for (i = 0; i < relocs64.count; i++)
write_reloc(relocs64.offset[i], stdout);
-
- /* Print a stop */
- write_reloc(0, stdout);
-
- /* Now print each inverse 32-bit relocation */
- for (i = 0; i < relocs32neg.count; i++)
- write_reloc(relocs32neg.offset[i], stdout);
#endif

/* Print a stop */
@@ -1179,8 +1057,6 @@ void process(FILE *fp, int use_real_mode, int as_text,
read_strtabs(fp);
read_symtabs(fp);
read_relocs(fp);
- if (ELF_BITS == 64)
- percpu_init();
if (show_absolute_syms) {
print_absolute_symbols();
return;
--
2.41.0

2023-11-15 17:39:16

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 07/14] objtool: Convert fixed location stack protector accesses

Older versions of GCC fixed the location of the stack protector canary
at %gs:40. Use objtool to convert these accesses to normal percpu
accesses to __stack_chk_guard.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/Kconfig | 4 ++
scripts/Makefile.lib | 2 +
tools/objtool/arch/x86/decode.c | 46 +++++++++++++
tools/objtool/arch/x86/special.c | 88 +++++++++++++++++++++++++
tools/objtool/builtin-check.c | 9 ++-
tools/objtool/check.c | 12 ++++
tools/objtool/elf.c | 34 ++++++++--
tools/objtool/include/objtool/arch.h | 3 +
tools/objtool/include/objtool/builtin.h | 2 +
tools/objtool/include/objtool/elf.h | 6 ++
10 files changed, 201 insertions(+), 5 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a1d2f7fe42bb..6cee46127fd2 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -416,6 +416,10 @@ config CC_HAS_SANE_STACKPROTECTOR
We have to make sure stack protector is unconditionally disabled if
the compiler does not allow control of the segment and symbol.

+config STACKPROTECTOR_OBJTOOL
+ bool
+ default n
+
menu "Processor type and features"

config SMP
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 68d0134bdbf9..a15dc76c19d0 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -255,6 +255,8 @@ objtool := $(objtree)/tools/objtool/objtool
objtool-args-$(CONFIG_HAVE_JUMP_LABEL_HACK) += --hacks=jump_label
objtool-args-$(CONFIG_HAVE_NOINSTR_HACK) += --hacks=noinstr
objtool-args-$(CONFIG_CALL_DEPTH_TRACKING) += --hacks=skylake
+objtool-args-$(CONFIG_STACKPROTECTOR_OBJTOOL) += --hacks=stackprotector
+objtool-args-$(CONFIG_SMP) += --smp
objtool-args-$(CONFIG_X86_KERNEL_IBT) += --ibt
objtool-args-$(CONFIG_FINEIBT) += --cfi
objtool-args-$(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL) += --mcount
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index e327cd827135..53f3d7323259 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -144,6 +144,18 @@ static bool has_notrack_prefix(struct insn *insn)
return false;
}

+static bool has_gs_prefix(struct insn *insn)
+{
+ int i;
+
+ for (i = 0; i < insn->prefixes.nbytes; i++) {
+ if (insn->prefixes.bytes[i] == 0x65)
+ return true;
+ }
+
+ return false;
+}
+
int arch_decode_instruction(struct objtool_file *file, const struct section *sec,
unsigned long offset, unsigned int maxlen,
struct instruction *insn)
@@ -408,10 +420,44 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec

break;

+ case 0x2b:
+ case 0x3b:
+ case 0x39:
+ if (!rex_w)
+ break;
+
+ /* sub %gs:0x28, reg */
+ /* cmp %gs:0x28, reg */
+ /* cmp reg, %gs:0x28 */
+ if (has_gs_prefix(&ins) &&
+ modrm_mod == 0 &&
+ modrm_rm == 4 &&
+ sib_index == 4 &&
+ sib_base == 5 &&
+ ins.displacement.value == 0x28)
+ {
+ insn->type = INSN_STACKPROTECTOR;
+ break;
+ }
+
+ break;
+
case 0x8b:
if (!rex_w)
break;

+ /* mov %gs:0x28, reg */
+ if (has_gs_prefix(&ins) &&
+ modrm_mod == 0 &&
+ modrm_rm == 4 &&
+ sib_index == 4 &&
+ sib_base == 5 &&
+ ins.displacement.value == 0x28)
+ {
+ insn->type = INSN_STACKPROTECTOR;
+ break;
+ }
+
if (rm_is_mem(CFI_BP)) {

/* mov disp(%rbp), reg */
diff --git a/tools/objtool/arch/x86/special.c b/tools/objtool/arch/x86/special.c
index 29e949579ede..47c17452c899 100644
--- a/tools/objtool/arch/x86/special.c
+++ b/tools/objtool/arch/x86/special.c
@@ -3,6 +3,9 @@

#include <objtool/special.h>
#include <objtool/builtin.h>
+#include <objtool/warn.h>
+#include <objtool/check.h>
+#include <objtool/elf.h>

#define X86_FEATURE_POPCNT (4 * 32 + 23)
#define X86_FEATURE_SMAP (9 * 32 + 20)
@@ -137,3 +140,88 @@ struct reloc *arch_find_switch_table(struct objtool_file *file,

return rodata_reloc;
}
+
+/*
+ * Convert op %gs:0x28, reg -> op __stack_chk_guard(%rip), %reg
+ * op is mov, sub, or cmp.
+ */
+int arch_hack_stackprotector(struct objtool_file *file)
+{
+ struct section *sec;
+ struct symbol *__stack_chk_guard;
+ struct instruction *insn;
+
+ int i;
+
+ __stack_chk_guard = find_symbol_by_name(file->elf, "__stack_chk_guard");
+
+ for_each_sec(file, sec) {
+ int count = 0;
+ int idx;
+ struct section *rsec = sec->rsec;
+
+ sec_for_each_insn(file, sec, insn) {
+ if (insn->type == INSN_STACKPROTECTOR)
+ count++;
+ }
+
+ if (!count)
+ continue;
+
+ if (!__stack_chk_guard)
+ __stack_chk_guard = elf_create_undef_symbol(file->elf, "__stack_chk_guard");
+
+ if (!sec->rsec) {
+ idx = 0;
+ rsec = sec->rsec = elf_create_rela_section(file->elf, sec, count);
+ } else {
+ idx = sec_num_entries(rsec);
+ if (elf_extend_rela_section(file->elf, rsec, count))
+ return -1;
+ }
+
+ sec_for_each_insn(file, sec, insn) {
+ unsigned char *data = insn->sec->data->d_buf + insn->offset;
+
+ if (insn->type != INSN_STACKPROTECTOR)
+ continue;
+
+ if (insn->len != 9)
+ goto invalid;
+
+ /* Remove GS prefix if !SMP */
+ if (data[0] != 0x65)
+ goto invalid;
+ if (!opts.smp)
+ data[0] = 0x90;
+
+ /* Set Mod=00, R/M=101. Preserve Reg */
+ data[3] = (data[3] & 0x38) | 5;
+
+ /* Displacement 0 */
+ data[4] = 0;
+ data[5] = 0;
+ data[6] = 0;
+ data[7] = 0;
+
+ /* Pad with NOP */
+ data[8] = 0x90;
+
+ mark_sec_changed(file->elf, insn->sec, true);
+
+ if (!elf_init_reloc_data_sym(file->elf, insn->sec, insn->offset + 4, idx++, __stack_chk_guard, -4))
+ return -1;
+
+ continue;
+
+invalid:
+ fprintf(stderr, "Invalid stackprotector instruction at %s+0x%lx: ", insn->sec->name, insn->offset);
+ for (i = 0; i < insn->len; i++)
+ fprintf(stderr, "%02x ", data[i]);
+ fprintf(stderr, "\n");
+ return -1;
+ }
+ }
+
+ return 0;
+}
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 5e21cfb7661d..0ab2efb45c0e 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -62,12 +62,17 @@ static int parse_hacks(const struct option *opt, const char *str, int unset)
found = true;
}

+ if (!str || strstr(str, "stackprotector")) {
+ opts.hack_stackprotector = true;
+ found = true;
+ }
+
return found ? 0 : -1;
}

static const struct option check_options[] = {
OPT_GROUP("Actions:"),
- OPT_CALLBACK_OPTARG('h', "hacks", NULL, NULL, "jump_label,noinstr,skylake", "patch toolchain bugs/limitations", parse_hacks),
+ OPT_CALLBACK_OPTARG('h', "hacks", NULL, NULL, "jump_label,noinstr,skylake,stackprotector", "patch toolchain bugs/limitations", parse_hacks),
OPT_BOOLEAN('i', "ibt", &opts.ibt, "validate and annotate IBT"),
OPT_BOOLEAN('m', "mcount", &opts.mcount, "annotate mcount/fentry calls for ftrace"),
OPT_BOOLEAN('n', "noinstr", &opts.noinstr, "validate noinstr rules"),
@@ -94,6 +99,7 @@ static const struct option check_options[] = {
OPT_BOOLEAN(0, "sec-address", &opts.sec_address, "print section addresses in warnings"),
OPT_BOOLEAN(0, "stats", &opts.stats, "print statistics"),
OPT_BOOLEAN('v', "verbose", &opts.verbose, "verbose warnings"),
+ OPT_BOOLEAN(0, "smp", &opts.smp, "building an SMP kernel"),

OPT_END(),
};
@@ -133,6 +139,7 @@ static bool opts_valid(void)
{
if (opts.hack_jump_label ||
opts.hack_noinstr ||
+ opts.hack_stackprotector ||
opts.ibt ||
opts.mcount ||
opts.noinstr ||
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index ac304140c395..57f080ca0195 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1315,6 +1315,11 @@ __weak bool arch_is_embedded_insn(struct symbol *sym)
return false;
}

+__weak int arch_hack_stackprotector(struct objtool_file *file)
+{
+ return 0;
+}
+
static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
{
struct reloc *reloc;
@@ -4812,6 +4817,13 @@ int check(struct objtool_file *file)
warnings += ret;
}

+ if (opts.hack_stackprotector) {
+ ret = arch_hack_stackprotector(file);
+ if (ret < 0)
+ goto out;
+ warnings += ret;
+ }
+
free_insns(file);

if (opts.verbose)
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index cfb970727c8a..2af99b2a054c 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -846,6 +846,32 @@ elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size)
return sym;
}

+struct symbol *
+elf_create_undef_symbol(struct elf *elf, const char *sym_name)
+{
+ struct symbol *sym = calloc(1, sizeof(*sym));
+ char *name = strdup(sym_name);
+
+ if (!sym || !name) {
+ perror("malloc");
+ return NULL;
+ }
+
+ sym->name = name;
+ sym->sec = find_section_by_index(elf, 0);
+
+ sym->sym.st_name = elf_add_string(elf, NULL, name);
+ sym->sym.st_info = GELF_ST_INFO(STB_GLOBAL, STT_NOTYPE);
+ sym->sym.st_value = 0;
+ sym->sym.st_size = 0;
+
+ sym = __elf_create_symbol(elf, sym);
+ if (sym)
+ elf_add_symbol(elf, sym);
+
+ return sym;
+}
+
static struct reloc *elf_init_reloc(struct elf *elf, struct section *rsec,
unsigned int reloc_idx,
unsigned long offset, struct symbol *sym,
@@ -924,7 +950,7 @@ struct reloc *elf_init_reloc_data_sym(struct elf *elf, struct section *sec,
struct symbol *sym,
s64 addend)
{
- if (sym->sec && (sec->sh.sh_flags & SHF_EXECINSTR)) {
+ if (sym->sec && (sym->sec->sh.sh_flags & SHF_EXECINSTR)) {
WARN("bad call to %s() for text symbol %s",
__func__, sym->name);
return NULL;
@@ -1196,9 +1222,9 @@ struct section *elf_create_section(struct elf *elf, const char *name,
return sec;
}

-static struct section *elf_create_rela_section(struct elf *elf,
- struct section *sec,
- unsigned int reloc_nr)
+struct section *elf_create_rela_section(struct elf *elf,
+ struct section *sec,
+ unsigned int reloc_nr)
{
struct section *rsec;
struct reloc_block *block;
diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
index 0b303eba660e..c60fec88b3af 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -28,6 +28,7 @@ enum insn_type {
INSN_CLD,
INSN_TRAP,
INSN_ENDBR,
+ INSN_STACKPROTECTOR,
INSN_OTHER,
};

@@ -96,4 +97,6 @@ int arch_rewrite_retpolines(struct objtool_file *file);

bool arch_pc_relative_reloc(struct reloc *reloc);

+int arch_hack_stackprotector(struct objtool_file *file);
+
#endif /* _ARCH_H */
diff --git a/tools/objtool/include/objtool/builtin.h b/tools/objtool/include/objtool/builtin.h
index fcca6662c8b4..5085d3135e6b 100644
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -13,6 +13,7 @@ struct opts {
bool hack_jump_label;
bool hack_noinstr;
bool hack_skylake;
+ bool hack_stackprotector;
bool ibt;
bool mcount;
bool noinstr;
@@ -38,6 +39,7 @@ struct opts {
bool sec_address;
bool stats;
bool verbose;
+ bool smp;
};

extern struct opts opts;
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 7851467f6878..b5eec9e4a65d 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -120,6 +120,10 @@ struct elf *elf_open_read(const char *name, int flags);
struct section *elf_create_section(struct elf *elf, const char *name,
size_t entsize, unsigned int nr);

+struct section *elf_create_rela_section(struct elf *elf,
+ struct section *sec,
+ unsigned int reloc_nr);
+
int elf_extend_rela_section(struct elf *elf,
struct section *rsec,
int add_relocs);
@@ -130,6 +134,8 @@ struct section *elf_create_section_pair(struct elf *elf, const char *name,

struct symbol *elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size);

+struct symbol *elf_create_undef_symbol(struct elf *elf, const char *sym_name);
+
struct reloc *elf_init_reloc_text_sym(struct elf *elf, struct section *sec,
unsigned long offset,
unsigned int reloc_idx,
--
2.41.0

2023-11-15 17:39:21

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 09/14] x86/percpu/64: Use relative percpu offsets

The percpu section is currently linked at virtual address 0, because
older compilers hardcoded the stack protector canary value at a fixed
offset from the start of the GS segment. Now that the canary is a
normal percpu variable, the percpu section can be linked normally.
This means that x86-64 will calculate percpu offsets like most other
architectures, as the delta between the initial percpu address and the
dynamically allocated memory.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/include/asm/processor.h | 6 +++++-
arch/x86/kernel/head_64.S | 19 +++++++++----------
arch/x86/kernel/setup_percpu.c | 12 ++----------
arch/x86/kernel/vmlinux.lds.S | 29 +----------------------------
arch/x86/platform/pvh/head.S | 5 ++---
arch/x86/tools/relocs.c | 10 +++-------
arch/x86/xen/xen-head.S | 9 ++++-----
init/Kconfig | 2 +-
8 files changed, 27 insertions(+), 65 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 3ee091225904..73fa9d4d2e16 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -403,7 +403,11 @@ DECLARE_INIT_PER_CPU(fixed_percpu_data);

static inline unsigned long cpu_kernelmode_gs_base(int cpu)
{
- return (unsigned long)per_cpu(fixed_percpu_data.gs_base, cpu);
+#ifdef CONFIG_SMP
+ return per_cpu_offset(cpu);
+#else
+ return 0;
+#endif
}

extern asmlinkage void entry_SYSCALL32_ignore(void);
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 0d94d2a091fe..fe73e1c4cc5d 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -70,11 +70,14 @@ SYM_CODE_START_NOALIGN(startup_64)

leaq _text(%rip), %rdi

- /* Setup GSBASE to allow stack canary access for C code */
+ /*
+ * Set up GSBASE.
+ * Note that, on SMP, the boot cpu uses init data section until
+ * the per cpu areas are set up.
+ */
movl $MSR_GS_BASE, %ecx
- leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
- movl %edx, %eax
- shrq $32, %rdx
+ xorl %eax, %eax
+ xorl %edx, %edx
wrmsr

call startup_64_setup_env
@@ -343,16 +346,12 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
movl %eax,%fs
movl %eax,%gs

- /* Set up %gs.
- *
- * The base of %gs always points to fixed_percpu_data.
+ /*
+ * Set up GSBASE.
* Note that, on SMP, the boot cpu uses init data section until
* the per cpu areas are set up.
*/
movl $MSR_GS_BASE,%ecx
-#ifndef CONFIG_SMP
- leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
-#endif
movl %edx, %eax
shrq $32, %rdx
wrmsr
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 2c97bf7b56ae..8707dd07b9ce 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -23,18 +23,10 @@
#include <asm/cpumask.h>
#include <asm/cpu.h>

-#ifdef CONFIG_X86_64
-#define BOOT_PERCPU_OFFSET ((unsigned long)__per_cpu_load)
-#else
-#define BOOT_PERCPU_OFFSET 0
-#endif
-
-DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off) = BOOT_PERCPU_OFFSET;
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off);
EXPORT_PER_CPU_SYMBOL(this_cpu_off);

-unsigned long __per_cpu_offset[NR_CPUS] __ro_after_init = {
- [0 ... NR_CPUS-1] = BOOT_PERCPU_OFFSET,
-};
+unsigned long __per_cpu_offset[NR_CPUS] __ro_after_init;
EXPORT_SYMBOL(__per_cpu_offset);

/*
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 1239be7cc8d8..57a83fb2d8a0 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -103,12 +103,6 @@ const_pcpu_hot = pcpu_hot;
PHDRS {
text PT_LOAD FLAGS(5); /* R_E */
data PT_LOAD FLAGS(6); /* RW_ */
-#ifdef CONFIG_X86_64
-#ifdef CONFIG_SMP
- percpu PT_LOAD FLAGS(6); /* RW_ */
-#endif
- init PT_LOAD FLAGS(7); /* RWE */
-#endif
note PT_NOTE FLAGS(0); /* ___ */
}

@@ -224,21 +218,7 @@ SECTIONS
__init_begin = .; /* paired with __init_end */
}

-#if defined(CONFIG_X86_64) && defined(CONFIG_SMP)
- /*
- * percpu offsets are zero-based on SMP. PERCPU_VADDR() changes the
- * output PHDR, so the next output section - .init.text - should
- * start another segment - init.
- */
- PERCPU_VADDR(INTERNODE_CACHE_BYTES, 0, :percpu)
- ASSERT(SIZEOF(.data..percpu) < CONFIG_PHYSICAL_START,
- "per-CPU data too large - increase CONFIG_PHYSICAL_START")
-#endif
-
INIT_TEXT_SECTION(PAGE_SIZE)
-#ifdef CONFIG_X86_64
- :init
-#endif

/*
* Section for code used exclusively before alternatives are run. All
@@ -368,9 +348,7 @@ SECTIONS
EXIT_DATA
}

-#if !defined(CONFIG_X86_64) || !defined(CONFIG_SMP)
PERCPU_SECTION(INTERNODE_CACHE_BYTES)
-#endif

. = ALIGN(PAGE_SIZE);

@@ -508,16 +486,11 @@ SECTIONS
* Per-cpu symbols which need to be offset from __per_cpu_load
* for the boot processor.
*/
-#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x) + __per_cpu_load
+#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x)
INIT_PER_CPU(gdt_page);
INIT_PER_CPU(fixed_percpu_data);
INIT_PER_CPU(irq_stack_backing_store);

-#ifdef CONFIG_SMP
-. = ASSERT((fixed_percpu_data == 0),
- "fixed_percpu_data is not at start of per-cpu area");
-#endif
-
#ifdef CONFIG_CPU_UNRET_ENTRY
. = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
#endif
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index fab90368481f..2ce07dffc314 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -100,9 +100,8 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
* the per cpu areas are set up.
*/
mov $MSR_GS_BASE,%ecx
- lea INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
- mov %edx, %eax
- shr $32, %rdx
+ xor %eax, %eax
+ xor %edx, %edx
wrmsr

call xen_prepare_pvh
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 24ad10c62840..ef355242a8d8 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -822,12 +822,7 @@ static void percpu_init(void)
*/
static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
{
- int shndx = sym_index(sym);
-
- return (shndx == per_cpu_shndx) &&
- strcmp(symname, "__init_begin") &&
- strcmp(symname, "__per_cpu_load") &&
- strncmp(symname, "init_per_cpu_", 13);
+ return 0;
}


@@ -1051,7 +1046,8 @@ static int cmp_relocs(const void *va, const void *vb)

static void sort_relocs(struct relocs *r)
{
- qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
+ if (r->count)
+ qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
}

static int write32(uint32_t v, FILE *f)
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 30f27e757354..7e8754c5fa1d 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -51,15 +51,14 @@ SYM_CODE_START(startup_xen)

leaq (__end_init_task - PTREGS_SIZE)(%rip), %rsp

- /* Set up %gs.
- *
- * The base of %gs always points to fixed_percpu_data.
+ /*
+ * Set up GSBASE.
* Note that, on SMP, the boot cpu uses init data section until
* the per cpu areas are set up.
*/
movl $MSR_GS_BASE,%ecx
- movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
- cdq
+ xorl %eax, %eax
+ xorl %edx, %edx
wrmsr

mov %rsi, %rdi
diff --git a/init/Kconfig b/init/Kconfig
index 9ffb103fc927..5f2c1f4a16aa 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1718,7 +1718,7 @@ config KALLSYMS_ALL
config KALLSYMS_ABSOLUTE_PERCPU
bool
depends on KALLSYMS
- default X86_64 && SMP
+ default n

config KALLSYMS_BASE_RELATIVE
bool
--
2.41.0

2023-11-15 17:39:23

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 10/14] x86/percpu/64: Remove fixed_percpu_data

Now that the stack protector canary value is a normal percpu variable,
fixed_percpu_data is unused and can be removed.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/include/asm/processor.h | 8 --------
arch/x86/kernel/cpu/common.c | 4 ----
arch/x86/kernel/vmlinux.lds.S | 1 -
arch/x86/tools/relocs.c | 1 -
4 files changed, 14 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 73fa9d4d2e16..f84c8d3ca75d 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -393,14 +393,6 @@ struct irq_stack {
} __aligned(IRQ_STACK_SIZE);

#ifdef CONFIG_X86_64
-struct fixed_percpu_data {
- char gs_base[40];
- unsigned long reserved;
-};
-
-DECLARE_PER_CPU_FIRST(struct fixed_percpu_data, fixed_percpu_data) __visible;
-DECLARE_INIT_PER_CPU(fixed_percpu_data);
-
static inline unsigned long cpu_kernelmode_gs_base(int cpu)
{
#ifdef CONFIG_SMP
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index b5b1d95b1399..a7792479ebe1 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2059,10 +2059,6 @@ EXPORT_PER_CPU_SYMBOL(pcpu_hot);
EXPORT_PER_CPU_SYMBOL(const_pcpu_hot);

#ifdef CONFIG_X86_64
-DEFINE_PER_CPU_FIRST(struct fixed_percpu_data,
- fixed_percpu_data) __aligned(PAGE_SIZE) __visible;
-EXPORT_PER_CPU_SYMBOL_GPL(fixed_percpu_data);
-
static void wrmsrl_cstar(unsigned long val)
{
/*
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 57a83fb2d8a0..efa4885060b5 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -488,7 +488,6 @@ SECTIONS
*/
#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x)
INIT_PER_CPU(gdt_page);
-INIT_PER_CPU(fixed_percpu_data);
INIT_PER_CPU(irq_stack_backing_store);

#ifdef CONFIG_CPU_UNRET_ENTRY
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index ef355242a8d8..3b0cfddd8b27 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -817,7 +817,6 @@ static void percpu_init(void)
* __per_cpu_load
*
* The "gold" linker incorrectly associates:
- * init_per_cpu__fixed_percpu_data
* init_per_cpu__gdt_page
*/
static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
--
2.41.0

2023-11-15 17:39:22

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 08/14] x86/stackprotector/64: Convert to normal percpu variable

Older versions of GCC fixed the location of the stack protector canary
at %gs:40. This constraint forced the percpu section to be linked at
virtual address 0 so that the canary could be the first data object in
the percpu section. Supporting the zero-based percpu section requires
additional code to handle relocations for RIP-relative references to
percpu data, extra complexity to kallsyms, and workarounds for linker
bugs due to the use of absolute symbols.

Use compiler options to redefine the stack protector location if
supported, otherwise use objtool. This will remove the contraint that
the percpu section must be zero-based.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/Kconfig | 11 ++++----
arch/x86/Makefile | 21 ++++++++++------
arch/x86/entry/entry_64.S | 2 +-
arch/x86/include/asm/processor.h | 16 ++----------
arch/x86/include/asm/stackprotector.h | 36 ++++-----------------------
arch/x86/kernel/asm-offsets_64.c | 6 -----
arch/x86/kernel/cpu/common.c | 4 +--
arch/x86/kernel/head_64.S | 3 +--
arch/x86/xen/xen-head.S | 3 +--
9 files changed, 30 insertions(+), 72 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6cee46127fd2..83404b741c0a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -266,7 +266,7 @@ config X86
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_SETUP_PER_CPU_AREA
select HAVE_SOFTIRQ_ON_OWN_STACK
- select HAVE_STACKPROTECTOR if CC_HAS_SANE_STACKPROTECTOR
+ select HAVE_STACKPROTECTOR if X86_64 || CC_HAS_SANE_STACKPROTECTOR
select HAVE_STACK_VALIDATION if HAVE_OBJTOOL
select HAVE_STATIC_CALL
select HAVE_STATIC_CALL_INLINE if HAVE_OBJTOOL
@@ -410,15 +410,14 @@ config PGTABLE_LEVELS

config CC_HAS_SANE_STACKPROTECTOR
bool
- default y if 64BIT
+ default $(cc-option,-mstack-protector-guard-reg=gs -mstack-protector-guard-symbol=__stack_chk_guard) if 64BIT
default $(cc-option,-mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard)
- help
- We have to make sure stack protector is unconditionally disabled if
- the compiler does not allow control of the segment and symbol.

config STACKPROTECTOR_OBJTOOL
bool
- default n
+ depends on X86_64 && STACKPROTECTOR
+ default !CC_HAS_SANE_STACKPROTECTOR
+ prompt "Debug objtool stack protector conversion" if CC_HAS_SANE_STACKPROTECTOR && DEBUG_KERNEL

menu "Processor type and features"

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 1a068de12a56..06a79361e88f 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -112,13 +112,7 @@ ifeq ($(CONFIG_X86_32),y)
# temporary until string.h is fixed
KBUILD_CFLAGS += -ffreestanding

- ifeq ($(CONFIG_STACKPROTECTOR),y)
- ifeq ($(CONFIG_SMP),y)
- KBUILD_CFLAGS += -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard
- else
- KBUILD_CFLAGS += -mstack-protector-guard=global
- endif
- endif
+ percpu_seg := fs
else
BITS := 64
UTS_MACHINE := x86_64
@@ -168,6 +162,19 @@ else
KBUILD_CFLAGS += -mcmodel=kernel
KBUILD_RUSTFLAGS += -Cno-redzone=y
KBUILD_RUSTFLAGS += -Ccode-model=kernel
+
+ percpu_seg := gs
+endif
+
+ifeq ($(CONFIG_STACKPROTECTOR),y)
+ ifneq ($(CONFIG_STACKPROTECTOR_OBJTOOL),y)
+ ifeq ($(CONFIG_SMP),y)
+ KBUILD_CFLAGS += -mstack-protector-guard-reg=$(percpu_seg)
+ KBUILD_CFLAGS += -mstack-protector-guard-symbol=__stack_chk_guard
+ else
+ KBUILD_CFLAGS += -mstack-protector-guard=global
+ endif
+ endif
endif

#
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 1a88ad8a7b48..cddcc236aaae 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -190,7 +190,7 @@ SYM_FUNC_START(__switch_to_asm)

#ifdef CONFIG_STACKPROTECTOR
movq TASK_stack_canary(%rsi), %rbx
- movq %rbx, PER_CPU_VAR(fixed_percpu_data + FIXED_stack_canary)
+ movq %rbx, PER_CPU_VAR(__stack_chk_guard)
#endif

/*
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 061aa86b4662..3ee091225904 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -394,16 +394,8 @@ struct irq_stack {

#ifdef CONFIG_X86_64
struct fixed_percpu_data {
- /*
- * GCC hardcodes the stack canary as %gs:40. Since the
- * irq_stack is the object at %gs:0, we reserve the bottom
- * 48 bytes of the irq stack for the canary.
- *
- * Once we are willing to require -mstack-protector-guard-symbol=
- * support for x86_64 stackprotector, we can get rid of this.
- */
char gs_base[40];
- unsigned long stack_canary;
+ unsigned long reserved;
};

DECLARE_PER_CPU_FIRST(struct fixed_percpu_data, fixed_percpu_data) __visible;
@@ -418,11 +410,7 @@ extern asmlinkage void entry_SYSCALL32_ignore(void);

/* Save actual FS/GS selectors and bases to current->thread */
void current_save_fsgs(void);
-#else /* X86_64 */
-#ifdef CONFIG_STACKPROTECTOR
-DECLARE_PER_CPU(unsigned long, __stack_chk_guard);
-#endif
-#endif /* !X86_64 */
+#endif /* X86_64 */

struct perf_event;

diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index 00473a650f51..d43fb589fcf6 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -2,26 +2,10 @@
/*
* GCC stack protector support.
*
- * Stack protector works by putting predefined pattern at the start of
+ * Stack protector works by putting a predefined pattern at the start of
* the stack frame and verifying that it hasn't been overwritten when
- * returning from the function. The pattern is called stack canary
- * and unfortunately gcc historically required it to be at a fixed offset
- * from the percpu segment base. On x86_64, the offset is 40 bytes.
- *
- * The same segment is shared by percpu area and stack canary. On
- * x86_64, percpu symbols are zero based and %gs (64-bit) points to the
- * base of percpu area. The first occupant of the percpu area is always
- * fixed_percpu_data which contains stack_canary at the appropriate
- * offset. On x86_32, the stack canary is just a regular percpu
- * variable.
- *
- * Putting percpu data in %fs on 32-bit is a minor optimization compared to
- * using %gs. Since 32-bit userspace normally has %fs == 0, we are likely
- * to load 0 into %fs on exit to usermode, whereas with percpu data in
- * %gs, we are likely to load a non-null %gs on return to user mode.
- *
- * Once we are willing to require GCC 8.1 or better for 64-bit stackprotector
- * support, we can remove some of this complexity.
+ * returning from the function. The pattern is called the stack canary
+ * and is a unique value for each task.
*/

#ifndef _ASM_STACKPROTECTOR_H
@@ -36,6 +20,8 @@

#include <linux/sched.h>

+DECLARE_PER_CPU(unsigned long, __stack_chk_guard);
+
/*
* Initialize the stackprotector canary value.
*
@@ -51,25 +37,13 @@ static __always_inline void boot_init_stack_canary(void)
{
unsigned long canary = get_random_canary();

-#ifdef CONFIG_X86_64
- BUILD_BUG_ON(offsetof(struct fixed_percpu_data, stack_canary) != 40);
-#endif
-
current->stack_canary = canary;
-#ifdef CONFIG_X86_64
- this_cpu_write(fixed_percpu_data.stack_canary, canary);
-#else
this_cpu_write(__stack_chk_guard, canary);
-#endif
}

static inline void cpu_init_stack_canary(int cpu, struct task_struct *idle)
{
-#ifdef CONFIG_X86_64
- per_cpu(fixed_percpu_data.stack_canary, cpu) = idle->stack_canary;
-#else
per_cpu(__stack_chk_guard, cpu) = idle->stack_canary;
-#endif
}

#else /* STACKPROTECTOR */
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index bb65371ea9df..590b6cd0eac0 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -54,11 +54,5 @@ int main(void)
BLANK();
#undef ENTRY

- BLANK();
-
-#ifdef CONFIG_STACKPROTECTOR
- OFFSET(FIXED_stack_canary, fixed_percpu_data, stack_canary);
- BLANK();
-#endif
return 0;
}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4d4b87c6885d..b5b1d95b1399 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2111,15 +2111,13 @@ void syscall_init(void)
X86_EFLAGS_AC|X86_EFLAGS_ID);
}

-#else /* CONFIG_X86_64 */
+#endif /* CONFIG_X86_64 */

#ifdef CONFIG_STACKPROTECTOR
DEFINE_PER_CPU(unsigned long, __stack_chk_guard);
EXPORT_PER_CPU_SYMBOL(__stack_chk_guard);
#endif

-#endif /* CONFIG_X86_64 */
-
/*
* Clear all 6 debug registers:
*/
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 3dcabbc49149..0d94d2a091fe 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -345,8 +345,7 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)

/* Set up %gs.
*
- * The base of %gs always points to fixed_percpu_data. If the
- * stack protector canary is enabled, it is located at %gs:40.
+ * The base of %gs always points to fixed_percpu_data.
* Note that, on SMP, the boot cpu uses init data section until
* the per cpu areas are set up.
*/
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index a0ea285878db..30f27e757354 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -53,8 +53,7 @@ SYM_CODE_START(startup_xen)

/* Set up %gs.
*
- * The base of %gs always points to fixed_percpu_data. If the
- * stack protector canary is enabled, it is located at %gs:40.
+ * The base of %gs always points to fixed_percpu_data.
* Note that, on SMP, the boot cpu uses init data section until
* the per cpu areas are set up.
*/
--
2.41.0

2023-11-15 17:39:27

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 12/14] x86/percpu/64: Remove INIT_PER_CPU macros

The load and link addresses of percpu variables are now the same, so
these macros are no longer necessary.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/include/asm/percpu.h | 22 ----------------------
arch/x86/kernel/irq_64.c | 1 -
arch/x86/kernel/vmlinux.lds.S | 7 -------
arch/x86/tools/relocs.c | 1 -
4 files changed, 31 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index b86b27d15e52..7a176381ee01 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -20,12 +20,6 @@

#define PER_CPU_VAR(var) __percpu(var)__percpu_rel

-#ifdef CONFIG_X86_64_SMP
-#define INIT_PER_CPU_VAR(var) init_per_cpu__##var
-#else
-#define INIT_PER_CPU_VAR(var) var
-#endif
-
#else /* ...!ASSEMBLY */

#include <linux/kernel.h>
@@ -96,22 +90,6 @@
#define __percpu_arg(x) __percpu_prefix "%" #x
#define __force_percpu_arg(x) __force_percpu_prefix "%" #x

-/*
- * Initialized pointers to per-cpu variables needed for the boot
- * processor need to use these macros to get the proper address
- * offset from __per_cpu_load on SMP.
- *
- * There also must be an entry in vmlinux_64.lds.S
- */
-#define DECLARE_INIT_PER_CPU(var) \
- extern typeof(var) init_per_cpu_var(var)
-
-#ifdef CONFIG_X86_64_SMP
-#define init_per_cpu_var(var) init_per_cpu__##var
-#else
-#define init_per_cpu_var(var) var
-#endif
-
/* For arch-specific code, we can use direct single-insn ops (they
* don't give an lvalue though). */

diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c
index fe0c859873d1..30424f9876bc 100644
--- a/arch/x86/kernel/irq_64.c
+++ b/arch/x86/kernel/irq_64.c
@@ -26,7 +26,6 @@
#include <asm/apic.h>

DEFINE_PER_CPU_PAGE_ALIGNED(struct irq_stack, irq_stack_backing_store) __visible;
-DECLARE_INIT_PER_CPU(irq_stack_backing_store);

#ifdef CONFIG_VMAP_STACK
/*
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index efa4885060b5..9aea7b6b02c7 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -482,13 +482,6 @@ SECTIONS
"kernel image bigger than KERNEL_IMAGE_SIZE");

#ifdef CONFIG_X86_64
-/*
- * Per-cpu symbols which need to be offset from __per_cpu_load
- * for the boot processor.
- */
-#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x)
-INIT_PER_CPU(gdt_page);
-INIT_PER_CPU(irq_stack_backing_store);

#ifdef CONFIG_CPU_UNRET_ENTRY
. = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index ae9bbf634826..70b7b0bf33d0 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -88,7 +88,6 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = {
"__initramfs_start|"
"(jiffies|jiffies_64)|"
#if ELF_BITS == 64
- "init_per_cpu__.*|"
"__end_rodata_hpage_align|"
#endif
"__vvar_page|"
--
2.41.0

2023-11-15 17:39:32

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 14/14] kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU

x86-64 was the only user.

Signed-off-by: Brian Gerst <[email protected]>
---
init/Kconfig | 11 +-----
kernel/kallsyms.c | 12 ++-----
scripts/kallsyms.c | 80 ++++++++---------------------------------
scripts/link-vmlinux.sh | 4 ---
4 files changed, 18 insertions(+), 89 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 5f2c1f4a16aa..b55a8f237b24 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1715,11 +1715,6 @@ config KALLSYMS_ALL

Say N unless you really need all symbols, or kernel live patching.

-config KALLSYMS_ABSOLUTE_PERCPU
- bool
- depends on KALLSYMS
- default n
-
config KALLSYMS_BASE_RELATIVE
bool
depends on KALLSYMS
@@ -1727,11 +1722,7 @@ config KALLSYMS_BASE_RELATIVE
help
Instead of emitting them as absolute values in the native word size,
emit the symbol references in the kallsyms table as 32-bit entries,
- each containing a relative value in the range [base, base + U32_MAX]
- or, when KALLSYMS_ABSOLUTE_PERCPU is in effect, each containing either
- an absolute value in the range [0, S32_MAX] or a relative value in the
- range [base, base + S32_MAX], where base is the lowest relative symbol
- address encountered in the image.
+ each containing a relative value in the range [base, base + U32_MAX].

On 64-bit builds, this reduces the size of the address table by 50%,
but more importantly, it results in entries whose values are build
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 18edd57b5fe8..f4e8e531052a 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -151,16 +151,8 @@ unsigned long kallsyms_sym_address(int idx)
if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
return kallsyms_addresses[idx];

- /* values are unsigned offsets if --absolute-percpu is not in effect */
- if (!IS_ENABLED(CONFIG_KALLSYMS_ABSOLUTE_PERCPU))
- return kallsyms_relative_base + (u32)kallsyms_offsets[idx];
-
- /* ...otherwise, positive offsets are absolute values */
- if (kallsyms_offsets[idx] >= 0)
- return kallsyms_offsets[idx];
-
- /* ...and negative offsets are relative to kallsyms_relative_base - 1 */
- return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
+ /* values are unsigned offsets */
+ return kallsyms_relative_base + (u32)kallsyms_offsets[idx];
}

static void cleanup_symbol_name(char *s)
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 653b92f6d4c8..501f978abf4b 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -5,8 +5,8 @@
* This software may be used and distributed according to the terms
* of the GNU General Public License, incorporated herein by reference.
*
- * Usage: kallsyms [--all-symbols] [--absolute-percpu]
- * [--base-relative] [--lto-clang] in.map > out.S
+ * Usage: kallsyms [--all-symbols] [--base-relative] [--lto-clang]
+ * in.map > out.S
*
* Table compression uses all the unused char codes on the symbols and
* maps these to the most used substrings (tokens). For instance, it might
@@ -37,7 +37,6 @@ struct sym_entry {
unsigned int len;
unsigned int seq;
unsigned int start_pos;
- unsigned int percpu_absolute;
unsigned char sym[];
};

@@ -55,14 +54,9 @@ static struct addr_range text_ranges[] = {
#define text_range_text (&text_ranges[0])
#define text_range_inittext (&text_ranges[1])

-static struct addr_range percpu_range = {
- "__per_cpu_start", "__per_cpu_end", -1ULL, 0
-};
-
static struct sym_entry **table;
static unsigned int table_size, table_cnt;
static int all_symbols;
-static int absolute_percpu;
static int base_relative;
static int lto_clang;

@@ -75,7 +69,7 @@ static unsigned char best_table_len[256];

static void usage(void)
{
- fprintf(stderr, "Usage: kallsyms [--all-symbols] [--absolute-percpu] "
+ fprintf(stderr, "Usage: kallsyms [--all-symbols] "
"[--base-relative] [--lto-clang] in.map > out.S\n");
exit(1);
}
@@ -167,7 +161,6 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len)
return NULL;

check_symbol_range(name, addr, text_ranges, ARRAY_SIZE(text_ranges));
- check_symbol_range(name, addr, &percpu_range, 1);

/* include the type field in the symbol name, so that it gets
* compressed together */
@@ -183,7 +176,6 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len)
sym->len = len;
sym->sym[0] = type;
strcpy(sym_name(sym), name);
- sym->percpu_absolute = 0;

return sym;
}
@@ -334,11 +326,6 @@ static int expand_symbol(const unsigned char *data, int len, char *result)
return total;
}

-static int symbol_absolute(const struct sym_entry *s)
-{
- return s->percpu_absolute;
-}
-
static void cleanup_symbol_name(char *s)
{
char *p;
@@ -499,30 +486,17 @@ static void write_src(void)
*/

long long offset;
- int overflow;
-
- if (!absolute_percpu) {
- offset = table[i]->addr - relative_base;
- overflow = (offset < 0 || offset > UINT_MAX);
- } else if (symbol_absolute(table[i])) {
- offset = table[i]->addr;
- overflow = (offset < 0 || offset > INT_MAX);
- } else {
- offset = relative_base - table[i]->addr - 1;
- overflow = (offset < INT_MIN || offset >= 0);
- }
- if (overflow) {
+
+ offset = table[i]->addr - relative_base;
+ if (offset < 0 || offset > UINT_MAX) {
fprintf(stderr, "kallsyms failure: "
- "%s symbol value %#llx out of range in relative mode\n",
- symbol_absolute(table[i]) ? "absolute" : "relative",
+ "symbol value %#llx out of range in relative mode\n",
table[i]->addr);
exit(EXIT_FAILURE);
}
printf("\t.long\t%#x /* %s */\n", (int)offset, table[i]->sym);
- } else if (!symbol_absolute(table[i])) {
- output_address(table[i]->addr);
} else {
- printf("\tPTR\t%#llx\n", table[i]->addr);
+ output_address(table[i]->addr);
}
}
printf("\n");
@@ -775,36 +749,15 @@ static void sort_symbols(void)
qsort(table, table_cnt, sizeof(table[0]), compare_symbols);
}

-static void make_percpus_absolute(void)
-{
- unsigned int i;
-
- for (i = 0; i < table_cnt; i++)
- if (symbol_in_range(table[i], &percpu_range, 1)) {
- /*
- * Keep the 'A' override for percpu symbols to
- * ensure consistent behavior compared to older
- * versions of this tool.
- */
- table[i]->sym[0] = 'A';
- table[i]->percpu_absolute = 1;
- }
-}
-
-/* find the minimum non-absolute symbol address */
+/* find the minimum symbol address */
static void record_relative_base(void)
{
- unsigned int i;
-
- for (i = 0; i < table_cnt; i++)
- if (!symbol_absolute(table[i])) {
- /*
- * The table is sorted by address.
- * Take the first non-absolute symbol value.
- */
- relative_base = table[i]->addr;
- return;
- }
+ /*
+ * The table is sorted by address.
+ * Take the first symbol value.
+ */
+ if (table_cnt)
+ relative_base = table[0]->addr;
}

int main(int argc, char **argv)
@@ -812,7 +765,6 @@ int main(int argc, char **argv)
while (1) {
static const struct option long_options[] = {
{"all-symbols", no_argument, &all_symbols, 1},
- {"absolute-percpu", no_argument, &absolute_percpu, 1},
{"base-relative", no_argument, &base_relative, 1},
{"lto-clang", no_argument, &lto_clang, 1},
{},
@@ -831,8 +783,6 @@ int main(int argc, char **argv)

read_map(argv[optind]);
shrink_table();
- if (absolute_percpu)
- make_percpus_absolute();
sort_symbols();
if (base_relative)
record_relative_base();
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index a432b171be82..d25b6d5de45e 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -148,10 +148,6 @@ kallsyms()
kallsymopt="${kallsymopt} --all-symbols"
fi

- if is_enabled CONFIG_KALLSYMS_ABSOLUTE_PERCPU; then
- kallsymopt="${kallsymopt} --absolute-percpu"
- fi
-
if is_enabled CONFIG_KALLSYMS_BASE_RELATIVE; then
kallsymopt="${kallsymopt} --base-relative"
fi
--
2.41.0

2023-11-15 17:39:50

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 13/14] percpu: Remove PER_CPU_FIRST_SECTION

x86-64 was the only user.

Signed-off-by: Brian Gerst <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 1 -
include/linux/percpu-defs.h | 12 ------------
2 files changed, 13 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index bae0fe4d499b..579bd5ad09b9 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -1026,7 +1026,6 @@
*/
#define PERCPU_INPUT(cacheline) \
__per_cpu_start = .; \
- *(.data..percpu..first) \
. = ALIGN(PAGE_SIZE); \
*(.data..percpu..page_aligned) \
. = ALIGN(cacheline); \
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index ec3573119923..b9ddee91e6c7 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -26,13 +26,11 @@
#define PER_CPU_SHARED_ALIGNED_SECTION "..shared_aligned"
#define PER_CPU_ALIGNED_SECTION "..shared_aligned"
#endif
-#define PER_CPU_FIRST_SECTION "..first"

#else

#define PER_CPU_SHARED_ALIGNED_SECTION ""
#define PER_CPU_ALIGNED_SECTION "..shared_aligned"
-#define PER_CPU_FIRST_SECTION ""

#endif

@@ -114,16 +112,6 @@
#define DEFINE_PER_CPU(type, name) \
DEFINE_PER_CPU_SECTION(type, name, "")

-/*
- * Declaration/definition used for per-CPU variables that must come first in
- * the set of variables.
- */
-#define DECLARE_PER_CPU_FIRST(type, name) \
- DECLARE_PER_CPU_SECTION(type, name, PER_CPU_FIRST_SECTION)
-
-#define DEFINE_PER_CPU_FIRST(type, name) \
- DEFINE_PER_CPU_SECTION(type, name, PER_CPU_FIRST_SECTION)
-
/*
* Declaration/definition used for per-CPU variables that must be cacheline
* aligned under SMP conditions so that, whilst a particular instance of the
--
2.41.0

2023-11-15 17:39:55

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 05/14] x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations

Clang may produce R_X86_64_REX_GOTPCRELX relocations when redefining the
stack protector location. Treat them as another type of PC-relative
relocation.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/tools/relocs.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index d30949e25ebd..24ad10c62840 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -31,6 +31,11 @@ static struct relocs relocs32;
static struct relocs relocs32neg;
static struct relocs relocs64;
#define FMT PRIu64
+
+#ifndef R_X86_64_REX_GOTPCRELX
+#define R_X86_64_REX_GOTPCRELX 42
+#endif
+
#else
#define FMT PRIu32
#endif
@@ -224,6 +229,7 @@ static const char *rel_type(unsigned type)
REL_TYPE(R_X86_64_PC16),
REL_TYPE(R_X86_64_8),
REL_TYPE(R_X86_64_PC8),
+ REL_TYPE(R_X86_64_REX_GOTPCRELX),
#else
REL_TYPE(R_386_NONE),
REL_TYPE(R_386_32),
@@ -848,6 +854,7 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,

case R_X86_64_PC32:
case R_X86_64_PLT32:
+ case R_X86_64_REX_GOTPCRELX:
/*
* PC relative relocations don't need to be adjusted unless
* referencing a percpu symbol.
--
2.41.0

2023-11-15 17:39:56

by Brian Gerst

[permalink] [raw]
Subject: [PATCH v3 04/14] x86/pvh: Use fixed_percpu_data for early boot GSBASE

Instead of having a private area for the stack canary, use
fixed_percpu_data for GSBASE like the native kernel.

Signed-off-by: Brian Gerst <[email protected]>
---
arch/x86/platform/pvh/head.S | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index c4365a05ab83..fab90368481f 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -94,10 +94,15 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
/* 64-bit entry point. */
.code64
1:
- /* Set base address in stack canary descriptor. */
+ /*
+ * Set up GSBASE.
+ * Note that, on SMP, the boot cpu uses init data section until
+ * the per cpu areas are set up.
+ */
mov $MSR_GS_BASE,%ecx
- mov $_pa(canary), %eax
- xor %edx, %edx
+ lea INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
+ mov %edx, %eax
+ shr $32, %rdx
wrmsr

call xen_prepare_pvh
@@ -156,8 +161,6 @@ SYM_DATA_START_LOCAL(gdt_start)
SYM_DATA_END_LABEL(gdt_start, SYM_L_LOCAL, gdt_end)

.balign 16
-SYM_DATA_LOCAL(canary, .fill 48, 1, 0)
-
SYM_DATA_START_LOCAL(early_stack)
.fill BOOT_STACK_SIZE, 1, 0
SYM_DATA_END_LABEL(early_stack, SYM_L_LOCAL, early_stack_end)
--
2.41.0