After multiple attempts, this patchset is now based on the fact that the
64b kernel mapping was moved outside the linear mapping.
The first patch allows to build relocatable kernels but is not selected
by default. That patch is a requirement for KASLR.
The second and third patches take advantage of an already existing powerpc
script that checks relocations at compile-time, and uses it for riscv.
This patchset is rebased on top of:
RISC-V kasan rework (https://lore.kernel.org/lkml/Y6TTvku%2FyuSjm42j@spud/T/)
riscv: Use PUD/P4D/PGD pages for the linear mapping (https://lore.kernel.org/lkml/20230125114229.hrhsyw4aegrnmoau@orel/T/)
riscv: Allow to downgrade paging mode from the command line (https://lore.kernel.org/lkml/CAHVXubjeSMvfTPnvrnYRupOGx6+vUvUGfRS3piTeo=TH2cHKNg@mail.gmail.com/)
base-commit-tag: v6.2-rc7
Changes in v8:
* Fix UEFI boot by moving rela.dyn section into the data so that PE/COFF
loader actually copies the relocations too
* Fix check that used PGDIR instead of PUD which was not correct
for sv48 and sv57
* Fix PE/COFF header data size definition as it led to size of 0
Changes in v7:
* Rebase on top of v5.15
* Fix LDFLAGS_vmlinux which was overriden when CONFIG_DYNAMIC_FTRACE was
set
* Make relocate_kernel static
* Add Ack from Michael
Changes in v6:
* Remove the kernel move to vmalloc zone
* Rebased on top of for-next
* Remove relocatable property from 32b kernel as the kernel is mapped in
the linear mapping and would then need to be copied physically too
* CONFIG_RELOCATABLE depends on !XIP_KERNEL
* Remove Reviewed-by from first patch as it changed a bit
Changes in v5:
* Add "static __init" to create_kernel_page_table function as reported by
Kbuild test robot
* Add reviewed-by from Zong
* Rebase onto v5.7
Changes in v4:
* Fix BPF region that overlapped with kernel's as suggested by Zong
* Fix end of module region that could be larger than 2GB as suggested by Zong
* Fix the size of the vm area reserved for the kernel as we could lose
PMD_SIZE if the size was already aligned on PMD_SIZE
* Split compile time relocations check patch into 2 patches as suggested by Anup
* Applied Reviewed-by from Zong and Anup
Changes in v3:
* Move kernel mapping to vmalloc
Changes in v2:
* Make RELOCATABLE depend on MMU as suggested by Anup
* Rename kernel_load_addr into kernel_virt_addr as suggested by Anup
* Use __pa_symbol instead of __pa, as suggested by Zong
* Rebased on top of v5.6-rc3
* Tested with sv48 patchset
* Add Reviewed/Tested-by from Zong and Anup
Alexandre Ghiti (3):
riscv: Introduce CONFIG_RELOCATABLE
powerpc: Move script to check relocations at compile time in scripts/
riscv: Check relocations at compile time
arch/powerpc/tools/relocs_check.sh | 18 ++--------
arch/riscv/Kconfig | 14 ++++++++
arch/riscv/Makefile | 7 ++--
arch/riscv/Makefile.postlink | 36 ++++++++++++++++++++
arch/riscv/kernel/efi-header.S | 6 ++--
arch/riscv/kernel/vmlinux.lds.S | 10 ++++--
arch/riscv/mm/Makefile | 4 +++
arch/riscv/mm/init.c | 54 +++++++++++++++++++++++++++++-
arch/riscv/tools/relocs_check.sh | 26 ++++++++++++++
scripts/relocs_check.sh | 20 +++++++++++
10 files changed, 171 insertions(+), 24 deletions(-)
create mode 100644 arch/riscv/Makefile.postlink
create mode 100755 arch/riscv/tools/relocs_check.sh
create mode 100755 scripts/relocs_check.sh
--
2.37.2
From: Alexandre Ghiti <[email protected]>
This config allows to compile 64b kernel as PIE and to relocate it at
any virtual address at runtime: this paves the way to KASLR.
Runtime relocation is possible since relocation metadata are embedded into
the kernel.
Note that relocating at runtime introduces an overhead even if the
kernel is loaded at the same address it was linked at and that the compiler
options are those used in arm64 which uses the same RELA relocation
format.
Signed-off-by: Alexandre Ghiti <[email protected]>
---
arch/riscv/Kconfig | 14 +++++++++
arch/riscv/Makefile | 7 +++--
arch/riscv/kernel/efi-header.S | 6 ++--
arch/riscv/kernel/vmlinux.lds.S | 10 ++++--
arch/riscv/mm/Makefile | 4 +++
arch/riscv/mm/init.c | 54 ++++++++++++++++++++++++++++++++-
6 files changed, 87 insertions(+), 8 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index e2b656043abf..e0ee7ce4b2e3 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -544,6 +544,20 @@ config COMPAT
If you want to execute 32-bit userspace applications, say Y.
+config RELOCATABLE
+ bool "Build a relocatable kernel"
+ depends on MMU && 64BIT && !XIP_KERNEL
+ help
+ This builds a kernel as a Position Independent Executable (PIE),
+ which retains all relocation metadata required to relocate the
+ kernel binary at runtime to a different virtual address than the
+ address it was linked at.
+ Since RISCV uses the RELA relocation format, this requires a
+ relocation pass at runtime even if the kernel is loaded at the
+ same address it was linked at.
+
+ If unsure, say N.
+
endmenu # "Kernel features"
menu "Boot options"
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 82153960ac00..97c34136b027 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -7,9 +7,12 @@
#
OBJCOPYFLAGS := -O binary
-LDFLAGS_vmlinux :=
+ifeq ($(CONFIG_RELOCATABLE),y)
+ LDFLAGS_vmlinux += -shared -Bsymbolic -z notext -z norelro
+ KBUILD_CFLAGS += -fPIE
+endif
ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
- LDFLAGS_vmlinux := --no-relax
+ LDFLAGS_vmlinux += --no-relax
KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
CC_FLAGS_FTRACE := -fpatchable-function-entry=8
endif
diff --git a/arch/riscv/kernel/efi-header.S b/arch/riscv/kernel/efi-header.S
index 8e733aa48ba6..f7ee09c4f12d 100644
--- a/arch/riscv/kernel/efi-header.S
+++ b/arch/riscv/kernel/efi-header.S
@@ -33,7 +33,7 @@ optional_header:
.byte 0x02 // MajorLinkerVersion
.byte 0x14 // MinorLinkerVersion
.long __pecoff_text_end - efi_header_end // SizeOfCode
- .long __pecoff_data_virt_size // SizeOfInitializedData
+ .long __pecoff_data_virt_end - __pecoff_text_end // SizeOfInitializedData
.long 0 // SizeOfUninitializedData
.long __efistub_efi_pe_entry - _start // AddressOfEntryPoint
.long efi_header_end - _start // BaseOfCode
@@ -91,9 +91,9 @@ section_table:
IMAGE_SCN_MEM_EXECUTE // Characteristics
.ascii ".data\0\0\0"
- .long __pecoff_data_virt_size // VirtualSize
+ .long __pecoff_data_virt_end - __pecoff_text_end // VirtualSize
.long __pecoff_text_end - _start // VirtualAddress
- .long __pecoff_data_raw_size // SizeOfRawData
+ .long __pecoff_data_raw_end - __pecoff_text_end // SizeOfRawData
.long __pecoff_text_end - _start // PointerToRawData
.long 0 // PointerToRelocations
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
index 4e6c88aa4d87..8be2de3be08c 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -122,9 +122,15 @@ SECTIONS
*(.sdata*)
}
+ .rela.dyn : ALIGN(8) {
+ __rela_dyn_start = .;
+ *(.rela .rela*)
+ __rela_dyn_end = .;
+ }
+
#ifdef CONFIG_EFI
.pecoff_edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGNMENT); }
- __pecoff_data_raw_size = ABSOLUTE(. - __pecoff_text_end);
+ __pecoff_data_raw_end = ABSOLUTE(.);
#endif
/* End of data section */
@@ -134,7 +140,7 @@ SECTIONS
#ifdef CONFIG_EFI
. = ALIGN(PECOFF_SECTION_ALIGNMENT);
- __pecoff_data_virt_size = ABSOLUTE(. - __pecoff_text_end);
+ __pecoff_data_virt_end = ABSOLUTE(.);
#endif
_end = .;
diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 2ac177c05352..b85e9e82f082 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -1,6 +1,10 @@
# SPDX-License-Identifier: GPL-2.0-only
CFLAGS_init.o := -mcmodel=medany
+ifdef CONFIG_RELOCATABLE
+CFLAGS_init.o += -fno-pie
+endif
+
ifdef CONFIG_FTRACE
CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_cacheflush.o = $(CC_FLAGS_FTRACE)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 7f01c2e56efe..3862696c2ac9 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -20,6 +20,9 @@
#include <linux/dma-map-ops.h>
#include <linux/crash_dump.h>
#include <linux/hugetlb.h>
+#ifdef CONFIG_RELOCATABLE
+#include <linux/elf.h>
+#endif
#include <asm/fixmap.h>
#include <asm/tlbflush.h>
@@ -146,7 +149,7 @@ static void __init print_vm_layout(void)
print_ml("kasan", KASAN_SHADOW_START, KASAN_SHADOW_END);
#endif
- print_ml("kernel", (unsigned long)KERNEL_LINK_ADDR,
+ print_ml("kernel", (unsigned long)kernel_map.virt_addr,
(unsigned long)ADDRESS_SPACE_END);
}
}
@@ -854,6 +857,44 @@ static __init void set_satp_mode(uintptr_t dtb_pa)
#error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
#endif
+#ifdef CONFIG_RELOCATABLE
+extern unsigned long __rela_dyn_start, __rela_dyn_end;
+
+static void __init relocate_kernel(void)
+{
+ Elf64_Rela *rela = (Elf64_Rela *)&__rela_dyn_start;
+ /*
+ * This holds the offset between the linked virtual address and the
+ * relocated virtual address.
+ */
+ uintptr_t reloc_offset = kernel_map.virt_addr - KERNEL_LINK_ADDR;
+ /*
+ * This holds the offset between kernel linked virtual address and
+ * physical address.
+ */
+ uintptr_t va_kernel_link_pa_offset = KERNEL_LINK_ADDR - kernel_map.phys_addr;
+
+ for ( ; rela < (Elf64_Rela *)&__rela_dyn_end; rela++) {
+ Elf64_Addr addr = (rela->r_offset - va_kernel_link_pa_offset);
+ Elf64_Addr relocated_addr = rela->r_addend;
+
+ if (rela->r_info != R_RISCV_RELATIVE)
+ continue;
+
+ /*
+ * Make sure to not relocate vdso symbols like rt_sigreturn
+ * which are linked from the address 0 in vmlinux since
+ * vdso symbol addresses are actually used as an offset from
+ * mm->context.vdso in VDSO_OFFSET macro.
+ */
+ if (relocated_addr >= KERNEL_LINK_ADDR)
+ relocated_addr += reloc_offset;
+
+ *(Elf64_Addr *)addr = relocated_addr;
+ }
+}
+#endif /* CONFIG_RELOCATABLE */
+
#ifdef CONFIG_XIP_KERNEL
static void __init create_kernel_page_table(pgd_t *pgdir,
__always_unused bool early)
@@ -1039,6 +1080,17 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
BUG_ON((kernel_map.virt_addr + kernel_map.size) > ADDRESS_SPACE_END - SZ_4K);
#endif
+#ifdef CONFIG_RELOCATABLE
+ /*
+ * Early page table uses only one PUD, which makes it possible
+ * to map PUD_SIZE aligned on PUD_SIZE: if the relocation offset
+ * makes the kernel cross over a PUD_SIZE boundary, raise a bug
+ * since a part of the kernel would not get mapped.
+ */
+ BUG_ON(PUD_SIZE - (kernel_map.virt_addr & (PUD_SIZE - 1)) < kernel_map.size);
+ relocate_kernel();
+#endif
+
apply_early_boot_alternatives();
pt_ops_set_early();
--
2.37.2
From: Alexandre Ghiti <[email protected]>
Relocating kernel at runtime is done very early in the boot process, so
it is not convenient to check for relocations there and react in case a
relocation was not expected.
Powerpc architecture has a script that allows to check at compile time
for such unexpected relocations: extract the common logic to scripts/
so that other architectures can take advantage of it.
Signed-off-by: Alexandre Ghiti <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Acked-by: Michael Ellerman <[email protected]> (powerpc)
---
arch/powerpc/tools/relocs_check.sh | 18 ++----------------
scripts/relocs_check.sh | 20 ++++++++++++++++++++
2 files changed, 22 insertions(+), 16 deletions(-)
create mode 100755 scripts/relocs_check.sh
diff --git a/arch/powerpc/tools/relocs_check.sh b/arch/powerpc/tools/relocs_check.sh
index 63792af00417..6b350e75014c 100755
--- a/arch/powerpc/tools/relocs_check.sh
+++ b/arch/powerpc/tools/relocs_check.sh
@@ -15,21 +15,8 @@ if [ $# -lt 3 ]; then
exit 1
fi
-# Have Kbuild supply the path to objdump and nm so we handle cross compilation.
-objdump="$1"
-nm="$2"
-vmlinux="$3"
-
-# Remove from the bad relocations those that match an undefined weak symbol
-# which will result in an absolute relocation to 0.
-# Weak unresolved symbols are of that form in nm output:
-# " w _binary__btf_vmlinux_bin_end"
-undef_weak_symbols=$($nm "$vmlinux" | awk '$1 ~ /w/ { print $2 }')
-
bad_relocs=$(
-$objdump -R "$vmlinux" |
- # Only look at relocation lines.
- grep -E '\<R_' |
+${srctree}/scripts/relocs_check.sh "$@" |
# These relocations are okay
# On PPC64:
# R_PPC64_RELATIVE, R_PPC64_NONE
@@ -44,8 +31,7 @@ R_PPC_ADDR16_LO
R_PPC_ADDR16_HI
R_PPC_ADDR16_HA
R_PPC_RELATIVE
-R_PPC_NONE' |
- ([ "$undef_weak_symbols" ] && grep -F -w -v "$undef_weak_symbols" || cat)
+R_PPC_NONE'
)
if [ -z "$bad_relocs" ]; then
diff --git a/scripts/relocs_check.sh b/scripts/relocs_check.sh
new file mode 100755
index 000000000000..137c660499f3
--- /dev/null
+++ b/scripts/relocs_check.sh
@@ -0,0 +1,20 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+# Get a list of all the relocations, remove from it the relocations
+# that are known to be legitimate and return this list to arch specific
+# script that will look for suspicious relocations.
+
+objdump="$1"
+nm="$2"
+vmlinux="$3"
+
+# Remove from the possible bad relocations those that match an undefined
+# weak symbol which will result in an absolute relocation to 0.
+# Weak unresolved symbols are of that form in nm output:
+# " w _binary__btf_vmlinux_bin_end"
+undef_weak_symbols=$($nm "$vmlinux" | awk '$1 ~ /w/ { print $2 }')
+
+$objdump -R "$vmlinux" |
+ grep -E '\<R_' |
+ ([ "$undef_weak_symbols" ] && grep -F -w -v "$undef_weak_symbols" || cat)
--
2.37.2
From: Alexandre Ghiti <[email protected]>
Relocating kernel at runtime is done very early in the boot process, so
it is not convenient to check for relocations there and react in case a
relocation was not expected.
There exists a script in scripts/ that extracts the relocations from
vmlinux that is then used at postlink to check the relocations.
Signed-off-by: Alexandre Ghiti <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
---
arch/riscv/Makefile.postlink | 36 ++++++++++++++++++++++++++++++++
arch/riscv/tools/relocs_check.sh | 26 +++++++++++++++++++++++
2 files changed, 62 insertions(+)
create mode 100644 arch/riscv/Makefile.postlink
create mode 100755 arch/riscv/tools/relocs_check.sh
diff --git a/arch/riscv/Makefile.postlink b/arch/riscv/Makefile.postlink
new file mode 100644
index 000000000000..bf2b2bca1845
--- /dev/null
+++ b/arch/riscv/Makefile.postlink
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0
+# ===========================================================================
+# Post-link riscv pass
+# ===========================================================================
+#
+# Check that vmlinux relocations look sane
+
+PHONY := __archpost
+__archpost:
+
+-include include/config/auto.conf
+include scripts/Kbuild.include
+
+quiet_cmd_relocs_check = CHKREL $@
+cmd_relocs_check = \
+ $(CONFIG_SHELL) $(srctree)/arch/riscv/tools/relocs_check.sh "$(OBJDUMP)" "$(NM)" "$@"
+
+# `@true` prevents complaint when there is nothing to be done
+
+vmlinux: FORCE
+ @true
+ifdef CONFIG_RELOCATABLE
+ $(call if_changed,relocs_check)
+endif
+
+%.ko: FORCE
+ @true
+
+clean:
+ @true
+
+PHONY += FORCE clean
+
+FORCE:
+
+.PHONY: $(PHONY)
diff --git a/arch/riscv/tools/relocs_check.sh b/arch/riscv/tools/relocs_check.sh
new file mode 100755
index 000000000000..baeb2e7b2290
--- /dev/null
+++ b/arch/riscv/tools/relocs_check.sh
@@ -0,0 +1,26 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-or-later
+# Based on powerpc relocs_check.sh
+
+# This script checks the relocations of a vmlinux for "suspicious"
+# relocations.
+
+if [ $# -lt 3 ]; then
+ echo "$0 [path to objdump] [path to nm] [path to vmlinux]" 1>&2
+ exit 1
+fi
+
+bad_relocs=$(
+${srctree}/scripts/relocs_check.sh "$@" |
+ # These relocations are okay
+ # R_RISCV_RELATIVE
+ grep -F -w -v 'R_RISCV_RELATIVE'
+)
+
+if [ -z "$bad_relocs" ]; then
+ exit 0
+fi
+
+num_bad=$(echo "$bad_relocs" | wc -l)
+echo "WARNING: $num_bad bad relocations"
+echo "$bad_relocs"
--
2.37.2
Alexandre Ghiti <[email protected]> writes:
> After multiple attempts, this patchset is now based on the fact that the
> 64b kernel mapping was moved outside the linear mapping.
>
> The first patch allows to build relocatable kernels but is not selected
> by default. That patch is a requirement for KASLR.
> The second and third patches take advantage of an already existing powerpc
> script that checks relocations at compile-time, and uses it for riscv.
>
> This patchset is rebased on top of:
>
> RISC-V kasan rework (https://lore.kernel.org/lkml/Y6TTvku%2FyuSjm42j@spud/T/)
> riscv: Use PUD/P4D/PGD pages for the linear mapping (https://lore.kernel.org/lkml/20230125114229.hrhsyw4aegrnmoau@orel/T/)
> riscv: Allow to downgrade paging mode from the command line (https://lore.kernel.org/lkml/CAHVXubjeSMvfTPnvrnYRupOGx6+vUvUGfRS3piTeo=TH2cHKNg@mail.gmail.com/)
> base-commit-tag: v6.2-rc7
>
> Changes in v8:
> * Fix UEFI boot by moving rela.dyn section into the data so that PE/COFF
> loader actually copies the relocations too
> * Fix check that used PGDIR instead of PUD which was not correct
> for sv48 and sv57
> * Fix PE/COFF header data size definition as it led to size of 0
>
> Changes in v7:
> * Rebase on top of v5.15
> * Fix LDFLAGS_vmlinux which was overriden when CONFIG_DYNAMIC_FTRACE was
> set
> * Make relocate_kernel static
> * Add Ack from Michael
>
> Changes in v6:
> * Remove the kernel move to vmalloc zone
> * Rebased on top of for-next
> * Remove relocatable property from 32b kernel as the kernel is mapped in
> the linear mapping and would then need to be copied physically too
> * CONFIG_RELOCATABLE depends on !XIP_KERNEL
> * Remove Reviewed-by from first patch as it changed a bit
>
> Changes in v5:
> * Add "static __init" to create_kernel_page_table function as reported by
> Kbuild test robot
> * Add reviewed-by from Zong
> * Rebase onto v5.7
>
> Changes in v4:
> * Fix BPF region that overlapped with kernel's as suggested by Zong
> * Fix end of module region that could be larger than 2GB as suggested by Zong
> * Fix the size of the vm area reserved for the kernel as we could lose
> PMD_SIZE if the size was already aligned on PMD_SIZE
> * Split compile time relocations check patch into 2 patches as suggested by Anup
> * Applied Reviewed-by from Zong and Anup
>
> Changes in v3:
> * Move kernel mapping to vmalloc
>
> Changes in v2:
> * Make RELOCATABLE depend on MMU as suggested by Anup
> * Rename kernel_load_addr into kernel_virt_addr as suggested by Anup
> * Use __pa_symbol instead of __pa, as suggested by Zong
> * Rebased on top of v5.6-rc3
> * Tested with sv48 patchset
> * Add Reviewed/Tested-by from Zong and Anup
>
> Alexandre Ghiti (3):
> riscv: Introduce CONFIG_RELOCATABLE
> powerpc: Move script to check relocations at compile time in scripts/
> riscv: Check relocations at compile time
I'm getting issues booting via UEFI/efi-stub with this, because the PE
header is messed up.
from arch/riscv/kernel/efi-header.S:
| ...
| extra_header_fields:
| .quad 0 // ImageBase
| .long PECOFF_SECTION_ALIGNMENT // SectionAlignment
| .long PECOFF_FILE_ALIGNMENT // FileAlignment
| ...
PECOFF* is taken from the linker-script, and ends up in relocation
section. When u-boot tried to load the image, alignment is zero and the
loader breaks.
Björn
+cc linux-kbuild, llvm, Nathan, Nick
On 2/15/23 15:36, Alexandre Ghiti wrote:
> From: Alexandre Ghiti <[email protected]>
>
> This config allows to compile 64b kernel as PIE and to relocate it at
> any virtual address at runtime: this paves the way to KASLR.
> Runtime relocation is possible since relocation metadata are embedded into
> the kernel.
>
> Note that relocating at runtime introduces an overhead even if the
> kernel is loaded at the same address it was linked at and that the compiler
> options are those used in arm64 which uses the same RELA relocation
> format.
>
> Signed-off-by: Alexandre Ghiti <[email protected]>
> ---
> arch/riscv/Kconfig | 14 +++++++++
> arch/riscv/Makefile | 7 +++--
> arch/riscv/kernel/efi-header.S | 6 ++--
> arch/riscv/kernel/vmlinux.lds.S | 10 ++++--
> arch/riscv/mm/Makefile | 4 +++
> arch/riscv/mm/init.c | 54 ++++++++++++++++++++++++++++++++-
> 6 files changed, 87 insertions(+), 8 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index e2b656043abf..e0ee7ce4b2e3 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -544,6 +544,20 @@ config COMPAT
>
> If you want to execute 32-bit userspace applications, say Y.
>
> +config RELOCATABLE
> + bool "Build a relocatable kernel"
> + depends on MMU && 64BIT && !XIP_KERNEL
> + help
> + This builds a kernel as a Position Independent Executable (PIE),
> + which retains all relocation metadata required to relocate the
> + kernel binary at runtime to a different virtual address than the
> + address it was linked at.
> + Since RISCV uses the RELA relocation format, this requires a
> + relocation pass at runtime even if the kernel is loaded at the
> + same address it was linked at.
> +
> + If unsure, say N.
> +
> endmenu # "Kernel features"
>
> menu "Boot options"
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 82153960ac00..97c34136b027 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -7,9 +7,12 @@
> #
>
> OBJCOPYFLAGS := -O binary
> -LDFLAGS_vmlinux :=
> +ifeq ($(CONFIG_RELOCATABLE),y)
> + LDFLAGS_vmlinux += -shared -Bsymbolic -z notext -z norelro
> + KBUILD_CFLAGS += -fPIE
> +endif
> ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> - LDFLAGS_vmlinux := --no-relax
> + LDFLAGS_vmlinux += --no-relax
> KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> CC_FLAGS_FTRACE := -fpatchable-function-entry=8
> endif
> diff --git a/arch/riscv/kernel/efi-header.S b/arch/riscv/kernel/efi-header.S
> index 8e733aa48ba6..f7ee09c4f12d 100644
> --- a/arch/riscv/kernel/efi-header.S
> +++ b/arch/riscv/kernel/efi-header.S
> @@ -33,7 +33,7 @@ optional_header:
> .byte 0x02 // MajorLinkerVersion
> .byte 0x14 // MinorLinkerVersion
> .long __pecoff_text_end - efi_header_end // SizeOfCode
> - .long __pecoff_data_virt_size // SizeOfInitializedData
> + .long __pecoff_data_virt_end - __pecoff_text_end // SizeOfInitializedData
> .long 0 // SizeOfUninitializedData
> .long __efistub_efi_pe_entry - _start // AddressOfEntryPoint
> .long efi_header_end - _start // BaseOfCode
> @@ -91,9 +91,9 @@ section_table:
> IMAGE_SCN_MEM_EXECUTE // Characteristics
>
> .ascii ".data\0\0\0"
> - .long __pecoff_data_virt_size // VirtualSize
> + .long __pecoff_data_virt_end - __pecoff_text_end // VirtualSize
> .long __pecoff_text_end - _start // VirtualAddress
> - .long __pecoff_data_raw_size // SizeOfRawData
> + .long __pecoff_data_raw_end - __pecoff_text_end // SizeOfRawData
> .long __pecoff_text_end - _start // PointerToRawData
>
> .long 0 // PointerToRelocations
> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> index 4e6c88aa4d87..8be2de3be08c 100644
> --- a/arch/riscv/kernel/vmlinux.lds.S
> +++ b/arch/riscv/kernel/vmlinux.lds.S
> @@ -122,9 +122,15 @@ SECTIONS
> *(.sdata*)
> }
>
> + .rela.dyn : ALIGN(8) {
> + __rela_dyn_start = .;
> + *(.rela .rela*)
> + __rela_dyn_end = .;
> + }
> +
So I realized those relocations would be better in the init section so
we can get rid of them at some point. So I tried the following:
diff --git a/arch/riscv/kernel/vmlinux.lds.S
b/arch/riscv/kernel/vmlinux.lds.S
index 7ac215467fd5..6111023a89ef 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -93,6 +93,12 @@ SECTIONS
*(.rel.dyn*)
}
+ .rela.dyn : ALIGN(8) {
+ __rela_dyn_start = .;
+ *(.rela .rela*)
+ __rela_dyn_end = .;
+ }
+
__init_data_end = .;
. = ALIGN(8);
@@ -119,12 +125,6 @@ SECTIONS
*(.sdata*)
}
- .rela.dyn : ALIGN(8) {
- __rela_dyn_start = .;
- *(.rela .rela*)
- __rela_dyn_end = .;
- }
-
#ifdef CONFIG_EFI
.pecoff_edata_padding : { BYTE(0); . =
ALIGN(PECOFF_FILE_ALIGNMENT); }
__pecoff_data_raw_end = ABSOLUTE(.);
But then all the relocations in vmlinux end up being null:
vmlinux: file format elf64-littleriscv
$ riscv64-linux-gnu-objdump -R vmlinux
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
0000000000000000 R_RISCV_NONE *ABS*
0000000000000000 R_RISCV_NONE *ABS*
....
I also noticed that re-linking vmlinux with the same command right
after works (ie, the relocations are now valid):
$ riscv64-linux-gnu-objdump -R vmlinux
vmlinux: file format elf64-littleriscv
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
ffffffff82600718 R_RISCV_RELATIVE *ABS*-0x000000007d9ff8e8
ffffffff82600720 R_RISCV_RELATIVE *ABS*-0x000000007d9ff8e8
...
Below is the command used to generate this working vmlinux:
riscv64-unknown-linux-gnu-ld -melf64lriscv -z noexecstack
--no-warn-rwx-segments -shared -Bsymbolic -z notext -z norelro
--no-relax --build-id=sha1 --script=./arch/riscv/kernel/vmlinux.lds
-Map=vmlinux.map -o vmlinux --whole-archive vmlinux.a .vmlinux.export.o
init/version-timestamp.o --no-whole-archive --start-group
./drivers/firmware/efi/libstub/lib.a --end-group .tmp_vmlinux.kallsyms3.o
I tried a lot of things, but I struggle to understand, does anyone have
any idea? FYI, the same problem happens with LLVM.
Thanks,
Alex
> #ifdef CONFIG_EFI
> .pecoff_edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGNMENT); }
> - __pecoff_data_raw_size = ABSOLUTE(. - __pecoff_text_end);
> + __pecoff_data_raw_end = ABSOLUTE(.);
> #endif
>
> /* End of data section */
> @@ -134,7 +140,7 @@ SECTIONS
>
> #ifdef CONFIG_EFI
> . = ALIGN(PECOFF_SECTION_ALIGNMENT);
> - __pecoff_data_virt_size = ABSOLUTE(. - __pecoff_text_end);
> + __pecoff_data_virt_end = ABSOLUTE(.);
> #endif
> _end = .;
>
> diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> index 2ac177c05352..b85e9e82f082 100644
> --- a/arch/riscv/mm/Makefile
> +++ b/arch/riscv/mm/Makefile
> @@ -1,6 +1,10 @@
> # SPDX-License-Identifier: GPL-2.0-only
>
> CFLAGS_init.o := -mcmodel=medany
> +ifdef CONFIG_RELOCATABLE
> +CFLAGS_init.o += -fno-pie
> +endif
> +
> ifdef CONFIG_FTRACE
> CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE)
> CFLAGS_REMOVE_cacheflush.o = $(CC_FLAGS_FTRACE)
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 7f01c2e56efe..3862696c2ac9 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -20,6 +20,9 @@
> #include <linux/dma-map-ops.h>
> #include <linux/crash_dump.h>
> #include <linux/hugetlb.h>
> +#ifdef CONFIG_RELOCATABLE
> +#include <linux/elf.h>
> +#endif
>
> #include <asm/fixmap.h>
> #include <asm/tlbflush.h>
> @@ -146,7 +149,7 @@ static void __init print_vm_layout(void)
> print_ml("kasan", KASAN_SHADOW_START, KASAN_SHADOW_END);
> #endif
>
> - print_ml("kernel", (unsigned long)KERNEL_LINK_ADDR,
> + print_ml("kernel", (unsigned long)kernel_map.virt_addr,
> (unsigned long)ADDRESS_SPACE_END);
> }
> }
> @@ -854,6 +857,44 @@ static __init void set_satp_mode(uintptr_t dtb_pa)
> #error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
> #endif
>
> +#ifdef CONFIG_RELOCATABLE
> +extern unsigned long __rela_dyn_start, __rela_dyn_end;
> +
> +static void __init relocate_kernel(void)
> +{
> + Elf64_Rela *rela = (Elf64_Rela *)&__rela_dyn_start;
> + /*
> + * This holds the offset between the linked virtual address and the
> + * relocated virtual address.
> + */
> + uintptr_t reloc_offset = kernel_map.virt_addr - KERNEL_LINK_ADDR;
> + /*
> + * This holds the offset between kernel linked virtual address and
> + * physical address.
> + */
> + uintptr_t va_kernel_link_pa_offset = KERNEL_LINK_ADDR - kernel_map.phys_addr;
> +
> + for ( ; rela < (Elf64_Rela *)&__rela_dyn_end; rela++) {
> + Elf64_Addr addr = (rela->r_offset - va_kernel_link_pa_offset);
> + Elf64_Addr relocated_addr = rela->r_addend;
> +
> + if (rela->r_info != R_RISCV_RELATIVE)
> + continue;
> +
> + /*
> + * Make sure to not relocate vdso symbols like rt_sigreturn
> + * which are linked from the address 0 in vmlinux since
> + * vdso symbol addresses are actually used as an offset from
> + * mm->context.vdso in VDSO_OFFSET macro.
> + */
> + if (relocated_addr >= KERNEL_LINK_ADDR)
> + relocated_addr += reloc_offset;
> +
> + *(Elf64_Addr *)addr = relocated_addr;
> + }
> +}
> +#endif /* CONFIG_RELOCATABLE */
> +
> #ifdef CONFIG_XIP_KERNEL
> static void __init create_kernel_page_table(pgd_t *pgdir,
> __always_unused bool early)
> @@ -1039,6 +1080,17 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> BUG_ON((kernel_map.virt_addr + kernel_map.size) > ADDRESS_SPACE_END - SZ_4K);
> #endif
>
> +#ifdef CONFIG_RELOCATABLE
> + /*
> + * Early page table uses only one PUD, which makes it possible
> + * to map PUD_SIZE aligned on PUD_SIZE: if the relocation offset
> + * makes the kernel cross over a PUD_SIZE boundary, raise a bug
> + * since a part of the kernel would not get mapped.
> + */
> + BUG_ON(PUD_SIZE - (kernel_map.virt_addr & (PUD_SIZE - 1)) < kernel_map.size);
> + relocate_kernel();
> +#endif
> +
> apply_early_boot_alternatives();
> pt_ops_set_early();
>
Alexandre Ghiti <[email protected]> writes:
> +cc linux-kbuild, llvm, Nathan, Nick
>
> On 2/15/23 15:36, Alexandre Ghiti wrote:
>> From: Alexandre Ghiti <[email protected]>
>>
>> This config allows to compile 64b kernel as PIE and to relocate it at
>> any virtual address at runtime: this paves the way to KASLR.
>> Runtime relocation is possible since relocation metadata are embedded into
>> the kernel.
>>
>> Note that relocating at runtime introduces an overhead even if the
>> kernel is loaded at the same address it was linked at and that the compiler
>> options are those used in arm64 which uses the same RELA relocation
>> format.
>>
>> Signed-off-by: Alexandre Ghiti <[email protected]>
>> ---
>> arch/riscv/Kconfig | 14 +++++++++
>> arch/riscv/Makefile | 7 +++--
>> arch/riscv/kernel/efi-header.S | 6 ++--
>> arch/riscv/kernel/vmlinux.lds.S | 10 ++++--
>> arch/riscv/mm/Makefile | 4 +++
>> arch/riscv/mm/init.c | 54 ++++++++++++++++++++++++++++++++-
>> 6 files changed, 87 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> index e2b656043abf..e0ee7ce4b2e3 100644
>> --- a/arch/riscv/Kconfig
>> +++ b/arch/riscv/Kconfig
>> @@ -544,6 +544,20 @@ config COMPAT
>>
>> If you want to execute 32-bit userspace applications, say Y.
>>
>> +config RELOCATABLE
>> + bool "Build a relocatable kernel"
>> + depends on MMU && 64BIT && !XIP_KERNEL
>> + help
>> + This builds a kernel as a Position Independent Executable (PIE),
>> + which retains all relocation metadata required to relocate the
>> + kernel binary at runtime to a different virtual address than the
>> + address it was linked at.
>> + Since RISCV uses the RELA relocation format, this requires a
>> + relocation pass at runtime even if the kernel is loaded at the
>> + same address it was linked at.
>> +
>> + If unsure, say N.
>> +
>> endmenu # "Kernel features"
>>
>> menu "Boot options"
>> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
>> index 82153960ac00..97c34136b027 100644
>> --- a/arch/riscv/Makefile
>> +++ b/arch/riscv/Makefile
>> @@ -7,9 +7,12 @@
>> #
>>
>> OBJCOPYFLAGS := -O binary
>> -LDFLAGS_vmlinux :=
>> +ifeq ($(CONFIG_RELOCATABLE),y)
>> + LDFLAGS_vmlinux += -shared -Bsymbolic -z notext -z norelro
>> + KBUILD_CFLAGS += -fPIE
>> +endif
>> ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
>> - LDFLAGS_vmlinux := --no-relax
>> + LDFLAGS_vmlinux += --no-relax
>> KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
>> CC_FLAGS_FTRACE := -fpatchable-function-entry=8
>> endif
>> diff --git a/arch/riscv/kernel/efi-header.S b/arch/riscv/kernel/efi-header.S
>> index 8e733aa48ba6..f7ee09c4f12d 100644
>> --- a/arch/riscv/kernel/efi-header.S
>> +++ b/arch/riscv/kernel/efi-header.S
>> @@ -33,7 +33,7 @@ optional_header:
>> .byte 0x02 // MajorLinkerVersion
>> .byte 0x14 // MinorLinkerVersion
>> .long __pecoff_text_end - efi_header_end // SizeOfCode
>> - .long __pecoff_data_virt_size // SizeOfInitializedData
>> + .long __pecoff_data_virt_end - __pecoff_text_end // SizeOfInitializedData
>> .long 0 // SizeOfUninitializedData
>> .long __efistub_efi_pe_entry - _start // AddressOfEntryPoint
>> .long efi_header_end - _start // BaseOfCode
>> @@ -91,9 +91,9 @@ section_table:
>> IMAGE_SCN_MEM_EXECUTE // Characteristics
>>
>> .ascii ".data\0\0\0"
>> - .long __pecoff_data_virt_size // VirtualSize
>> + .long __pecoff_data_virt_end - __pecoff_text_end // VirtualSize
>> .long __pecoff_text_end - _start // VirtualAddress
>> - .long __pecoff_data_raw_size // SizeOfRawData
>> + .long __pecoff_data_raw_end - __pecoff_text_end // SizeOfRawData
>> .long __pecoff_text_end - _start // PointerToRawData
>>
>> .long 0 // PointerToRelocations
>> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
>> index 4e6c88aa4d87..8be2de3be08c 100644
>> --- a/arch/riscv/kernel/vmlinux.lds.S
>> +++ b/arch/riscv/kernel/vmlinux.lds.S
>> @@ -122,9 +122,15 @@ SECTIONS
>> *(.sdata*)
>> }
>>
>> + .rela.dyn : ALIGN(8) {
>> + __rela_dyn_start = .;
>> + *(.rela .rela*)
>> + __rela_dyn_end = .;
>> + }
>> +
>
>
> So I realized those relocations would be better in the init section so
> we can get rid of them at some point. So I tried the following:
>
> diff --git a/arch/riscv/kernel/vmlinux.lds.S
> b/arch/riscv/kernel/vmlinux.lds.S
> index 7ac215467fd5..6111023a89ef 100644
> --- a/arch/riscv/kernel/vmlinux.lds.S
> +++ b/arch/riscv/kernel/vmlinux.lds.S
> @@ -93,6 +93,12 @@ SECTIONS
> *(.rel.dyn*)
> }
>
> + .rela.dyn : ALIGN(8) {
> + __rela_dyn_start = .;
> + *(.rela .rela*)
> + __rela_dyn_end = .;
> + }
> +
> __init_data_end = .;
>
> . = ALIGN(8);
> @@ -119,12 +125,6 @@ SECTIONS
> *(.sdata*)
> }
>
> - .rela.dyn : ALIGN(8) {
> - __rela_dyn_start = .;
> - *(.rela .rela*)
> - __rela_dyn_end = .;
> - }
> -
> #ifdef CONFIG_EFI
> .pecoff_edata_padding : { BYTE(0); . =
> ALIGN(PECOFF_FILE_ALIGNMENT); }
> __pecoff_data_raw_end = ABSOLUTE(.);
>
>
> But then all the relocations in vmlinux end up being null:
>
> vmlinux: file format elf64-littleriscv
>
> $ riscv64-linux-gnu-objdump -R vmlinux
>
> DYNAMIC RELOCATION RECORDS
> OFFSET TYPE VALUE
> 0000000000000000 R_RISCV_NONE *ABS*
> 0000000000000000 R_RISCV_NONE *ABS*
> ....
>
> I also noticed that re-linking vmlinux with the same command right
> after works (ie, the relocations are now valid):
>
> $ riscv64-linux-gnu-objdump -R vmlinux
>
> vmlinux: file format elf64-littleriscv
>
> DYNAMIC RELOCATION RECORDS
> OFFSET TYPE VALUE
> ffffffff82600718 R_RISCV_RELATIVE *ABS*-0x000000007d9ff8e8
> ffffffff82600720 R_RISCV_RELATIVE *ABS*-0x000000007d9ff8e8
> ...
>
> Below is the command used to generate this working vmlinux:
>
> riscv64-unknown-linux-gnu-ld -melf64lriscv -z noexecstack
> --no-warn-rwx-segments -shared -Bsymbolic -z notext -z norelro
> --no-relax --build-id=sha1 --script=./arch/riscv/kernel/vmlinux.lds
> -Map=vmlinux.map -o vmlinux --whole-archive vmlinux.a .vmlinux.export.o
> init/version-timestamp.o --no-whole-archive --start-group
> ./drivers/firmware/efi/libstub/lib.a --end-group .tmp_vmlinux.kallsyms3.o
>
> I tried a lot of things, but I struggle to understand, does anyone have
> any idea? FYI, the same problem happens with LLVM.
Don't ask me *why*, but adding --emit-relocs to your linker flags solves
"the NULL .rela.dyn" both for GCC and LLVM.
The downside is that you end up with a bunch of .rela cruft in your
vmlinux.
Björn
@linux-kbuild: Does anyone has an idea to solve this?
Thanks!
On 2/22/23 13:29, Alexandre Ghiti wrote:
> +cc linux-kbuild, llvm, Nathan, Nick
>
> On 2/15/23 15:36, Alexandre Ghiti wrote:
>> From: Alexandre Ghiti <[email protected]>
>>
>> This config allows to compile 64b kernel as PIE and to relocate it at
>> any virtual address at runtime: this paves the way to KASLR.
>> Runtime relocation is possible since relocation metadata are embedded
>> into
>> the kernel.
>>
>> Note that relocating at runtime introduces an overhead even if the
>> kernel is loaded at the same address it was linked at and that the
>> compiler
>> options are those used in arm64 which uses the same RELA relocation
>> format.
>>
>> Signed-off-by: Alexandre Ghiti <[email protected]>
>> ---
>> arch/riscv/Kconfig | 14 +++++++++
>> arch/riscv/Makefile | 7 +++--
>> arch/riscv/kernel/efi-header.S | 6 ++--
>> arch/riscv/kernel/vmlinux.lds.S | 10 ++++--
>> arch/riscv/mm/Makefile | 4 +++
>> arch/riscv/mm/init.c | 54 ++++++++++++++++++++++++++++++++-
>> 6 files changed, 87 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> index e2b656043abf..e0ee7ce4b2e3 100644
>> --- a/arch/riscv/Kconfig
>> +++ b/arch/riscv/Kconfig
>> @@ -544,6 +544,20 @@ config COMPAT
>> If you want to execute 32-bit userspace applications, say Y.
>> +config RELOCATABLE
>> + bool "Build a relocatable kernel"
>> + depends on MMU && 64BIT && !XIP_KERNEL
>> + help
>> + This builds a kernel as a Position Independent Executable
>> (PIE),
>> + which retains all relocation metadata required to relocate
>> the
>> + kernel binary at runtime to a different virtual address
>> than the
>> + address it was linked at.
>> + Since RISCV uses the RELA relocation format, this requires a
>> + relocation pass at runtime even if the kernel is loaded at
>> the
>> + same address it was linked at.
>> +
>> + If unsure, say N.
>> +
>> endmenu # "Kernel features"
>> menu "Boot options"
>> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
>> index 82153960ac00..97c34136b027 100644
>> --- a/arch/riscv/Makefile
>> +++ b/arch/riscv/Makefile
>> @@ -7,9 +7,12 @@
>> #
>> OBJCOPYFLAGS := -O binary
>> -LDFLAGS_vmlinux :=
>> +ifeq ($(CONFIG_RELOCATABLE),y)
>> + LDFLAGS_vmlinux += -shared -Bsymbolic -z notext -z norelro
>> + KBUILD_CFLAGS += -fPIE
>> +endif
>> ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
>> - LDFLAGS_vmlinux := --no-relax
>> + LDFLAGS_vmlinux += --no-relax
>> KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
>> CC_FLAGS_FTRACE := -fpatchable-function-entry=8
>> endif
>> diff --git a/arch/riscv/kernel/efi-header.S
>> b/arch/riscv/kernel/efi-header.S
>> index 8e733aa48ba6..f7ee09c4f12d 100644
>> --- a/arch/riscv/kernel/efi-header.S
>> +++ b/arch/riscv/kernel/efi-header.S
>> @@ -33,7 +33,7 @@ optional_header:
>> .byte 0x02 // MajorLinkerVersion
>> .byte 0x14 // MinorLinkerVersion
>> .long __pecoff_text_end - efi_header_end // SizeOfCode
>> - .long __pecoff_data_virt_size //
>> SizeOfInitializedData
>> + .long __pecoff_data_virt_end - __pecoff_text_end //
>> SizeOfInitializedData
>> .long 0 // SizeOfUninitializedData
>> .long __efistub_efi_pe_entry - _start //
>> AddressOfEntryPoint
>> .long efi_header_end - _start // BaseOfCode
>> @@ -91,9 +91,9 @@ section_table:
>> IMAGE_SCN_MEM_EXECUTE // Characteristics
>> .ascii ".data\0\0\0"
>> - .long __pecoff_data_virt_size // VirtualSize
>> + .long __pecoff_data_virt_end - __pecoff_text_end //
>> VirtualSize
>> .long __pecoff_text_end - _start // VirtualAddress
>> - .long __pecoff_data_raw_size // SizeOfRawData
>> + .long __pecoff_data_raw_end - __pecoff_text_end //
>> SizeOfRawData
>> .long __pecoff_text_end - _start // PointerToRawData
>> .long 0 // PointerToRelocations
>> diff --git a/arch/riscv/kernel/vmlinux.lds.S
>> b/arch/riscv/kernel/vmlinux.lds.S
>> index 4e6c88aa4d87..8be2de3be08c 100644
>> --- a/arch/riscv/kernel/vmlinux.lds.S
>> +++ b/arch/riscv/kernel/vmlinux.lds.S
>> @@ -122,9 +122,15 @@ SECTIONS
>> *(.sdata*)
>> }
>> + .rela.dyn : ALIGN(8) {
>> + __rela_dyn_start = .;
>> + *(.rela .rela*)
>> + __rela_dyn_end = .;
>> + }
>> +
>
>
> So I realized those relocations would be better in the init section so
> we can get rid of them at some point. So I tried the following:
>
> diff --git a/arch/riscv/kernel/vmlinux.lds.S
> b/arch/riscv/kernel/vmlinux.lds.S
> index 7ac215467fd5..6111023a89ef 100644
> --- a/arch/riscv/kernel/vmlinux.lds.S
> +++ b/arch/riscv/kernel/vmlinux.lds.S
> @@ -93,6 +93,12 @@ SECTIONS
> *(.rel.dyn*)
> }
>
> + .rela.dyn : ALIGN(8) {
> + __rela_dyn_start = .;
> + *(.rela .rela*)
> + __rela_dyn_end = .;
> + }
> +
> __init_data_end = .;
>
> . = ALIGN(8);
> @@ -119,12 +125,6 @@ SECTIONS
> *(.sdata*)
> }
>
> - .rela.dyn : ALIGN(8) {
> - __rela_dyn_start = .;
> - *(.rela .rela*)
> - __rela_dyn_end = .;
> - }
> -
> #ifdef CONFIG_EFI
> .pecoff_edata_padding : { BYTE(0); . =
> ALIGN(PECOFF_FILE_ALIGNMENT); }
> __pecoff_data_raw_end = ABSOLUTE(.);
>
>
> But then all the relocations in vmlinux end up being null:
>
> vmlinux: file format elf64-littleriscv
>
> $ riscv64-linux-gnu-objdump -R vmlinux
>
> DYNAMIC RELOCATION RECORDS
> OFFSET TYPE VALUE
> 0000000000000000 R_RISCV_NONE *ABS*
> 0000000000000000 R_RISCV_NONE *ABS*
> ....
>
> I also noticed that re-linking vmlinux with the same command right
> after works (ie, the relocations are now valid):
>
> $ riscv64-linux-gnu-objdump -R vmlinux
>
> vmlinux: file format elf64-littleriscv
>
> DYNAMIC RELOCATION RECORDS
> OFFSET TYPE VALUE
> ffffffff82600718 R_RISCV_RELATIVE *ABS*-0x000000007d9ff8e8
> ffffffff82600720 R_RISCV_RELATIVE *ABS*-0x000000007d9ff8e8
> ...
>
> Below is the command used to generate this working vmlinux:
>
> riscv64-unknown-linux-gnu-ld -melf64lriscv -z noexecstack
> --no-warn-rwx-segments -shared -Bsymbolic -z notext -z norelro
> --no-relax --build-id=sha1 --script=./arch/riscv/kernel/vmlinux.lds
> -Map=vmlinux.map -o vmlinux --whole-archive vmlinux.a
> .vmlinux.export.o init/version-timestamp.o --no-whole-archive
> --start-group ./drivers/firmware/efi/libstub/lib.a --end-group
> .tmp_vmlinux.kallsyms3.o
>
> I tried a lot of things, but I struggle to understand, does anyone
> have any idea? FYI, the same problem happens with LLVM.
>
> Thanks,
>
> Alex
>
>
>> #ifdef CONFIG_EFI
>> .pecoff_edata_padding : { BYTE(0); . =
>> ALIGN(PECOFF_FILE_ALIGNMENT); }
>> - __pecoff_data_raw_size = ABSOLUTE(. - __pecoff_text_end);
>> + __pecoff_data_raw_end = ABSOLUTE(.);
>> #endif
>> /* End of data section */
>> @@ -134,7 +140,7 @@ SECTIONS
>> #ifdef CONFIG_EFI
>> . = ALIGN(PECOFF_SECTION_ALIGNMENT);
>> - __pecoff_data_virt_size = ABSOLUTE(. - __pecoff_text_end);
>> + __pecoff_data_virt_end = ABSOLUTE(.);
>> #endif
>> _end = .;
>> diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
>> index 2ac177c05352..b85e9e82f082 100644
>> --- a/arch/riscv/mm/Makefile
>> +++ b/arch/riscv/mm/Makefile
>> @@ -1,6 +1,10 @@
>> # SPDX-License-Identifier: GPL-2.0-only
>> CFLAGS_init.o := -mcmodel=medany
>> +ifdef CONFIG_RELOCATABLE
>> +CFLAGS_init.o += -fno-pie
>> +endif
>> +
>> ifdef CONFIG_FTRACE
>> CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE)
>> CFLAGS_REMOVE_cacheflush.o = $(CC_FLAGS_FTRACE)
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index 7f01c2e56efe..3862696c2ac9 100644
>> --- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -20,6 +20,9 @@
>> #include <linux/dma-map-ops.h>
>> #include <linux/crash_dump.h>
>> #include <linux/hugetlb.h>
>> +#ifdef CONFIG_RELOCATABLE
>> +#include <linux/elf.h>
>> +#endif
>> #include <asm/fixmap.h>
>> #include <asm/tlbflush.h>
>> @@ -146,7 +149,7 @@ static void __init print_vm_layout(void)
>> print_ml("kasan", KASAN_SHADOW_START, KASAN_SHADOW_END);
>> #endif
>> - print_ml("kernel", (unsigned long)KERNEL_LINK_ADDR,
>> + print_ml("kernel", (unsigned long)kernel_map.virt_addr,
>> (unsigned long)ADDRESS_SPACE_END);
>> }
>> }
>> @@ -854,6 +857,44 @@ static __init void set_satp_mode(uintptr_t dtb_pa)
>> #error "setup_vm() is called from head.S before relocate so it
>> should not use absolute addressing."
>> #endif
>> +#ifdef CONFIG_RELOCATABLE
>> +extern unsigned long __rela_dyn_start, __rela_dyn_end;
>> +
>> +static void __init relocate_kernel(void)
>> +{
>> + Elf64_Rela *rela = (Elf64_Rela *)&__rela_dyn_start;
>> + /*
>> + * This holds the offset between the linked virtual address and the
>> + * relocated virtual address.
>> + */
>> + uintptr_t reloc_offset = kernel_map.virt_addr - KERNEL_LINK_ADDR;
>> + /*
>> + * This holds the offset between kernel linked virtual address and
>> + * physical address.
>> + */
>> + uintptr_t va_kernel_link_pa_offset = KERNEL_LINK_ADDR -
>> kernel_map.phys_addr;
>> +
>> + for ( ; rela < (Elf64_Rela *)&__rela_dyn_end; rela++) {
>> + Elf64_Addr addr = (rela->r_offset - va_kernel_link_pa_offset);
>> + Elf64_Addr relocated_addr = rela->r_addend;
>> +
>> + if (rela->r_info != R_RISCV_RELATIVE)
>> + continue;
>> +
>> + /*
>> + * Make sure to not relocate vdso symbols like rt_sigreturn
>> + * which are linked from the address 0 in vmlinux since
>> + * vdso symbol addresses are actually used as an offset from
>> + * mm->context.vdso in VDSO_OFFSET macro.
>> + */
>> + if (relocated_addr >= KERNEL_LINK_ADDR)
>> + relocated_addr += reloc_offset;
>> +
>> + *(Elf64_Addr *)addr = relocated_addr;
>> + }
>> +}
>> +#endif /* CONFIG_RELOCATABLE */
>> +
>> #ifdef CONFIG_XIP_KERNEL
>> static void __init create_kernel_page_table(pgd_t *pgdir,
>> __always_unused bool early)
>> @@ -1039,6 +1080,17 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>> BUG_ON((kernel_map.virt_addr + kernel_map.size) >
>> ADDRESS_SPACE_END - SZ_4K);
>> #endif
>> +#ifdef CONFIG_RELOCATABLE
>> + /*
>> + * Early page table uses only one PUD, which makes it possible
>> + * to map PUD_SIZE aligned on PUD_SIZE: if the relocation offset
>> + * makes the kernel cross over a PUD_SIZE boundary, raise a bug
>> + * since a part of the kernel would not get mapped.
>> + */
>> + BUG_ON(PUD_SIZE - (kernel_map.virt_addr & (PUD_SIZE - 1)) <
>> kernel_map.size);
>> + relocate_kernel();
>> +#endif
>> +
>> apply_early_boot_alternatives();
>> pt_ops_set_early();
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv
On Fri, Feb 24, 2023 at 7:58 AM Björn Töpel <[email protected]> wrote:
>
> Alexandre Ghiti <[email protected]> writes:
>
> > +cc linux-kbuild, llvm, Nathan, Nick
> >
> > On 2/15/23 15:36, Alexandre Ghiti wrote:
> >> From: Alexandre Ghiti <[email protected]>
> >>
> > I tried a lot of things, but I struggle to understand, does anyone have
> > any idea? FYI, the same problem happens with LLVM.
Off the top of my head, no idea.
(Maybe as a follow up to this series, I wonder if pursuing
ARCH_HAS_RELR for ARCH=riscv is worthwhile?)
>
> Don't ask me *why*, but adding --emit-relocs to your linker flags solves
> "the NULL .rela.dyn" both for GCC and LLVM.
>
> The downside is that you end up with a bunch of .rela cruft in your
> vmlinux.
There was a patch just this week to use $(OBJCOPY) to strip these from
vmlinux (for x86). Looks like x86 uses --emit-relocs for KASLR:
https://lore.kernel.org/lkml/[email protected]/
--
Thanks,
~Nick Desaulniers
On Wed, Mar 22, 2023 at 11:26 AM Nick Desaulniers
<[email protected]> wrote:
>
> On Fri, Feb 24, 2023 at 7:58 AM Björn Töpel <[email protected]> wrote:
> >
> > Alexandre Ghiti <[email protected]> writes:
> >
> > > +cc linux-kbuild, llvm, Nathan, Nick
> > >
> > > On 2/15/23 15:36, Alexandre Ghiti wrote:
> > >> From: Alexandre Ghiti <[email protected]>
> > >>
> > > I tried a lot of things, but I struggle to understand, does anyone have
> > > any idea? FYI, the same problem happens with LLVM.
>
> Off the top of my head, no idea.
>
> (Maybe as a follow up to this series, I wonder if pursuing
> ARCH_HAS_RELR for ARCH=riscv is worthwhile?)
(I had thought about this for my own fun, but the currently only
implementation arch/arm64/kernel/head.S uses assembly.
Every port needs to write some assembly for the same task, which is a pity.
In FreeBSD rtld, glibc, and musl, DT_RELR code is target-independent.)
> >
> > Don't ask me *why*, but adding --emit-relocs to your linker flags solves
> > "the NULL .rela.dyn" both for GCC and LLVM.
> >
> > The downside is that you end up with a bunch of .rela cruft in your
> > vmlinux.
>
> There was a patch just this week to use $(OBJCOPY) to strip these from
> vmlinux (for x86). Looks like x86 uses --emit-relocs for KASLR:
> https://lore.kernel.org/lkml/[email protected]/
> --
> Thanks,
> ~Nick Desaulniers
>
--
宋方睿
Hi Nick,
On 3/22/23 19:25, Nick Desaulniers wrote:
> On Fri, Feb 24, 2023 at 7:58 AM Björn Töpel <[email protected]> wrote:
>> Alexandre Ghiti <[email protected]> writes:
>>
>>> +cc linux-kbuild, llvm, Nathan, Nick
>>>
>>> On 2/15/23 15:36, Alexandre Ghiti wrote:
>>>> From: Alexandre Ghiti <[email protected]>
>>>>
>>> I tried a lot of things, but I struggle to understand, does anyone have
>>> any idea? FYI, the same problem happens with LLVM.
> Off the top of my head, no idea.
>
> (Maybe as a follow up to this series, I wonder if pursuing
> ARCH_HAS_RELR for ARCH=riscv is worthwhile?)
IIUC, the goal for using RELR is to reduce the size of a kernel image:
right now, this is not my priority, but I'll add that to my todo list
because that may be useful to distros.
>
>> Don't ask me *why*, but adding --emit-relocs to your linker flags solves
>> "the NULL .rela.dyn" both for GCC and LLVM.
>>
>> The downside is that you end up with a bunch of .rela cruft in your
>> vmlinux.
> There was a patch just this week to use $(OBJCOPY) to strip these from
> vmlinux (for x86). Looks like x86 uses --emit-relocs for KASLR:
> https://lore.kernel.org/lkml/[email protected]/
That's nice, that would be an interesting intermediate step until we
find the issue here as I believe it is important to have the relocations
in the init section to save memory.
Thanks for your answer Nick, really appreciated,
Alex
That does not work with UEFI booting:
Loading Linux 6.4.0-rc1-1.g668187d-default ...
Loading initial ramdisk ...
Unhandled exception: Instruction access fault
EPC: ffffffff80016d56 RA: 000000008020334e TVAL: 0000007f80016d56
EPC: ffffffff002d1d56 RA: 00000000004be34e reloc adjusted
Unhandled exception: Load access fault
EPC: 00000000fff462d4 RA: 00000000fff462d0 TVAL: ffffffff80016d56
EPC: 00000000802012d4 RA: 00000000802012d0 reloc adjusted
Code: c825 8e0d 05b3 40b4 d0ef 0636 7493 ffe4 (d783 0004)
UEFI image [0x00000000fe65e000:0x00000000fe6e3fff] '/efi\boot\bootriscv64.efi'
UEFI image [0x00000000daa82000:0x00000000dcc2afff]
--
Andreas Schwab, [email protected]
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
On 5/9/23 21:07, Andreas Schwab wrote:
> That does not work with UEFI booting:
>
> Loading Linux 6.4.0-rc1-1.g668187d-default ...
> Loading initial ramdisk ...
> Unhandled exception: Instruction access fault
> EPC: ffffffff80016d56 RA: 000000008020334e TVAL: 0000007f80016d56
> EPC: ffffffff002d1d56 RA: 00000000004be34e reloc adjusted
> Unhandled exception: Load access fault
> EPC: 00000000fff462d4 RA: 00000000fff462d0 TVAL: ffffffff80016d56
> EPC: 00000000802012d4 RA: 00000000802012d0 reloc adjusted
>
> Code: c825 8e0d 05b3 40b4 d0ef 0636 7493 ffe4 (d783 0004)
> UEFI image [0x00000000fe65e000:0x00000000fe6e3fff] '/efi\boot\bootriscv64.efi'
> UEFI image [0x00000000daa82000:0x00000000dcc2afff]
>
I need more details please, as I have a UEFI bootflow and it works great
(KASLR is based on a relocatable kernel and works fine in UEFI too).
Thanks,
Alex
On Mai 09 2023, Alexandre Ghiti wrote:
> On 5/9/23 21:07, Andreas Schwab wrote:
>> That does not work with UEFI booting:
>>
>> Loading Linux 6.4.0-rc1-1.g668187d-default ...
>> Loading initial ramdisk ...
>> Unhandled exception: Instruction access fault
>> EPC: ffffffff80016d56 RA: 000000008020334e TVAL: 0000007f80016d56
>> EPC: ffffffff002d1d56 RA: 00000000004be34e reloc adjusted
>> Unhandled exception: Load access fault
>> EPC: 00000000fff462d4 RA: 00000000fff462d0 TVAL: ffffffff80016d56
>> EPC: 00000000802012d4 RA: 00000000802012d0 reloc adjusted
>>
>> Code: c825 8e0d 05b3 40b4 d0ef 0636 7493 ffe4 (d783 0004)
>> UEFI image [0x00000000fe65e000:0x00000000fe6e3fff] '/efi\boot\bootriscv64.efi'
>> UEFI image [0x00000000daa82000:0x00000000dcc2afff]
>>
>
> I need more details please, as I have a UEFI bootflow and it works great
> (KASLR is based on a relocatable kernel and works fine in UEFI too).
It also crashes without UEFI. Disabling CONFIG_RELOCATABLE fixes that.
This was tested on the HiFive Unmatched board.
The kernel image I tested is available from
<https://download.opensuse.org/repositories/Kernel:/HEAD/RISCV/>. The
same kernel with CONFIG_RELOCATABLE disabled is available from
<https://download.opensuse.org/repositories/home:/Andreas_Schwab:/riscv:/kernel/standard/>.
--
Andreas Schwab, [email protected]
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
On Thu, 11 May 2023 11:18:23 PDT (-0700), [email protected] wrote:
> On Mai 09 2023, Alexandre Ghiti wrote:
>
>> On 5/9/23 21:07, Andreas Schwab wrote:
>>> That does not work with UEFI booting:
>>>
>>> Loading Linux 6.4.0-rc1-1.g668187d-default ...
>>> Loading initial ramdisk ...
>>> Unhandled exception: Instruction access fault
>>> EPC: ffffffff80016d56 RA: 000000008020334e TVAL: 0000007f80016d56
>>> EPC: ffffffff002d1d56 RA: 00000000004be34e reloc adjusted
>>> Unhandled exception: Load access fault
>>> EPC: 00000000fff462d4 RA: 00000000fff462d0 TVAL: ffffffff80016d56
>>> EPC: 00000000802012d4 RA: 00000000802012d0 reloc adjusted
>>>
>>> Code: c825 8e0d 05b3 40b4 d0ef 0636 7493 ffe4 (d783 0004)
>>> UEFI image [0x00000000fe65e000:0x00000000fe6e3fff] '/efi\boot\bootriscv64.efi'
>>> UEFI image [0x00000000daa82000:0x00000000dcc2afff]
>>>
>>
>> I need more details please, as I have a UEFI bootflow and it works great
>> (KASLR is based on a relocatable kernel and works fine in UEFI too).
>
> It also crashes without UEFI. Disabling CONFIG_RELOCATABLE fixes that.
> This was tested on the HiFive Unmatched board.
> The kernel image I tested is available from
> <https://download.opensuse.org/repositories/Kernel:/HEAD/RISCV/>. The
> same kernel with CONFIG_RELOCATABLE disabled is available from
> <https://download.opensuse.org/repositories/home:/Andreas_Schwab:/riscv:/kernel/standard/>.
Sorry I missed this earlier, there's been some other reports of boot
failures on rc1 showing up but those were all a lot more vague. Just
setting CONFIG_RELOCATABLE=y doesn't manifest a boot failure on QEMU on
my end and I don't have an UNmatched floating around.
Alex says he's going to look into it (and IIRC he has my Unmatched...).
On 5/11/23 20:18, Andreas Schwab wrote:
> On Mai 09 2023, Alexandre Ghiti wrote:
>
>> On 5/9/23 21:07, Andreas Schwab wrote:
>>> That does not work with UEFI booting:
>>>
>>> Loading Linux 6.4.0-rc1-1.g668187d-default ...
>>> Loading initial ramdisk ...
>>> Unhandled exception: Instruction access fault
>>> EPC: ffffffff80016d56 RA: 000000008020334e TVAL: 0000007f80016d56
>>> EPC: ffffffff002d1d56 RA: 00000000004be34e reloc adjusted
>>> Unhandled exception: Load access fault
>>> EPC: 00000000fff462d4 RA: 00000000fff462d0 TVAL: ffffffff80016d56
>>> EPC: 00000000802012d4 RA: 00000000802012d0 reloc adjusted
>>>
>>> Code: c825 8e0d 05b3 40b4 d0ef 0636 7493 ffe4 (d783 0004)
>>> UEFI image [0x00000000fe65e000:0x00000000fe6e3fff] '/efi\boot\bootriscv64.efi'
>>> UEFI image [0x00000000daa82000:0x00000000dcc2afff]
>>>
>> I need more details please, as I have a UEFI bootflow and it works great
>> (KASLR is based on a relocatable kernel and works fine in UEFI too).
> It also crashes without UEFI. Disabling CONFIG_RELOCATABLE fixes that.
> This was tested on the HiFive Unmatched board.
> The kernel image I tested is available from
> <https://download.opensuse.org/repositories/Kernel:/HEAD/RISCV/>. The
> same kernel with CONFIG_RELOCATABLE disabled is available from
> <https://download.opensuse.org/repositories/home:/Andreas_Schwab:/riscv:/kernel/standard/>.
>
I have tested the following patch successfully, can you give it a try
while I make sure this is the only place I forgot to add the -fno-pie flag?
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index fbdccc21418a..153864e4f399 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -23,6 +23,10 @@ ifdef CONFIG_FTRACE
CFLAGS_REMOVE_alternative.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_cpufeature.o = $(CC_FLAGS_FTRACE)
endif
+ifdef CONFIG_RELOCATABLE
+CFLAGS_alternative.o += -fno-pie
+CFLAGS_cpufeature.o += -fno-pie
+endif
ifdef CONFIG_KASAN
KASAN_SANITIZE_alternative.o := n
KASAN_SANITIZE_cpufeature.o := n
Thanks
Alex
On Mai 19 2023, Alexandre Ghiti wrote:
> I have tested the following patch successfully, can you give it a try
> while I make sure this is the only place I forgot to add the -fno-pie
> flag?
>
> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
> index fbdccc21418a..153864e4f399 100644
> --- a/arch/riscv/kernel/Makefile
> +++ b/arch/riscv/kernel/Makefile
> @@ -23,6 +23,10 @@ ifdef CONFIG_FTRACE
> CFLAGS_REMOVE_alternative.o = $(CC_FLAGS_FTRACE)
> CFLAGS_REMOVE_cpufeature.o = $(CC_FLAGS_FTRACE)
> endif
> +ifdef CONFIG_RELOCATABLE
> +CFLAGS_alternative.o += -fno-pie
> +CFLAGS_cpufeature.o += -fno-pie
> +endif
> ifdef CONFIG_KASAN
> KASAN_SANITIZE_alternative.o := n
> KASAN_SANITIZE_cpufeature.o := n
I can confirm that this fixes the crash.
--
Andreas Schwab, [email protected]
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
On Fri, 19 May 2023 14:48:59 PDT (-0700), [email protected] wrote:
> On Mai 19 2023, Alexandre Ghiti wrote:
>
>> I have tested the following patch successfully, can you give it a try
>> while I make sure this is the only place I forgot to add the -fno-pie
>> flag?
>>
>> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
>> index fbdccc21418a..153864e4f399 100644
>> --- a/arch/riscv/kernel/Makefile
>> +++ b/arch/riscv/kernel/Makefile
>> @@ -23,6 +23,10 @@ ifdef CONFIG_FTRACE
>> CFLAGS_REMOVE_alternative.o = $(CC_FLAGS_FTRACE)
>> CFLAGS_REMOVE_cpufeature.o = $(CC_FLAGS_FTRACE)
>> endif
>> +ifdef CONFIG_RELOCATABLE
>> +CFLAGS_alternative.o += -fno-pie
>> +CFLAGS_cpufeature.o += -fno-pie
>> +endif
>> ifdef CONFIG_KASAN
>> KASAN_SANITIZE_alternative.o := n
>> KASAN_SANITIZE_cpufeature.o := n
>
> I can confirm that this fixes the crash.
Thanks. Alex: can you send a patch?
On 19/05/2023 23:55, Palmer Dabbelt wrote:
> On Fri, 19 May 2023 14:48:59 PDT (-0700), [email protected] wrote:
>> On Mai 19 2023, Alexandre Ghiti wrote:
>>
>>> I have tested the following patch successfully, can you give it a try
>>> while I make sure this is the only place I forgot to add the -fno-pie
>>> flag?
>>>
>>> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
>>> index fbdccc21418a..153864e4f399 100644
>>> --- a/arch/riscv/kernel/Makefile
>>> +++ b/arch/riscv/kernel/Makefile
>>> @@ -23,6 +23,10 @@ ifdef CONFIG_FTRACE
>>> CFLAGS_REMOVE_alternative.o = $(CC_FLAGS_FTRACE)
>>> CFLAGS_REMOVE_cpufeature.o = $(CC_FLAGS_FTRACE)
>>> endif
>>> +ifdef CONFIG_RELOCATABLE
>>> +CFLAGS_alternative.o += -fno-pie
>>> +CFLAGS_cpufeature.o += -fno-pie
>>> +endif
>>> ifdef CONFIG_KASAN
>>> KASAN_SANITIZE_alternative.o := n
>>> KASAN_SANITIZE_cpufeature.o := n
>>
>> I can confirm that this fixes the crash.
>
> Thanks. Alex: can you send a patch?
I don't think this patch alone will work, all the code in early
alternatives must be compiled with -fno-pie, but I'm a bit scared that's
a "big" constraint. For now, I see 2 solutions:
- Document somewhere the fact that anything called from early
alternatives must be compiled with -fno-pie
- Or relocate once with physical address, call early alternatives, and
then do the final virtual relocation
Both options can be cumbersome in their own way, if anyone has an
opinion, I'd be happy to discuss that :)