2023-08-12 02:23:37

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH 0/5] riscv: SCS support

Hi folks,

This series adds Shadow Call Stack (SCS) support for RISC-V. SCS
uses compiler instrumentation to store return addresses in a
separate shadow stack to protect them against accidental or
malicious overwrites. More information about SCS can be found
here:

https://clang.llvm.org/docs/ShadowCallStack.html

Patch 1 is from Deepak, and it simplifies VMAP_STACK overflow
handling by adding support for accessing per-CPU variables
directly in assembly. The patch is included in this series to
make IRQ stack switching cleaner with SCS, and I've simply
rebased it. Patch 2 uses this functionality to clean up the stack
switching by moving duplicate code into a single function. On
RISC-V, the compiler uses the gp register for storing the current
shadow call stack pointer, which is incompatible with global
pointer relaxation. Patch 3 moves global pointer loading into a
macro that can be easily disabled with SCS. Patch 4 implements
SCS register loading and switching, and allows the feature to be
enabled, and patch 5 adds separate per-CPU IRQ shadow call stacks
when CONFIG_IRQ_STACKS is enabled.

Note that this series requires Clang 17. Earlier Clang versions
support SCS on RISC-V, but use the x18 register instead of gp,
which isn't ideal. gcc has SCS support for arm64, but I'm not
aware of plans to support RISC-V. Once the Zicfiss extension is
ratified, it's probably preferable to use hardware-backed shadow
stacks instead of SCS on hardware that supports the extension,
and we may want to consider implementing CONFIG_DYNAMIC_SCS to
patch between the implementation at runtime (similarly to the
arm64 implementation, which switches to SCS when hardware PAC
support isn't available).

Sami


Deepak Gupta (1):
riscv: VMAP_STACK overflow detection thread-safe

Sami Tolvanen (4):
riscv: Deduplicate IRQ stack switching
riscv: Move global pointer loading to a macro
riscv: Implement Shadow Call Stack
riscv: Use separate IRQ shadow call stacks

arch/riscv/Kconfig | 6 ++
arch/riscv/Makefile | 4 +
arch/riscv/include/asm/asm.h | 35 ++++++++
arch/riscv/include/asm/irq_stack.h | 3 +
arch/riscv/include/asm/scs.h | 54 ++++++++++++
arch/riscv/include/asm/thread_info.h | 16 +++-
arch/riscv/kernel/asm-offsets.c | 4 +
arch/riscv/kernel/entry.S | 126 +++++++++++++--------------
arch/riscv/kernel/head.S | 19 ++--
arch/riscv/kernel/irq.c | 53 ++++++-----
arch/riscv/kernel/suspend_entry.S | 5 +-
arch/riscv/kernel/traps.c | 65 ++------------
arch/riscv/kernel/vdso/Makefile | 2 +-
arch/riscv/purgatory/Makefile | 4 +
14 files changed, 228 insertions(+), 168 deletions(-)
create mode 100644 arch/riscv/include/asm/scs.h


base-commit: 52a93d39b17dc7eb98b6aa3edb93943248e03b2f
--
2.41.0.640.ga95def55d0-goog



2023-08-12 02:56:11

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH 4/5] riscv: Implement Shadow Call Stack

Implement CONFIG_SHADOW_CALL_STACK for RISC-V. When enabled, the
compiler injects instructions to all non-leaf C functions to
store the return address to the shadow stack and unconditionally
load it again before returning, which makes it harder to corrupt
the return address through a stack overflow, for example.

The active shadow call stack pointer is stored in the gp
register, which makes SCS incompatible with gp relaxation. Use
--no-relax-gp to ensure gp relaxation is disabled and disable
global pointer loading. Add SCS pointers to struct thread_info,
implement SCS initialization, and task switching

Signed-off-by: Sami Tolvanen <[email protected]>
---
arch/riscv/Kconfig | 6 ++++
arch/riscv/Makefile | 4 +++
arch/riscv/include/asm/asm.h | 6 ++++
arch/riscv/include/asm/scs.h | 47 ++++++++++++++++++++++++++++
arch/riscv/include/asm/thread_info.h | 13 ++++++++
arch/riscv/kernel/asm-offsets.c | 3 ++
arch/riscv/kernel/entry.S | 11 +++++++
arch/riscv/kernel/head.S | 4 +++
arch/riscv/kernel/vdso/Makefile | 2 +-
arch/riscv/purgatory/Makefile | 4 +++
10 files changed, 99 insertions(+), 1 deletion(-)
create mode 100644 arch/riscv/include/asm/scs.h

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 4c07b9189c86..8fe31ec59da4 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -46,6 +46,7 @@ config RISCV
select ARCH_SUPPORTS_HUGETLBFS if MMU
select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU
select ARCH_SUPPORTS_PER_VMA_LOCK if MMU
+ select ARCH_SUPPORTS_SHADOW_CALL_STACK if HAVE_SHADOW_CALL_STACK
select ARCH_USE_MEMTEST
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
@@ -169,6 +170,11 @@ config GCC_SUPPORTS_DYNAMIC_FTRACE
def_bool CC_IS_GCC
depends on $(cc-option,-fpatchable-function-entry=8)

+config HAVE_SHADOW_CALL_STACK
+ def_bool $(cc-option,-fsanitize=shadow-call-stack)
+ # https://github.com/riscv-non-isa/riscv-elf-psabi-doc/commit/a484e843e6eeb51f0cb7b8819e50da6d2444d769
+ depends on $(ld-option,--no-relax-gp)
+
config ARCH_MMAP_RND_BITS_MIN
default 18 if 64BIT
default 8
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 6ec6d52a4180..e518a74640fb 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -55,6 +55,10 @@ endif
endif
endif

+ifeq ($(CONFIG_SHADOW_CALL_STACK),y)
+ KBUILD_LDFLAGS += --no-relax-gp
+endif
+
# ISA string setting
riscv-march-$(CONFIG_ARCH_RV32I) := rv32ima
riscv-march-$(CONFIG_ARCH_RV64I) := rv64ima
diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
index c53388f784a2..ac77738835ae 100644
--- a/arch/riscv/include/asm/asm.h
+++ b/arch/riscv/include/asm/asm.h
@@ -103,6 +103,11 @@
REG_L \dst, 0(\dst)
.endm

+#ifdef CONFIG_SHADOW_CALL_STACK
+/* gp is used as the shadow call stack pointer instead */
+.macro load_global_pointer
+.endm
+#else
/* load __global_pointer to gp */
.macro load_global_pointer
.option push
@@ -110,6 +115,7 @@
la gp, __global_pointer$
.option pop
.endm
+#endif /* CONFIG_SHADOW_CALL_STACK */

/* save all GPs except x1 ~ x5 */
.macro save_from_x6_to_x31
diff --git a/arch/riscv/include/asm/scs.h b/arch/riscv/include/asm/scs.h
new file mode 100644
index 000000000000..94726ea773e3
--- /dev/null
+++ b/arch/riscv/include/asm/scs.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_SCS_H
+#define _ASM_SCS_H
+
+#ifdef __ASSEMBLY__
+#include <asm/asm-offsets.h>
+
+#ifdef CONFIG_SHADOW_CALL_STACK
+
+/* Load init_shadow_call_stack to gp. */
+.macro scs_load_init_stack
+ la gp, init_shadow_call_stack
+ XIP_FIXUP_OFFSET gp
+.endm
+
+/* Load task_scs_sp(current) to gp. */
+.macro scs_load_current
+ REG_L gp, TASK_TI_SCS_SP(tp)
+.endm
+
+/* Load task_scs_sp(current) to gp, but only if tp has changed. */
+.macro scs_load_current_if_task_changed prev
+ beq \prev, tp, _skip_scs
+ scs_load_current
+_skip_scs:
+.endm
+
+/* Save gp to task_scs_sp(current). */
+.macro scs_save_current
+ REG_S gp, TASK_TI_SCS_SP(tp)
+.endm
+
+#else /* CONFIG_SHADOW_CALL_STACK */
+
+.macro scs_load_init_stack
+.endm
+.macro scs_load_current
+.endm
+.macro scs_load_current_if_task_changed prev
+.endm
+.macro scs_save_current
+.endm
+
+#endif /* CONFIG_SHADOW_CALL_STACK */
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_SCS_H */
diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
index d18ce0113ca1..574779900bfb 100644
--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
@@ -57,8 +57,20 @@ struct thread_info {
long user_sp; /* User stack pointer */
int cpu;
unsigned long syscall_work; /* SYSCALL_WORK_ flags */
+#ifdef CONFIG_SHADOW_CALL_STACK
+ void *scs_base;
+ void *scs_sp;
+#endif
};

+#ifdef CONFIG_SHADOW_CALL_STACK
+#define INIT_SCS \
+ .scs_base = init_shadow_call_stack, \
+ .scs_sp = init_shadow_call_stack,
+#else
+#define INIT_SCS
+#endif
+
/*
* macros/functions for gaining access to the thread information structure
*
@@ -68,6 +80,7 @@ struct thread_info {
{ \
.flags = 0, \
.preempt_count = INIT_PREEMPT_COUNT, \
+ INIT_SCS \
}

void arch_release_task_struct(struct task_struct *tsk);
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 9f535d5de33f..177cef43a2ee 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -38,6 +38,9 @@ void asm_offsets(void)
OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count);
OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp);
OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp);
+#ifdef CONFIG_SHADOW_CALL_STACK
+ OFFSET(TASK_TI_SCS_SP, task_struct, thread_info.scs_sp);
+#endif

OFFSET(TASK_TI_CPU_NUM, task_struct, thread_info.cpu);
OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]);
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 2b4248c6b0a9..ad34507d3c96 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -9,6 +9,7 @@

#include <asm/asm.h>
#include <asm/csr.h>
+#include <asm/scs.h>
#include <asm/unistd.h>
#include <asm/page.h>
#include <asm/thread_info.h>
@@ -77,6 +78,9 @@ _save_context:
/* Load the global pointer */
load_global_pointer

+ /* Load the kernel shadow call stack pointer if coming from userspace */
+ scs_load_current_if_task_changed s5
+
move a0, sp /* pt_regs */
la ra, ret_from_exception

@@ -123,6 +127,9 @@ SYM_CODE_START_NOALIGN(ret_from_exception)
addi s0, sp, PT_SIZE_ON_STACK
REG_S s0, TASK_TI_KERNEL_SP(tp)

+ /* Save the kernel shadow call stack pointer */
+ scs_save_current
+
/*
* Save TP into the scratch register , so we can find the kernel data
* structures again.
@@ -277,6 +284,8 @@ SYM_FUNC_START(__switch_to)
REG_S s9, TASK_THREAD_S9_RA(a3)
REG_S s10, TASK_THREAD_S10_RA(a3)
REG_S s11, TASK_THREAD_S11_RA(a3)
+ /* Save the kernel shadow call stack pointer */
+ scs_save_current
/* Restore context from next->thread */
REG_L ra, TASK_THREAD_RA_RA(a4)
REG_L sp, TASK_THREAD_SP_RA(a4)
@@ -294,6 +303,8 @@ SYM_FUNC_START(__switch_to)
REG_L s11, TASK_THREAD_S11_RA(a4)
/* The offset of thread_info in task_struct is zero. */
move tp, a1
+ /* Switch to the next shadow call stack */
+ scs_load_current
ret
SYM_FUNC_END(__switch_to)

diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 79b5a863c782..c3d0ee77483b 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -14,6 +14,7 @@
#include <asm/cpu_ops_sbi.h>
#include <asm/hwcap.h>
#include <asm/image.h>
+#include <asm/scs.h>
#include <asm/xip_fixup.h>
#include "efi-header.S"

@@ -153,6 +154,7 @@ secondary_start_sbi:
XIP_FIXUP_OFFSET a3
add a3, a3, a1
REG_L sp, (a3)
+ scs_load_current

.Lsecondary_start_common:

@@ -293,6 +295,7 @@ clear_bss_done:
la sp, init_thread_union + THREAD_SIZE
XIP_FIXUP_OFFSET sp
addi sp, sp, -PT_SIZE_ON_STACK
+ scs_load_init_stack
#ifdef CONFIG_BUILTIN_DTB
la a0, __dtb_start
XIP_FIXUP_OFFSET a0
@@ -311,6 +314,7 @@ clear_bss_done:
la tp, init_task
la sp, init_thread_union + THREAD_SIZE
addi sp, sp, -PT_SIZE_ON_STACK
+ scs_load_init_stack

#ifdef CONFIG_KASAN
call kasan_early_init
diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile
index 6b1dba11bf6d..48c362c0cb3d 100644
--- a/arch/riscv/kernel/vdso/Makefile
+++ b/arch/riscv/kernel/vdso/Makefile
@@ -36,7 +36,7 @@ CPPFLAGS_vdso.lds += -DHAS_VGETTIMEOFDAY
endif

# Disable -pg to prevent insert call site
-CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS)

# Disable profiling and instrumentation for VDSO code
GCOV_PROFILE := n
diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile
index dc20e166983e..d5d60c040560 100644
--- a/arch/riscv/purgatory/Makefile
+++ b/arch/riscv/purgatory/Makefile
@@ -77,6 +77,10 @@ ifdef CONFIG_STACKPROTECTOR_STRONG
PURGATORY_CFLAGS_REMOVE += -fstack-protector-strong
endif

+ifdef CONFIG_SHADOW_CALL_STACK
+PURGATORY_CFLAGS_REMOVE += $(CC_FLAGS_SCS)
+endif
+
CFLAGS_REMOVE_purgatory.o += $(PURGATORY_CFLAGS_REMOVE)
CFLAGS_purgatory.o += $(PURGATORY_CFLAGS)

--
2.41.0.640.ga95def55d0-goog


2023-08-12 02:58:47

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH 2/5] riscv: Deduplicate IRQ stack switching

With CONFIG_IRQ_STACKS, we switch to a separate per-CPU IRQ stack
before calling handle_riscv_irq or __do_softirq. We currently
have duplicate inline assembly snippets for stack switching in
both code paths. Now that we can access per-CPU variables in
assembly, implement call_on_irq_stack in assembly, and use that
instead of redudant inline assembly.

Signed-off-by: Sami Tolvanen <[email protected]>
---
arch/riscv/include/asm/asm.h | 5 +++++
arch/riscv/include/asm/irq_stack.h | 3 +++
arch/riscv/kernel/entry.S | 32 ++++++++++++++++++++++++++++++
arch/riscv/kernel/irq.c | 32 ++++++++----------------------
arch/riscv/kernel/traps.c | 29 ++++-----------------------
5 files changed, 52 insertions(+), 49 deletions(-)

diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
index f403e46e04f2..13815a7f907a 100644
--- a/arch/riscv/include/asm/asm.h
+++ b/arch/riscv/include/asm/asm.h
@@ -98,6 +98,11 @@
add \dst, \dst, \tmp
.endm

+.macro load_per_cpu dst ptr tmp
+ asm_per_cpu \dst \ptr \tmp
+ REG_L \dst, 0(\dst)
+.endm
+
/* save all GPs except x1 ~ x5 */
.macro save_from_x6_to_x31
REG_S x6, PT_T1(sp)
diff --git a/arch/riscv/include/asm/irq_stack.h b/arch/riscv/include/asm/irq_stack.h
index e4042d297580..6441ded3b0cf 100644
--- a/arch/riscv/include/asm/irq_stack.h
+++ b/arch/riscv/include/asm/irq_stack.h
@@ -12,6 +12,9 @@

DECLARE_PER_CPU(ulong *, irq_stack_ptr);

+asmlinkage void call_on_irq_stack(struct pt_regs *regs,
+ void (*func)(struct pt_regs *));
+
#ifdef CONFIG_VMAP_STACK
/*
* To ensure that VMAP'd stack overflow detection works correctly, all VMAP'd
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 3d11aa3af105..39875f5e08a6 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -218,6 +218,38 @@ SYM_CODE_START(ret_from_fork)
tail syscall_exit_to_user_mode
SYM_CODE_END(ret_from_fork)

+#ifdef CONFIG_IRQ_STACKS
+/*
+ * void call_on_irq_stack(struct pt_regs *regs,
+ * void (*func)(struct pt_regs *));
+ *
+ * Calls func(regs) using the per-CPU IRQ stack.
+ */
+SYM_FUNC_START(call_on_irq_stack)
+ /* Create a frame record to save ra and s0 (fp) */
+ addi sp, sp, -RISCV_SZPTR
+ REG_S ra, (sp)
+ addi sp, sp, -RISCV_SZPTR
+ REG_S s0, (sp)
+ addi s0, sp, 2*RISCV_SZPTR
+
+ /* Switch to the per-CPU IRQ stack and call the handler */
+ load_per_cpu t0, irq_stack_ptr, t1
+ li t1, IRQ_STACK_SIZE
+ add sp, t0, t1
+ jalr a1
+
+ /* Switch back to the thread stack and restore ra and s0 */
+ addi sp, s0, -2*RISCV_SZPTR
+ REG_L s0, (sp)
+ addi sp, sp, RISCV_SZPTR
+ REG_L ra, (sp)
+ addi sp, sp, RISCV_SZPTR
+
+ ret
+SYM_FUNC_END(call_on_irq_stack)
+#endif /* CONFIG_IRQ_STACKS */
+
/*
* Integer register context switch
* The callee-saved registers must be saved and restored.
diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
index d0577cc6a081..95dafdcbd135 100644
--- a/arch/riscv/kernel/irq.c
+++ b/arch/riscv/kernel/irq.c
@@ -61,32 +61,16 @@ static void init_irq_stacks(void)
#endif /* CONFIG_VMAP_STACK */

#ifdef CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK
+static void ___do_softirq(struct pt_regs *regs)
+{
+ __do_softirq();
+}
+
void do_softirq_own_stack(void)
{
-#ifdef CONFIG_IRQ_STACKS
- if (on_thread_stack()) {
- ulong *sp = per_cpu(irq_stack_ptr, smp_processor_id())
- + IRQ_STACK_SIZE/sizeof(ulong);
- __asm__ __volatile(
- "addi sp, sp, -"RISCV_SZPTR "\n"
- REG_S" ra, (sp) \n"
- "addi sp, sp, -"RISCV_SZPTR "\n"
- REG_S" s0, (sp) \n"
- "addi s0, sp, 2*"RISCV_SZPTR "\n"
- "move sp, %[sp] \n"
- "call __do_softirq \n"
- "addi sp, s0, -2*"RISCV_SZPTR"\n"
- REG_L" s0, (sp) \n"
- "addi sp, sp, "RISCV_SZPTR "\n"
- REG_L" ra, (sp) \n"
- "addi sp, sp, "RISCV_SZPTR "\n"
- :
- : [sp] "r" (sp)
- : "a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7",
- "t0", "t1", "t2", "t3", "t4", "t5", "t6",
- "memory");
- } else
-#endif
+ if (on_thread_stack())
+ call_on_irq_stack(NULL, ___do_softirq);
+ else
__do_softirq();
}
#endif /* CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK */
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index deb2144d9143..83319b6816da 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -350,31 +350,10 @@ static void noinstr handle_riscv_irq(struct pt_regs *regs)
asmlinkage void noinstr do_irq(struct pt_regs *regs)
{
irqentry_state_t state = irqentry_enter(regs);
-#ifdef CONFIG_IRQ_STACKS
- if (on_thread_stack()) {
- ulong *sp = per_cpu(irq_stack_ptr, smp_processor_id())
- + IRQ_STACK_SIZE/sizeof(ulong);
- __asm__ __volatile(
- "addi sp, sp, -"RISCV_SZPTR "\n"
- REG_S" ra, (sp) \n"
- "addi sp, sp, -"RISCV_SZPTR "\n"
- REG_S" s0, (sp) \n"
- "addi s0, sp, 2*"RISCV_SZPTR "\n"
- "move sp, %[sp] \n"
- "move a0, %[regs] \n"
- "call handle_riscv_irq \n"
- "addi sp, s0, -2*"RISCV_SZPTR"\n"
- REG_L" s0, (sp) \n"
- "addi sp, sp, "RISCV_SZPTR "\n"
- REG_L" ra, (sp) \n"
- "addi sp, sp, "RISCV_SZPTR "\n"
- :
- : [sp] "r" (sp), [regs] "r" (regs)
- : "a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7",
- "t0", "t1", "t2", "t3", "t4", "t5", "t6",
- "memory");
- } else
-#endif
+
+ if (IS_ENABLED(CONFIG_IRQ_STACKS) && on_thread_stack())
+ call_on_irq_stack(regs, handle_riscv_irq);
+ else
handle_riscv_irq(regs);

irqentry_exit(regs, state);
--
2.41.0.640.ga95def55d0-goog


2023-08-16 05:41:26

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH 0/5] riscv: SCS support

Hi Sami,

On Fri, Aug 11, 2023 at 11:35:57PM +0000, Sami Tolvanen wrote:
> Hi folks,
>
> This series adds Shadow Call Stack (SCS) support for RISC-V. SCS
> uses compiler instrumentation to store return addresses in a
> separate shadow stack to protect them against accidental or
> malicious overwrites. More information about SCS can be found
> here:
>
> https://clang.llvm.org/docs/ShadowCallStack.html
>
> Patch 1 is from Deepak, and it simplifies VMAP_STACK overflow
> handling by adding support for accessing per-CPU variables
> directly in assembly. The patch is included in this series to
> make IRQ stack switching cleaner with SCS, and I've simply
> rebased it. Patch 2 uses this functionality to clean up the stack
> switching by moving duplicate code into a single function. On
> RISC-V, the compiler uses the gp register for storing the current
> shadow call stack pointer, which is incompatible with global
> pointer relaxation. Patch 3 moves global pointer loading into a
> macro that can be easily disabled with SCS. Patch 4 implements
> SCS register loading and switching, and allows the feature to be
> enabled, and patch 5 adds separate per-CPU IRQ shadow call stacks
> when CONFIG_IRQ_STACKS is enabled.
>
> Note that this series requires Clang 17. Earlier Clang versions
> support SCS on RISC-V, but use the x18 register instead of gp,
> which isn't ideal. gcc has SCS support for arm64, but I'm not
> aware of plans to support RISC-V. Once the Zicfiss extension is
> ratified, it's probably preferable to use hardware-backed shadow
> stacks instead of SCS on hardware that supports the extension,
> and we may want to consider implementing CONFIG_DYNAMIC_SCS to
> patch between the implementation at runtime (similarly to the
> arm64 implementation, which switches to SCS when hardware PAC
> support isn't available).

I took this series for a spin on top of 6.5-rc6 with both LLVM 18 (built
within the past couple of days) and LLVM 17.0.0-rc2 but it seems that
the CFI_BACKWARDS LKDTM test does not pass with
CONFIG_SHADOW_CALL_STACK=y.

[ 73.324652] lkdtm: Performing direct entry CFI_BACKWARD
[ 73.324900] lkdtm: Attempting unchecked stack return address redirection ...
[ 73.325178] lkdtm: Eek: return address mismatch! 0000000000000002 != ffffffff80614982
[ 73.325478] lkdtm: FAIL: stack return address manipulation failed!

Does the test need to be adjusted or is there some other issue?

Cheers,
Nathan

2023-08-20 10:52:40

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH 0/5] riscv: SCS support

On Mon, Aug 14, 2023 at 11:33 AM Kees Cook <[email protected]> wrote:
>
> On Mon, Aug 14, 2023 at 10:59:28AM -0700, Nathan Chancellor wrote:
> > Hi Sami,
> >
> > On Fri, Aug 11, 2023 at 11:35:57PM +0000, Sami Tolvanen wrote:
> > > Hi folks,
> > >
> > > This series adds Shadow Call Stack (SCS) support for RISC-V. SCS
> > > uses compiler instrumentation to store return addresses in a
> > > separate shadow stack to protect them against accidental or
> > > malicious overwrites. More information about SCS can be found
> > > here:
> > >
> > > https://clang.llvm.org/docs/ShadowCallStack.html
> > >
> > > Patch 1 is from Deepak, and it simplifies VMAP_STACK overflow
> > > handling by adding support for accessing per-CPU variables
> > > directly in assembly. The patch is included in this series to
> > > make IRQ stack switching cleaner with SCS, and I've simply
> > > rebased it. Patch 2 uses this functionality to clean up the stack
> > > switching by moving duplicate code into a single function. On
> > > RISC-V, the compiler uses the gp register for storing the current
> > > shadow call stack pointer, which is incompatible with global
> > > pointer relaxation. Patch 3 moves global pointer loading into a
> > > macro that can be easily disabled with SCS. Patch 4 implements
> > > SCS register loading and switching, and allows the feature to be
> > > enabled, and patch 5 adds separate per-CPU IRQ shadow call stacks
> > > when CONFIG_IRQ_STACKS is enabled.
> > >
> > > Note that this series requires Clang 17. Earlier Clang versions
> > > support SCS on RISC-V, but use the x18 register instead of gp,
> > > which isn't ideal. gcc has SCS support for arm64, but I'm not
> > > aware of plans to support RISC-V. Once the Zicfiss extension is
> > > ratified, it's probably preferable to use hardware-backed shadow
> > > stacks instead of SCS on hardware that supports the extension,
> > > and we may want to consider implementing CONFIG_DYNAMIC_SCS to
> > > patch between the implementation at runtime (similarly to the
> > > arm64 implementation, which switches to SCS when hardware PAC
> > > support isn't available).
> >
> > I took this series for a spin on top of 6.5-rc6 with both LLVM 18 (built
> > within the past couple of days) and LLVM 17.0.0-rc2 but it seems that
> > the CFI_BACKWARDS LKDTM test does not pass with
> > CONFIG_SHADOW_CALL_STACK=y.
> >
> > [ 73.324652] lkdtm: Performing direct entry CFI_BACKWARD
> > [ 73.324900] lkdtm: Attempting unchecked stack return address redirection ...
> > [ 73.325178] lkdtm: Eek: return address mismatch! 0000000000000002 != ffffffff80614982
> > [ 73.325478] lkdtm: FAIL: stack return address manipulation failed!
> >
> > Does the test need to be adjusted or is there some other issue?
>
> Does it pass without the series? I tried to write it to be
> arch-agnostic, but I never tested it on RISC-V. It's very possible that
> test needs adjusting for the architecture. Besides the label horrors,
> the use of __builtin_frame_address may not work there either...

Looks like __builtin_frame_address behaves differently on RISC-V.
After staring at the disassembly a bit, using
__builtin_frame_address(0) - 1 instead of + 1 seems to yield correct
results. WIth that change, here's the test without SCS:

# echo CFI_BACKWARD > /sys/kernel/debug/provoke-crash/DIRECT
[ 16.272028] lkdtm: Performing direct entry CFI_BACKWARD
[ 16.272368] lkdtm: Attempting unchecked stack return address redirection ...
[ 16.272671] lkdtm: ok: redirected stack return address.
[ 16.272885] lkdtm: Attempting checked stack return address redirection ...
[ 16.273384] lkdtm: FAIL: stack return address was redirected!
[ 16.273755] lkdtm: This is probably expected, since this kernel
(6.5.0-rc5-00005-g5a1201f89265-dirty riscv64) was built *without*
CONFIG_ARM64_PTR_AUTH_KERNEL=y nor CONFIG_SHADOW_CALL_STACK=y

And with SCS:

# echo CFI_BACKWARD > /sys/kernel/debug/provoke-crash/DIRECT
[ 17.429551] lkdtm: Performing direct entry CFI_BACKWARD
[ 17.430184] lkdtm: Attempting unchecked stack return address redirection ...
[ 17.431402] lkdtm: ok: redirected stack return address.
[ 17.432031] lkdtm: Attempting checked stack return address redirection ...
[ 17.432861] lkdtm: ok: control flow unchanged.

Sami