2023-09-26 19:49:06

by Clément Léger

[permalink] [raw]
Subject: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

Since commit 61cadb9 ("Provide new description of misaligned load/store
behavior compatible with privileged architecture.") in the RISC-V ISA
manual, it is stated that misaligned load/store might not be supported.
However, the RISC-V kernel uABI describes that misaligned accesses are
supported. In order to support that, this series adds support for S-mode
handling of misaligned accesses as well support for prctl(PR_UNALIGN).

Handling misaligned access in kernel allows for a finer grain control
of the misaligned accesses behavior, and thanks to the prctl call, can
allow disabling misaligned access emulation to generate SIGBUS. User
space can then optimize its software by removing such access based on
SIGBUS generation.

Currently, this series is useful for people that uses a SBI that does
not handled misaligned traps. In a near future, this series will make
use a SBI extension [1] allowing to request delegation of the
misaligned load/store traps to the S-mode software. This extension has
been submitted for review to the riscv tech-prs group. An OpenSBI
implementation for this spec is available at [2].

This series can be tested using the spike simulator [3] and an openSBI
version [4] which allows to always delegate misaligned load/store to
S-mode.

[1] https://lists.riscv.org/g/tech-prs/message/540
[2] https://github.com/rivosinc/opensbi/tree/dev/cleger/fw_feature_upstream
[3] https://github.com/riscv-software-src/riscv-isa-sim
[4] https://github.com/rivosinc/opensbi/tree/dev/cleger/no_misaligned

Clément Léger (7):
riscv: remove unused functions in traps_misaligned.c
riscv: add support for misaligned handling in S-mode
riscv: report perf event for misaligned fault
riscv: add floating point insn support to misaligned access emulation
riscv: add support for sysctl unaligned_enabled control
riscv: report misaligned accesses emulation to hwprobe
riscv: add support for PR_SET_UNALIGN and PR_GET_UNALIGN

arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/cpufeature.h | 6 +
arch/riscv/include/asm/entry-common.h | 3 +
arch/riscv/include/asm/processor.h | 9 +
arch/riscv/kernel/Makefile | 2 +-
arch/riscv/kernel/cpufeature.c | 6 +-
arch/riscv/kernel/fpu.S | 117 ++++++++
arch/riscv/kernel/process.c | 18 ++
arch/riscv/kernel/setup.c | 1 +
arch/riscv/kernel/traps.c | 9 -
arch/riscv/kernel/traps_misaligned.c | 374 ++++++++++++++++++++++----
11 files changed, 488 insertions(+), 58 deletions(-)

--
2.40.1


2023-09-26 21:10:57

by Clément Léger

[permalink] [raw]
Subject: [PATCH 5/7] riscv: add support for sysctl unaligned_enabled control

This sysctl tuning option allows the user to disable misaligned access
handling globally on the system. This will also be used by misaligned
detection code to temporarily disable misaligned access handling.

Signed-off-by: Clément Léger <[email protected]>
---
arch/riscv/Kconfig | 1 +
arch/riscv/kernel/traps_misaligned.c | 9 +++++++++
2 files changed, 10 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d607ab0f7c6d..3515510fe418 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -157,6 +157,7 @@ config RISCV
select RISCV_TIMER if RISCV_SBI
select SIFIVE_PLIC
select SPARSE_IRQ
+ select SYSCTL_ARCH_UNALIGN_ALLOW
select SYSCTL_EXCEPTION_TRACE
select THREAD_INFO_IN_TASK
select TRACE_IRQFLAGS_SUPPORT
diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c
index 041fd2dbd955..b5fb1ff078e3 100644
--- a/arch/riscv/kernel/traps_misaligned.c
+++ b/arch/riscv/kernel/traps_misaligned.c
@@ -396,6 +396,9 @@ union reg_data {
u64 data_u64;
};

+/* sysctl hooks */
+int unaligned_enabled __read_mostly = 1; /* Enabled by default */
+
int handle_misaligned_load(struct pt_regs *regs)
{
union reg_data val;
@@ -406,6 +409,9 @@ int handle_misaligned_load(struct pt_regs *regs)

perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr);

+ if (!unaligned_enabled)
+ return -1;
+
if (get_insn(regs, epc, &insn))
return -1;

@@ -502,6 +508,9 @@ int handle_misaligned_store(struct pt_regs *regs)

perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr);

+ if (!unaligned_enabled)
+ return -1;
+
if (get_insn(regs, epc, &insn))
return -1;

--
2.40.1

2023-09-26 21:36:47

by Clément Léger

[permalink] [raw]
Subject: [PATCH 4/7] riscv: add floating point insn support to misaligned access emulation

This support is partially based of openSBI misaligned emulation floating
point instruction support. It provides support for the existing
floating point instructions (both for 32/64 bits as well as compressed
ones). Since floating point registers are not part of the pt_regs
struct, we need to modify them directly using some assembly. We also
dirty the pt_regs status in case we modify them to be sure context
switch will save FP state. With this support, Linux is on par with
openSBI support.

Signed-off-by: Clément Léger <[email protected]>
---
arch/riscv/kernel/fpu.S | 117 +++++++++++++++++++++
arch/riscv/kernel/traps_misaligned.c | 152 ++++++++++++++++++++++++++-
2 files changed, 265 insertions(+), 4 deletions(-)

diff --git a/arch/riscv/kernel/fpu.S b/arch/riscv/kernel/fpu.S
index dd2205473de7..2785badb247c 100644
--- a/arch/riscv/kernel/fpu.S
+++ b/arch/riscv/kernel/fpu.S
@@ -104,3 +104,120 @@ ENTRY(__fstate_restore)
csrc CSR_STATUS, t1
ret
ENDPROC(__fstate_restore)
+
+#define get_f32(which) fmv.x.s a0, which; j 2f
+#define put_f32(which) fmv.s.x which, a1; j 2f
+#if __riscv_xlen == 64
+# define get_f64(which) fmv.x.d a0, which; j 2f
+# define put_f64(which) fmv.d.x which, a1; j 2f
+#else
+# define get_f64(which) fsd which, 0(a1); j 2f
+# define put_f64(which) fld which, 0(a1); j 2f
+#endif
+
+.macro fp_access_prologue
+ /*
+ * Compute jump offset to store the correct FP register since we don't
+ * have indirect FP register access
+ */
+ sll t0, a0, 3
+ la t2, 1f
+ add t0, t0, t2
+ li t1, SR_FS
+ csrs CSR_STATUS, t1
+ jr t0
+1:
+.endm
+
+.macro fp_access_epilogue
+2:
+ csrc CSR_STATUS, t1
+ ret
+.endm
+
+#define fp_access_body(__access_func) \
+ __access_func(f0); \
+ __access_func(f1); \
+ __access_func(f2); \
+ __access_func(f3); \
+ __access_func(f4); \
+ __access_func(f5); \
+ __access_func(f6); \
+ __access_func(f7); \
+ __access_func(f8); \
+ __access_func(f9); \
+ __access_func(f10); \
+ __access_func(f11); \
+ __access_func(f12); \
+ __access_func(f13); \
+ __access_func(f14); \
+ __access_func(f15); \
+ __access_func(f16); \
+ __access_func(f17); \
+ __access_func(f18); \
+ __access_func(f19); \
+ __access_func(f20); \
+ __access_func(f21); \
+ __access_func(f22); \
+ __access_func(f23); \
+ __access_func(f24); \
+ __access_func(f25); \
+ __access_func(f26); \
+ __access_func(f27); \
+ __access_func(f28); \
+ __access_func(f29); \
+ __access_func(f30); \
+ __access_func(f31)
+
+
+/*
+ * Disable compressed instructions set to keep a constant offset between FP
+ * load/store/move instructions
+ */
+.option norvc
+/*
+ * put_f32_reg - Set a FP register from a register containing the value
+ * a0 = FP register index to be set
+ * a1 = value to be loaded in the FP register
+ */
+SYM_FUNC_START(put_f32_reg)
+ fp_access_prologue
+ fp_access_body(put_f32)
+ fp_access_epilogue
+SYM_FUNC_END(put_f32_reg)
+
+/*
+ * get_f32_reg - Get a FP register value and return it
+ * a0 = FP register index to be retrieved
+ */
+SYM_FUNC_START(get_f32_reg)
+ fp_access_prologue
+ fp_access_body(get_f32)
+ fp_access_epilogue
+SYM_FUNC_END(put_f32_reg)
+
+/*
+ * put_f64_reg - Set a 64 bits FP register from a value or a pointer.
+ * a0 = FP register index to be set
+ * a1 = value/pointer to be loaded in the FP register (when xlen == 32 bits, we
+ * load the value to a pointer).
+ */
+SYM_FUNC_START(put_f64_reg)
+ fp_access_prologue
+ fp_access_body(put_f64)
+ fp_access_epilogue
+SYM_FUNC_END(put_f64_reg)
+
+/*
+ * put_f64_reg - Get a 64 bits FP register value and returned it or store it to
+ * a pointer.
+ * a0 = FP register index to be retrieved
+ * a1 = If xlen == 32, pointer which should be loaded with the FP register value
+ * or unused if xlen == 64. In which case the FP register value is returned
+ * through a0
+ */
+SYM_FUNC_START(get_f64_reg)
+ fp_access_prologue
+ fp_access_body(get_f64)
+ fp_access_epilogue
+SYM_FUNC_END(get_f64_reg)
diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c
index 804f6c5e0e44..041fd2dbd955 100644
--- a/arch/riscv/kernel/traps_misaligned.c
+++ b/arch/riscv/kernel/traps_misaligned.c
@@ -153,6 +153,115 @@
#define PRECISION_S 0
#define PRECISION_D 1

+#ifdef CONFIG_FPU
+
+#define FP_GET_RD(insn) (insn >> 7 & 0x1F)
+
+extern void put_f32_reg(unsigned long fp_reg, unsigned long value);
+
+static int set_f32_rd(unsigned long insn, struct pt_regs *regs,
+ unsigned long val)
+{
+ unsigned long fp_reg = FP_GET_RD(insn);
+
+ put_f32_reg(fp_reg, val);
+ regs->status |= SR_FS_DIRTY;
+
+ return 0;
+}
+
+extern void put_f64_reg(unsigned long fp_reg, unsigned long value);
+
+static int set_f64_rd(unsigned long insn, struct pt_regs *regs, u64 val)
+{
+ unsigned long fp_reg = FP_GET_RD(insn);
+ unsigned long value;
+
+#if __riscv_xlen == 32
+ value = (unsigned long) &val;
+#else
+ value = val;
+#endif
+ put_f64_reg(fp_reg, value);
+ regs->status |= SR_FS_DIRTY;
+
+ return 0;
+}
+
+#if __riscv_xlen == 32
+extern void get_f64_reg(unsigned long fp_reg, u64 *value);
+
+static u64 get_f64_rs(unsigned long insn, u8 fp_reg_offset,
+ struct pt_regs *regs)
+{
+ unsigned long fp_reg = (insn >> fp_reg_offset) & 0x1F;
+ u64 val;
+
+ get_f64_reg(fp_reg, &val);
+ regs->status |= SR_FS_DIRTY;
+
+ return val;
+}
+#else
+
+extern unsigned long get_f64_reg(unsigned long fp_reg);
+
+static unsigned long get_f64_rs(unsigned long insn, u8 fp_reg_offset,
+ struct pt_regs *regs)
+{
+ unsigned long fp_reg = (insn >> fp_reg_offset) & 0x1F;
+ unsigned long val;
+
+ val = get_f64_reg(fp_reg);
+ regs->status |= SR_FS_DIRTY;
+
+ return val;
+}
+
+#endif
+
+extern unsigned long get_f32_reg(unsigned long fp_reg);
+
+static unsigned long get_f32_rs(unsigned long insn, u8 fp_reg_offset,
+ struct pt_regs *regs)
+{
+ unsigned long fp_reg = (insn >> fp_reg_offset) & 0x1F;
+ unsigned long val;
+
+ val = get_f32_reg(fp_reg);
+ regs->status |= SR_FS_DIRTY;
+
+ return val;
+}
+
+#else /* CONFIG_FPU */
+static void set_f32_rd(unsigned long insn, struct pt_regs *regs,
+ unsigned long val) {}
+
+static void set_f64_rd(unsigned long insn, struct pt_regs *regs, u64 val) {}
+
+static unsigned long get_f64_rs(unsigned long insn, u8 fp_reg_offset,
+ struct pt_regs *regs)
+{
+ return 0;
+}
+
+static unsigned long get_f32_rs(unsigned long insn, u8 fp_reg_offset,
+ struct pt_regs *regs)
+{
+ return 0;
+}
+
+#endif
+
+#define GET_F64_RS2(insn, regs) (get_f64_rs(insn, 20, regs))
+#define GET_F64_RS2C(insn, regs) (get_f64_rs(insn, 2, regs))
+#define GET_F64_RS2S(insn, regs) (get_f64_rs(RVC_RS2S(insn), 0, regs))
+
+#define GET_F32_RS2(insn, regs) (get_f32_rs(insn, 20, regs))
+#define GET_F32_RS2C(insn, regs) (get_f32_rs(insn, 2, regs))
+#define GET_F32_RS2S(insn, regs) (get_f32_rs(RVC_RS2S(insn), 0, regs))
+
#ifdef CONFIG_RISCV_M_MODE
static inline int load_u8(struct pt_regs *regs, const u8 *addr, u8 *r_val)
{
@@ -362,15 +471,21 @@ int handle_misaligned_load(struct pt_regs *regs)
return -1;
}

+ if (!IS_ENABLED(CONFIG_FPU) && fp)
+ return -EOPNOTSUPP;
+
val.data_u64 = 0;
for (i = 0; i < len; i++) {
if (load_u8(regs, (void *)(addr + i), &val.data_bytes[i]))
return -1;
}

- if (fp)
- return -1;
- SET_RD(insn, regs, val.data_ulong << shift >> shift);
+ if (!fp)
+ SET_RD(insn, regs, val.data_ulong << shift >> shift);
+ else if (len == 8)
+ set_f64_rd(insn, regs, val.data_u64);
+ else
+ set_f32_rd(insn, regs, val.data_ulong);

regs->epc = epc + INSN_LEN(insn);

@@ -383,7 +498,7 @@ int handle_misaligned_store(struct pt_regs *regs)
unsigned long epc = regs->epc;
unsigned long insn;
unsigned long addr = regs->badaddr;
- int i, len = 0;
+ int i, len = 0, fp = 0;

perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr);

@@ -400,6 +515,14 @@ int handle_misaligned_store(struct pt_regs *regs)
} else if ((insn & INSN_MASK_SD) == INSN_MATCH_SD) {
len = 8;
#endif
+ } else if ((insn & INSN_MASK_FSD) == INSN_MATCH_FSD) {
+ fp = 1;
+ len = 8;
+ val.data_u64 = GET_F64_RS2(insn, regs);
+ } else if ((insn & INSN_MASK_FSW) == INSN_MATCH_FSW) {
+ fp = 1;
+ len = 4;
+ val.data_ulong = GET_F32_RS2(insn, regs);
} else if ((insn & INSN_MASK_SH) == INSN_MATCH_SH) {
len = 2;
#if defined(CONFIG_64BIT)
@@ -418,11 +541,32 @@ int handle_misaligned_store(struct pt_regs *regs)
((insn >> SH_RD) & 0x1f)) {
len = 4;
val.data_ulong = GET_RS2C(insn, regs);
+ } else if ((insn & INSN_MASK_C_FSD) == INSN_MATCH_C_FSD) {
+ fp = 1;
+ len = 8;
+ val.data_u64 = GET_F64_RS2S(insn, regs);
+ } else if ((insn & INSN_MASK_C_FSDSP) == INSN_MATCH_C_FSDSP) {
+ fp = 1;
+ len = 8;
+ val.data_u64 = GET_F64_RS2C(insn, regs);
+#if !defined(CONFIG_64BIT)
+ } else if ((insn & INSN_MASK_C_FSW) == INSN_MATCH_C_FSW) {
+ fp = 1;
+ len = 4;
+ val.data_ulong = GET_F32_RS2S(insn, regs);
+ } else if ((insn & INSN_MASK_C_FSWSP) == INSN_MATCH_C_FSWSP) {
+ fp = 1;
+ len = 4;
+ val.data_ulong = GET_F32_RS2C(insn, regs);
+#endif
} else {
regs->epc = epc;
return -1;
}

+ if (!IS_ENABLED(CONFIG_FPU) && fp)
+ return -EOPNOTSUPP;
+
for (i = 0; i < len; i++) {
if (store_u8(regs, (void *)(addr + i), val.data_bytes[i]))
return -1;
--
2.40.1

2023-09-27 00:03:22

by Clément Léger

[permalink] [raw]
Subject: [PATCH 6/7] riscv: report misaligned accesses emulation to hwprobe

hwprobe provides a way to report if misaligned access are emulated. In
order to correctly populate that feature, we can check if it actually
traps when doing a misaligned access. This can be checked using an
exception table entry which will actually be used when a misaligned
access is done from kernel mode.

Signed-off-by: Clément Léger <[email protected]>
---
arch/riscv/include/asm/cpufeature.h | 6 +++
arch/riscv/kernel/cpufeature.c | 6 ++-
arch/riscv/kernel/setup.c | 1 +
arch/riscv/kernel/traps_misaligned.c | 63 +++++++++++++++++++++++++++-
4 files changed, 74 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
index d0345bd659c9..c1f0ef02cd7d 100644
--- a/arch/riscv/include/asm/cpufeature.h
+++ b/arch/riscv/include/asm/cpufeature.h
@@ -8,6 +8,7 @@

#include <linux/bitmap.h>
#include <asm/hwcap.h>
+#include <asm/hwprobe.h>

/*
* These are probed via a device_initcall(), via either the SBI or directly
@@ -32,4 +33,9 @@ extern struct riscv_isainfo hart_isa[NR_CPUS];

void check_unaligned_access(int cpu);

+bool unaligned_ctl_available(void);
+
+bool check_unaligned_access_emulated(int cpu);
+void unaligned_emulation_finish(void);
+
#endif
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 1cfbba65d11a..fbbde800bc21 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -568,6 +568,9 @@ void check_unaligned_access(int cpu)
void *src;
long speed = RISCV_HWPROBE_MISALIGNED_SLOW;

+ if (check_unaligned_access_emulated(cpu))
+ return;
+
page = alloc_pages(GFP_NOWAIT, get_order(MISALIGNED_BUFFER_SIZE));
if (!page) {
pr_warn("Can't alloc pages to measure memcpy performance");
@@ -645,9 +648,10 @@ void check_unaligned_access(int cpu)
__free_pages(page, get_order(MISALIGNED_BUFFER_SIZE));
}

-static int check_unaligned_access_boot_cpu(void)
+static int __init check_unaligned_access_boot_cpu(void)
{
check_unaligned_access(0);
+ unaligned_emulation_finish();
return 0;
}

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index e600aab116a4..3af6ad4df7cf 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -26,6 +26,7 @@
#include <asm/acpi.h>
#include <asm/alternative.h>
#include <asm/cacheflush.h>
+#include <asm/cpufeature.h>
#include <asm/cpu_ops.h>
#include <asm/early_ioremap.h>
#include <asm/pgtable.h>
diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c
index b5fb1ff078e3..fa81f6952fa4 100644
--- a/arch/riscv/kernel/traps_misaligned.c
+++ b/arch/riscv/kernel/traps_misaligned.c
@@ -9,11 +9,14 @@
#include <linux/perf_event.h>
#include <linux/irq.h>
#include <linux/stringify.h>
+#include <linux/prctl.h>

#include <asm/processor.h>
#include <asm/ptrace.h>
#include <asm/csr.h>
#include <asm/entry-common.h>
+#include <asm/hwprobe.h>
+#include <asm/cpufeature.h>

#define INSN_MATCH_LB 0x3
#define INSN_MASK_LB 0x707f
@@ -396,8 +399,10 @@ union reg_data {
u64 data_u64;
};

+static bool unaligned_ctl __read_mostly;
+
/* sysctl hooks */
-int unaligned_enabled __read_mostly = 1; /* Enabled by default */
+int unaligned_enabled __read_mostly;

int handle_misaligned_load(struct pt_regs *regs)
{
@@ -412,6 +417,9 @@ int handle_misaligned_load(struct pt_regs *regs)
if (!unaligned_enabled)
return -1;

+ if (user_mode(regs) && (current->thread.align_ctl & PR_UNALIGN_SIGBUS))
+ return -1;
+
if (get_insn(regs, epc, &insn))
return -1;

@@ -511,6 +519,9 @@ int handle_misaligned_store(struct pt_regs *regs)
if (!unaligned_enabled)
return -1;

+ if (user_mode(regs) && (current->thread.align_ctl & PR_UNALIGN_SIGBUS))
+ return -1;
+
if (get_insn(regs, epc, &insn))
return -1;

@@ -585,3 +596,53 @@ int handle_misaligned_store(struct pt_regs *regs)

return 0;
}
+
+bool check_unaligned_access_emulated(int cpu)
+{
+ unsigned long emulated = 1, tmp_var;
+
+ /* Use a fixup to detect if misaligned access triggered an exception */
+ __asm__ __volatile__ (
+ "1:\n"
+ " "REG_L" %[tmp], 1(%[ptr])\n"
+ " li %[emulated], 0\n"
+ "2:\n"
+ _ASM_EXTABLE(1b, 2b)
+ : [emulated] "+r" (emulated), [tmp] "=r" (tmp_var)
+ : [ptr] "r" (&tmp_var)
+ : "memory");
+
+ if (!emulated)
+ return false;
+
+ per_cpu(misaligned_access_speed, cpu) =
+ RISCV_HWPROBE_MISALIGNED_EMULATED;
+
+ return true;
+}
+
+void __init unaligned_emulation_finish(void)
+{
+ int cpu;
+
+ /*
+ * We can only support PR_UNALIGN controls if all CPUs have misaligned
+ * accesses emulated since tasks requesting such control can run on any
+ * CPU.
+ */
+ for_each_possible_cpu(cpu) {
+ if (per_cpu(misaligned_access_speed, cpu) !=
+ RISCV_HWPROBE_MISALIGNED_EMULATED) {
+ goto out;
+ }
+ }
+ unaligned_ctl = true;
+
+out:
+ unaligned_enabled = 1;
+}
+
+bool unaligned_ctl_available(void)
+{
+ return unaligned_ctl;
+}
--
2.40.1

2023-09-28 02:18:30

by Evan Green

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

On Tue, Sep 26, 2023 at 8:03 AM Clément Léger <[email protected]> wrote:
>
> Since commit 61cadb9 ("Provide new description of misaligned load/store
> behavior compatible with privileged architecture.") in the RISC-V ISA
> manual, it is stated that misaligned load/store might not be supported.
> However, the RISC-V kernel uABI describes that misaligned accesses are
> supported. In order to support that, this series adds support for S-mode
> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>
> Handling misaligned access in kernel allows for a finer grain control
> of the misaligned accesses behavior, and thanks to the prctl call, can
> allow disabling misaligned access emulation to generate SIGBUS. User
> space can then optimize its software by removing such access based on
> SIGBUS generation.
>
> Currently, this series is useful for people that uses a SBI that does
> not handled misaligned traps. In a near future, this series will make
> use a SBI extension [1] allowing to request delegation of the
> misaligned load/store traps to the S-mode software. This extension has
> been submitted for review to the riscv tech-prs group. An OpenSBI
> implementation for this spec is available at [2].

For my own education, how does the new SBI call behave with respect to
multiple harts? Does a call to change a feature perform that change
across all harts, or just the hart the SBI call was made on? If the
answer is "all harts", what if not all harts are exactly the same, and
some can enable the feature switch while others cannot? Also if the
answer is "all harts", does it also apply to hotplugged cpus, which
may not have even existed at boot time?

What happens if a hart goes through a context loss event, like
suspend/resume? Is the setting expected to be sticky, or is the kernel
expected to replay these calls?

-Evan

2023-09-28 12:03:56

by Clément Léger

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode



On 26/09/2023 23:43, Evan Green wrote:
> On Tue, Sep 26, 2023 at 8:03 AM Clément Léger <[email protected]> wrote:
>>
>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>> behavior compatible with privileged architecture.") in the RISC-V ISA
>> manual, it is stated that misaligned load/store might not be supported.
>> However, the RISC-V kernel uABI describes that misaligned accesses are
>> supported. In order to support that, this series adds support for S-mode
>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>
>> Handling misaligned access in kernel allows for a finer grain control
>> of the misaligned accesses behavior, and thanks to the prctl call, can
>> allow disabling misaligned access emulation to generate SIGBUS. User
>> space can then optimize its software by removing such access based on
>> SIGBUS generation.
>>
>> Currently, this series is useful for people that uses a SBI that does
>> not handled misaligned traps. In a near future, this series will make
>> use a SBI extension [1] allowing to request delegation of the
>> misaligned load/store traps to the S-mode software. This extension has
>> been submitted for review to the riscv tech-prs group. An OpenSBI
>> implementation for this spec is available at [2].
>
> For my own education, how does the new SBI call behave with respect to
> multiple harts? Does a call to change a feature perform that change
> across all harts, or just the hart the SBI call was made on? If the
> answer is "all harts", what if not all harts are exactly the same, and
> some can enable the feature switch while others cannot? Also if the
> answer is "all harts", does it also apply to hotplugged cpus, which
> may not have even existed at boot time?

Depending on the feature, they can be either global (all harts) or
local (calling hart). The medeleg register is per hart and thus
misaligned load/store delegation for S-mode is also per hart.


>
> What happens if a hart goes through a context loss event, like
> suspend/resume? Is the setting expected to be sticky, or is the kernel
> expected to replay these calls?

That is a good question that we did not actually clarified yet. Thanks
for raising it !

Clément

>
> -Evan

2023-09-28 22:27:10

by Evan Green

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

On Thu, Sep 28, 2023 at 12:49 AM Clément Léger <[email protected]> wrote:
>
>
>
> On 26/09/2023 23:43, Evan Green wrote:
> > On Tue, Sep 26, 2023 at 8:03 AM Clément Léger <[email protected]> wrote:
> >>
> >> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >> behavior compatible with privileged architecture.") in the RISC-V ISA
> >> manual, it is stated that misaligned load/store might not be supported.
> >> However, the RISC-V kernel uABI describes that misaligned accesses are
> >> supported. In order to support that, this series adds support for S-mode
> >> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> >>
> >> Handling misaligned access in kernel allows for a finer grain control
> >> of the misaligned accesses behavior, and thanks to the prctl call, can
> >> allow disabling misaligned access emulation to generate SIGBUS. User
> >> space can then optimize its software by removing such access based on
> >> SIGBUS generation.
> >>
> >> Currently, this series is useful for people that uses a SBI that does
> >> not handled misaligned traps. In a near future, this series will make
> >> use a SBI extension [1] allowing to request delegation of the
> >> misaligned load/store traps to the S-mode software. This extension has
> >> been submitted for review to the riscv tech-prs group. An OpenSBI
> >> implementation for this spec is available at [2].
> >
> > For my own education, how does the new SBI call behave with respect to
> > multiple harts? Does a call to change a feature perform that change
> > across all harts, or just the hart the SBI call was made on? If the
> > answer is "all harts", what if not all harts are exactly the same, and
> > some can enable the feature switch while others cannot? Also if the
> > answer is "all harts", does it also apply to hotplugged cpus, which
> > may not have even existed at boot time?
>
> Depending on the feature, they can be either global (all harts) or
> local (calling hart). The medeleg register is per hart and thus
> misaligned load/store delegation for S-mode is also per hart.

We should probably state this in the spec update then, both generally
and for each specific feature added. Otherwise firmware writers are
left not knowing if they're supposed to spread a feature across to all
cores or not.

>
>
> >
> > What happens if a hart goes through a context loss event, like
> > suspend/resume? Is the setting expected to be sticky, or is the kernel
> > expected to replay these calls?
>
> That is a good question that we did not actually clarified yet. Thanks
> for raising it !

No problem! This may also need to be specified per-feature in the
spec. I have a vague hunch that it's better to ask the kernel to do it
on resume, though ideally we'd have the terminology (and I don't think
we do?) to specify exactly which points constitute a context loss.
Mostly I'm remembering the x86 and ARM transition from S3, where lots
of firmware code ran at resume, to S0ix-like power states, where
things resumed directly into the OS and they had to figure out how to
do it without firmware. The vague hunch is that keeping the laundry
list of things firmware must do on resume low might keep us from
getting in S0ix's way, but it's all so speculative it's hard to know
if it's really a useful hunch or not.

-Evan

2023-09-29 06:12:54

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 6/7] riscv: report misaligned accesses emulation to hwprobe

Hi Cl?ment,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.6-rc3 next-20230928]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Cl-ment-L-ger/riscv-remove-unused-functions-in-traps_misaligned-c/20230926-230654
base: linus/master
patch link: https://lore.kernel.org/r/20230926150316.1129648-7-cleger%40rivosinc.com
patch subject: [PATCH 6/7] riscv: report misaligned accesses emulation to hwprobe
config: riscv-randconfig-002-20230929 (https://download.01.org/0day-ci/archive/20230929/[email protected]/config)
compiler: riscv64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230929/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All error/warnings (new ones prefixed by >>):

arch/riscv/kernel/traps_misaligned.c: In function 'handle_misaligned_load':
arch/riscv/kernel/traps_misaligned.c:420:48: error: 'struct thread_struct' has no member named 'align_ctl'
420 | if (user_mode(regs) && (current->thread.align_ctl & PR_UNALIGN_SIGBUS))
| ^
arch/riscv/kernel/traps_misaligned.c: In function 'handle_misaligned_store':
arch/riscv/kernel/traps_misaligned.c:522:48: error: 'struct thread_struct' has no member named 'align_ctl'
522 | if (user_mode(regs) && (current->thread.align_ctl & PR_UNALIGN_SIGBUS))
| ^
arch/riscv/kernel/traps_misaligned.c: In function 'check_unaligned_access_emulated':
>> arch/riscv/kernel/traps_misaligned.c:610:17: error: expected ':' or ')' before '_ASM_EXTABLE'
610 | _ASM_EXTABLE(1b, 2b)
| ^~~~~~~~~~~~
>> arch/riscv/kernel/traps_misaligned.c:610:30: error: invalid suffix "b" on integer constant
610 | _ASM_EXTABLE(1b, 2b)
| ^~
arch/riscv/kernel/traps_misaligned.c:610:34: error: invalid suffix "b" on integer constant
610 | _ASM_EXTABLE(1b, 2b)
| ^~
>> arch/riscv/kernel/traps_misaligned.c:602:37: warning: unused variable 'tmp_var' [-Wunused-variable]
602 | unsigned long emulated = 1, tmp_var;
| ^~~~~~~


vim +610 arch/riscv/kernel/traps_misaligned.c

599
600 bool check_unaligned_access_emulated(int cpu)
601 {
> 602 unsigned long emulated = 1, tmp_var;
603
604 /* Use a fixup to detect if misaligned access triggered an exception */
605 __asm__ __volatile__ (
606 "1:\n"
607 " "REG_L" %[tmp], 1(%[ptr])\n"
608 " li %[emulated], 0\n"
609 "2:\n"
> 610 _ASM_EXTABLE(1b, 2b)
611 : [emulated] "+r" (emulated), [tmp] "=r" (tmp_var)
612 : [ptr] "r" (&tmp_var)
613 : "memory");
614
615 if (!emulated)
616 return false;
617
618 per_cpu(misaligned_access_speed, cpu) =
619 RISCV_HWPROBE_MISALIGNED_EMULATED;
620
621 return true;
622 }
623

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2023-09-30 09:24:10

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

On Tue, Sep 26, 2023 at 05:03:09PM +0200, Cl?ment L?ger wrote:
> Since commit 61cadb9 ("Provide new description of misaligned load/store
> behavior compatible with privileged architecture.") in the RISC-V ISA
> manual, it is stated that misaligned load/store might not be supported.
> However, the RISC-V kernel uABI describes that misaligned accesses are
> supported. In order to support that, this series adds support for S-mode
> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>
> Handling misaligned access in kernel allows for a finer grain control
> of the misaligned accesses behavior, and thanks to the prctl call, can
> allow disabling misaligned access emulation to generate SIGBUS. User
> space can then optimize its software by removing such access based on
> SIGBUS generation.
>
> Currently, this series is useful for people that uses a SBI that does
> not handled misaligned traps. In a near future, this series will make
> use a SBI extension [1] allowing to request delegation of the
> misaligned load/store traps to the S-mode software. This extension has
> been submitted for review to the riscv tech-prs group. An OpenSBI
> implementation for this spec is available at [2].
>
> This series can be tested using the spike simulator [3] and an openSBI
> version [4] which allows to always delegate misaligned load/store to
> S-mode.

Some patches in this series do not build for any configs, some are
broken for clang builds and others are broken for nommu. Please try to
build test this more thoroughly before you submit the next version.

Also, AIUI, this series should be marked RFC since the SBI extension
this relies on has not been frozen.

Cheers,
Conor.


Attachments:
(No filename) (1.70 kB)
signature.asc (235.00 B)
Download all attachments

2023-10-02 10:57:28

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

On Mon, Oct 02, 2023 at 09:40:04AM +0200, Cl?ment L?ger wrote:
>
>
> On 30/09/2023 11:23, Conor Dooley wrote:
> > On Tue, Sep 26, 2023 at 05:03:09PM +0200, Cl?ment L?ger wrote:
> >> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >> behavior compatible with privileged architecture.") in the RISC-V ISA
> >> manual, it is stated that misaligned load/store might not be supported.
> >> However, the RISC-V kernel uABI describes that misaligned accesses are
> >> supported. In order to support that, this series adds support for S-mode
> >> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> >>
> >> Handling misaligned access in kernel allows for a finer grain control
> >> of the misaligned accesses behavior, and thanks to the prctl call, can
> >> allow disabling misaligned access emulation to generate SIGBUS. User
> >> space can then optimize its software by removing such access based on
> >> SIGBUS generation.
> >>
> >> Currently, this series is useful for people that uses a SBI that does
> >> not handled misaligned traps. In a near future, this series will make
> >> use a SBI extension [1] allowing to request delegation of the
> >> misaligned load/store traps to the S-mode software. This extension has
> >> been submitted for review to the riscv tech-prs group. An OpenSBI
> >> implementation for this spec is available at [2].
> >>
> >> This series can be tested using the spike simulator [3] and an openSBI
> >> version [4] which allows to always delegate misaligned load/store to
> >> S-mode.
> >
> > Some patches in this series do not build for any configs, some are
> > broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
>
> Hi Conor,
>
> Thanks for the feedback, I'll check that.
>
> >
> > Also, AIUI, this series should be marked RFC since the SBI extension
> > this relies on has not been frozen.
>
> This series does not actually uses the SBI extension but provides a way
> to detect if misaligned accesses are not handled by hardware nor by the
> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
> implementation that does not handle misaligned accesses and that they
> would like to make use of the PR_SET_UNALIGN feature. This is what this
> series addresses (and thus does not depend on the mentioned SBI extension).

Ah, I must have misread then. Apologies.


Attachments:
(No filename) (2.44 kB)
signature.asc (235.00 B)
Download all attachments

2023-10-02 11:31:57

by Clément Léger

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode



On 30/09/2023 11:23, Conor Dooley wrote:
> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>> behavior compatible with privileged architecture.") in the RISC-V ISA
>> manual, it is stated that misaligned load/store might not be supported.
>> However, the RISC-V kernel uABI describes that misaligned accesses are
>> supported. In order to support that, this series adds support for S-mode
>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>
>> Handling misaligned access in kernel allows for a finer grain control
>> of the misaligned accesses behavior, and thanks to the prctl call, can
>> allow disabling misaligned access emulation to generate SIGBUS. User
>> space can then optimize its software by removing such access based on
>> SIGBUS generation.
>>
>> Currently, this series is useful for people that uses a SBI that does
>> not handled misaligned traps. In a near future, this series will make
>> use a SBI extension [1] allowing to request delegation of the
>> misaligned load/store traps to the S-mode software. This extension has
>> been submitted for review to the riscv tech-prs group. An OpenSBI
>> implementation for this spec is available at [2].
>>
>> This series can be tested using the spike simulator [3] and an openSBI
>> version [4] which allows to always delegate misaligned load/store to
>> S-mode.
>
> Some patches in this series do not build for any configs, some are
> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.

Hi Conor,

Thanks for the feedback, I'll check that.

>
> Also, AIUI, this series should be marked RFC since the SBI extension
> this relies on has not been frozen.

This series does not actually uses the SBI extension but provides a way
to detect if misaligned accesses are not handled by hardware nor by the
SBI. It has been reported by Ron & Daniel they they have a minimal SBI
implementation that does not handle misaligned accesses and that they
would like to make use of the PR_SET_UNALIGN feature. This is what this
series addresses (and thus does not depend on the mentioned SBI extension).

Thanks,

Clément

>
> Cheers,
> Conor.

2023-10-02 20:14:35

by Clément Léger

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode



On 02/10/2023 12:49, Conor Dooley wrote:
> On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
>>
>>
>> On 30/09/2023 11:23, Conor Dooley wrote:
>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
>>>> manual, it is stated that misaligned load/store might not be supported.
>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
>>>> supported. In order to support that, this series adds support for S-mode
>>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>>>
>>>> Handling misaligned access in kernel allows for a finer grain control
>>>> of the misaligned accesses behavior, and thanks to the prctl call, can
>>>> allow disabling misaligned access emulation to generate SIGBUS. User
>>>> space can then optimize its software by removing such access based on
>>>> SIGBUS generation.
>>>>
>>>> Currently, this series is useful for people that uses a SBI that does
>>>> not handled misaligned traps. In a near future, this series will make
>>>> use a SBI extension [1] allowing to request delegation of the
>>>> misaligned load/store traps to the S-mode software. This extension has
>>>> been submitted for review to the riscv tech-prs group. An OpenSBI
>>>> implementation for this spec is available at [2].
>>>>
>>>> This series can be tested using the spike simulator [3] and an openSBI
>>>> version [4] which allows to always delegate misaligned load/store to
>>>> S-mode.
>>>
>>> Some patches in this series do not build for any configs, some are
>>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
>>
>> Hi Conor,
>>
>> Thanks for the feedback, I'll check that.
>>
>>>
>>> Also, AIUI, this series should be marked RFC since the SBI extension
>>> this relies on has not been frozen.
>>
>> This series does not actually uses the SBI extension but provides a way
>> to detect if misaligned accesses are not handled by hardware nor by the
>> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
>> implementation that does not handle misaligned accesses and that they
>> would like to make use of the PR_SET_UNALIGN feature. This is what this
>> series addresses (and thus does not depend on the mentioned SBI extension).
>
> Ah, I must have misread then. Apologies.

No worries, maybe I should actually remove this from the cover letter to
avoid any confusion !

Clément

2023-10-02 22:23:37

by Jessica Clarke

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

On 2 Oct 2023, at 16:32, ron minnich <[email protected]> wrote:
>
> This was a very interesting read. One other thought crossed my mind,
> which is that a RISC-V implementation might make the alignment
> delegation hard-wired to always delegate to S mode. I.e, the bit might
> be WARL and always 1. For what I'm doing, this would actually be
> pretty convenient. Just want to make sure this code can accommodate
> that -- wdyt?

Such an implementation would violate the spec:

An implementation shall not have any bits of medeleg be read-only
one, i.e., any synchronous trap that can be delegated must support not
being delegated.

Supporting that is thus out of scope.

Jess

> We have found lots of value in our experiments with delegating
> alignment traps to Linux -- not least because they tend to locate
> problems in the kernel :-) -- we've found issues in module loading,
> early startup (there's a needed .align2 directive for sbi secondary
> startup, AFAICT) and the timing code for misaligned load/store
> handling.
>
> I don't know how you test this unaligned trap handling, but it might
> be worthwhile to work that out. You can test via oreboot and the
> visionfive2, save we have not figured out why SMP startup is going
> wrong, yet :-), so we're not as feature-complete as needed. But soon.
>
> Thanks!
>
> On Mon, Oct 2, 2023 at 5:19 AM Clément Léger <[email protected]> wrote:
>>
>>
>>
>> On 02/10/2023 12:49, Conor Dooley wrote:
>>> On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
>>>>
>>>>
>>>> On 30/09/2023 11:23, Conor Dooley wrote:
>>>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>>>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>>>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
>>>>>> manual, it is stated that misaligned load/store might not be supported.
>>>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
>>>>>> supported. In order to support that, this series adds support for S-mode
>>>>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>>>>>
>>>>>> Handling misaligned access in kernel allows for a finer grain control
>>>>>> of the misaligned accesses behavior, and thanks to the prctl call, can
>>>>>> allow disabling misaligned access emulation to generate SIGBUS. User
>>>>>> space can then optimize its software by removing such access based on
>>>>>> SIGBUS generation.
>>>>>>
>>>>>> Currently, this series is useful for people that uses a SBI that does
>>>>>> not handled misaligned traps. In a near future, this series will make
>>>>>> use a SBI extension [1] allowing to request delegation of the
>>>>>> misaligned load/store traps to the S-mode software. This extension has
>>>>>> been submitted for review to the riscv tech-prs group. An OpenSBI
>>>>>> implementation for this spec is available at [2].
>>>>>>
>>>>>> This series can be tested using the spike simulator [3] and an openSBI
>>>>>> version [4] which allows to always delegate misaligned load/store to
>>>>>> S-mode.
>>>>>
>>>>> Some patches in this series do not build for any configs, some are
>>>>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
>>>>
>>>> Hi Conor,
>>>>
>>>> Thanks for the feedback, I'll check that.
>>>>
>>>>>
>>>>> Also, AIUI, this series should be marked RFC since the SBI extension
>>>>> this relies on has not been frozen.
>>>>
>>>> This series does not actually uses the SBI extension but provides a way
>>>> to detect if misaligned accesses are not handled by hardware nor by the
>>>> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
>>>> implementation that does not handle misaligned accesses and that they
>>>> would like to make use of the PR_SET_UNALIGN feature. This is what this
>>>> series addresses (and thus does not depend on the mentioned SBI extension).
>>>
>>> Ah, I must have misread then. Apologies.
>>
>> No worries, maybe I should actually remove this from the cover letter to
>> avoid any confusion !
>>
>> Clément
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv

2023-10-02 22:54:56

by ron minnich

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

This was a very interesting read. One other thought crossed my mind,
which is that a RISC-V implementation might make the alignment
delegation hard-wired to always delegate to S mode. I.e, the bit might
be WARL and always 1. For what I'm doing, this would actually be
pretty convenient. Just want to make sure this code can accommodate
that -- wdyt?

We have found lots of value in our experiments with delegating
alignment traps to Linux -- not least because they tend to locate
problems in the kernel :-) -- we've found issues in module loading,
early startup (there's a needed .align2 directive for sbi secondary
startup, AFAICT) and the timing code for misaligned load/store
handling.

I don't know how you test this unaligned trap handling, but it might
be worthwhile to work that out. You can test via oreboot and the
visionfive2, save we have not figured out why SMP startup is going
wrong, yet :-), so we're not as feature-complete as needed. But soon.

Thanks!

On Mon, Oct 2, 2023 at 5:19 AM Clément Léger <[email protected]> wrote:
>
>
>
> On 02/10/2023 12:49, Conor Dooley wrote:
> > On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
> >>
> >>
> >> On 30/09/2023 11:23, Conor Dooley wrote:
> >>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
> >>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >>>> behavior compatible with privileged architecture.") in the RISC-V ISA
> >>>> manual, it is stated that misaligned load/store might not be supported.
> >>>> However, the RISC-V kernel uABI describes that misaligned accesses are
> >>>> supported. In order to support that, this series adds support for S-mode
> >>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> >>>>
> >>>> Handling misaligned access in kernel allows for a finer grain control
> >>>> of the misaligned accesses behavior, and thanks to the prctl call, can
> >>>> allow disabling misaligned access emulation to generate SIGBUS. User
> >>>> space can then optimize its software by removing such access based on
> >>>> SIGBUS generation.
> >>>>
> >>>> Currently, this series is useful for people that uses a SBI that does
> >>>> not handled misaligned traps. In a near future, this series will make
> >>>> use a SBI extension [1] allowing to request delegation of the
> >>>> misaligned load/store traps to the S-mode software. This extension has
> >>>> been submitted for review to the riscv tech-prs group. An OpenSBI
> >>>> implementation for this spec is available at [2].
> >>>>
> >>>> This series can be tested using the spike simulator [3] and an openSBI
> >>>> version [4] which allows to always delegate misaligned load/store to
> >>>> S-mode.
> >>>
> >>> Some patches in this series do not build for any configs, some are
> >>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
> >>
> >> Hi Conor,
> >>
> >> Thanks for the feedback, I'll check that.
> >>
> >>>
> >>> Also, AIUI, this series should be marked RFC since the SBI extension
> >>> this relies on has not been frozen.
> >>
> >> This series does not actually uses the SBI extension but provides a way
> >> to detect if misaligned accesses are not handled by hardware nor by the
> >> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
> >> implementation that does not handle misaligned accesses and that they
> >> would like to make use of the PR_SET_UNALIGN feature. This is what this
> >> series addresses (and thus does not depend on the mentioned SBI extension).
> >
> > Ah, I must have misread then. Apologies.
>
> No worries, maybe I should actually remove this from the cover letter to
> avoid any confusion !
>
> Clément

2023-10-03 08:23:35

by Clément Léger

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode



On 02/10/2023 17:32, ron minnich wrote:
> This was a very interesting read. One other thought crossed my mind,
> which is that a RISC-V implementation might make the alignment
> delegation hard-wired to always delegate to S mode. I.e, the bit might
> be WARL and always 1. For what I'm doing, this would actually be
> pretty convenient. Just want to make sure this code can accommodate
> that -- wdyt?

Hi Ron,

This series does not really care about "how" misaligned load/store are
delegated, it only tries to check if misaligned load/store are handled
by the kernel. So whatever you decide to do to delegate that is a bit
out of the scope of this series.

>
> We have found lots of value in our experiments with delegating
> alignment traps to Linux -- not least because they tend to locate
> problems in the kernel :-) -- we've found issues in module loading,
> early startup (there's a needed .align2 directive for sbi secondary
> startup, AFAICT) and the timing code for misaligned load/store
> handling.>
> I don't know how you test this unaligned trap handling, but it might
> be worthwhile to work that out. You can test via oreboot and the
> visionfive2, save we have not figured out why SMP startup is going
> wrong, yet :-), so we're not as feature-complete as needed. But soon.

I test that on spike (which does not handle misaligned accesses contrary
to qemu) using a userspace program that actually exercise all kind of
standard load/store instructions as well as FPU ones with different
registers. Regarding the kernel, you are right that I might be lacking a
few tests though. I'll also consider using a visionfive2 board to
validate that on real hardware.

Thanks,

Clément

>
> Thanks!
>
> On Mon, Oct 2, 2023 at 5:19 AM Clément Léger <[email protected]> wrote:
>>
>>
>>
>> On 02/10/2023 12:49, Conor Dooley wrote:
>>> On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
>>>>
>>>>
>>>> On 30/09/2023 11:23, Conor Dooley wrote:
>>>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>>>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>>>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
>>>>>> manual, it is stated that misaligned load/store might not be supported.
>>>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
>>>>>> supported. In order to support that, this series adds support for S-mode
>>>>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>>>>>
>>>>>> Handling misaligned access in kernel allows for a finer grain control
>>>>>> of the misaligned accesses behavior, and thanks to the prctl call, can
>>>>>> allow disabling misaligned access emulation to generate SIGBUS. User
>>>>>> space can then optimize its software by removing such access based on
>>>>>> SIGBUS generation.
>>>>>>
>>>>>> Currently, this series is useful for people that uses a SBI that does
>>>>>> not handled misaligned traps. In a near future, this series will make
>>>>>> use a SBI extension [1] allowing to request delegation of the
>>>>>> misaligned load/store traps to the S-mode software. This extension has
>>>>>> been submitted for review to the riscv tech-prs group. An OpenSBI
>>>>>> implementation for this spec is available at [2].
>>>>>>
>>>>>> This series can be tested using the spike simulator [3] and an openSBI
>>>>>> version [4] which allows to always delegate misaligned load/store to
>>>>>> S-mode.
>>>>>
>>>>> Some patches in this series do not build for any configs, some are
>>>>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
>>>>
>>>> Hi Conor,
>>>>
>>>> Thanks for the feedback, I'll check that.
>>>>
>>>>>
>>>>> Also, AIUI, this series should be marked RFC since the SBI extension
>>>>> this relies on has not been frozen.
>>>>
>>>> This series does not actually uses the SBI extension but provides a way
>>>> to detect if misaligned accesses are not handled by hardware nor by the
>>>> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
>>>> implementation that does not handle misaligned accesses and that they
>>>> would like to make use of the PR_SET_UNALIGN feature. This is what this
>>>> series addresses (and thus does not depend on the mentioned SBI extension).
>>>
>>> Ah, I must have misread then. Apologies.
>>
>> No worries, maybe I should actually remove this from the cover letter to
>> avoid any confusion !
>>
>> Clément

2023-10-03 08:41:24

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

On Thu, Sep 28, 2023 at 9:48 AM Evan Green <[email protected]> wrote:
>
> On Thu, Sep 28, 2023 at 12:49 AM Clément Léger <[email protected]> wrote:
> >
> >
> >
> > On 26/09/2023 23:43, Evan Green wrote:
> > > On Tue, Sep 26, 2023 at 8:03 AM Clément Léger <[email protected]> wrote:
> > >>
> > >> Since commit 61cadb9 ("Provide new description of misaligned load/store
> > >> behavior compatible with privileged architecture.") in the RISC-V ISA
> > >> manual, it is stated that misaligned load/store might not be supported.
> > >> However, the RISC-V kernel uABI describes that misaligned accesses are
> > >> supported. In order to support that, this series adds support for S-mode
> > >> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> > >>
> > >> Handling misaligned access in kernel allows for a finer grain control
> > >> of the misaligned accesses behavior, and thanks to the prctl call, can
> > >> allow disabling misaligned access emulation to generate SIGBUS. User
> > >> space can then optimize its software by removing such access based on
> > >> SIGBUS generation.
> > >>
> > >> Currently, this series is useful for people that uses a SBI that does
> > >> not handled misaligned traps. In a near future, this series will make
> > >> use a SBI extension [1] allowing to request delegation of the
> > >> misaligned load/store traps to the S-mode software. This extension has
> > >> been submitted for review to the riscv tech-prs group. An OpenSBI
> > >> implementation for this spec is available at [2].
> > >
> > > For my own education, how does the new SBI call behave with respect to
> > > multiple harts? Does a call to change a feature perform that change
> > > across all harts, or just the hart the SBI call was made on? If the
> > > answer is "all harts", what if not all harts are exactly the same, and
> > > some can enable the feature switch while others cannot? Also if the
> > > answer is "all harts", does it also apply to hotplugged cpus, which
> > > may not have even existed at boot time?
> >
> > Depending on the feature, they can be either global (all harts) or
> > local (calling hart). The medeleg register is per hart and thus
> > misaligned load/store delegation for S-mode is also per hart.
>
> We should probably state this in the spec update then, both generally
> and for each specific feature added. Otherwise firmware writers are
> left not knowing if they're supposed to spread a feature across to all
> cores or not.
>

If a feature is required to update any CSR, it must be per hart only.
The supervisor software
is aware of the state of each hart and it should invoke it from all
the present harts.

Doing it in M-mode will result in M-mode IPIs and seems racy with what
kernel might be doing at that time.

> >
> >
> > >
> > > What happens if a hart goes through a context loss event, like
> > > suspend/resume? Is the setting expected to be sticky, or is the kernel
> > > expected to replay these calls?
> >
> > That is a good question that we did not actually clarified yet. Thanks
> > for raising it !
>

IMO, it should be sticky until a reset. I would be interested to hear
other thoughts though if there is a non-sticky
use-case.


> No problem! This may also need to be specified per-feature in the
> spec. I have a vague hunch that it's better to ask the kernel to do it
> on resume, though ideally we'd have the terminology (and I don't think
> we do?) to specify exactly which points constitute a context loss.
> Mostly I'm remembering the x86 and ARM transition from S3, where lots
> of firmware code ran at resume, to S0ix-like power states, where
> things resumed directly into the OS and they had to figure out how to
> do it without firmware. The vague hunch is that keeping the laundry
> list of things firmware must do on resume low might keep us from
> getting in S0ix's way, but it's all so speculative it's hard to know
> if it's really a useful hunch or not.
>
> -Evan

2023-10-03 15:38:32

by ron minnich

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

While it is true that it violates the spec today, given the fluidity
of the spec of the last 10 years, I'm not sure that matters :-)

Anyway, that's out of scope for this discussion, though I appreciate
your clarification. I'll bring it up elsewhere.

Clement points out that this series would work fine if that bit were
hardwired to 1, which is all I care about.

thanks

On Mon, Oct 2, 2023 at 4:23 PM Jessica Clarke <[email protected]> wrote:
>
> On 2 Oct 2023, at 16:32, ron minnich <[email protected]> wrote:
> >
> > This was a very interesting read. One other thought crossed my mind,
> > which is that a RISC-V implementation might make the alignment
> > delegation hard-wired to always delegate to S mode. I.e, the bit might
> > be WARL and always 1. For what I'm doing, this would actually be
> > pretty convenient. Just want to make sure this code can accommodate
> > that -- wdyt?
>
> Such an implementation would violate the spec:
>
> An implementation shall not have any bits of medeleg be read-only
> one, i.e., any synchronous trap that can be delegated must support not
> being delegated.
>
> Supporting that is thus out of scope.
>
> Jess
>
> > We have found lots of value in our experiments with delegating
> > alignment traps to Linux -- not least because they tend to locate
> > problems in the kernel :-) -- we've found issues in module loading,
> > early startup (there's a needed .align2 directive for sbi secondary
> > startup, AFAICT) and the timing code for misaligned load/store
> > handling.
> >
> > I don't know how you test this unaligned trap handling, but it might
> > be worthwhile to work that out. You can test via oreboot and the
> > visionfive2, save we have not figured out why SMP startup is going
> > wrong, yet :-), so we're not as feature-complete as needed. But soon.
> >
> > Thanks!
> >
> > On Mon, Oct 2, 2023 at 5:19 AM Clément Léger <[email protected]> wrote:
> >>
> >>
> >>
> >> On 02/10/2023 12:49, Conor Dooley wrote:
> >>> On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
> >>>>
> >>>>
> >>>> On 30/09/2023 11:23, Conor Dooley wrote:
> >>>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
> >>>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >>>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
> >>>>>> manual, it is stated that misaligned load/store might not be supported.
> >>>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
> >>>>>> supported. In order to support that, this series adds support for S-mode
> >>>>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> >>>>>>
> >>>>>> Handling misaligned access in kernel allows for a finer grain control
> >>>>>> of the misaligned accesses behavior, and thanks to the prctl call, can
> >>>>>> allow disabling misaligned access emulation to generate SIGBUS. User
> >>>>>> space can then optimize its software by removing such access based on
> >>>>>> SIGBUS generation.
> >>>>>>
> >>>>>> Currently, this series is useful for people that uses a SBI that does
> >>>>>> not handled misaligned traps. In a near future, this series will make
> >>>>>> use a SBI extension [1] allowing to request delegation of the
> >>>>>> misaligned load/store traps to the S-mode software. This extension has
> >>>>>> been submitted for review to the riscv tech-prs group. An OpenSBI
> >>>>>> implementation for this spec is available at [2].
> >>>>>>
> >>>>>> This series can be tested using the spike simulator [3] and an openSBI
> >>>>>> version [4] which allows to always delegate misaligned load/store to
> >>>>>> S-mode.
> >>>>>
> >>>>> Some patches in this series do not build for any configs, some are
> >>>>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
> >>>>
> >>>> Hi Conor,
> >>>>
> >>>> Thanks for the feedback, I'll check that.
> >>>>
> >>>>>
> >>>>> Also, AIUI, this series should be marked RFC since the SBI extension
> >>>>> this relies on has not been frozen.
> >>>>
> >>>> This series does not actually uses the SBI extension but provides a way
> >>>> to detect if misaligned accesses are not handled by hardware nor by the
> >>>> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
> >>>> implementation that does not handle misaligned accesses and that they
> >>>> would like to make use of the PR_SET_UNALIGN feature. This is what this
> >>>> series addresses (and thus does not depend on the mentioned SBI extension).
> >>>
> >>> Ah, I must have misread then. Apologies.
> >>
> >> No worries, maybe I should actually remove this from the cover letter to
> >> avoid any confusion !
> >>
> >> Clément
> >
> > _______________________________________________
> > linux-riscv mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>

2023-10-03 15:39:47

by ron minnich

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

On Tue, Oct 3, 2023 at 2:40 AM Atish Kumar Patra <[email protected]> wrote:

> IMO, it should be sticky until a reset.

that's a wonderful idea; perhaps this discussion needs to be held
elsewhere, esp. as it regards possible violation of the spec. Thanks!

2023-10-04 08:27:13

by David Laight

[permalink] [raw]
Subject: RE: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

From: Clément Léger
> Sent: 02 October 2023 08:40
>
> On 30/09/2023 11:23, Conor Dooley wrote:
> > On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
> >> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >> behavior compatible with privileged architecture.") in the RISC-V ISA
> >> manual, it is stated that misaligned load/store might not be supported.
> >> However, the RISC-V kernel uABI describes that misaligned accesses are
> >> supported.
...

That it just really horrid.
If the cpu is going to trap misaligned accesses then you want
The compiler generated code (ie packed data) not to generate
misaligned accesses.
So you have to change the kernel uABI.

OTOH if you known that such accesses won't fault and will be
not really slower than aligned accesses then optimised versions
of some functions (like memcpy and checksums) can use misaligned
accesses.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-10-04 10:04:00

by Clément Léger

[permalink] [raw]
Subject: Re: [PATCH 0/7] Add support to handle misaligned accesses in S-mode



On 04/10/2023 10:26, David Laight wrote:
> From: Clément Léger
>> Sent: 02 October 2023 08:40
>>
>> On 30/09/2023 11:23, Conor Dooley wrote:
>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
>>>> manual, it is stated that misaligned load/store might not be supported.
>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
>>>> supported.
> ...
>
> That it just really horrid.
> If the cpu is going to trap misaligned accesses then you want
> The compiler generated code (ie packed data) not to generate
> misaligned accesses.

Hi David,

Saying that you support misaligned access does not mean that they are
going to be efficient, just that they are supported (in fact, the uABI
state that they may perform poorly). The compiler is actually not so
stupid and will try to do as much aligned access as possible in what it
generates (unless forced by some assembly, cast or whatever that can
screw up alignment accesses). This is already the case and it will most
probably not change.

> So you have to change the kernel uABI.

I don't think so. Rule N°1 for kernel development is "Don't break the
userspace". So if changing the RISC-V uABI to say "misaligned accesses
are not supported", that is unlikely to happen. We stated that
misaligned access are supported and thus, they will continue to be
supported.

>
> OTOH if you known that such accesses won't fault and will be
> not really slower than aligned accesses then optimised versions
> of some functions (like memcpy and checksums) can use misaligned
> accesses.

Yes, this is selected by HAVE_EFFICIENT_UNALIGNED_ACCESS. On RISC-V,
since the specification says nothing about the efficiency of such
access, we can't select it like that. Some RISC-V based SoC/CPUs might
want to select it manually in their config. In order to support that
dynamically and in a generic way, some future work could involve using
static keys for such alternatives and enable it based on the speed that
was detected.

Thanks,

Clément

>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)

2023-10-04 14:10:59

by David Laight

[permalink] [raw]
Subject: RE: [PATCH 0/7] Add support to handle misaligned accesses in S-mode

...
> Saying that you support misaligned access does not mean that they are
> going to be efficient, just that they are supported (in fact, the uABI
> state that they may perform poorly). The compiler is actually not so
> stupid and will try to do as much aligned access as possible in what it
> generates (unless forced by some assembly, cast or whatever that can
> screw up alignment accesses). This is already the case and it will most
> probably not change.

I did a quick check.

https://godbolt.org/z/j3e9drv4e

The code generated by both clang and gcc for misaligned reads is horrid.
Gcc does a better job if the alignment is known but generates much
the same as the clang code when it isn't.

The C code is much shorter.
Even though both gcc and clang add a (different) instruction to it

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)