2024-05-08 01:36:49

by Charlie Jenkins

[permalink] [raw]
Subject: [PATCH v2 0/8] riscv: Support compiling the kernel with more extensions

The kernel currently has the restriction that it can only be compiled
with the extensions that are hardcoded in arch/risc/Makefile.

Any extension that is not listed in the Makefile can still be used by
explicitly writing the assembly and using alternative patching.

This series introduces Kconfig options that allow the kernel to be
compiled with additional extensions.

The motivation for this patch is the performance improvements that come
along with compiling the kernel with these extra instructions. Allowing
the compiler to emit arbitrary Zb* instructions achieves a 4.9%
reduction of dynamic instruction count for a test ran in Spike that
boots the kernel and runs a user space program that prints to the
console.

Additionally, alternatives that check if an extension is supported can
be eliminated when the Kconfig options to assume hardware support is
enabled.

This series is based on the wording changes from Conor in:

https://lore.kernel.org/lkml/20240424-tabby-plural-5f1d9fe44f47@spud/T/

Signed-off-by: Charlie Jenkins <[email protected]>
---
Changes in v2:
- Eliminate references to incorrect config in Svpbmt patch (Jess)
- Add motivation to cover letter (Conor)
- Remove "v" from march
- Correct the ifdef for vector
- Correct the ifdef for Svnapot
- Link to v1: https://lore.kernel.org/r/20240506-compile_kernel_with_extensions-v1-0-5c25c134c097@rivosinc.com

---
Charlie Jenkins (8):
riscv: Add PLATFORM_MAY_SUPPORT_RISCV_ISA_C Kconfig option
riscv: Add PLATFORM_MAY_SUPPORT_RISCV_ISA_V Kconfig option
riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT Kconfig option
riscv: Move RISCV_ISA_SVPBMT to Kconfig.isa
riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_ZBB Kconfig option
riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_ZBA Kconfig option
riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_ZBC Kconfig option
riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_ZBS Kconfig option

arch/riscv/Kconfig | 133 +-----------
arch/riscv/Kconfig.isa | 393 ++++++++++++++++++++++++++++++++++
arch/riscv/Makefile | 14 +-
arch/riscv/crypto/Kconfig | 14 +-
arch/riscv/include/asm/arch_hweight.h | 33 +--
arch/riscv/include/asm/checksum.h | 18 +-
arch/riscv/include/asm/pgtable.h | 3 +-
arch/riscv/include/asm/simd.h | 3 +
arch/riscv/include/asm/vector.h | 3 +-
arch/riscv/kernel/cpufeature.c | 3 +-
arch/riscv/kernel/head.S | 8 +-
arch/riscv/kernel/probes/uprobes.c | 2 +-
arch/riscv/kernel/process.c | 12 +-
arch/riscv/kernel/ptrace.c | 6 +
arch/riscv/lib/csum.c | 48 ++---
arch/riscv/lib/riscv_v_helpers.c | 1 -
arch/riscv/lib/strcmp.S | 4 +-
arch/riscv/lib/strlen.S | 4 +-
arch/riscv/lib/strncmp.S | 4 +-
arch/riscv/lib/uaccess_vector.S | 2 +
arch/riscv/lib/xor.S | 2 +
arch/riscv/net/bpf_jit.h | 8 +-
22 files changed, 509 insertions(+), 209 deletions(-)
---
base-commit: 2f47357557b7aa98d9d9002688aae480864ca3f6
change-id: 20240429-compile_kernel_with_extensions-92dd2403d325
--
- Charlie



2024-05-08 01:37:14

by Charlie Jenkins

[permalink] [raw]
Subject: [PATCH v2 2/8] riscv: Add PLATFORM_MAY_SUPPORT_RISCV_ISA_V Kconfig option

Current versions of the kernel add "v" to the march and then immeidately
filter it out such that "v" is not passed to CFLAGS. Instead of doing
this filtering, code blocks in the kernel that want to use vector
assembly have been changed to locally enable vector (using ".option
arch, +v").

To support kernels that can run on hardware that may support vector, the
config option PLATFORM_MAY_SUPPORT_RISCV_ISA_V is added, and the
previous behavior of RISCV_ISA_V is retained with the option
CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_V. When the hardware is assumed to
support vector, has_vector() unconditionally returns true. "v" is
not added to the toolchain march even when the hardware is assumed to
support vector because kernel vector code must be guarded by
kernel_vector_begin/end.

Signed-off-by: Charlie Jenkins <[email protected]>
---
arch/riscv/Kconfig | 54 -------------------------
arch/riscv/Kconfig.isa | 85 ++++++++++++++++++++++++++++++++++++++++
arch/riscv/Makefile | 6 +--
arch/riscv/crypto/Kconfig | 14 +++----
arch/riscv/include/asm/simd.h | 3 ++
arch/riscv/include/asm/vector.h | 3 +-
arch/riscv/kernel/cpufeature.c | 3 +-
arch/riscv/kernel/head.S | 8 +++-
arch/riscv/kernel/process.c | 12 +++---
arch/riscv/kernel/ptrace.c | 6 +++
arch/riscv/lib/riscv_v_helpers.c | 1 -
arch/riscv/lib/uaccess_vector.S | 2 +
arch/riscv/lib/xor.S | 2 +
13 files changed, 123 insertions(+), 76 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c2a4f5364707..c2e9eded0a7d 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -520,60 +520,6 @@ config RISCV_ISA_SVPBMT

If you don't know what to do here, say Y.

-config TOOLCHAIN_HAS_V
- bool
- default y
- depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64iv)
- depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32iv)
- depends on LLD_VERSION >= 140000 || LD_VERSION >= 23800
- depends on AS_HAS_OPTION_ARCH
-
-config RISCV_ISA_V
- bool "Vector extension support"
- depends on TOOLCHAIN_HAS_V
- depends on FPU
- select DYNAMIC_SIGFRAME
- default y
- help
- Add support for the Vector extension when it is detected at boot.
- When this option is disabled, neither the kernel nor userspace may
- use vector procedures.
-
- If you don't know what to do here, say Y.
-
-config RISCV_ISA_V_DEFAULT_ENABLE
- bool "Enable userspace Vector by default"
- depends on RISCV_ISA_V
- default y
- help
- Say Y here if you want to enable Vector in userspace by default.
- Otherwise, userspace has to make explicit prctl() call to enable
- Vector, or enable it via the sysctl interface.
-
- If you don't know what to do here, say Y.
-
-config RISCV_ISA_V_UCOPY_THRESHOLD
- int "Threshold size for vectorized user copies"
- depends on RISCV_ISA_V
- default 768
- help
- Prefer using vectorized copy_to_user()/copy_from_user() when the
- workload size exceeds this value.
-
-config RISCV_ISA_V_PREEMPTIVE
- bool "Run kernel-mode Vector with kernel preemption"
- depends on PREEMPTION
- depends on RISCV_ISA_V
- default y
- help
- Usually, in-kernel SIMD routines are run with preemption disabled.
- Functions which envoke long running SIMD thus must yield core's
- vector unit to prevent blocking other tasks for too long.
-
- This config allows kernel to run SIMD without explicitly disable
- preemption. Enabling this config will result in higher memory
- consumption due to the allocation of per-task's kernel Vector context.
-
config TOOLCHAIN_HAS_ZBB
bool
default y
diff --git a/arch/riscv/Kconfig.isa b/arch/riscv/Kconfig.isa
index 08b7af5aabb0..0663c98b5b17 100644
--- a/arch/riscv/Kconfig.isa
+++ b/arch/riscv/Kconfig.isa
@@ -39,3 +39,88 @@ config PLATFORM_SUPPORTS_RISCV_ISA_C
If you don't know what to do here, say Y.

endchoice
+
+config TOOLCHAIN_HAS_V
+ bool
+ default y
+ depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64iv)
+ depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32iv)
+ depends on LLD_VERSION >= 140000 || LD_VERSION >= 23800
+ depends on AS_HAS_OPTION_ARCH
+
+config RISCV_ISA_V
+ bool
+
+choice
+ prompt "Vector extension support"
+ default PLATFORM_MAY_SUPPORT_RISCV_ISA_V
+ help
+ This selects the level of support for vector instructions to be
+ built into the Linux Kernel. This does not impact whether vector
+ instructions are allowed to be emitted by user-space code.
+
+config PROHIBIT_RISCV_ISA_V
+ bool "Prohibit vector instructions"
+ depends on NONPORTABLE
+ help
+ Regardless of if the platform supports vector instructions,
+ prohibit the kernel from emitting vector instructions.
+
+config PLATFORM_MAY_SUPPORT_RISCV_ISA_V
+ bool "Allow vector instruction sequences if supported"
+ depends on TOOLCHAIN_HAS_V
+ depends on FPU
+ select DYNAMIC_SIGFRAME
+ select RISCV_ISA_V
+ help
+ Only allow vector instructions to be emitted if "V" is present in
+ the device tree or ACPI table. No vector instructions will be
+ emitted if the platform does not support them.
+
+config PLATFORM_SUPPORTS_RISCV_ISA_V
+ bool "Emit vector instructions when building Linux"
+ depends on TOOLCHAIN_HAS_V
+ depends on FPU
+ depends on NONPORTABLE
+ select DYNAMIC_SIGFRAME
+ select RISCV_ISA_V
+ help
+ Adds "V" to the ISA subsets that the toolchain is allowed to emit
+ when building Linux, which results in vector instructions in the
+ Linux binary. This option produces a kernel that will not run on
+ systems that do not support vector instructions.
+
+endchoice
+
+config RISCV_ISA_V_DEFAULT_ENABLE
+ bool "Enable userspace Vector by default"
+ depends on RISCV_ISA_V
+ default y
+ help
+ Say Y here if you want to enable Vector in userspace by default.
+ Otherwise, userspace has to make explicit prctl() call to enable
+ Vector, or enable it via the sysctl interface.
+
+ If you don't know what to do here, say Y.
+
+config RISCV_ISA_V_UCOPY_THRESHOLD
+ int "Threshold size for vectorized user copies"
+ depends on RISCV_ISA_V
+ default 768
+ help
+ Prefer using vectorized copy_to_user()/copy_from_user() when the
+ workload size exceeds this value.
+
+config RISCV_ISA_V_PREEMPTIVE
+ bool "Run kernel-mode Vector with kernel preemption"
+ depends on PREEMPTION
+ depends on RISCV_ISA_V
+ default y
+ help
+ Usually, in-kernel SIMD routines are run with preemption disabled.
+ Functions which envoke long running SIMD thus must yield core's
+ vector unit to prevent blocking other tasks for too long.
+
+ This config allows kernel to run SIMD without explicitly disable
+ preemption. Enabling this config will result in higher memory
+ consumption due to the allocation of per-task's kernel Vector context.
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index e1be36004097..e1111e62ca20 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -66,7 +66,6 @@ riscv-march-$(CONFIG_ARCH_RV32I) := rv32ima
riscv-march-$(CONFIG_ARCH_RV64I) := rv64ima
riscv-march-$(CONFIG_FPU) := $(riscv-march-y)fd
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_C) := $(riscv-march-y)c
-riscv-march-$(CONFIG_RISCV_ISA_V) := $(riscv-march-y)v

ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC
KBUILD_CFLAGS += -Wa,-misa-spec=2.2
@@ -78,10 +77,7 @@ endif
# Check if the toolchain supports Zihintpause extension
riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZIHINTPAUSE) := $(riscv-march-y)_zihintpause

-# Remove F,D,V from isa string for all. Keep extensions between "fd" and "v" by
-# matching non-v and non-multi-letter extensions out with the filter ([^v_]*)
-KBUILD_CFLAGS += -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)fd([^v_]*)v?/\1\2/')
-
+KBUILD_CFLAGS += -march=$(subst fd,,$(riscv-march-y))
KBUILD_AFLAGS += -march=$(riscv-march-y)

KBUILD_CFLAGS += -mno-save-restore
diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig
index ad58dad9a580..5f7ea675a8cf 100644
--- a/arch/riscv/crypto/Kconfig
+++ b/arch/riscv/crypto/Kconfig
@@ -4,7 +4,7 @@ menu "Accelerated Cryptographic Algorithms for CPU (riscv)"

config CRYPTO_AES_RISCV64
tristate "Ciphers: AES, modes: ECB, CBC, CTS, CTR, XTS"
- depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
+ depends on 64BIT && PLATFORM_SUPPORTS_RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
select CRYPTO_ALGAPI
select CRYPTO_LIB_AES
select CRYPTO_SKCIPHER
@@ -20,7 +20,7 @@ config CRYPTO_AES_RISCV64

config CRYPTO_CHACHA_RISCV64
tristate "Ciphers: ChaCha"
- depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
+ depends on 64BIT && PLATFORM_SUPPORTS_RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
select CRYPTO_SKCIPHER
select CRYPTO_LIB_CHACHA_GENERIC
help
@@ -31,7 +31,7 @@ config CRYPTO_CHACHA_RISCV64

config CRYPTO_GHASH_RISCV64
tristate "Hash functions: GHASH"
- depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
+ depends on 64BIT && PLATFORM_SUPPORTS_RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
select CRYPTO_GCM
help
GCM GHASH function (NIST SP 800-38D)
@@ -41,7 +41,7 @@ config CRYPTO_GHASH_RISCV64

config CRYPTO_SHA256_RISCV64
tristate "Hash functions: SHA-224 and SHA-256"
- depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
+ depends on 64BIT && PLATFORM_SUPPORTS_RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
select CRYPTO_SHA256
help
SHA-224 and SHA-256 secure hash algorithm (FIPS 180)
@@ -52,7 +52,7 @@ config CRYPTO_SHA256_RISCV64

config CRYPTO_SHA512_RISCV64
tristate "Hash functions: SHA-384 and SHA-512"
- depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
+ depends on 64BIT && PLATFORM_SUPPORTS_RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
select CRYPTO_SHA512
help
SHA-384 and SHA-512 secure hash algorithm (FIPS 180)
@@ -63,7 +63,7 @@ config CRYPTO_SHA512_RISCV64

config CRYPTO_SM3_RISCV64
tristate "Hash functions: SM3 (ShangMi 3)"
- depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
+ depends on 64BIT && PLATFORM_SUPPORTS_RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
select CRYPTO_HASH
select CRYPTO_SM3
help
@@ -75,7 +75,7 @@ config CRYPTO_SM3_RISCV64

config CRYPTO_SM4_RISCV64
tristate "Ciphers: SM4 (ShangMi 4)"
- depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
+ depends on 64BIT && PLATFORM_SUPPORTS_RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO
select CRYPTO_ALGAPI
select CRYPTO_SM4
help
diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h
index adb50f3ec205..81508325fd51 100644
--- a/arch/riscv/include/asm/simd.h
+++ b/arch/riscv/include/asm/simd.h
@@ -26,6 +26,9 @@
*/
static __must_check inline bool may_use_simd(void)
{
+ if (!has_vector())
+ return false;
+
/*
* RISCV_KERNEL_MODE_V is only set while preemption is disabled,
* and is clear whenever preemption is enabled.
diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
index 731dcd0ed4de..a08c4bf92ecd 100644
--- a/arch/riscv/include/asm/vector.h
+++ b/arch/riscv/include/asm/vector.h
@@ -37,7 +37,8 @@ static inline u32 riscv_v_flags(void)

static __always_inline bool has_vector(void)
{
- return riscv_has_extension_unlikely(RISCV_ISA_EXT_v);
+ return IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_V) ||
+ riscv_has_extension_likely(RISCV_ISA_EXT_v);
}

static inline void __riscv_v_vstate_clean(struct pt_regs *regs)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 3ed2359eae35..7cb365714855 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -683,7 +683,6 @@ void __init riscv_fill_hwcap(void)
}

if (elf_hwcap & COMPAT_HWCAP_ISA_V) {
- riscv_v_setup_vsize();
/*
* ISA string in device tree might have 'v' flag, but
* CONFIG_RISCV_ISA_V is disabled in kernel.
@@ -691,6 +690,8 @@ void __init riscv_fill_hwcap(void)
*/
if (!IS_ENABLED(CONFIG_RISCV_ISA_V))
elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
+ else
+ riscv_v_setup_vsize();
}

memset(print_str, 0, sizeof(print_str));
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 4236a69c35cb..b027be82bbb3 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -428,17 +428,20 @@ SYM_CODE_START_LOCAL(reset_regs)
.Lreset_regs_done_fpu:
#endif /* CONFIG_FPU */

-#ifdef CONFIG_RISCV_ISA_V
+#if defined(CONFIG_PLATFORM_MAY_SUPPORT_RISCV_ISA_V)
csrr t0, CSR_MISA
li t1, COMPAT_HWCAP_ISA_V
and t0, t0, t1
beqz t0, .Lreset_regs_done_vector
-
+#endif
+#ifdef CONFIG_RISCV_ISA_V
/*
* Clear vector registers and reset vcsr
* VLMAX has a defined value, VLEN is a constant,
* and this form of vsetvli is defined to set vl to VLMAX.
*/
+ .option push
+ .option arch, +v
li t1, SR_VS
csrs CSR_STATUS, t1
csrs CSR_VCSR, x0
@@ -447,6 +450,7 @@ SYM_CODE_START_LOCAL(reset_regs)
vmv.v.i v8, 0
vmv.v.i v16, 0
vmv.v.i v24, 0
+ .option pop
/* note that the caller must clear SR_VS */
.Lreset_regs_done_vector:
#endif /* CONFIG_RISCV_ISA_V */
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index e4bc61c4e58a..3ba7bf63ccb2 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -165,11 +165,13 @@ void flush_thread(void)
#endif
#ifdef CONFIG_RISCV_ISA_V
/* Reset vector state */
- riscv_v_vstate_ctrl_init(current);
- riscv_v_vstate_off(task_pt_regs(current));
- kfree(current->thread.vstate.datap);
- memset(&current->thread.vstate, 0, sizeof(struct __riscv_v_ext_state));
- clear_tsk_thread_flag(current, TIF_RISCV_V_DEFER_RESTORE);
+ if (has_vector()) {
+ riscv_v_vstate_ctrl_init(current);
+ riscv_v_vstate_off(task_pt_regs(current));
+ kfree(current->thread.vstate.datap);
+ memset(&current->thread.vstate, 0, sizeof(struct __riscv_v_ext_state));
+ clear_tsk_thread_flag(current, TIF_RISCV_V_DEFER_RESTORE);
+ }
#endif
}

diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
index 92731ff8c79a..bdfaed2a4023 100644
--- a/arch/riscv/kernel/ptrace.c
+++ b/arch/riscv/kernel/ptrace.c
@@ -92,6 +92,9 @@ static int riscv_vr_get(struct task_struct *target,
struct __riscv_v_ext_state *vstate = &target->thread.vstate;
struct __riscv_v_regset_state ptrace_vstate;

+ if (!has_vector())
+ return 0;
+
if (!riscv_v_vstate_query(task_pt_regs(target)))
return -EINVAL;

@@ -127,6 +130,9 @@ static int riscv_vr_set(struct task_struct *target,
struct __riscv_v_ext_state *vstate = &target->thread.vstate;
struct __riscv_v_regset_state ptrace_vstate;

+ if (!has_vector())
+ return 0;
+
if (!riscv_v_vstate_query(task_pt_regs(target)))
return -EINVAL;

diff --git a/arch/riscv/lib/riscv_v_helpers.c b/arch/riscv/lib/riscv_v_helpers.c
index be38a93cedae..661c77fdd7f7 100644
--- a/arch/riscv/lib/riscv_v_helpers.c
+++ b/arch/riscv/lib/riscv_v_helpers.c
@@ -21,7 +21,6 @@ asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n)
{
size_t remain, copied;

- /* skip has_vector() check because it has been done by the asm */
if (!may_use_simd())
goto fallback;

diff --git a/arch/riscv/lib/uaccess_vector.S b/arch/riscv/lib/uaccess_vector.S
index 7c45f26de4f7..4de37a3a2163 100644
--- a/arch/riscv/lib/uaccess_vector.S
+++ b/arch/riscv/lib/uaccess_vector.S
@@ -5,6 +5,8 @@
#include <asm/asm-extable.h>
#include <asm/csr.h>

+.option arch, +v
+
#define pDst a0
#define pSrc a1
#define iNum a2
diff --git a/arch/riscv/lib/xor.S b/arch/riscv/lib/xor.S
index b28f2430e52f..9a3e2c19efc9 100644
--- a/arch/riscv/lib/xor.S
+++ b/arch/riscv/lib/xor.S
@@ -6,6 +6,8 @@
#include <linux/export.h>
#include <asm/asm.h>

+.option arch, +v
+
SYM_FUNC_START(xor_regs_2_)
vsetvli a3, a0, e8, m8, ta, ma
vle8.v v0, (a1)

--
2.44.0


2024-05-08 01:37:29

by Charlie Jenkins

[permalink] [raw]
Subject: [PATCH v2 4/8] riscv: Move RISCV_ISA_SVPBMT to Kconfig.isa

Svpbmt would not benefit from having PLATFORM_SUPPORTS_RISCV_ISA_SVPBMT
so just move the definition of RISCV_ISA_SVPBMT to Kconfig.isa.

Signed-off-by: Charlie Jenkins <[email protected]>
---
arch/riscv/Kconfig | 17 -----------------
arch/riscv/Kconfig.isa | 17 +++++++++++++++++
2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 3c1960e8cd7c..47a1d28bbb64 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -484,23 +484,6 @@ config RISCV_ALTERNATIVE_EARLY
help
Allows early patching of the kernel for special errata

-config RISCV_ISA_SVPBMT
- bool "Svpbmt extension support for supervisor mode page-based memory types"
- depends on 64BIT && MMU
- depends on RISCV_ALTERNATIVE
- default y
- help
- Add support for the Svpbmt ISA-extension (Supervisor-mode:
- page-based memory types) in the kernel when it is detected at boot.
-
- The memory type for a page contains a combination of attributes
- that indicate the cacheability, idempotency, and ordering
- properties for access to that page.
-
- The Svpbmt extension is only available on 64-bit cpus.
-
- If you don't know what to do here, say Y.
-
config TOOLCHAIN_HAS_ZBB
bool
default y
diff --git a/arch/riscv/Kconfig.isa b/arch/riscv/Kconfig.isa
index 37585bcd763e..50e217dc5719 100644
--- a/arch/riscv/Kconfig.isa
+++ b/arch/riscv/Kconfig.isa
@@ -168,3 +168,20 @@ config PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT
not support Svnapot.

endchoice
+
+config RISCV_ISA_SVPBMT
+ bool "Svpbmt extension support for supervisor mode page-based memory types"
+ depends on 64BIT && MMU
+ depends on RISCV_ALTERNATIVE
+ default y
+ help
+ Add support for the Svpbmt ISA-extension (Supervisor-mode:
+ page-based memory types) in the kernel when it is detected at boot.
+
+ The memory type for a page contains a combination of attributes
+ that indicate the cacheability, idempotency, and ordering
+ properties for access to that page.
+
+ The Svpbmt extension is only available on 64-bit cpus.
+
+ If you don't know what to do here, say Y.

--
2.44.0


2024-05-08 01:37:29

by Charlie Jenkins

[permalink] [raw]
Subject: [PATCH v2 1/8] riscv: Add PLATFORM_MAY_SUPPORT_RISCV_ISA_C Kconfig option

Introduce a "Kernel ISA" menu and migrate the compressed instruction
support options into a new file Kconfig.isa. Add a new option
"PLATFORM_MAY_SUPPORT_RISCV_ISA_C" that can be used to conditionally
emit C extensions if the hardware supports it.

The existing "RISCV_ISA_C" option is repurposed to be used to by kernel
code to determine if either PLATFORM_MAY_SUPPORT_RISCV_ISA_C or
PLATFORM_SUPPORTS_RISCV_ISA_C has been set.

Signed-off-by: Charlie Jenkins <[email protected]>
---
arch/riscv/Kconfig | 19 +++++++-----------
arch/riscv/Kconfig.isa | 41 ++++++++++++++++++++++++++++++++++++++
arch/riscv/Makefile | 4 ++--
arch/riscv/kernel/probes/uprobes.c | 2 +-
arch/riscv/net/bpf_jit.h | 4 +++-
5 files changed, 54 insertions(+), 16 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c8bdfd33abf4..c2a4f5364707 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -369,6 +369,12 @@ config ARCH_RV64I

endchoice

+menu "Kernel ISA"
+
+source "arch/riscv/Kconfig.isa"
+
+endmenu
+
# We must be able to map all physical memory into the kernel, but the compiler
# is still a bit more efficient when generating code if it's setup in a manner
# such that it can only map 2GiB of memory.
@@ -478,17 +484,6 @@ config RISCV_ALTERNATIVE_EARLY
help
Allows early patching of the kernel for special errata

-config RISCV_ISA_C
- bool "Emit compressed instructions when building Linux"
- default y
- help
- Adds "C" to the ISA subsets that the toolchain is allowed to emit
- when building Linux, which results in compressed instructions in the
- Linux binary. This option produces a kernel that will not run on
- systems that do not support compressed instructions.
-
- If you don't know what to do here, say Y.
-
config RISCV_ISA_SVNAPOT
bool "Svnapot extension support for supervisor mode NAPOT pages"
depends on 64BIT && MMU
@@ -937,6 +932,7 @@ config EFI
bool "UEFI runtime support"
depends on OF && !XIP_KERNEL
depends on MMU
+ depends on PLATFORM_SUPPORTS_RISCV_ISA_C
default y
select ARCH_SUPPORTS_ACPI if 64BIT
select EFI_GENERIC_STUB
@@ -944,7 +940,6 @@ config EFI
select EFI_RUNTIME_WRAPPERS
select EFI_STUB
select LIBFDT
- select RISCV_ISA_C
select UCS2_STRING
help
This option provides support for runtime services provided
diff --git a/arch/riscv/Kconfig.isa b/arch/riscv/Kconfig.isa
new file mode 100644
index 000000000000..08b7af5aabb0
--- /dev/null
+++ b/arch/riscv/Kconfig.isa
@@ -0,0 +1,41 @@
+config RISCV_ISA_C
+ bool
+
+choice
+ prompt "Compressed instruction support"
+ default PLATFORM_SUPPORTS_RISCV_ISA_C
+ help
+ This selects the level of support for compressed instructions to be
+ built into the Linux Kernel. This does not impact whether compressed
+ instructions are allowed to be emitted by user-space code.
+
+config PROHIBIT_RISCV_ISA_C
+ bool "Prohibit compressed instructions"
+ depends on NONPORTABLE
+ help
+ Regardless of if the platform supports compressed instructions,
+ prohibit the kernel from emitting compressed instructions.
+
+config PLATFORM_MAY_SUPPORT_RISCV_ISA_C
+ bool "Allow compressed instructions sequences if supported"
+ depends on FPU
+ depends on NONPORTABLE
+ select RISCV_ISA_C
+ help
+ Only allow compressed instructions to be emitted if "C" is present in
+ the device tree or ACPI table. No compressed instructions will be
+ emitted if the platform does not support them.
+
+config PLATFORM_SUPPORTS_RISCV_ISA_C
+ bool "Emit compressed instructions when building Linux"
+ depends on FPU
+ select RISCV_ISA_C
+ help
+ Adds "C" to the ISA subsets that the toolchain is allowed to emit
+ when building Linux, which results in compressed instructions in the
+ Linux binary. This option produces a kernel that will not run on
+ systems that do not support compressed instructions.
+
+ If you don't know what to do here, say Y.
+
+endchoice
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 5b3115a19852..e1be36004097 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -14,7 +14,7 @@ endif
ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
LDFLAGS_vmlinux += --no-relax
KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
-ifeq ($(CONFIG_RISCV_ISA_C),y)
+ifeq ($(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_C),y)
CC_FLAGS_FTRACE := -fpatchable-function-entry=4
else
CC_FLAGS_FTRACE := -fpatchable-function-entry=2
@@ -65,7 +65,7 @@ endif
riscv-march-$(CONFIG_ARCH_RV32I) := rv32ima
riscv-march-$(CONFIG_ARCH_RV64I) := rv64ima
riscv-march-$(CONFIG_FPU) := $(riscv-march-y)fd
-riscv-march-$(CONFIG_RISCV_ISA_C) := $(riscv-march-y)c
+riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_C) := $(riscv-march-y)c
riscv-march-$(CONFIG_RISCV_ISA_V) := $(riscv-march-y)v

ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC
diff --git a/arch/riscv/kernel/probes/uprobes.c b/arch/riscv/kernel/probes/uprobes.c
index 4b3dc8beaf77..a468689a6f6d 100644
--- a/arch/riscv/kernel/probes/uprobes.c
+++ b/arch/riscv/kernel/probes/uprobes.c
@@ -11,7 +11,7 @@

bool is_swbp_insn(uprobe_opcode_t *insn)
{
-#ifdef CONFIG_RISCV_ISA_C
+#ifdef CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_C
return (*insn & 0xffff) == UPROBE_SWBP_INSN;
#else
return *insn == UPROBE_SWBP_INSN;
diff --git a/arch/riscv/net/bpf_jit.h b/arch/riscv/net/bpf_jit.h
index f4b6b3b9edda..259294bdbc3a 100644
--- a/arch/riscv/net/bpf_jit.h
+++ b/arch/riscv/net/bpf_jit.h
@@ -15,7 +15,9 @@

static inline bool rvc_enabled(void)
{
- return IS_ENABLED(CONFIG_RISCV_ISA_C);
+ return IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_C) ||
+ (IS_ENABLED(CONFIG_PLATFORM_MAY_SUPPORT_RISCV_ISA_C) &&
+ riscv_has_extension_likely(RISCV_ISA_EXT_c));
}

static inline bool rvzbb_enabled(void)

--
2.44.0


2024-05-08 01:37:53

by Charlie Jenkins

[permalink] [raw]
Subject: [PATCH v2 3/8] riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT Kconfig option

The existing "RISCV_ISA_SVNAPOT" option is repurposed to be used to by
kernel code to determine if either
PLATFORM_MAY_SUPPORT_RISCV_ISA_SVNAPOT or
PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT has been set.

PLATFORM_MAY_SUPPORT_RISCV_ISA_SVNAPOT will check if the hardware
supports Svnapot before using it, while
PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT will assume that the hardware
supports Svnapot.

Signed-off-by: Charlie Jenkins <[email protected]>
---
arch/riscv/Kconfig | 19 -----------------
arch/riscv/Kconfig.isa | 44 ++++++++++++++++++++++++++++++++++++++++
arch/riscv/include/asm/pgtable.h | 3 ++-
3 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c2e9eded0a7d..3c1960e8cd7c 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -484,25 +484,6 @@ config RISCV_ALTERNATIVE_EARLY
help
Allows early patching of the kernel for special errata

-config RISCV_ISA_SVNAPOT
- bool "Svnapot extension support for supervisor mode NAPOT pages"
- depends on 64BIT && MMU
- depends on RISCV_ALTERNATIVE
- default y
- help
- Add support for the Svnapot ISA-extension in the kernel when it
- is detected at boot.
-
- The Svnapot extension is used to mark contiguous PTEs as a range
- of contiguous virtual-to-physical translations for a naturally
- aligned power-of-2 (NAPOT) granularity larger than the base 4KB page
- size. When HUGETLBFS is also selected this option unconditionally
- allocates some memory for each NAPOT page size supported by the kernel.
- When optimizing for low memory consumption and for platforms without
- the Svnapot extension, it may be better to say N here.
-
- If you don't know what to do here, say Y.
-
config RISCV_ISA_SVPBMT
bool "Svpbmt extension support for supervisor mode page-based memory types"
depends on 64BIT && MMU
diff --git a/arch/riscv/Kconfig.isa b/arch/riscv/Kconfig.isa
index 0663c98b5b17..37585bcd763e 100644
--- a/arch/riscv/Kconfig.isa
+++ b/arch/riscv/Kconfig.isa
@@ -124,3 +124,47 @@ config RISCV_ISA_V_PREEMPTIVE
This config allows kernel to run SIMD without explicitly disable
preemption. Enabling this config will result in higher memory
consumption due to the allocation of per-task's kernel Vector context.
+
+config RISCV_ISA_SVNAPOT
+ bool
+
+choice
+ prompt "Svnapot extension support for supervisor mode NAPOT pages"
+ default PLATFORM_MAY_SUPPORT_RISCV_ISA_SVNAPOT
+ help
+ This selects the level of support for Svnapot in the Linux Kernel.
+
+ The Svnapot extension is used to mark contiguous PTEs as a range
+ of contiguous virtual-to-physical translations for a naturally
+ aligned power-of-2 (NAPOT) granularity larger than the base 4KB page
+ size. When HUGETLBFS is also selected this option unconditionally
+ allocates some memory for each NAPOT page size supported by the kernel.
+ When optimizing for low memory consumption and for platforms without
+ the Svnapot extension, it may be better to prohibit Svnapot.
+
+config PROHIBIT_RISCV_ISA_SVNAPOT
+ bool "Prohibit Svnapot extension"
+ help
+ Regardless of if the platform supports Svnapot, prohibit the kernel
+ from using Svnapot.
+
+config PLATFORM_MAY_SUPPORT_RISCV_ISA_SVNAPOT
+ bool "Allow Svnapot extension if supported"
+ depends on 64BIT && MMU
+ depends on RISCV_ALTERNATIVE
+ select RISCV_ISA_SVNAPOT
+ help
+ Add support for the Svnapot ISA-extension in the kernel when it
+ is detected at boot.
+
+config PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT
+ bool "Emit Svnapot mappings when building Linux"
+ depends on 64BIT && MMU
+ depends on NONPORTABLE
+ select RISCV_ISA_SVNAPOT
+ help
+ Compile a kernel that assumes that the platform supports Svnapot.
+ This option produces a kernel that will not run on systems that do
+ not support Svnapot.
+
+endchoice
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 6afd6bb4882e..432be9691b78 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -289,7 +289,8 @@ static inline pte_t pud_pte(pud_t pud)

static __always_inline bool has_svnapot(void)
{
- return riscv_has_extension_likely(RISCV_ISA_EXT_SVNAPOT);
+ return IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT) ||
+ riscv_has_extension_likely(RISCV_ISA_EXT_SVNAPOT);
}

static inline unsigned long pte_napot(pte_t pte)

--
2.44.0


2024-05-08 01:37:54

by Charlie Jenkins

[permalink] [raw]
Subject: [PATCH v2 5/8] riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_ZBB Kconfig option

Zbb can optimize kernel instruction sequences. Add a config option
PLATFORM_SUPPORTS_RISCV_ISA_ZBB that allows arbitrary Zbb instruction
sequences to be emitted by the compiler. This assumption also allows the
alternatives to become evaluated at compile time for Zbb.

The existing "RISCV_ISA_ZBB" option is repurposed to be used to by kernel
code to determine if either PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBB or
PLATFORM_SUPPORTS_RISCV_ISA_ZBB has been set.

Signed-off-by: Charlie Jenkins <[email protected]>
---
arch/riscv/Kconfig | 24 ----------------
arch/riscv/Kconfig.isa | 54 ++++++++++++++++++++++++++++++++++-
arch/riscv/Makefile | 1 +
arch/riscv/include/asm/arch_hweight.h | 33 ++++++++++-----------
arch/riscv/include/asm/checksum.h | 18 ++++++------
arch/riscv/lib/csum.c | 48 +++++++++++++++----------------
arch/riscv/lib/strcmp.S | 4 +--
arch/riscv/lib/strlen.S | 4 +--
arch/riscv/lib/strncmp.S | 4 +--
arch/riscv/net/bpf_jit.h | 4 ++-
10 files changed, 113 insertions(+), 81 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 47a1d28bbb64..df620e534b3f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -484,14 +484,6 @@ config RISCV_ALTERNATIVE_EARLY
help
Allows early patching of the kernel for special errata

-config TOOLCHAIN_HAS_ZBB
- bool
- default y
- depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbb)
- depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbb)
- depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
- depends on AS_HAS_OPTION_ARCH
-
# This symbol indicates that the toolchain supports all v1.0 vector crypto
# extensions, including Zvk*, Zvbb, and Zvbc. LLVM added all of these at once.
# binutils added all except Zvkb, then added Zvkb. So we just check for Zvkb.
@@ -499,22 +491,6 @@ config TOOLCHAIN_HAS_VECTOR_CRYPTO
def_bool $(as-instr, .option arch$(comma) +v$(comma) +zvkb)
depends on AS_HAS_OPTION_ARCH

-config RISCV_ISA_ZBB
- bool "Zbb extension support for bit manipulation instructions"
- depends on TOOLCHAIN_HAS_ZBB
- depends on MMU
- depends on RISCV_ALTERNATIVE
- default y
- help
- Add support for enabling optimisations in the kernel when the
- Zbb extension is detected at boot.
-
- The Zbb extension provides instructions to accelerate a number
- of bit-specific operations (count bit population, sign extending,
- bitrotation, etc).
-
- If you don't know what to do here, say Y.
-
config RISCV_ISA_ZICBOM
bool "Zicbom extension support for non-coherent DMA operation"
depends on MMU
diff --git a/arch/riscv/Kconfig.isa b/arch/riscv/Kconfig.isa
index 50e217dc5719..49bed8c75263 100644
--- a/arch/riscv/Kconfig.isa
+++ b/arch/riscv/Kconfig.isa
@@ -169,7 +169,7 @@ config PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT

endchoice

-config RISCV_ISA_SVPBMT
+config PLATFORM_MAY_SUPPORT_RISCV_ISA_SVPBMT
bool "Svpbmt extension support for supervisor mode page-based memory types"
depends on 64BIT && MMU
depends on RISCV_ALTERNATIVE
@@ -185,3 +185,55 @@ config RISCV_ISA_SVPBMT
The Svpbmt extension is only available on 64-bit cpus.

If you don't know what to do here, say Y.
+
+config TOOLCHAIN_HAS_ZBB
+ bool
+ default y
+ depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbb)
+ depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbb)
+ depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
+ depends on AS_HAS_OPTION_ARCH
+
+config RISCV_ISA_ZBB
+ bool
+
+choice
+ prompt "Zbb extension for bit manipulation instructions support"
+ default PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBB
+ help
+ This selects the level of support for Zbb instructions to be
+ built into the Linux Kernel. This does not impact whether Zbb
+ instructions are allowed to be emitted by user-space code.
+
+ The Zbb extension provides instructions to accelerate a number
+ of bit-specific operations (count bit population, sign extending,
+ bitrotation, etc).
+
+config PROHIBIT_RISCV_ISA_ZBB
+ bool "Prohibit Zbb instruction sequences"
+ depends on NONPORTABLE
+ help
+ Regardless of if the platform supports Zbb instructions,
+ prohibit the kernel from emitting Zbb instructions.
+
+config PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBB
+ bool "Allow Zbb instruction sequences if supported"
+ depends on TOOLCHAIN_HAS_ZBB
+ depends on RISCV_ALTERNATIVE
+ select RISCV_ISA_ZBB
+ help
+ Add support for enabling optimisations in the kernel when the
+ Zbb extension is detected at boot.
+
+config PLATFORM_SUPPORTS_RISCV_ISA_ZBB
+ bool "Emit Zbb instructions when building Linux"
+ depends on TOOLCHAIN_HAS_ZBB
+ depends on NONPORTABLE
+ select RISCV_ISA_ZBB
+ help
+ Adds "zbb" to the ISA subsets that the toolchain is allowed to emit
+ when building Linux, which results in Zbb instructions in the
+ Linux binary. This option produces a kernel that will not run on
+ systems that do not support the Zbb extension.
+
+endchoice
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index e1111e62ca20..6b0c3a782f99 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -66,6 +66,7 @@ riscv-march-$(CONFIG_ARCH_RV32I) := rv32ima
riscv-march-$(CONFIG_ARCH_RV64I) := rv64ima
riscv-march-$(CONFIG_FPU) := $(riscv-march-y)fd
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_C) := $(riscv-march-y)c
+riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB) := $(riscv-march-y)_zbb

ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC
KBUILD_CFLAGS += -Wa,-misa-spec=2.2
diff --git a/arch/riscv/include/asm/arch_hweight.h b/arch/riscv/include/asm/arch_hweight.h
index 85b2c443823e..d89a18d5104f 100644
--- a/arch/riscv/include/asm/arch_hweight.h
+++ b/arch/riscv/include/asm/arch_hweight.h
@@ -19,21 +19,20 @@

static __always_inline unsigned int __arch_hweight32(unsigned int w)
{
-#ifdef CONFIG_RISCV_ISA_ZBB
- asm goto(ALTERNATIVE("j %l[legacy]", "nop", 0,
- RISCV_ISA_EXT_ZBB, 1)
- : : : : legacy);
-
- asm (".option push\n"
- ".option arch,+zbb\n"
- CPOPW "%0, %0\n"
- ".option pop\n"
- : "+r" (w) : :);
-
- return w;
+ if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB)) {
+ if (!IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB))
+ asm goto(ALTERNATIVE("j %l[legacy]", "nop", 0,
+ RISCV_ISA_EXT_ZBB, 1)
+ : : : : legacy);
+ asm (".option push\n"
+ ".option arch,+zbb\n"
+ CPOPW "%0, %0\n"
+ ".option pop\n"
+ : "+r" (w) : :);

+ return w;
+ }
legacy:
-#endif
return __sw_hweight32(w);
}

@@ -50,11 +49,12 @@ static inline unsigned int __arch_hweight8(unsigned int w)
#if BITS_PER_LONG == 64
static __always_inline unsigned long __arch_hweight64(__u64 w)
{
-# ifdef CONFIG_RISCV_ISA_ZBB
+#ifdef CONFIG_PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBB
asm goto(ALTERNATIVE("j %l[legacy]", "nop", 0,
RISCV_ISA_EXT_ZBB, 1)
: : : : legacy);
-
+#endif
+#ifdef CONFIG_RISCV_ISA_ZBB
asm (".option push\n"
".option arch,+zbb\n"
"cpop %0, %0\n"
@@ -62,7 +62,8 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
: "+r" (w) : :);

return w;
-
+#endif
+#ifdef CONFIG_PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBB
legacy:
# endif
return __sw_hweight64(w);
diff --git a/arch/riscv/include/asm/checksum.h b/arch/riscv/include/asm/checksum.h
index 88e6f1499e88..2fe92abf5525 100644
--- a/arch/riscv/include/asm/checksum.h
+++ b/arch/riscv/include/asm/checksum.h
@@ -2,7 +2,7 @@
/*
* Checksum routines
*
- * Copyright (C) 2023 Rivos Inc.
+ * Copyright (C) 2023-2024 Rivos Inc.
*/
#ifndef __ASM_RISCV_CHECKSUM_H
#define __ASM_RISCV_CHECKSUM_H
@@ -49,16 +49,16 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)
* ZBB only saves three instructions on 32-bit and five on 64-bit so not
* worth checking if supported without Alternatives.
*/
- if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) &&
- IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
+ if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB)) {
unsigned long fold_temp;

- asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
- RISCV_ISA_EXT_ZBB, 1)
- :
- :
- :
- : no_zbb);
+ if (!IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB))
+ asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
+ RISCV_ISA_EXT_ZBB, 1)
+ :
+ :
+ :
+ : no_zbb);

if (IS_ENABLED(CONFIG_32BIT)) {
asm(".option push \n\
diff --git a/arch/riscv/lib/csum.c b/arch/riscv/lib/csum.c
index 7fb12c59e571..5ea2bf71c963 100644
--- a/arch/riscv/lib/csum.c
+++ b/arch/riscv/lib/csum.c
@@ -44,8 +44,7 @@ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
* Zbb support saves 4 instructions, so not worth checking without
* alternatives if supported
*/
- if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) &&
- IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
+ if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB)) {
unsigned long fold_temp;

/*
@@ -53,12 +52,13 @@ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
* support, so nop when Zbb is available and jump when Zbb is
* not available.
*/
- asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
- RISCV_ISA_EXT_ZBB, 1)
- :
- :
- :
- : no_zbb);
+ if (!IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB))
+ asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
+ RISCV_ISA_EXT_ZBB, 1)
+ :
+ :
+ :
+ : no_zbb);
asm(".option push \n\
.option arch,+zbb \n\
rori %[fold_temp], %[sum], 32 \n\
@@ -161,8 +161,7 @@ do_csum_with_alignment(const unsigned char *buff, int len)
* Zbb support saves 6 instructions, so not worth checking without
* alternatives if supported
*/
- if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) &&
- IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
+ if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB)) {
unsigned long fold_temp;

/*
@@ -170,12 +169,13 @@ do_csum_with_alignment(const unsigned char *buff, int len)
* support, so nop when Zbb is available and jump when Zbb is
* not available.
*/
- asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
- RISCV_ISA_EXT_ZBB, 1)
- :
- :
- :
- : no_zbb);
+ if (!IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB))
+ asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
+ RISCV_ISA_EXT_ZBB, 1)
+ :
+ :
+ :
+ : no_zbb);

#ifdef CONFIG_32BIT
asm_goto_output(".option push \n\
@@ -248,8 +248,7 @@ do_csum_no_alignment(const unsigned char *buff, int len)
* Zbb support saves 6 instructions, so not worth checking without
* alternatives if supported
*/
- if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) &&
- IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
+ if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB)) {
unsigned long fold_temp;

/*
@@ -257,12 +256,13 @@ do_csum_no_alignment(const unsigned char *buff, int len)
* support, so nop when Zbb is available and jump when Zbb is
* not available.
*/
- asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
- RISCV_ISA_EXT_ZBB, 1)
- :
- :
- :
- : no_zbb);
+ if (!IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB))
+ asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
+ RISCV_ISA_EXT_ZBB, 1)
+ :
+ :
+ :
+ : no_zbb);

#ifdef CONFIG_32BIT
asm (".option push \n\
diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
index 687b2bea5c43..5798ef7e73fc 100644
--- a/arch/riscv/lib/strcmp.S
+++ b/arch/riscv/lib/strcmp.S
@@ -7,7 +7,7 @@

/* int strcmp(const char *cs, const char *ct) */
SYM_FUNC_START(strcmp)
-
+#ifndef CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB
ALTERNATIVE("nop", "j strcmp_zbb", 0, RISCV_ISA_EXT_ZBB, CONFIG_RISCV_ISA_ZBB)

/*
@@ -37,7 +37,7 @@ SYM_FUNC_START(strcmp)
*/
sub a0, t0, t1
ret
-
+#endif /* !CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB */
/*
* Variant of strcmp using the ZBB extension if available.
* The code was published as part of the bitmanip manual
diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
index 8ae3064e45ff..b63b91f74084 100644
--- a/arch/riscv/lib/strlen.S
+++ b/arch/riscv/lib/strlen.S
@@ -7,7 +7,7 @@

/* int strlen(const char *s) */
SYM_FUNC_START(strlen)
-
+#ifndef CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB
ALTERNATIVE("nop", "j strlen_zbb", 0, RISCV_ISA_EXT_ZBB, CONFIG_RISCV_ISA_ZBB)

/*
@@ -29,7 +29,7 @@ SYM_FUNC_START(strlen)
2:
sub a0, t1, a0
ret
-
+#endif /* !CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB */
/*
* Variant of strlen using the ZBB extension if available
*/
diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
index aba5b3148621..3a1330d7d4a2 100644
--- a/arch/riscv/lib/strncmp.S
+++ b/arch/riscv/lib/strncmp.S
@@ -7,7 +7,7 @@

/* int strncmp(const char *cs, const char *ct, size_t count) */
SYM_FUNC_START(strncmp)
-
+#ifndef CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB
ALTERNATIVE("nop", "j strncmp_zbb", 0, RISCV_ISA_EXT_ZBB, CONFIG_RISCV_ISA_ZBB)

/*
@@ -42,7 +42,7 @@ SYM_FUNC_START(strncmp)
*/
sub a0, t0, t1
ret
-
+#endif /* !CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB */
/*
* Variant of strncmp using the ZBB extension if available
*/
diff --git a/arch/riscv/net/bpf_jit.h b/arch/riscv/net/bpf_jit.h
index 259294bdbc3a..61892044124e 100644
--- a/arch/riscv/net/bpf_jit.h
+++ b/arch/riscv/net/bpf_jit.h
@@ -22,7 +22,9 @@ static inline bool rvc_enabled(void)

static inline bool rvzbb_enabled(void)
{
- return IS_ENABLED(CONFIG_RISCV_ISA_ZBB) && riscv_has_extension_likely(RISCV_ISA_EXT_ZBB);
+ return IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB) ||
+ (IS_ENABLED(CONFIG_PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBB) &&
+ riscv_has_extension_likely(RISCV_ISA_EXT_ZBB));
}

enum {

--
2.44.0


2024-05-08 01:38:06

by Charlie Jenkins

[permalink] [raw]
Subject: [PATCH v2 7/8] riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_ZBC Kconfig option

Zbc can optimize kernel instruction sequences. Add a config option
PLATFORM_SUPPORTS_RISCV_ISA_ZBC that allows arbitrary Zbc instruction
sequences to be emitted by the compiler.

The existing "RISCV_ISA_ZBC" option is repurposed to be used to by kernel
code to determine if either PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBC or
PLATFORM_SUPPORTS_RISCV_ISA_ZBC has been set.

Signed-off-by: Charlie Jenkins <[email protected]>
---
arch/riscv/Kconfig.isa | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++
arch/riscv/Makefile | 1 +
2 files changed, 52 insertions(+)

diff --git a/arch/riscv/Kconfig.isa b/arch/riscv/Kconfig.isa
index e7f28dc44137..b7399f236bba 100644
--- a/arch/riscv/Kconfig.isa
+++ b/arch/riscv/Kconfig.isa
@@ -289,3 +289,54 @@ config PLATFORM_SUPPORTS_RISCV_ISA_ZBB
systems that do not support the Zbb extension.

endchoice
+
+config TOOLCHAIN_HAS_ZBC
+ bool
+ default y
+ depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbc)
+ depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbc)
+ depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
+ depends on AS_HAS_OPTION_ARCH
+
+config RISCV_ISA_ZBC
+ bool
+
+choice
+ prompt "Zbc extension for bit manipulation instructions support"
+ default PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBC
+ help
+ This selects the level of support for Zbc instructions to be
+ built into the Linux Kernel. This does not impact whether Zbc
+ instructions are allowed to be emitted by user-space code.
+
+ The Zbc extension provides instructions to accelerate carry-less
+ multiplication.
+
+config PROHIBIT_RISCV_ISA_ZBC
+ bool "Prohibit Zbc instruction sequences"
+ depends on NONPORTABLE
+ help
+ Regardless of if the platform supports Zbc instructions,
+ prohibit the kernel from emitting Zbc instructions.
+
+config PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBC
+ bool "Allow Zbc instruction sequences if supported"
+ depends on TOOLCHAIN_HAS_ZBC
+ depends on RISCV_ALTERNATIVE
+ select RISCV_ISA_ZBC
+ help
+ Add support for enabling optimisations in the kernel when the
+ Zbc extension is detected at boot.
+
+config PLATFORM_SUPPORTS_RISCV_ISA_ZBC
+ bool "Emit Zbc instructions when building Linux"
+ depends on TOOLCHAIN_HAS_ZBC
+ depends on NONPORTABLE
+ select RISCV_ISA_ZBC
+ help
+ Adds "zbc" to the ISA subsets that the toolchain is allowed to emit
+ when building Linux, which results in Zbc instructions in the
+ Linux binary. This option produces a kernel that will not run on
+ systems that do not support the Zbc extension.
+
+endchoice
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index c8ec38b9880a..57457d15e9a4 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -68,6 +68,7 @@ riscv-march-$(CONFIG_FPU) := $(riscv-march-y)fd
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_C) := $(riscv-march-y)c
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBA) := $(riscv-march-y)_zba
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB) := $(riscv-march-y)_zbb
+riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBC) := $(riscv-march-y)_zbc

ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC
KBUILD_CFLAGS += -Wa,-misa-spec=2.2

--
2.44.0


2024-05-08 01:38:22

by Charlie Jenkins

[permalink] [raw]
Subject: [PATCH v2 8/8] riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_ZBS Kconfig option

Zbs can optimize kernel instruction sequences. Add a config option
PLATFORM_SUPPORTS_RISCV_ISA_ZBS that allows arbitrary Zbs instruction
sequences to be emitted by the compiler.

The existing "RISCV_ISA_ZBS" option is repurposed to be used to by kernel
code to determine if either PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBS or
PLATFORM_SUPPORTS_RISCV_ISA_ZBS has been set.

Signed-off-by: Charlie Jenkins <[email protected]>
---
arch/riscv/Kconfig.isa | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++
arch/riscv/Makefile | 1 +
2 files changed, 52 insertions(+)

diff --git a/arch/riscv/Kconfig.isa b/arch/riscv/Kconfig.isa
index b7399f236bba..60ae1bf71c70 100644
--- a/arch/riscv/Kconfig.isa
+++ b/arch/riscv/Kconfig.isa
@@ -340,3 +340,54 @@ config PLATFORM_SUPPORTS_RISCV_ISA_ZBC
systems that do not support the Zbc extension.

endchoice
+
+config TOOLCHAIN_HAS_ZBS
+ bool
+ default y
+ depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbs)
+ depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbs)
+ depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
+ depends on AS_HAS_OPTION_ARCH
+
+config RISCV_ISA_ZBS
+ bool
+
+choice
+ prompt "Zbs extension for bit manipulation instructions support"
+ default PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBS
+ help
+ This selects the level of support for Zbs instructions to be
+ built into the Linux Kernel. This does not impact whether Zbs
+ instructions are allowed to be emitted by user-space code.
+
+ The Zbs extension provides instructions to accelerate carry-less
+ multiplication.
+
+config PROHIBIT_RISCV_ISA_ZBS
+ bool "Prohibit Zbs instruction sequences"
+ depends on NONPORTABLE
+ help
+ Regardless of if the platform supports Zbs instructions,
+ prohibit the kernel from emitting Zbs instructions.
+
+config PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBS
+ bool "Allow Zbs instruction sequences if supported"
+ depends on TOOLCHAIN_HAS_ZBS
+ depends on RISCV_ALTERNATIVE
+ select RISCV_ISA_ZBS
+ help
+ Add support for enabling optimisations in the kernel when the
+ Zbs extension is detected at boot.
+
+config PLATFORM_SUPPORTS_RISCV_ISA_ZBS
+ bool "Emit Zbs instructions when building Linux"
+ depends on TOOLCHAIN_HAS_ZBS
+ depends on NONPORTABLE
+ select RISCV_ISA_ZBS
+ help
+ Adds "zbs" to the ISA subsets that the toolchain is allowed to emit
+ when building Linux, which results in Zbs instructions in the
+ Linux binary. This option produces a kernel that will not run on
+ systems that do not support the Zbs extension.
+
+endchoice
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 57457d15e9a4..80ff8503196a 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -69,6 +69,7 @@ riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_C) := $(riscv-march-y)c
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBA) := $(riscv-march-y)_zba
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB) := $(riscv-march-y)_zbb
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBC) := $(riscv-march-y)_zbc
+riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBS) := $(riscv-march-y)_zbs

ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC
KBUILD_CFLAGS += -Wa,-misa-spec=2.2

--
2.44.0


2024-05-08 01:38:47

by Charlie Jenkins

[permalink] [raw]
Subject: [PATCH v2 6/8] riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_ZBA Kconfig option

Zba can optimize kernel instruction sequences. Add a config option
PLATFORM_SUPPORTS_RISCV_ISA_ZBA that allows arbitrary Zba instruction
sequences to be emitted by the compiler.

The existing "RISCV_ISA_ZBA" option is repurposed to be used to by kernel
code to determine if either PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBA or
PLATFORM_SUPPORTS_RISCV_ISA_ZBA has been set.

Signed-off-by: Charlie Jenkins <[email protected]>
---
arch/riscv/Kconfig.isa | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++
arch/riscv/Makefile | 1 +
2 files changed, 53 insertions(+)

diff --git a/arch/riscv/Kconfig.isa b/arch/riscv/Kconfig.isa
index 49bed8c75263..e7f28dc44137 100644
--- a/arch/riscv/Kconfig.isa
+++ b/arch/riscv/Kconfig.isa
@@ -186,6 +186,58 @@ config PLATFORM_MAY_SUPPORT_RISCV_ISA_SVPBMT

If you don't know what to do here, say Y.

+
+config TOOLCHAIN_HAS_ZBA
+ bool
+ default y
+ depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zba)
+ depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zba)
+ depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
+ depends on AS_HAS_OPTION_ARCH
+
+config RISCV_ISA_ZBA
+ bool
+
+choice
+ prompt "Zba extension for address generation instructions support"
+ default PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBA
+ help
+ This selects the level of support for Zba instructions to be
+ built into the Linux Kernel. This does not impact whether Zba
+ instructions are allowed to be emitted by user-space code.
+
+ The Zba extension provides instructions to accelerate a number
+ of address generation instruction sequences.
+
+config PROHIBIT_RISCV_ISA_ZBA
+ bool "Prohibit Zba instruction sequences"
+ depends on NONPORTABLE
+ help
+ Regardless of if the platform supports Zba instructions,
+ prohibit the kernel from emitting Zba instructions.
+
+config PLATFORM_MAY_SUPPORT_RISCV_ISA_ZBA
+ bool "Allow Zba instruction sequences if supported"
+ depends on TOOLCHAIN_HAS_ZBB
+ depends on RISCV_ALTERNATIVE
+ select RISCV_ISA_ZBA
+ help
+ Add support for enabling optimisations in the kernel when the
+ Zba extension is detected at boot.
+
+config PLATFORM_SUPPORTS_RISCV_ISA_ZBA
+ bool "Emit Zba instructions when building Linux"
+ depends on TOOLCHAIN_HAS_ZBB
+ depends on NONPORTABLE
+ select RISCV_ISA_ZBA
+ help
+ Adds "zba" to the ISA subsets that the toolchain is allowed to emit
+ when building Linux, which results in Zba instructions in the
+ Linux binary. This option produces a kernel that will not run on
+ systems that do not support the Zba extension.
+
+endchoice
+
config TOOLCHAIN_HAS_ZBB
bool
default y
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 6b0c3a782f99..c8ec38b9880a 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -66,6 +66,7 @@ riscv-march-$(CONFIG_ARCH_RV32I) := rv32ima
riscv-march-$(CONFIG_ARCH_RV64I) := rv64ima
riscv-march-$(CONFIG_FPU) := $(riscv-march-y)fd
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_C) := $(riscv-march-y)c
+riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBA) := $(riscv-march-y)_zba
riscv-march-$(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_ZBB) := $(riscv-march-y)_zbb

ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC

--
2.44.0


2024-05-08 17:06:25

by Charlie Jenkins

[permalink] [raw]
Subject: Re: [PATCH v2 3/8] riscv: Add PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT Kconfig option

On Wed, May 08, 2024 at 10:00:48AM +0100, Ben Dooks wrote:
> On 08/05/2024 02:36, Charlie Jenkins wrote:
> > The existing "RISCV_ISA_SVNAPOT" option is repurposed to be used to by
> > kernel code to determine if either
> > PLATFORM_MAY_SUPPORT_RISCV_ISA_SVNAPOT or
> > PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT has been set.
> >
> > PLATFORM_MAY_SUPPORT_RISCV_ISA_SVNAPOT will check if the hardware
> > supports Svnapot before using it, while
> > PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT will assume that the hardware
> > supports Svnapot.
> >
> > Signed-off-by: Charlie Jenkins <[email protected]>
> > ---
> > arch/riscv/Kconfig | 19 -----------------
> > arch/riscv/Kconfig.isa | 44 ++++++++++++++++++++++++++++++++++++++++
> > arch/riscv/include/asm/pgtable.h | 3 ++-
> > 3 files changed, 46 insertions(+), 20 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index c2e9eded0a7d..3c1960e8cd7c 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -484,25 +484,6 @@ config RISCV_ALTERNATIVE_EARLY
> > help
> > Allows early patching of the kernel for special errata
> > -config RISCV_ISA_SVNAPOT
> > - bool "Svnapot extension support for supervisor mode NAPOT pages"
> > - depends on 64BIT && MMU
> > - depends on RISCV_ALTERNATIVE
> > - default y
> > - help
> > - Add support for the Svnapot ISA-extension in the kernel when it
> > - is detected at boot.
> > -
> > - The Svnapot extension is used to mark contiguous PTEs as a range
> > - of contiguous virtual-to-physical translations for a naturally
> > - aligned power-of-2 (NAPOT) granularity larger than the base 4KB page
> > - size. When HUGETLBFS is also selected this option unconditionally
> > - allocates some memory for each NAPOT page size supported by the kernel.
> > - When optimizing for low memory consumption and for platforms without
> > - the Svnapot extension, it may be better to say N here.
> > -
> > - If you don't know what to do here, say Y.
> > -
> > config RISCV_ISA_SVPBMT
> > bool "Svpbmt extension support for supervisor mode page-based memory types"
> > depends on 64BIT && MMU
> > diff --git a/arch/riscv/Kconfig.isa b/arch/riscv/Kconfig.isa
> > index 0663c98b5b17..37585bcd763e 100644
> > --- a/arch/riscv/Kconfig.isa
> > +++ b/arch/riscv/Kconfig.isa
> > @@ -124,3 +124,47 @@ config RISCV_ISA_V_PREEMPTIVE
> > This config allows kernel to run SIMD without explicitly disable
> > preemption. Enabling this config will result in higher memory
> > consumption due to the allocation of per-task's kernel Vector context.
> > +
> > +config RISCV_ISA_SVNAPOT
> > + bool
> > +
> > +choice
> > + prompt "Svnapot extension support for supervisor mode NAPOT pages"
> > + default PLATFORM_MAY_SUPPORT_RISCV_ISA_SVNAPOT
> > + help
> > + This selects the level of support for Svnapot in the Linux Kernel.
> > +
> > + The Svnapot extension is used to mark contiguous PTEs as a range
> > + of contiguous virtual-to-physical translations for a naturally
> > + aligned power-of-2 (NAPOT) granularity larger than the base 4KB page
> > + size. When HUGETLBFS is also selected this option unconditionally
> > + allocates some memory for each NAPOT page size supported by the kernel.
> > + When optimizing for low memory consumption and for platforms without
> > + the Svnapot extension, it may be better to prohibit Svnapot.
> > +
> > +config PROHIBIT_RISCV_ISA_SVNAPOT
> > + bool "Prohibit Svnapot extension"
> > + help
> > + Regardless of if the platform supports Svnapot, prohibit the kernel
> > + from using Svnapot.
> > +
> > +config PLATFORM_MAY_SUPPORT_RISCV_ISA_SVNAPOT
> > + bool "Allow Svnapot extension if supported"
> > + depends on 64BIT && MMU
> > + depends on RISCV_ALTERNATIVE
> > + select RISCV_ISA_SVNAPOT
> > + help
> > + Add support for the Svnapot ISA-extension in the kernel when it
> > + is detected at boot.
> > +
> > +config PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT
> > + bool "Emit Svnapot mappings when building Linux"
> > + depends on 64BIT && MMU
> > + depends on NONPORTABLE
> > + select RISCV_ISA_SVNAPOT
> > + help
> > + Compile a kernel that assumes that the platform supports Svnapot.
> > + This option produces a kernel that will not run on systems that do
> > + not support Svnapot.
> > +
> > +endchoice
> > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> > index 6afd6bb4882e..432be9691b78 100644
> > --- a/arch/riscv/include/asm/pgtable.h
> > +++ b/arch/riscv/include/asm/pgtable.h
> > @@ -289,7 +289,8 @@ static inline pte_t pud_pte(pud_t pud)
> > static __always_inline bool has_svnapot(void)
> > {
> > - return riscv_has_extension_likely(RISCV_ISA_EXT_SVNAPOT);
> > + return IS_ENABLED(CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT) ||
> > + riscv_has_extension_likely(RISCV_ISA_EXT_SVNAPOT);
>
> could you add the IS_ENABLED(*) check into riscv_has_extension_likely
> and other such functions?

I wasn't sure how to support that. An option I was debating about this
was fixing up riscv_has_extension_likely() so that it's a macro and
SVNAPOT could be expanded to both
CONFIG_PLATFORM_SUPPORTS_RISCV_ISA_SVNAPOT and RISCV_ISA_EXT_SVNAPOT.

- Charlie

>
>
> --
> Ben Dooks http://www.codethink.co.uk/
> Senior Engineer Codethink - Providing Genius
>
> https://www.codethink.co.uk/privacy.html
>

2024-05-09 20:46:07

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] riscv: Support compiling the kernel with more extensions

Hey Charlie,

Don't mean to subject you to a rant here, but that's kinda what it seems
to have become. I wish the Zbb example I use below was something someone
else had written, so that it doesn't feel like I am tryna kick you while
you're down, but that was the thing I happened across this evening :/

On Tue, May 07, 2024 at 06:36:26PM -0700, Charlie Jenkins wrote:
> The kernel currently has the restriction that it can only be compiled
> with the extensions that are hardcoded in arch/risc/Makefile.
>
> Any extension that is not listed in the Makefile can still be used by
> explicitly writing the assembly and using alternative patching.
>
> This series introduces Kconfig options that allow the kernel to be
> compiled with additional extensions.
>
> The motivation for this patch is the performance improvements that come
> along with compiling the kernel with these extra instructions. Allowing
> the compiler to emit arbitrary Zb* instructions achieves a 4.9%
> reduction of dynamic instruction count for a test ran in Spike that
> boots the kernel and runs a user space program that prints to the
> console.
>
> Additionally, alternatives that check if an extension is supported can
> be eliminated when the Kconfig options to assume hardware support is
> enabled.

I brought this up yesterday at the weekly patchwork call and meant to
reply here yesterday, but I didn't get a chance to. I'll start off with
my thoughts on the idea and the implementation and then mention some of
what was said at the call.

Firstly, I don't like an implementation of this behaviour that requires
doing ifdeffery around alternative sites. I think that iff this is done,
the alternative itself should be evaluated at compile time, rather than
having to add more decoration to callsites. That becomes particular
important in the cases where the alternative may not be a simple a or b
case, although I don't think there are any of those in the extensions
you've looked at so far - or at least, you've not tackled those cases.

I am curious about the Svpbmt patch, as you say
> Svpbmt would not benefit from having PLATFORM_SUPPORTS_RISCV_ISA_SVPBMT
> so just move the definition of RISCV_ISA_SVPBMT to Kconfig.isa.
without any any justification for why it would not benefit. There's
alternatives in the codebase right now for Svpbmt, why wouldn't those
get evaluated at build time also? Or why not Zicbom? Was your rationale
for the extensions chosen just the ones that the compiler can actually
generate code for?
That aside, the series seems to address the easiest parts of doing
compile-time extension configuration rather than the messier cases like
Zicbom. To me it seems like the messier cases is actually where we should
be starting, so that we have a scheme that works well.

Ben mentioned something along the same lines for the
has_extension_likely() stuff, which should be far simpler, and can be
implemented via a macro, as you already pointed out.

I did notice that you left the riscv_isa_extension_available() stuff
alone. I think that's reasonable as that code serves more than one
purpose and is intended for use in either in probe functions where
there's no perf (or even code-size impact really, just disable the
driver if you don't want it) or in cases where the user provides its own
bitmap, like KVM.

I haven't actually reviewed the content line by line yet, so I don't
have any detailed comment on any patches, but I think the two things
being done here deserve to be split apart - the first element is
evaluating things that are using alternatives at build time and the
other is adding extensions to the toolchain's march.

Moving onto the objection to the series that I have though, at least at
the moment. Adding more and more optimisations to the kernel already has
potential to balloon to silly levels, and that's before we even consider
the permutations of different build-time options. Both of those things
feel like "where does it stop?" situation, with every single extension
that could have code-gen impact becoming another build-time option for
the kernel. As a result, I'm not convinced that we should do this at all,
and I am starting to wonder about some of stuff that we have already
merged..

I don't think the configurability this series adds is worth the burden of
maintaining support for all the various configurations you're proposing
here (and the others that someone will come along with the week after
this would be merged. After all, with extant hardware that distros are
supporting, albeit in developer or bring-up type builds, one of these
options could even be enabled. Which I suppose could be translated to
a NAK from me on doing something like this at the moment...

Palmer suggested in the weekly call that what would make more sense is
having established bases, that align with what distros are likely to
ship, which probably means something approximating the mandatory set for
profiles, although he also said that the rva23 profiles had already been
given the kibosh by folks. He'll have to provide more information on
that one though.
I think that that seems like a sane approach, as it would produce a far
more limited set of combinations to maintain, but it also means not doing
something like this until the point that distros commit to some specific
set of extensions that is not rv64gc... As well as reducing the
combinations that we need to reason about as developers, I think that the
"user story" for people deciding what is worth enabling in their kernel
config before simpler too.

* Something else that came up during that call, and I think it was
Palmer's suggestion was having a hard think about what we are
currently accepting optimisations for in the kernel. I think we need to
up the "burden of proof" for what we will merge optimisations for to
things that are demonstrated to have significant benefits. I don't mean
to single you out here, cos I did ack the patch after all and it was
just the random example I stumbled on this evening while looking at some
alternative users in the course of writing a reply here. Take this code
for example:

/*
* ZBB only saves three instructions on 32-bit and five on 64-bit so not
* worth checking if supported without Alternatives.
*/
if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) &&
IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
unsigned long fold_temp;

asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
RISCV_ISA_EXT_ZBB, 1)
:
:
:
: no_zbb);

if (IS_ENABLED(CONFIG_32BIT)) {
asm(".option push \n\
.option arch,+zbb \n\
not %[fold_temp], %[csum] \n\
rori %[csum], %[csum], 16 \n\
sub %[csum], %[fold_temp], %[csum] \n\
.option pop"
: [csum] "+r" (csum), [fold_temp] "=&r" (fold_temp));
} else {
asm(".option push \n\
.option arch,+zbb \n\
rori %[fold_temp], %[csum], 32 \n\
add %[csum], %[fold_temp], %[csum] \n\
srli %[csum], %[csum], 32 \n\
not %[fold_temp], %[csum] \n\
roriw %[csum], %[csum], 16 \n\
subw %[csum], %[fold_temp], %[csum] \n\
.option pop"
: [csum] "+r" (csum), [fold_temp] "=&r" (fold_temp));
}
return (__force __sum16)(csum >> 16);
}

The comment there made me think as to why we even have this optimisation
for Zbb at all - is the saving of 3 - 1 or 5 - 1 instructions actually
worth having 3 code paths? The commit message for this contains no
information on the performance benefit of the code at, and while the cover
letter has some information, it was not actually tested in hardware and
does not look to be a real-word benchmark. This one is already merged,
but something like this in the future would really need to be subjected to
significantly more scrutiny! At the very least, "optimisations" need to be
proved to be beneficial in hardware.

Anyways, that's my thoughts on this. IIRC it was mainly Palmer and I
doing the talking about this on the call, with Paul I think having some
comments. Hopefully Palmer can chime in :)

Cheers,
Conor.


Attachments:
(No filename) (7.98 kB)
signature.asc (235.00 B)
Download all attachments

2024-05-09 21:16:32

by Charlie Jenkins

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] riscv: Support compiling the kernel with more extensions

On Thu, May 09, 2024 at 09:25:10PM +0100, Conor Dooley wrote:
> Hey Charlie,
>
> Don't mean to subject you to a rant here, but that's kinda what it seems
> to have become. I wish the Zbb example I use below was something someone
> else had written, so that it doesn't feel like I am tryna kick you while
> you're down, but that was the thing I happened across this evening :/

Don't feel bad! I wrote it so I can take it :)

>
> On Tue, May 07, 2024 at 06:36:26PM -0700, Charlie Jenkins wrote:
> > The kernel currently has the restriction that it can only be compiled
> > with the extensions that are hardcoded in arch/risc/Makefile.
> >
> > Any extension that is not listed in the Makefile can still be used by
> > explicitly writing the assembly and using alternative patching.
> >
> > This series introduces Kconfig options that allow the kernel to be
> > compiled with additional extensions.
> >
> > The motivation for this patch is the performance improvements that come
> > along with compiling the kernel with these extra instructions. Allowing
> > the compiler to emit arbitrary Zb* instructions achieves a 4.9%
> > reduction of dynamic instruction count for a test ran in Spike that
> > boots the kernel and runs a user space program that prints to the
> > console.
> >
> > Additionally, alternatives that check if an extension is supported can
> > be eliminated when the Kconfig options to assume hardware support is
> > enabled.
>
> I brought this up yesterday at the weekly patchwork call and meant to
> reply here yesterday, but I didn't get a chance to. I'll start off with
> my thoughts on the idea and the implementation and then mention some of
> what was said at the call.
>
> Firstly, I don't like an implementation of this behaviour that requires
> doing ifdeffery around alternative sites. I think that iff this is done,
> the alternative itself should be evaluated at compile time, rather than
> having to add more decoration to callsites. That becomes particular
> important in the cases where the alternative may not be a simple a or b
> case, although I don't think there are any of those in the extensions
> you've looked at so far - or at least, you've not tackled those cases.
>
> I am curious about the Svpbmt patch, as you say
> > Svpbmt would not benefit from having PLATFORM_SUPPORTS_RISCV_ISA_SVPBMT
> > so just move the definition of RISCV_ISA_SVPBMT to Kconfig.isa.
> without any any justification for why it would not benefit. There's
> alternatives in the codebase right now for Svpbmt, why wouldn't those
> get evaluated at build time also? Or why not Zicbom? Was your rationale
> for the extensions chosen just the ones that the compiler can actually
> generate code for?

It's only used in a place that has errata so I wasn't sure how to
handle that.

> That aside, the series seems to address the easiest parts of doing
> compile-time extension configuration rather than the messier cases like
> Zicbom. To me it seems like the messier cases is actually where we should
> be starting, so that we have a scheme that works well.

That's good advice. I wanted to send out something to start the
conversation on what people were interested in optimizing here. I can
look more into Zicbom.

>
> Ben mentioned something along the same lines for the
> has_extension_likely() stuff, which should be far simpler, and can be
> implemented via a macro, as you already pointed out.

I was hesistant to change "too much" as I was expecting push back and
didn't want to have to re-write everything ;)

>
> I did notice that you left the riscv_isa_extension_available() stuff
> alone. I think that's reasonable as that code serves more than one
> purpose and is intended for use in either in probe functions where
> there's no perf (or even code-size impact really, just disable the
> driver if you don't want it) or in cases where the user provides its own
> bitmap, like KVM.
>
> I haven't actually reviewed the content line by line yet, so I don't
> have any detailed comment on any patches, but I think the two things
> being done here deserve to be split apart - the first element is
> evaluating things that are using alternatives at build time and the
> other is adding extensions to the toolchain's march.

That will double the size of the series but if you think that's better
than I can do that.

>
> Moving onto the objection to the series that I have though, at least at
> the moment. Adding more and more optimisations to the kernel already has
> potential to balloon to silly levels, and that's before we even consider
> the permutations of different build-time options. Both of those things
> feel like "where does it stop?" situation, with every single extension
> that could have code-gen impact becoming another build-time option for
> the kernel. As a result, I'm not convinced that we should do this at all,
> and I am starting to wonder about some of stuff that we have already
> merged..
>

Vendors that expect a high level of performance need a way to be able to
compile the kernel with more extensions than the base extensions. We are
leaving 5% that can easily be gained by not allowing this.

> I don't think the configurability this series adds is worth the burden of
> maintaining support for all the various configurations you're proposing
> here (and the others that someone will come along with the week after
> this would be merged. After all, with extant hardware that distros are
> supporting, albeit in developer or bring-up type builds, one of these
> options could even be enabled. Which I suppose could be translated to
> a NAK from me on doing something like this at the moment...

By migrating everything into more refined macros I think I can ease this
burden. I don't see this as a burden, these options are all so closly
tied to each other and only matter when a kernel developer explicitly
wants to use an extension. If this is all wrapped up into the macros
that check if an extension is available it won't even be an extra step
than what it currently is.

>
> Palmer suggested in the weekly call that what would make more sense is
> having established bases, that align with what distros are likely to
> ship, which probably means something approximating the mandatory set for
> profiles, although he also said that the rva23 profiles had already been
> given the kibosh by folks. He'll have to provide more information on
> that one though.
> I think that that seems like a sane approach, as it would produce a far
> more limited set of combinations to maintain, but it also means not doing
> something like this until the point that distros commit to some specific
> set of extensions that is not rv64gc... As well as reducing the
> combinations that we need to reason about as developers, I think that the
> "user story" for people deciding what is worth enabling in their kernel
> config before simpler too.

There is a chicken and the egg problem here. The
hardware/software/distros all want to support the same thing. Somebody
needs to step up and make a decision. With a patch like this, a distro
can see all of the functionality and select what they want. This can
then be rolled up into a config that selects something like all of the
bitmanip options.

>
> * Something else that came up during that call, and I think it was
> Palmer's suggestion was having a hard think about what we are
> currently accepting optimisations for in the kernel. I think we need to
> up the "burden of proof" for what we will merge optimisations for to
> things that are demonstrated to have significant benefits. I don't mean
> to single you out here, cos I did ack the patch after all and it was
> just the random example I stumbled on this evening while looking at some
> alternative users in the course of writing a reply here. Take this code
> for example:
>
> /*
> * ZBB only saves three instructions on 32-bit and five on 64-bit so not
> * worth checking if supported without Alternatives.
> */
> if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) &&
> IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
> unsigned long fold_temp;
>
> asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
> RISCV_ISA_EXT_ZBB, 1)
> :
> :
> :
> : no_zbb);
>
> if (IS_ENABLED(CONFIG_32BIT)) {
> asm(".option push \n\
> .option arch,+zbb \n\
> not %[fold_temp], %[csum] \n\
> rori %[csum], %[csum], 16 \n\
> sub %[csum], %[fold_temp], %[csum] \n\
> .option pop"
> : [csum] "+r" (csum), [fold_temp] "=&r" (fold_temp));
> } else {
> asm(".option push \n\
> .option arch,+zbb \n\
> rori %[fold_temp], %[csum], 32 \n\
> add %[csum], %[fold_temp], %[csum] \n\
> srli %[csum], %[csum], 32 \n\
> not %[fold_temp], %[csum] \n\
> roriw %[csum], %[csum], 16 \n\
> subw %[csum], %[fold_temp], %[csum] \n\
> .option pop"
> : [csum] "+r" (csum), [fold_temp] "=&r" (fold_temp));
> }
> return (__force __sum16)(csum >> 16);
> }
>
> The comment there made me think as to why we even have this optimisation
> for Zbb at all - is the saving of 3 - 1 or 5 - 1 instructions actually
> worth having 3 code paths? The commit message for this contains no
> information on the performance benefit of the code at, and while the cover
> letter has some information, it was not actually tested in hardware and
> does not look to be a real-word benchmark. This one is already merged,
> but something like this in the future would really need to be subjected to
> significantly more scrutiny! At the very least, "optimisations" need to be
> proved to be beneficial in hardware.

I put the justification in the cover letter of the series:

"Tested on QEMU, this series allows the CHECKSUM_KUNIT tests to complete
an average of 50.9% faster."

I did a lot of testing locally to ensure that every combination was as
performant as it possibly could be. I did not provide numbers for every
case simply because the combination with 64-bit and Zbb was the
primary target of the series and nobody asked about the other cases.

There is pretty much only this code and the bitops optimization in the
kernel that try to do anything extreme for the sake of performance.
These checksum functions are very critical to performance as these
checksums are computed on every network packet that is received by the
kernel. Networking drivers rely on these functions and they need to be
as fast as possible. 50% improvement is very good even if it's only
qemu.

We could just say we don't care about performance if you are running
32-bit linux or don't have Zbb, but we would be making that decision
because we don't feel like maintaining the code. The code was written,
tested, reviewed, and it provided large performance gains. I fail to
understand why this is a burden to maintain.

- Charlie

>
> Anyways, that's my thoughts on this. IIRC it was mainly Palmer and I
> doing the talking about this on the call, with Paul I think having some
> comments. Hopefully Palmer can chime in :)
>
> Cheers,
> Conor.



2024-05-09 22:08:46

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] riscv: Support compiling the kernel with more extensions

On Thu, May 09, 2024 at 02:16:17PM -0700, Charlie Jenkins wrote:
> On Thu, May 09, 2024 at 09:25:10PM +0100, Conor Dooley wrote:
> > On Tue, May 07, 2024 at 06:36:26PM -0700, Charlie Jenkins wrote:
> > > The kernel currently has the restriction that it can only be compiled
> > > with the extensions that are hardcoded in arch/risc/Makefile.
> > >
> > > Any extension that is not listed in the Makefile can still be used by
> > > explicitly writing the assembly and using alternative patching.
> > >
> > > This series introduces Kconfig options that allow the kernel to be
> > > compiled with additional extensions.
> > >
> > > The motivation for this patch is the performance improvements that come
> > > along with compiling the kernel with these extra instructions. Allowing
> > > the compiler to emit arbitrary Zb* instructions achieves a 4.9%
> > > reduction of dynamic instruction count for a test ran in Spike that
> > > boots the kernel and runs a user space program that prints to the
> > > console.
> > >
> > > Additionally, alternatives that check if an extension is supported can
> > > be eliminated when the Kconfig options to assume hardware support is
> > > enabled.
> >
> > I brought this up yesterday at the weekly patchwork call and meant to
> > reply here yesterday, but I didn't get a chance to. I'll start off with
> > my thoughts on the idea and the implementation and then mention some of
> > what was said at the call.
> >
> > Firstly, I don't like an implementation of this behaviour that requires
> > doing ifdeffery around alternative sites. I think that iff this is done,
> > the alternative itself should be evaluated at compile time, rather than
> > having to add more decoration to callsites. That becomes particular
> > important in the cases where the alternative may not be a simple a or b
> > case, although I don't think there are any of those in the extensions
> > you've looked at so far - or at least, you've not tackled those cases.
> >
> > I am curious about the Svpbmt patch, as you say
> > > Svpbmt would not benefit from having PLATFORM_SUPPORTS_RISCV_ISA_SVPBMT
> > > so just move the definition of RISCV_ISA_SVPBMT to Kconfig.isa.
> > without any any justification for why it would not benefit. There's
> > alternatives in the codebase right now for Svpbmt, why wouldn't those
> > get evaluated at build time also? Or why not Zicbom? Was your rationale
> > for the extensions chosen just the ones that the compiler can actually
> > generate code for?
>
> It's only used in a place that has errata so I wasn't sure how to
> handle that.
>
> > That aside, the series seems to address the easiest parts of doing
> > compile-time extension configuration rather than the messier cases like
> > Zicbom. To me it seems like the messier cases is actually where we should
> > be starting, so that we have a scheme that works well.
>
> That's good advice. I wanted to send out something to start the
> conversation on what people were interested in optimizing here. I can
> look more into Zicbom.
>
> >
> > Ben mentioned something along the same lines for the
> > has_extension_likely() stuff, which should be far simpler, and can be
> > implemented via a macro, as you already pointed out.
>
> I was hesistant to change "too much" as I was expecting push back and
> didn't want to have to re-write everything ;)
>
> >
> > I did notice that you left the riscv_isa_extension_available() stuff
> > alone. I think that's reasonable as that code serves more than one
> > purpose and is intended for use in either in probe functions where
> > there's no perf (or even code-size impact really, just disable the
> > driver if you don't want it) or in cases where the user provides its own
> > bitmap, like KVM.
> >
> > I haven't actually reviewed the content line by line yet, so I don't
> > have any detailed comment on any patches, but I think the two things
> > being done here deserve to be split apart - the first element is
> > evaluating things that are using alternatives at build time and the
> > other is adding extensions to the toolchain's march.
>
> That will double the size of the series but if you think that's better
> than I can do that.
>
> >
> > Moving onto the objection to the series that I have though, at least at
> > the moment. Adding more and more optimisations to the kernel already has
> > potential to balloon to silly levels, and that's before we even consider
> > the permutations of different build-time options. Both of those things
> > feel like "where does it stop?" situation, with every single extension
> > that could have code-gen impact becoming another build-time option for
> > the kernel. As a result, I'm not convinced that we should do this at all,
> > and I am starting to wonder about some of stuff that we have already
> > merged..
> >
>
> Vendors that expect a high level of performance need a way to be able to
> compile the kernel with more extensions than the base extensions. We are
> leaving 5% that can easily be gained by not allowing this.

Maybe we are, but if people want their 5% they need to show up with
evidence that there is actually 5% to be gained. Also, if you read on, I
am not saying we should never do this, and leave that 5% permanently on
the table, only that we should significantly constrain the permutations
that we are allowing. And honestly, if some vendor is really desperate to
compile the kernel with Zxy in march but not whatever other extensions
that may be in a profile's mandatory set, they can always do it out of
tree. Carrying a single out of tree patch is nothing to most vendors...

> > I don't think the configurability this series adds is worth the burden of
> > maintaining support for all the various configurations you're proposing
> > here (and the others that someone will come along with the week after
> > this would be merged. After all, with extant hardware that distros are
> > supporting, albeit in developer or bring-up type builds, one of these
> > options could even be enabled. Which I suppose could be translated to
> > a NAK from me on doing something like this at the moment...
>
> By migrating everything into more refined macros I think I can ease this
> burden. I don't see this as a burden, these options are all so closly
> tied to each other

What does "closely tied to each other" actually mean?

> and only matter when a kernel developer explicitly
> wants to use an extension.

Unless your definition of "kernel developer" extends to "people that
compile their own kernel based on menuconfig", then I don't think you
and I are on the same page about what the series actually does.
Remember, there's the making alternatives and other optimisations
unconditional /and/ the addition of stuff to march going on in this
series.

> If this is all wrapped up into the macros
> that check if an extension is available it won't even be an extra step
> than what it currently is.
>
> >
> > Palmer suggested in the weekly call that what would make more sense is
> > having established bases, that align with what distros are likely to
> > ship, which probably means something approximating the mandatory set for
> > profiles, although he also said that the rva23 profiles had already been
> > given the kibosh by folks. He'll have to provide more information on
> > that one though.
> > I think that that seems like a sane approach, as it would produce a far
> > more limited set of combinations to maintain, but it also means not doing
> > something like this until the point that distros commit to some specific
> > set of extensions that is not rv64gc... As well as reducing the
> > combinations that we need to reason about as developers, I think that the
> > "user story" for people deciding what is worth enabling in their kernel
> > config before simpler too.
>
> There is a chicken and the egg problem here. The
> hardware/software/distros all want to support the same thing. Somebody
> needs to step up and make a decision. With a patch like this, a distro
> can see all of the functionality and select what they want. This can
> then be rolled up into a config that selects something like all of the
> bitmanip options.

I don't think there's a chicken and egg problem, or at least not one
that kernel config options for every extension solves. I expect distros
to work with RVI to define something, which may well be the platform
spec (hopefully it's the platform spec...) and then we can make that a
config option.

> > * Something else that came up during that call, and I think it was
> > Palmer's suggestion was having a hard think about what we are
> > currently accepting optimisations for in the kernel. I think we need to
> > up the "burden of proof" for what we will merge optimisations for to
> > things that are demonstrated to have significant benefits. I don't mean
> > to single you out here, cos I did ack the patch after all and it was
> > just the random example I stumbled on this evening while looking at some
> > alternative users in the course of writing a reply here. Take this code
> > for example:
> >
> > /*
> > * ZBB only saves three instructions on 32-bit and five on 64-bit so not
> > * worth checking if supported without Alternatives.
> > */
> > if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) &&
> > IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
> > unsigned long fold_temp;
> >
> > asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
> > RISCV_ISA_EXT_ZBB, 1)
> > :
> > :
> > :
> > : no_zbb);
> >
> > if (IS_ENABLED(CONFIG_32BIT)) {
> > asm(".option push \n\
> > .option arch,+zbb \n\
> > not %[fold_temp], %[csum] \n\
> > rori %[csum], %[csum], 16 \n\
> > sub %[csum], %[fold_temp], %[csum] \n\
> > .option pop"
> > : [csum] "+r" (csum), [fold_temp] "=&r" (fold_temp));
> > } else {
> > asm(".option push \n\
> > .option arch,+zbb \n\
> > rori %[fold_temp], %[csum], 32 \n\
> > add %[csum], %[fold_temp], %[csum] \n\
> > srli %[csum], %[csum], 32 \n\
> > not %[fold_temp], %[csum] \n\
> > roriw %[csum], %[csum], 16 \n\
> > subw %[csum], %[fold_temp], %[csum] \n\
> > .option pop"
> > : [csum] "+r" (csum), [fold_temp] "=&r" (fold_temp));
> > }
> > return (__force __sum16)(csum >> 16);
> > }
> >
> > The comment there made me think as to why we even have this optimisation
> > for Zbb at all - is the saving of 3 - 1 or 5 - 1 instructions actually
> > worth having 3 code paths? The commit message for this contains no
> > information on the performance benefit of the code at, and while the cover
> > letter has some information, it was not actually tested in hardware and
> > does not look to be a real-word benchmark. This one is already merged,
> > but something like this in the future would really need to be subjected to
> > significantly more scrutiny! At the very least, "optimisations" need to be
> > proved to be beneficial in hardware.
>
> I put the justification in the cover letter of the series:

If you read what I wrote I acknowledge that there's info in the cover,
but if you continue reading you'll note I said that "it was not tested
in hardware and does not look to be a real-word [sic] benchmark".

> "Tested on QEMU, this series allows the CHECKSUM_KUNIT tests to complete
> an average of 50.9% faster."
>
> I did a lot of testing locally to ensure that every combination was as
> performant as it possibly could be. I did not provide numbers for every
> case simply because the combination with 64-bit and Zbb was the
> primary target of the series and nobody asked about the other cases.
>
> There is pretty much only this code and the bitops optimization in the
> kernel that try to do anything extreme for the sake of performance.
> These checksum functions are very critical to performance as these
> checksums are computed on every network packet that is received by the
> kernel. Networking drivers rely on these functions and they need to be
> as fast as possible. 50% improvement is very good even if it's only
> qemu.
>
> We could just say we don't care about performance if you are running
> 32-bit linux or don't have Zbb, but we would be making that decision
> because we don't feel like maintaining the code. The code was written,
> tested, reviewed, and it provided large performance gains. I fail to
> understand why this is a burden to maintain.

Maybe if you read what I wrote you'd see what I was getting at, or maybe
not as I might not have been sufficiently clear. I'm not saying that this
particular optimisation is not worth having, but rather than I wanted to
see why this particular optimisation was worth maintaining 3 code paths
for but the commit message does not detail any of the benefits, and
looking at the cover I discovered that it was not tested in hardware nor
seemingly with a real test case.
I am saying that the future standard should require both of those things,
not that I think your optimisation is not worthwhile and should therefore
be thrown out.

Hope that helps,
Conor.


Attachments:
(No filename) (13.02 kB)
signature.asc (235.00 B)
Download all attachments

2024-05-09 22:55:51

by Charlie Jenkins

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] riscv: Support compiling the kernel with more extensions

On Thu, May 09, 2024 at 11:08:34PM +0100, Conor Dooley wrote:
> On Thu, May 09, 2024 at 02:16:17PM -0700, Charlie Jenkins wrote:
> > On Thu, May 09, 2024 at 09:25:10PM +0100, Conor Dooley wrote:
> > > On Tue, May 07, 2024 at 06:36:26PM -0700, Charlie Jenkins wrote:
> > > > The kernel currently has the restriction that it can only be compiled
> > > > with the extensions that are hardcoded in arch/risc/Makefile.
> > > >
> > > > Any extension that is not listed in the Makefile can still be used by
> > > > explicitly writing the assembly and using alternative patching.
> > > >
> > > > This series introduces Kconfig options that allow the kernel to be
> > > > compiled with additional extensions.
> > > >
> > > > The motivation for this patch is the performance improvements that come
> > > > along with compiling the kernel with these extra instructions. Allowing
> > > > the compiler to emit arbitrary Zb* instructions achieves a 4.9%
> > > > reduction of dynamic instruction count for a test ran in Spike that
> > > > boots the kernel and runs a user space program that prints to the
> > > > console.
> > > >
> > > > Additionally, alternatives that check if an extension is supported can
> > > > be eliminated when the Kconfig options to assume hardware support is
> > > > enabled.
> > >
> > > I brought this up yesterday at the weekly patchwork call and meant to
> > > reply here yesterday, but I didn't get a chance to. I'll start off with
> > > my thoughts on the idea and the implementation and then mention some of
> > > what was said at the call.
> > >
> > > Firstly, I don't like an implementation of this behaviour that requires
> > > doing ifdeffery around alternative sites. I think that iff this is done,
> > > the alternative itself should be evaluated at compile time, rather than
> > > having to add more decoration to callsites. That becomes particular
> > > important in the cases where the alternative may not be a simple a or b
> > > case, although I don't think there are any of those in the extensions
> > > you've looked at so far - or at least, you've not tackled those cases.
> > >
> > > I am curious about the Svpbmt patch, as you say
> > > > Svpbmt would not benefit from having PLATFORM_SUPPORTS_RISCV_ISA_SVPBMT
> > > > so just move the definition of RISCV_ISA_SVPBMT to Kconfig.isa.
> > > without any any justification for why it would not benefit. There's
> > > alternatives in the codebase right now for Svpbmt, why wouldn't those
> > > get evaluated at build time also? Or why not Zicbom? Was your rationale
> > > for the extensions chosen just the ones that the compiler can actually
> > > generate code for?
> >
> > It's only used in a place that has errata so I wasn't sure how to
> > handle that.
> >
> > > That aside, the series seems to address the easiest parts of doing
> > > compile-time extension configuration rather than the messier cases like
> > > Zicbom. To me it seems like the messier cases is actually where we should
> > > be starting, so that we have a scheme that works well.
> >
> > That's good advice. I wanted to send out something to start the
> > conversation on what people were interested in optimizing here. I can
> > look more into Zicbom.
> >
> > >
> > > Ben mentioned something along the same lines for the
> > > has_extension_likely() stuff, which should be far simpler, and can be
> > > implemented via a macro, as you already pointed out.
> >
> > I was hesistant to change "too much" as I was expecting push back and
> > didn't want to have to re-write everything ;)
> >
> > >
> > > I did notice that you left the riscv_isa_extension_available() stuff
> > > alone. I think that's reasonable as that code serves more than one
> > > purpose and is intended for use in either in probe functions where
> > > there's no perf (or even code-size impact really, just disable the
> > > driver if you don't want it) or in cases where the user provides its own
> > > bitmap, like KVM.
> > >
> > > I haven't actually reviewed the content line by line yet, so I don't
> > > have any detailed comment on any patches, but I think the two things
> > > being done here deserve to be split apart - the first element is
> > > evaluating things that are using alternatives at build time and the
> > > other is adding extensions to the toolchain's march.
> >
> > That will double the size of the series but if you think that's better
> > than I can do that.
> >
> > >
> > > Moving onto the objection to the series that I have though, at least at
> > > the moment. Adding more and more optimisations to the kernel already has
> > > potential to balloon to silly levels, and that's before we even consider
> > > the permutations of different build-time options. Both of those things
> > > feel like "where does it stop?" situation, with every single extension
> > > that could have code-gen impact becoming another build-time option for
> > > the kernel. As a result, I'm not convinced that we should do this at all,
> > > and I am starting to wonder about some of stuff that we have already
> > > merged..
> > >
> >
> > Vendors that expect a high level of performance need a way to be able to
> > compile the kernel with more extensions than the base extensions. We are
> > leaving 5% that can easily be gained by not allowing this.
>
> Maybe we are, but if people want their 5% they need to show up with
> evidence that there is actually 5% to be gained. Also, if you read on, I
> am not saying we should never do this, and leave that 5% permanently on
> the table, only that we should significantly constrain the permutations
> that we are allowing. And honestly, if some vendor is really desperate to
> compile the kernel with Zxy in march but not whatever other extensions
> that may be in a profile's mandatory set, they can always do it out of
> tree. Carrying a single out of tree patch is nothing to most vendors...

My impression is that every vendor will want this, so it makes sense to
have this be a standard option. Internally to Rivos we test with
additional extensions hard-coded to append to march. That is obviously not
something that should be upstreamed. We can continue to carry this hack,
but I want something that is standardized that all vendors can use.

>
> > > I don't think the configurability this series adds is worth the burden of
> > > maintaining support for all the various configurations you're proposing
> > > here (and the others that someone will come along with the week after
> > > this would be merged. After all, with extant hardware that distros are
> > > supporting, albeit in developer or bring-up type builds, one of these
> > > options could even be enabled. Which I suppose could be translated to
> > > a NAK from me on doing something like this at the moment...
> >
> > By migrating everything into more refined macros I think I can ease this
> > burden. I don't see this as a burden, these options are all so closly
> > tied to each other
>
> What does "closely tied to each other" actually mean?

Assuming that the hardware has an extension vs the hardware may support
an extension vs the hardware doesn't support an extension. It adds 2
additional options since the existing code only checks if the hardware
may support the extension. However, these two additional options do not
add significant more complexity. That's what I meant by closely tied to
each other.

>
> > and only matter when a kernel developer explicitly
> > wants to use an extension.
>
> Unless your definition of "kernel developer" extends to "people that
> compile their own kernel based on menuconfig", then I don't think you
> and I are on the same page about what the series actually does.
> Remember, there's the making alternatives and other optimisations
> unconditional /and/ the addition of stuff to march going on in this
> series.

True, I was referring to the alternatives when I said that.

>
> > If this is all wrapped up into the macros
> > that check if an extension is available it won't even be an extra step
> > than what it currently is.
> >
> > > Palmer suggested in the weekly call that what would make more sense is
> > > having established bases, that align with what distros are likely to
> > > ship, which probably means something approximating the mandatory set for
> > > profiles, although he also said that the rva23 profiles had already been
> > > given the kibosh by folks. He'll have to provide more information on
> > > that one though.
> > > I think that that seems like a sane approach, as it would produce a far
> > > more limited set of combinations to maintain, but it also means not doing
> > > something like this until the point that distros commit to some specific
> > > set of extensions that is not rv64gc... As well as reducing the
> > > combinations that we need to reason about as developers, I think that the
> > > "user story" for people deciding what is worth enabling in their kernel
> > > config before simpler too.
> >
> > There is a chicken and the egg problem here. The
> > hardware/software/distros all want to support the same thing. Somebody
> > needs to step up and make a decision. With a patch like this, a distro
> > can see all of the functionality and select what they want. This can
> > then be rolled up into a config that selects something like all of the
> > bitmanip options.
>
> I don't think there's a chicken and egg problem, or at least not one
> that kernel config options for every extension solves. I expect distros
> to work with RVI to define something, which may well be the platform
> spec (hopefully it's the platform spec...) and then we can make that a
> config option.

I'll wait for Palmer to weigh in but I see no reason to wait for that
since right now we can get the performance improvement. This won't be
merged into 6.10, and 6.11 won't be released for a couple of months so
we are looking at a timeline in the future anyway but I wanted to start
the conversation now.

>
> > > * Something else that came up during that call, and I think it was
> > > Palmer's suggestion was having a hard think about what we are
> > > currently accepting optimisations for in the kernel. I think we need to
> > > up the "burden of proof" for what we will merge optimisations for to
> > > things that are demonstrated to have significant benefits. I don't mean
> > > to single you out here, cos I did ack the patch after all and it was
> > > just the random example I stumbled on this evening while looking at some
> > > alternative users in the course of writing a reply here. Take this code
> > > for example:
> > >
> > > /*
> > > * ZBB only saves three instructions on 32-bit and five on 64-bit so not
> > > * worth checking if supported without Alternatives.
> > > */
> > > if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) &&
> > > IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
> > > unsigned long fold_temp;
> > >
> > > asm goto(ALTERNATIVE("j %l[no_zbb]", "nop", 0,
> > > RISCV_ISA_EXT_ZBB, 1)
> > > :
> > > :
> > > :
> > > : no_zbb);
> > >
> > > if (IS_ENABLED(CONFIG_32BIT)) {
> > > asm(".option push \n\
> > > .option arch,+zbb \n\
> > > not %[fold_temp], %[csum] \n\
> > > rori %[csum], %[csum], 16 \n\
> > > sub %[csum], %[fold_temp], %[csum] \n\
> > > .option pop"
> > > : [csum] "+r" (csum), [fold_temp] "=&r" (fold_temp));
> > > } else {
> > > asm(".option push \n\
> > > .option arch,+zbb \n\
> > > rori %[fold_temp], %[csum], 32 \n\
> > > add %[csum], %[fold_temp], %[csum] \n\
> > > srli %[csum], %[csum], 32 \n\
> > > not %[fold_temp], %[csum] \n\
> > > roriw %[csum], %[csum], 16 \n\
> > > subw %[csum], %[fold_temp], %[csum] \n\
> > > .option pop"
> > > : [csum] "+r" (csum), [fold_temp] "=&r" (fold_temp));
> > > }
> > > return (__force __sum16)(csum >> 16);
> > > }
> > >
> > > The comment there made me think as to why we even have this optimisation
> > > for Zbb at all - is the saving of 3 - 1 or 5 - 1 instructions actually
> > > worth having 3 code paths? The commit message for this contains no
> > > information on the performance benefit of the code at, and while the cover
> > > letter has some information, it was not actually tested in hardware and
> > > does not look to be a real-word benchmark. This one is already merged,
> > > but something like this in the future would really need to be subjected to
> > > significantly more scrutiny! At the very least, "optimisations" need to be
> > > proved to be beneficial in hardware.
> >
> > I put the justification in the cover letter of the series:
>
> If you read what I wrote I acknowledge that there's info in the cover,
> but if you continue reading you'll note I said that "it was not tested
> in hardware and does not look to be a real-word [sic] benchmark".

I wanted to copy what I said in the cover letter here for other readers
to see. It's not "real world" but it tests the functions which use the
optimizations. The functions that are being optimized are 50% faster and
that's the benefit.

>
> > "Tested on QEMU, this series allows the CHECKSUM_KUNIT tests to complete
> > an average of 50.9% faster."
> >
> > I did a lot of testing locally to ensure that every combination was as
> > performant as it possibly could be. I did not provide numbers for every
> > case simply because the combination with 64-bit and Zbb was the
> > primary target of the series and nobody asked about the other cases.
> >
> > There is pretty much only this code and the bitops optimization in the
> > kernel that try to do anything extreme for the sake of performance.
> > These checksum functions are very critical to performance as these
> > checksums are computed on every network packet that is received by the
> > kernel. Networking drivers rely on these functions and they need to be
> > as fast as possible. 50% improvement is very good even if it's only
> > qemu.
> >
> > We could just say we don't care about performance if you are running
> > 32-bit linux or don't have Zbb, but we would be making that decision
> > because we don't feel like maintaining the code. The code was written,
> > tested, reviewed, and it provided large performance gains. I fail to
> > understand why this is a burden to maintain.
>
> Maybe if you read what I wrote you'd see what I was getting at, or maybe
> not as I might not have been sufficiently clear. I'm not saying that this
> particular optimisation is not worth having, but rather than I wanted to

I seem to frequently give you the impression that I don't read what you
say before responding. What we each view as "important" in the kernel is
different so we come from different places when approaching a problem. I
respond in the way that I do not because I am not listening to you, but
simply because I have a different opinion and I am not explaining that
properly.

> see why this particular optimisation was worth maintaining 3 code paths

I interpreted the "3 code paths" as with Zbb + 64 bit, with Zbb + 32
bit, and without Zbb. I directly responded to that by saying that we
could eliminate all of the code paths that are not Zbb + 64 bit could be
eliminated. I should have given performance numbers for these alternate
cases too, and somebody should have asked. I agree that performance
needs justification, and I understand that I did not provide ample
justification in this patch. All other architectures optimized these
code paths so I figured that was sufficient justification for riscv to
do the same, but I understand that it is not.

> for but the commit message does not detail any of the benefits, and
> looking at the cover I discovered that it was not tested in hardware nor
> seemingly with a real test case.

It's hard when riscv currently is very focused on microcontrollers.
These changes are in order to help future hardware that is more focused
about performance. That's why I contribute this upstream with the hope
that other people, like me, care about performance. Rivos could carry
all of these performance patches internally, but if they already exist
why not let other vendors use them too?

- Charlie

> I am saying that the future standard should require both of those things,
> not that I think your optimisation is not worthwhile and should therefore
> be thrown out.
>
> Hope that helps,
> Conor.



2024-05-10 08:25:48

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] riscv: Support compiling the kernel with more extensions

On Thu, May 09, 2024 at 03:55:12PM -0700, Charlie Jenkins wrote:
> On Thu, May 09, 2024 at 11:08:34PM +0100, Conor Dooley wrote:

> > Maybe if you read what I wrote you'd see what I was getting at, or maybe
> > not as I might not have been sufficiently clear. I'm not saying that this
> > particular optimisation is not worth having, but rather than I wanted to
>
> I seem to frequently give you the impression that I don't read what you
> say before responding.

Does it happen frequently? I don't really recall it annoying me before.

> What we each view as "important" in the kernel is
> different so we come from different places when approaching a problem. I
> respond in the way that I do not because I am not listening to you, but
> simply because I have a different opinion and I am not explaining that
> properly.

If you're trying to describe a different opinion, responding to the bit
I was talking about, as you do below in your latest response is ideal.
Responding inline but not actually addressing the points I was making
did make me think you were [un]intentionally ignoring what I was trying
to say.

> > see why this particular optimisation was worth maintaining 3 code paths
>
> I interpreted the "3 code paths" as with Zbb + 64 bit, with Zbb + 32
> bit, and without Zbb. I directly responded to that by saying that we
> could eliminate all of the code paths that are not Zbb + 64 bit could be
> eliminated. I should have given performance numbers for these alternate
> cases too, and somebody should have asked. I agree that performance
> needs justification, and I understand that I did not provide ample
> justification in this patch. All other architectures optimized these
> code paths so I figured that was sufficient justification for riscv to
> do the same, but I understand that it is not.

And hey, if you look at the commit in question, who acked it? I'm just
saying that I think we should have a higher standard going forwards.

> > for but the commit message does not detail any of the benefits, and
> > looking at the cover I discovered that it was not tested in hardware nor
> > seemingly with a real test case.
>
> It's hard when riscv currently is very focused on microcontrollers.

Zbb is actually implemented in hardware, so testing it in the real world
is not a barrier. Palmer probably has a JH7110 board that someone gave
to him is not using...

> These changes are in order to help future hardware that is more focused
> about performance.

I'm not replying to most of your response rn, got other stuff to do, but
what I was trying to say is that I think the point at which optimisations
for future hardware/extensions should be merged is the point at which
those extensions are no longer future.

> That's why I contribute this upstream with the hope
> that other people, like me, care about performance.

Heh, that could be read to mean that I do not care about performance,
which wouldn't be true.

Cheers,
Conor.


Attachments:
(No filename) (2.96 kB)
signature.asc (235.00 B)
Download all attachments

2024-05-10 08:36:02

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] riscv: Support compiling the kernel with more extensions

On Fri, May 10, 2024 at 09:25:37AM +0100, Conor Dooley wrote:

> > > see why this particular optimisation was worth maintaining 3 code paths
> >
> > I interpreted the "3 code paths" as with Zbb + 64 bit, with Zbb + 32
> > bit, and without Zbb. I directly responded to that by saying that we
> > could eliminate all of the code paths that are not Zbb + 64 bit could be
> > eliminated.

Argh, forgot to say that that was what I meant by the 3 paths, but I
didn't take
| We could just say we don't care about performance if you are running
| 32-bit linux or don't have Zbb, but we would be making that decision
| because we don't feel like maintaining the code. The code was written,
| tested, reviewed, and it provided large performance gains. I fail to
| understand why this is a burden to maintain.
as seriously suggesting that we should remove anything, it read like a
defence of the current code!


Attachments:
(No filename) (918.00 B)
signature.asc (235.00 B)
Download all attachments

2024-05-10 16:48:25

by Charlie Jenkins

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] riscv: Support compiling the kernel with more extensions

On Fri, May 10, 2024 at 09:25:37AM +0100, Conor Dooley wrote:
> On Thu, May 09, 2024 at 03:55:12PM -0700, Charlie Jenkins wrote:
> > On Thu, May 09, 2024 at 11:08:34PM +0100, Conor Dooley wrote:
>
> > > Maybe if you read what I wrote you'd see what I was getting at, or maybe
> > > not as I might not have been sufficiently clear. I'm not saying that this
> > > particular optimisation is not worth having, but rather than I wanted to
> >
> > I seem to frequently give you the impression that I don't read what you
> > say before responding.
>
> Does it happen frequently? I don't really recall it annoying me before.
>
> > What we each view as "important" in the kernel is
> > different so we come from different places when approaching a problem. I
> > respond in the way that I do not because I am not listening to you, but
> > simply because I have a different opinion and I am not explaining that
> > properly.
>
> If you're trying to describe a different opinion, responding to the bit
> I was talking about, as you do below in your latest response is ideal.
> Responding inline but not actually addressing the points I was making
> did make me think you were [un]intentionally ignoring what I was trying
> to say.
>
> > > see why this particular optimisation was worth maintaining 3 code paths
> >
> > I interpreted the "3 code paths" as with Zbb + 64 bit, with Zbb + 32
> > bit, and without Zbb. I directly responded to that by saying that we
> > could eliminate all of the code paths that are not Zbb + 64 bit could be
> > eliminated. I should have given performance numbers for these alternate
> > cases too, and somebody should have asked. I agree that performance
> > needs justification, and I understand that I did not provide ample
> > justification in this patch. All other architectures optimized these
> > code paths so I figured that was sufficient justification for riscv to
> > do the same, but I understand that it is not.
>
> And hey, if you look at the commit in question, who acked it? I'm just
> saying that I think we should have a higher standard going forwards.
>
> > > for but the commit message does not detail any of the benefits, and
> > > looking at the cover I discovered that it was not tested in hardware nor
> > > seemingly with a real test case.
> >
> > It's hard when riscv currently is very focused on microcontrollers.
>
> Zbb is actually implemented in hardware, so testing it in the real world
> is not a barrier. Palmer probably has a JH7110 board that someone gave
> to him is not using...
>
> > These changes are in order to help future hardware that is more focused
> > about performance.
>
> I'm not replying to most of your response rn, got other stuff to do, but
> what I was trying to say is that I think the point at which optimisations
> for future hardware/extensions should be merged is the point at which
> those extensions are no longer future.
>
> > That's why I contribute this upstream with the hope
> > that other people, like me, care about performance.
>
> Heh, that could be read to mean that I do not care about performance,
> which wouldn't be true.
>
> Cheers,
> Conor.


This is all a good perspective for me to keep in mind when I look into
performance improvements. I will try to get my hands on some hardware
that I can use to test so I can have some real numbers!

- Charlie



2024-05-10 20:43:45

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] riscv: Add PLATFORM_MAY_SUPPORT_RISCV_ISA_V Kconfig option

Hey Charlie,

On Tue, May 07, 2024 at 06:36:28PM -0700, Charlie Jenkins wrote:
> Current versions of the kernel add "v" to the march and then immeidately
> filter it out such that "v" is not passed to CFLAGS. Instead of doing
> this filtering, code blocks in the kernel that want to use vector
> assembly have been changed to locally enable vector (using ".option
> arch, +v").

Other content in the series aside, since this is a change that could be
made independently of the main series objectives, I figured it was worth
pointing out that this is not a change without downsides: I think that
it would drop support for vector with most versions of LLVM as
option arch support there is much more recent thing than it is for gcc.
Off the top of my head I don't know exactly the versions involved, but
it is something like LLVM-14 supports vector but only LLVM-17 and later
supports .option arch.

Thanks,
Conor.


Attachments:
(No filename) (933.00 B)
signature.asc (235.00 B)
Download all attachments

2024-05-10 21:43:20

by Charlie Jenkins

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] riscv: Add PLATFORM_MAY_SUPPORT_RISCV_ISA_V Kconfig option

On Fri, May 10, 2024 at 09:43:33PM +0100, Conor Dooley wrote:
> Hey Charlie,
>
> On Tue, May 07, 2024 at 06:36:28PM -0700, Charlie Jenkins wrote:
> > Current versions of the kernel add "v" to the march and then immeidately
> > filter it out such that "v" is not passed to CFLAGS. Instead of doing
> > this filtering, code blocks in the kernel that want to use vector
> > assembly have been changed to locally enable vector (using ".option
> > arch, +v").
>
> Other content in the series aside, since this is a change that could be
> made independently of the main series objectives, I figured it was worth
> pointing out that this is not a change without downsides: I think that
> it would drop support for vector with most versions of LLVM as
> .option arch support there is much more recent thing than it is for gcc.
> Off the top of my head I don't know exactly the versions involved, but
> it is something like LLVM-14 supports vector but only LLVM-17 and later
> supports .option arch.

Toolchain incompatibilities are always fun. It does look like .option
arch was introduced in LLVM-17. That would be a regression. We do use
option arch for every other extension, but vector was treated special
when it was introduced unfortunately so maybe we have to live with the
weird march parsing hack.

- Charlie

>
> Thanks,
> Conor.



2024-05-10 22:28:03

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] riscv: Add PLATFORM_MAY_SUPPORT_RISCV_ISA_V Kconfig option

On Fri, May 10, 2024 at 02:43:10PM -0700, Charlie Jenkins wrote:
> On Fri, May 10, 2024 at 09:43:33PM +0100, Conor Dooley wrote:
> > Hey Charlie,
> >
> > On Tue, May 07, 2024 at 06:36:28PM -0700, Charlie Jenkins wrote:
> > > Current versions of the kernel add "v" to the march and then immeidately
> > > filter it out such that "v" is not passed to CFLAGS. Instead of doing
> > > this filtering, code blocks in the kernel that want to use vector
> > > assembly have been changed to locally enable vector (using ".option
> > > arch, +v").
> >
> > Other content in the series aside, since this is a change that could be
> > made independently of the main series objectives, I figured it was worth
> > pointing out that this is not a change without downsides: I think that
> > it would drop support for vector with most versions of LLVM as
> > .option arch support there is much more recent thing than it is for gcc.
> > Off the top of my head I don't know exactly the versions involved, but
> > it is something like LLVM-14 supports vector but only LLVM-17 and later
> > supports .option arch.
>
> Toolchain incompatibilities are always fun. It does look like .option
> arch was introduced in LLVM-17. That would be a regression. We do use
> .option arch for every other extension, but vector was treated special
> when it was introduced unfortunately so maybe we have to live with the
> weird march parsing hack.

I wrote out a long message about the history of some of this, but right
at the end I was scrolling through my chat logs with Andy and realised
we actually did make it depend on AS_HAS_OPTION_ARCH, so it should be
safe to do without regressing anything. I didn't notice in the diff that
the AS_HAS_OPTION_ARCH was a movement, not an addition. Maybe Andy knows
why, despite the dependency on the assembler having it, we didn't use it
everywhere.


Attachments:
(No filename) (1.85 kB)
signature.asc (235.00 B)
Download all attachments

2024-05-15 14:34:49

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] riscv: Add PLATFORM_MAY_SUPPORT_RISCV_ISA_V Kconfig option

On Fri, May 10, 2024 at 11:26:22PM +0100, Conor Dooley wrote:
> On Fri, May 10, 2024 at 02:43:10PM -0700, Charlie Jenkins wrote:
> > On Fri, May 10, 2024 at 09:43:33PM +0100, Conor Dooley wrote:
> > > Hey Charlie,
> > >
> > > On Tue, May 07, 2024 at 06:36:28PM -0700, Charlie Jenkins wrote:
> > > > Current versions of the kernel add "v" to the march and then immeidately
> > > > filter it out such that "v" is not passed to CFLAGS. Instead of doing
> > > > this filtering, code blocks in the kernel that want to use vector
> > > > assembly have been changed to locally enable vector (using ".option
> > > > arch, +v").
> > >
> > > Other content in the series aside, since this is a change that could be
> > > made independently of the main series objectives, I figured it was worth
> > > pointing out that this is not a change without downsides: I think that
> > > it would drop support for vector with most versions of LLVM as
> > > .option arch support there is much more recent thing than it is for gcc.
> > > Off the top of my head I don't know exactly the versions involved, but
> > > it is something like LLVM-14 supports vector but only LLVM-17 and later
> > > supports .option arch.
> >
> > Toolchain incompatibilities are always fun. It does look like .option
> > arch was introduced in LLVM-17. That would be a regression. We do use
> > .option arch for every other extension, but vector was treated special
> > when it was introduced unfortunately so maybe we have to live with the
> > weird march parsing hack.
>
> I wrote out a long message about the history of some of this, but right
> at the end I was scrolling through my chat logs with Andy and realised
> we actually did make it depend on AS_HAS_OPTION_ARCH, so it should be
> safe to do without regressing anything. I didn't notice in the diff that
> the AS_HAS_OPTION_ARCH was a movement, not an addition. Maybe Andy knows
> why, despite the dependency on the assembler having it, we didn't use it
> everywhere.

And while I remember, the bits of this doing the .option arch conversion
should be (IMO) split into a different patch to anything dealing with
Kconfig options etc.

Cheers,
Conor.


Attachments:
(No filename) (2.16 kB)
signature.asc (235.00 B)
Download all attachments