2022-05-09 08:42:53

by Jisheng Zhang

[permalink] [raw]
Subject: [PATCH v2 0/4] unified way to use static key and optimize pgtable_l4_enabled

Currently, riscv has several features which may not be supported on all
riscv platforms, for example, FPU, SV48, SV57 and so on. To support
unified kernel Image style, we need to check whether the feature is
suportted or not. If the check sits at hot code path, then performance
will be impacted a lot. static key can be used to solve the issue. In
the past, FPU support has been converted to use static key mechanism.
I believe we will have similar cases in the future. For example, the
SV48 support can take advantage of static key[1].

patch1 is a simple W=1 warning fix.
patch2 introduces an unified mechanism to use static key for riscv cpu
features.
patch3 converts has_cpu() to use the mechanism.
patch4 uses the mechanism to optimize pgtable_l4|[l5]_enabled.

[1] http://lists.infradead.org/pipermail/linux-riscv/2021-December/011164.html

Since v1:
- Add a W=1 warning fix
- Fix W=1 error
- Based on v5.18-rcN, since SV57 support is added, so convert
pgtable_l5_enabled as well.

Jisheng Zhang (4):
riscv: mm: init: make pt_ops_set_[early|late|fixmap] static
riscv: introduce unified static key mechanism for CPU features
riscv: replace has_fpu() with system_supports_fpu()
riscv: convert pgtable_l4|[l5]_enabled to static key

arch/riscv/Makefile | 3 +
arch/riscv/include/asm/cpufeature.h | 110 ++++++++++++++++++++++++++++
arch/riscv/include/asm/pgalloc.h | 16 ++--
arch/riscv/include/asm/pgtable-64.h | 40 +++++-----
arch/riscv/include/asm/pgtable.h | 5 +-
arch/riscv/include/asm/switch_to.h | 9 +--
arch/riscv/kernel/cpu.c | 4 +-
arch/riscv/kernel/cpufeature.c | 29 ++++++--
arch/riscv/kernel/process.c | 2 +-
arch/riscv/kernel/signal.c | 4 +-
arch/riscv/mm/init.c | 52 ++++++-------
arch/riscv/mm/kasan_init.c | 16 ++--
arch/riscv/tools/Makefile | 22 ++++++
arch/riscv/tools/cpucaps | 7 ++
arch/riscv/tools/gen-cpucaps.awk | 40 ++++++++++
15 files changed, 274 insertions(+), 85 deletions(-)
create mode 100644 arch/riscv/include/asm/cpufeature.h
create mode 100644 arch/riscv/tools/Makefile
create mode 100644 arch/riscv/tools/cpucaps
create mode 100755 arch/riscv/tools/gen-cpucaps.awk

--
2.34.1



2022-05-09 10:10:26

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] unified way to use static key and optimize pgtable_l4_enabled

On Sun, May 8, 2022 at 9:46 PM Jisheng Zhang <[email protected]> wrote:
>
> Currently, riscv has several features which may not be supported on all
> riscv platforms, for example, FPU, SV48, SV57 and so on. To support
> unified kernel Image style, we need to check whether the feature is
> suportted or not. If the check sits at hot code path, then performance
> will be impacted a lot. static key can be used to solve the issue. In
> the past, FPU support has been converted to use static key mechanism.
> I believe we will have similar cases in the future. For example, the
> SV48 support can take advantage of static key[1].
>
> patch1 is a simple W=1 warning fix.
> patch2 introduces an unified mechanism to use static key for riscv cpu
> features.
> patch3 converts has_cpu() to use the mechanism.
> patch4 uses the mechanism to optimize pgtable_l4|[l5]_enabled.
>
> [1] http://lists.infradead.org/pipermail/linux-riscv/2021-December/011164.html

Overall, using a script to generate CPU capabilities seems a bit
over-engineered to me. We already have RISC-V ISA extension
parsing infrastructure which can be easily extended to support
static key arrays.

Regards,
Anup

>
> Since v1:
> - Add a W=1 warning fix
> - Fix W=1 error
> - Based on v5.18-rcN, since SV57 support is added, so convert
> pgtable_l5_enabled as well.
>
> Jisheng Zhang (4):
> riscv: mm: init: make pt_ops_set_[early|late|fixmap] static
> riscv: introduce unified static key mechanism for CPU features
> riscv: replace has_fpu() with system_supports_fpu()
> riscv: convert pgtable_l4|[l5]_enabled to static key
>
> arch/riscv/Makefile | 3 +
> arch/riscv/include/asm/cpufeature.h | 110 ++++++++++++++++++++++++++++
> arch/riscv/include/asm/pgalloc.h | 16 ++--
> arch/riscv/include/asm/pgtable-64.h | 40 +++++-----
> arch/riscv/include/asm/pgtable.h | 5 +-
> arch/riscv/include/asm/switch_to.h | 9 +--
> arch/riscv/kernel/cpu.c | 4 +-
> arch/riscv/kernel/cpufeature.c | 29 ++++++--
> arch/riscv/kernel/process.c | 2 +-
> arch/riscv/kernel/signal.c | 4 +-
> arch/riscv/mm/init.c | 52 ++++++-------
> arch/riscv/mm/kasan_init.c | 16 ++--
> arch/riscv/tools/Makefile | 22 ++++++
> arch/riscv/tools/cpucaps | 7 ++
> arch/riscv/tools/gen-cpucaps.awk | 40 ++++++++++
> 15 files changed, 274 insertions(+), 85 deletions(-)
> create mode 100644 arch/riscv/include/asm/cpufeature.h
> create mode 100644 arch/riscv/tools/Makefile
> create mode 100644 arch/riscv/tools/cpucaps
> create mode 100755 arch/riscv/tools/gen-cpucaps.awk
>
> --
> 2.34.1
>

2022-05-09 11:22:04

by Jisheng Zhang

[permalink] [raw]
Subject: [PATCH v2 3/4] riscv: replace has_fpu() with system_supports_fpu()

This is to use the unified cpus_have_{final|const}_cap() instead of
putting static key related here and there.

Signed-off-by: Jisheng Zhang <[email protected]>
---
arch/riscv/include/asm/cpufeature.h | 5 +++++
arch/riscv/include/asm/switch_to.h | 9 ++-------
arch/riscv/kernel/cpufeature.c | 8 ++------
arch/riscv/kernel/process.c | 2 +-
arch/riscv/kernel/signal.c | 4 ++--
5 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
index d80ddd2f3b49..634a653c7fa2 100644
--- a/arch/riscv/include/asm/cpufeature.h
+++ b/arch/riscv/include/asm/cpufeature.h
@@ -91,4 +91,9 @@ static inline void cpus_set_cap(unsigned int num)
}
}

+static inline bool system_supports_fpu(void)
+{
+ return IS_ENABLED(CONFIG_FPU) && !cpus_have_final_cap(RISCV_HAS_NO_FPU);
+}
+
#endif
diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
index 0a3f4f95c555..362cb18d12d5 100644
--- a/arch/riscv/include/asm/switch_to.h
+++ b/arch/riscv/include/asm/switch_to.h
@@ -8,6 +8,7 @@

#include <linux/jump_label.h>
#include <linux/sched/task_stack.h>
+#include <asm/cpufeature.h>
#include <asm/processor.h>
#include <asm/ptrace.h>
#include <asm/csr.h>
@@ -56,13 +57,7 @@ static inline void __switch_to_aux(struct task_struct *prev,
fstate_restore(next, task_pt_regs(next));
}

-extern struct static_key_false cpu_hwcap_fpu;
-static __always_inline bool has_fpu(void)
-{
- return static_branch_likely(&cpu_hwcap_fpu);
-}
#else
-static __always_inline bool has_fpu(void) { return false; }
#define fstate_save(task, regs) do { } while (0)
#define fstate_restore(task, regs) do { } while (0)
#define __switch_to_aux(__prev, __next) do { } while (0)
@@ -75,7 +70,7 @@ extern struct task_struct *__switch_to(struct task_struct *,
do { \
struct task_struct *__prev = (prev); \
struct task_struct *__next = (next); \
- if (has_fpu()) \
+ if (system_supports_fpu()) \
__switch_to_aux(__prev, __next); \
((last) = __switch_to(__prev, __next)); \
} while (0)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index e6c72cad0c1c..1edf3c3f8f62 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -22,10 +22,6 @@ unsigned long elf_hwcap __read_mostly;
/* Host ISA bitmap */
static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly;

-#ifdef CONFIG_FPU
-__ro_after_init DEFINE_STATIC_KEY_FALSE(cpu_hwcap_fpu);
-#endif
-
DECLARE_BITMAP(cpu_hwcaps, RISCV_NCAPS);
EXPORT_SYMBOL(cpu_hwcaps);

@@ -254,8 +250,8 @@ void __init riscv_fill_hwcap(void)
pr_info("riscv: ELF capabilities %s\n", print_str);

#ifdef CONFIG_FPU
- if (elf_hwcap & (COMPAT_HWCAP_ISA_F | COMPAT_HWCAP_ISA_D))
- static_branch_enable(&cpu_hwcap_fpu);
+ if (!(elf_hwcap & (COMPAT_HWCAP_ISA_F | COMPAT_HWCAP_ISA_D)))
+ cpus_set_cap(RISCV_HAS_NO_FPU);
#endif
enable_cpu_capabilities();
static_branch_enable(&riscv_const_caps_ready);
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 504b496787aa..c9cd0b42299e 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -88,7 +88,7 @@ void start_thread(struct pt_regs *regs, unsigned long pc,
unsigned long sp)
{
regs->status = SR_PIE;
- if (has_fpu()) {
+ if (system_supports_fpu()) {
regs->status |= SR_FS_INITIAL;
/*
* Restore the initial value to the FP register
diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
index 9f4e59f80551..96aa593a989e 100644
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -90,7 +90,7 @@ static long restore_sigcontext(struct pt_regs *regs,
/* sc_regs is structured the same as the start of pt_regs */
err = __copy_from_user(regs, &sc->sc_regs, sizeof(sc->sc_regs));
/* Restore the floating-point state. */
- if (has_fpu())
+ if (system_supports_fpu())
err |= restore_fp_state(regs, &sc->sc_fpregs);
return err;
}
@@ -143,7 +143,7 @@ static long setup_sigcontext(struct rt_sigframe __user *frame,
/* sc_regs is structured the same as the start of pt_regs */
err = __copy_to_user(&sc->sc_regs, regs, sizeof(sc->sc_regs));
/* Save the floating-point state. */
- if (has_fpu())
+ if (system_supports_fpu())
err |= save_fp_state(regs, &sc->sc_fpregs);
return err;
}
--
2.34.1


2022-05-09 14:35:19

by Jisheng Zhang

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] unified way to use static key and optimize pgtable_l4_enabled

On Mon, May 09, 2022 at 10:07:16AM +0530, Anup Patel wrote:
> On Sun, May 8, 2022 at 9:46 PM Jisheng Zhang <[email protected]> wrote:
> >
> > Currently, riscv has several features which may not be supported on all
> > riscv platforms, for example, FPU, SV48, SV57 and so on. To support
> > unified kernel Image style, we need to check whether the feature is
> > suportted or not. If the check sits at hot code path, then performance
> > will be impacted a lot. static key can be used to solve the issue. In
> > the past, FPU support has been converted to use static key mechanism.
> > I believe we will have similar cases in the future. For example, the
> > SV48 support can take advantage of static key[1].
> >
> > patch1 is a simple W=1 warning fix.
> > patch2 introduces an unified mechanism to use static key for riscv cpu
> > features.
> > patch3 converts has_cpu() to use the mechanism.
> > patch4 uses the mechanism to optimize pgtable_l4|[l5]_enabled.
> >
> > [1] http://lists.infradead.org/pipermail/linux-riscv/2021-December/011164.html
>
> Overall, using a script to generate CPU capabilities seems a bit
> over-engineered to me. We already have RISC-V ISA extension

Not all riscv features are *ISA* extensions. For example, SV48 and SV57
are not ISA extensions. IIRC, I asked this question before, here are
Atish's comments:

https://lore.kernel.org/linux-riscv/CAHBxVyF65jC_wvxcD6bueqpCY8-Kbahu1yxsSoBmO1s15dGkSQ@mail.gmail.com/

> parsing infrastructure which can be easily extended to support
> static key arrays.
>