LinuxLists.cc - [PATCH 0/3] riscv: Add kernel-mode FPU support for amdgpu

2023-11-22 03:07:00

Subject: [PATCH 0/3] riscv: Add kernel-mode FPU support for amdgpu

This series allows using newer AMD GPUs (e.g. Navi) on RISC-V boards
such as SiFive's HiFive Unmatched. Those GPUs need CONFIG_DRM_AMD_DC_FP
to initialize, which requires kernel-mode FPU support.

I'm sending these patches as one series so there is a user along with
the infrastructure being added. I assume patch 3 would be merged
separately, after patches 1-2 are merged.

Samuel Holland (3):
riscv: Add support for kernel-mode FPU
riscv: Factor out riscv-march-y to a separate Makefile
drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

arch/riscv/Makefile | 12 +-----------
arch/riscv/Makefile.isa | 15 +++++++++++++++
arch/riscv/include/asm/switch_to.h | 14 ++++++++++++++
arch/riscv/kernel/process.c | 3 +++
drivers/gpu/drm/amd/display/Kconfig | 5 ++++-
drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c | 6 ++++--
drivers/gpu/drm/amd/display/dc/dml/Makefile | 6 ++++++
drivers/gpu/drm/amd/display/dc/dml2/Makefile | 6 ++++++
8 files changed, 53 insertions(+), 14 deletions(-)
create mode 100644 arch/riscv/Makefile.isa

--
2.42.0

2023-11-22 03:07:50

by Samuel Holland

[permalink] [raw]

Subject: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
architectures. Enabling hardware FP requires overriding the ISA string
for the relevant compilation units.

Signed-off-by: Samuel Holland <[email protected]>
---

drivers/gpu/drm/amd/display/Kconfig | 5 ++++-
drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c | 6 ++++--
drivers/gpu/drm/amd/display/dc/dml/Makefile | 6 ++++++
drivers/gpu/drm/amd/display/dc/dml2/Makefile | 6 ++++++
4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/Kconfig b/drivers/gpu/drm/amd/display/Kconfig
index 901d1961b739..49b33b2f6701 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -8,7 +8,10 @@ config DRM_AMD_DC
depends on BROKEN || !CC_IS_CLANG || ARM64 || RISCV || SPARC64 || X86_64
select SND_HDA_COMPONENT if SND_HDA_CORE
# !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
- select DRM_AMD_DC_FP if (X86 || LOONGARCH || (PPC64 && ALTIVEC) || (ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG))
+ select DRM_AMD_DC_FP if ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG
+ select DRM_AMD_DC_FP if PPC64 && ALTIVEC
+ select DRM_AMD_DC_FP if RISCV && FPU
+ select DRM_AMD_DC_FP if LOONGARCH || X86
help
Choose this option if you want to use the new display engine
support for AMDGPU. This adds required support for Vega and
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
index 4ae4720535a5..834dca0396f1 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -35,6 +35,8 @@
#include <asm/neon.h>
#elif defined(CONFIG_LOONGARCH)
#include <asm/fpu.h>
+#elif defined(CONFIG_RISCV)
+#include <asm/switch_to.h>
#endif

/**
@@ -89,7 +91,7 @@ void dc_fpu_begin(const char *function_name, const int line)
depth = __this_cpu_inc_return(fpu_recursion_depth);

if (depth == 1) {
-#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
+#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH) || defined(CONFIG_RISCV)
kernel_fpu_begin();
#elif defined(CONFIG_PPC64)
if (cpu_has_feature(CPU_FTR_VSX_COMP))
@@ -122,7 +124,7 @@ void dc_fpu_end(const char *function_name, const int line)

depth = __this_cpu_dec_return(fpu_recursion_depth);
if (depth == 0) {
-#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
+#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH) || defined(CONFIG_RISCV)
kernel_fpu_end();
#elif defined(CONFIG_PPC64)
if (cpu_has_feature(CPU_FTR_VSX_COMP))
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index ea7d60f9a9b4..5c8f840ef323 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -43,6 +43,12 @@ dml_ccflags := -mfpu=64
dml_rcflags := -msoft-float
endif

+ifdef CONFIG_RISCV
+include $(srctree)/arch/riscv/Makefile.isa
+# Remove V from the ISA string, like in arch/riscv/Makefile, but keep F and D.
+dml_ccflags := -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)([^v_]*)v?/\1\2/')
+endif
+
ifdef CONFIG_CC_IS_GCC
ifneq ($(call gcc-min-version, 70100),y)
IS_OLD_GCC = 1
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
index acff3449b8d7..15ad6e3a2173 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
@@ -42,6 +42,12 @@ dml2_ccflags := -mfpu=64
dml2_rcflags := -msoft-float
endif

+ifdef CONFIG_RISCV
+include $(srctree)/arch/riscv/Makefile.isa
+# Remove V from the ISA string, like in arch/riscv/Makefile, but keep F and D.
+dml2_ccflags := -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)([^v_]*)v?/\1\2/')
+endif
+
ifdef CONFIG_CC_IS_GCC
ifeq ($(call cc-ifversion, -lt, 0701, y), y)
IS_OLD_GCC = 1
--
2.42.0

2023-11-22 03:07:56

by Samuel Holland

[permalink] [raw]

Subject: [PATCH 1/3] riscv: Add support for kernel-mode FPU

This is needed to support recent hardware in the amdgpu DRM driver. The
FPU code in that driver is not performance-critical, so only provide the
minimal support.

Signed-off-by: Samuel Holland <[email protected]>
---

arch/riscv/include/asm/switch_to.h | 14 ++++++++++++++
arch/riscv/kernel/process.c | 3 +++
2 files changed, 17 insertions(+)

diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
index f90d8e42f3c7..4b15f1292fc4 100644
--- a/arch/riscv/include/asm/switch_to.h
+++ b/arch/riscv/include/asm/switch_to.h
@@ -63,6 +63,20 @@ static __always_inline bool has_fpu(void)
return riscv_has_extension_likely(RISCV_ISA_EXT_f) ||
riscv_has_extension_likely(RISCV_ISA_EXT_d);
}
+
+static inline void kernel_fpu_begin(void)
+{
+ preempt_disable();
+ fstate_save(current, task_pt_regs(current));
+ csr_set(CSR_SSTATUS, SR_FS);
+}
+
+static inline void kernel_fpu_end(void)
+{
+ csr_clear(CSR_SSTATUS, SR_FS);
+ fstate_restore(current, task_pt_regs(current));
+ preempt_enable();
+}
#else
static __always_inline bool has_fpu(void) { return false; }
#define fstate_save(task, regs) do { } while (0)
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 4f21d970a129..6a18bc709d1c 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -225,3 +225,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
p->thread.sp = (unsigned long)childregs; /* kernel sp */
return 0;
}
+
+EXPORT_SYMBOL_GPL(__fstate_save);
+EXPORT_SYMBOL_GPL(__fstate_restore);
--
2.42.0

2023-11-22 03:07:59

by Samuel Holland

[permalink] [raw]

Subject: [PATCH 2/3] riscv: Factor out riscv-march-y to a separate Makefile

Since it is not possible to incrementally add/remove extensions from the
compiler's ISA string by appending arguments, any code that wants to
modify the ISA string must recreate the whole thing. To support this,
factor out the logic for generating the -march argument so it can be
reused where needed.

Signed-off-by: Samuel Holland <[email protected]>
---

arch/riscv/Makefile | 12 +-----------
arch/riscv/Makefile.isa | 15 +++++++++++++++
2 files changed, 16 insertions(+), 11 deletions(-)
create mode 100644 arch/riscv/Makefile.isa

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index a74be78678eb..c738eafe67a0 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -58,22 +58,12 @@ ifeq ($(CONFIG_SHADOW_CALL_STACK),y)
KBUILD_LDFLAGS += --no-relax-gp
endif

-# ISA string setting
-riscv-march-$(CONFIG_ARCH_RV32I) := rv32ima
-riscv-march-$(CONFIG_ARCH_RV64I) := rv64ima
-riscv-march-$(CONFIG_FPU) := $(riscv-march-y)fd
-riscv-march-$(CONFIG_RISCV_ISA_C) := $(riscv-march-y)c
-riscv-march-$(CONFIG_RISCV_ISA_V) := $(riscv-march-y)v
-
ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC
KBUILD_CFLAGS += -Wa,-misa-spec=2.2
KBUILD_AFLAGS += -Wa,-misa-spec=2.2
-else
-riscv-march-$(CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI) := $(riscv-march-y)_zicsr_zifencei
endif

-# Check if the toolchain supports Zihintpause extension
-riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZIHINTPAUSE) := $(riscv-march-y)_zihintpause
+include $(srctree)/arch/riscv/Makefile.isa

# Remove F,D,V from isa string for all. Keep extensions between "fd" and "v" by
# matching non-v and non-multi-letter extensions out with the filter ([^v_]*)
diff --git a/arch/riscv/Makefile.isa b/arch/riscv/Makefile.isa
new file mode 100644
index 000000000000..e10c77e26fe6
--- /dev/null
+++ b/arch/riscv/Makefile.isa
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+# ISA string setting
+riscv-march-$(CONFIG_ARCH_RV32I) := rv32ima
+riscv-march-$(CONFIG_ARCH_RV64I) := rv64ima
+riscv-march-$(CONFIG_FPU) := $(riscv-march-y)fd
+riscv-march-$(CONFIG_RISCV_ISA_C) := $(riscv-march-y)c
+riscv-march-$(CONFIG_RISCV_ISA_V) := $(riscv-march-y)v
+
+ifndef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC
+riscv-march-$(CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI) := $(riscv-march-y)_zicsr_zifencei
+endif
+
+# Check if the toolchain supports Zihintpause extension
+riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZIHINTPAUSE) := $(riscv-march-y)_zihintpause
--
2.42.0

2023-11-22 08:33:42

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH 1/3] riscv: Add support for kernel-mode FPU

On Tue, Nov 21, 2023 at 07:05:13PM -0800, Samuel Holland wrote:
> +static inline void kernel_fpu_begin(void)
> +{
> + preempt_disable();
> + fstate_save(current, task_pt_regs(current));
> + csr_set(CSR_SSTATUS, SR_FS);
> +}
> +
> +static inline void kernel_fpu_end(void)
> +{
> + csr_clear(CSR_SSTATUS, SR_FS);
> + fstate_restore(current, task_pt_regs(current));
> + preempt_enable();
> +}

Is there any critical reason to inline these two? I'd much rather see
them out of line and exported instead of the low-level helpers.

2023-11-22 08:40:44

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

> - select DRM_AMD_DC_FP if (X86 || LOONGARCH || (PPC64 && ALTIVEC) || (ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG))
> + select DRM_AMD_DC_FP if ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG
> + select DRM_AMD_DC_FP if PPC64 && ALTIVEC
> + select DRM_AMD_DC_FP if RISCV && FPU
> + select DRM_AMD_DC_FP if LOONGARCH || X86

This really is a mess. Can you add a ARCH_HAS_KERNEL_FPU_SUPPORT
symbol that all architetures that have it select instead, and them
make DRM_AMD_DC_FP depend on it?

> -#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
> +#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH) || defined(CONFIG_RISCV)
> kernel_fpu_begin();
> #elif defined(CONFIG_PPC64)
> if (cpu_has_feature(CPU_FTR_VSX_COMP))
> @@ -122,7 +124,7 @@ void dc_fpu_end(const char *function_name, const int line)
>
> depth = __this_cpu_dec_return(fpu_recursion_depth);
> if (depth == 0) {
> -#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
> +#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH) || defined(CONFIG_RISCV)
> kernel_fpu_end();
> #elif defined(CONFIG_PPC64)
> if (cpu_has_feature(CPU_FTR_VSX_COMP))

And then this mess can go away. We'll need to decide if we want to
cover all the in-kernel vector support as part of it, which would
seem reasonable to me, or have a separate generic kernel_vector_begin
with it's own option.

> diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> index ea7d60f9a9b4..5c8f840ef323 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> @@ -43,6 +43,12 @@ dml_ccflags := -mfpu=64
> dml_rcflags := -msoft-float
> endif
>
> +ifdef CONFIG_RISCV
> +include $(srctree)/arch/riscv/Makefile.isa
> +# Remove V from the ISA string, like in arch/riscv/Makefile, but keep F and D.
> +dml_ccflags := -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)([^v_]*)v?/\1\2/')
> +endif
> +
> ifdef CONFIG_CC_IS_GCC
> ifneq ($(call gcc-min-version, 70100),y)
> IS_OLD_GCC = 1

And this is again not really something we should be doing.
Instead we need a generic way in Kconfig to enable FPU support
for an object file or set of, that the arch support can hook
into.

Btw, I'm also really worried about folks using the FPU instructions
outside the kernel_fpu_begin/end windows in general (not directly
related to the RISC-V support). Can we have objecttool checks
for that similar to only allowing the unsafe uaccess in the
uaccess begin/end pairs?

2023-11-22 20:25:25

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH 1/3] riscv: Add support for kernel-mode FPU

2023-11-23 01:56:51

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH 1/3] riscv: Add support for kernel-mode FPU

Hi Samuel,

kernel test robot noticed the following build errors:

[auto build test ERROR on drm-misc/drm-misc-next]
[also build test ERROR on linus/master v6.7-rc2 next-20231122]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Samuel-Holland/riscv-Add-support-for-kernel-mode-FPU/20231122-111015
base: git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link: https://lore.kernel.org/r/20231122030621.3759313-2-samuel.holland%40sifive.com
patch subject: [PATCH 1/3] riscv: Add support for kernel-mode FPU
config: riscv-randconfig-r111-20231123 (https://download.01.org/0day-ci/archive/20231123/[email protected]/config)
compiler: riscv64-linux-gcc (GCC) 13.2.0
reproduce: (https://download.01.org/0day-ci/archive/20231123/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

In file included from include/linux/linkage.h:7,
from include/linux/printk.h:8,
from include/asm-generic/bug.h:22,
from arch/riscv/include/asm/bug.h:83,
from include/linux/bug.h:5,
from arch/riscv/include/asm/current.h:13,
from include/linux/sched.h:12,
from include/linux/ratelimit.h:6,
from include/linux/dev_printk.h:16,
from include/linux/device.h:15,
from include/linux/node.h:18,
from include/linux/cpu.h:17,
from arch/riscv/kernel/process.c:10:
>> arch/riscv/kernel/process.c:229:19: error: '__fstate_save' undeclared here (not in a function); did you mean 'fstate_save'?
229 | EXPORT_SYMBOL_GPL(__fstate_save);
| ^~~~~~~~~~~~~
include/linux/export.h:74:23: note: in definition of macro '__EXPORT_SYMBOL'
74 | extern typeof(sym) sym; \
| ^~~
include/linux/export.h:87:41: note: in expansion of macro '_EXPORT_SYMBOL'
87 | #define EXPORT_SYMBOL_GPL(sym) _EXPORT_SYMBOL(sym, "GPL")
| ^~~~~~~~~~~~~~
arch/riscv/kernel/process.c:229:1: note: in expansion of macro 'EXPORT_SYMBOL_GPL'
229 | EXPORT_SYMBOL_GPL(__fstate_save);
| ^~~~~~~~~~~~~~~~~
>> arch/riscv/kernel/process.c:230:19: error: '__fstate_restore' undeclared here (not in a function); did you mean 'fstate_restore'?
230 | EXPORT_SYMBOL_GPL(__fstate_restore);
| ^~~~~~~~~~~~~~~~
include/linux/export.h:74:23: note: in definition of macro '__EXPORT_SYMBOL'
74 | extern typeof(sym) sym; \
| ^~~
include/linux/export.h:87:41: note: in expansion of macro '_EXPORT_SYMBOL'
87 | #define EXPORT_SYMBOL_GPL(sym) _EXPORT_SYMBOL(sym, "GPL")
| ^~~~~~~~~~~~~~
arch/riscv/kernel/process.c:230:1: note: in expansion of macro 'EXPORT_SYMBOL_GPL'
230 | EXPORT_SYMBOL_GPL(__fstate_restore);
| ^~~~~~~~~~~~~~~~~

vim +229 arch/riscv/kernel/process.c

228
> 229 EXPORT_SYMBOL_GPL(__fstate_save);
> 230 EXPORT_SYMBOL_GPL(__fstate_restore);

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2023-11-23 14:24:03

by Conor Dooley

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

On Tue, Nov 21, 2023 at 07:05:15PM -0800, Samuel Holland wrote:
> RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
> architectures. Enabling hardware FP requires overriding the ISA string
> for the relevant compilation units.

Ah yes, bringing the joy of frame-larger-than warnings to RISC-V:
../drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: warning: stack frame size (2416) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Wframe-larger-than]

Nathan, have you given up on these being sorted out?

Also, what on earth is that function name, it exceeds 80 characters
before even considering anything else? Actually, I don't think I want
to know.

Attachments:

(No filename) (772.00 B)
signature.asc (235.00 B)
Download all attachments

2023-11-30 00:42:42

by Nathan Chancellor

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

On Thu, Nov 23, 2023 at 02:23:01PM +0000, Conor Dooley wrote:
> On Tue, Nov 21, 2023 at 07:05:15PM -0800, Samuel Holland wrote:
> > RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
> > architectures. Enabling hardware FP requires overriding the ISA string
> > for the relevant compilation units.
>
> Ah yes, bringing the joy of frame-larger-than warnings to RISC-V:
> ../drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: warning: stack frame size (2416) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Wframe-larger-than]

:(

> Nathan, have you given up on these being sorted out?

Does your configuration have KASAN (I don't think RISC-V supports
KCSAN)? It is possible that dml/dcn32 needs something similar to commit
6740ec97bcdb ("drm/amd/display: Increase frame warning limit with KASAN
or KCSAN in dml2")?

I am not really interested in playing whack-a-mole with these warnings
like I have done in the past for the reasons I outlined here:

https://lore.kernel.org/[email protected]/

> Also, what on earth is that function name, it exceeds 80 characters
> before even considering anything else? Actually, I don't think I want
> to know.

Welcome to "gcc-parsable HW gospel, coming straight from HW engineers" :)

Cheers,
Nathan

2023-11-30 08:27:29

by Conor Dooley

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

On Wed, Nov 29, 2023 at 05:42:24PM -0700, Nathan Chancellor wrote:
> On Thu, Nov 23, 2023 at 02:23:01PM +0000, Conor Dooley wrote:
> > On Tue, Nov 21, 2023 at 07:05:15PM -0800, Samuel Holland wrote:
> > > RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
> > > architectures. Enabling hardware FP requires overriding the ISA string
> > > for the relevant compilation units.
> >
> > Ah yes, bringing the joy of frame-larger-than warnings to RISC-V:
> > ../drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: warning: stack frame size (2416) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Wframe-larger-than]
>
> :(
>
> > Nathan, have you given up on these being sorted out?
>
> Does your configuration have KASAN (I don't think RISC-V supports
> KCSAN)? It is possible that dml/dcn32 needs something similar to commit
> 6740ec97bcdb ("drm/amd/display: Increase frame warning limit with KASAN
> or KCSAN in dml2")?

It's from allmodconfig, so yes, I think so.

Attachments:

(No filename) (1.07 kB)
signature.asc (235.00 B)
Download all attachments

2023-12-08 04:17:22

by Samuel Holland

[permalink] [raw]

Subject: Re: [PATCH 1/3] riscv: Add support for kernel-mode FPU

Hi Christoph,

On 2023-11-22 2:33 AM, Christoph Hellwig wrote:
> On Tue, Nov 21, 2023 at 07:05:13PM -0800, Samuel Holland wrote:
>> +static inline void kernel_fpu_begin(void)
>> +{
>> + preempt_disable();
>> + fstate_save(current, task_pt_regs(current));
>> + csr_set(CSR_SSTATUS, SR_FS);
>> +}
>> +
>> +static inline void kernel_fpu_end(void)
>> +{
>> + csr_clear(CSR_SSTATUS, SR_FS);
>> + fstate_restore(current, task_pt_regs(current));
>> + preempt_enable();
>> +}
>
> Is there any critical reason to inline these two? I'd much rather see
> them out of line and exported instead of the low-level helpers.

No, I will define them out of line in v2.

Regards,
Samuel

2023-12-08 04:51:52

by Samuel Holland

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

Hi Christoph,

On 2023-11-22 2:40 AM, Christoph Hellwig wrote:
>> - select DRM_AMD_DC_FP if (X86 || LOONGARCH || (PPC64 && ALTIVEC) || (ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG))
>> + select DRM_AMD_DC_FP if ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG
>> + select DRM_AMD_DC_FP if PPC64 && ALTIVEC
>> + select DRM_AMD_DC_FP if RISCV && FPU
>> + select DRM_AMD_DC_FP if LOONGARCH || X86
>
> This really is a mess. Can you add a ARCH_HAS_KERNEL_FPU_SUPPORT
> symbol that all architetures that have it select instead, and them
> make DRM_AMD_DC_FP depend on it?

Yes, I have done this for v2, which I will send shortly.

>> -#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
>> +#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH) || defined(CONFIG_RISCV)
>> kernel_fpu_begin();
>> #elif defined(CONFIG_PPC64)
>> if (cpu_has_feature(CPU_FTR_VSX_COMP))
>> @@ -122,7 +124,7 @@ void dc_fpu_end(const char *function_name, const int line)
>>
>> depth = __this_cpu_dec_return(fpu_recursion_depth);
>> if (depth == 0) {
>> -#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
>> +#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH) || defined(CONFIG_RISCV)
>> kernel_fpu_end();
>> #elif defined(CONFIG_PPC64)
>> if (cpu_has_feature(CPU_FTR_VSX_COMP))
>
> And then this mess can go away. We'll need to decide if we want to
> cover all the in-kernel vector support as part of it, which would
> seem reasonable to me, or have a separate generic kernel_vector_begin
> with it's own option.

I think we may want to keep vector separate for performance on architectures
with separate FP and vector register files. For now, I have limited my changes
to FPU support only, which means I have removed VSX/Altivec from here; the
AMDGPU code doesn't need Altivec anyway.

>> diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile b/drivers/gpu/drm/amd/display/dc/dml/Makefile
>> index ea7d60f9a9b4..5c8f840ef323 100644
>> --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
>> +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
>> @@ -43,6 +43,12 @@ dml_ccflags := -mfpu=64
>> dml_rcflags := -msoft-float
>> endif
>>
>> +ifdef CONFIG_RISCV
>> +include $(srctree)/arch/riscv/Makefile.isa
>> +# Remove V from the ISA string, like in arch/riscv/Makefile, but keep F and D.
>> +dml_ccflags := -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)([^v_]*)v?/\1\2/')
>> +endif
>> +
>> ifdef CONFIG_CC_IS_GCC
>> ifneq ($(call gcc-min-version, 70100),y)
>> IS_OLD_GCC = 1
>
> And this is again not really something we should be doing.
> Instead we need a generic way in Kconfig to enable FPU support
> for an object file or set of, that the arch support can hook
> into.

I've included this in v2 as well.

> Btw, I'm also really worried about folks using the FPU instructions
> outside the kernel_fpu_begin/end windows in general (not directly
> related to the RISC-V support). Can we have objecttool checks
> for that similar to only allowing the unsafe uaccess in the
> uaccess begin/end pairs?

ARM partially enforces this at compile time: it disallows calling
kernel_neon_begin() inside a translation unit that has NEON enabled. That
doesn't prevent the programmer from calling a FPU-enabled function from outside
a begin/end section, but it does prevent the compiler from generating unexpected
FPU usage behind your back. I implemented this same functionality for RISC-V.

Actually tracking all possibly-FPU-tainted functions and their call sites is
probably possible, but a much larger task.

Regards,
Samuel

2023-12-08 05:05:27

by Samuel Holland

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

Hi Nathan,

On 2023-11-29 6:42 PM, Nathan Chancellor wrote:
> On Thu, Nov 23, 2023 at 02:23:01PM +0000, Conor Dooley wrote:
>> On Tue, Nov 21, 2023 at 07:05:15PM -0800, Samuel Holland wrote:
>>> RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
>>> architectures. Enabling hardware FP requires overriding the ISA string
>>> for the relevant compilation units.
>>
>> Ah yes, bringing the joy of frame-larger-than warnings to RISC-V:
>> ../drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: warning: stack frame size (2416) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Wframe-larger-than]
>
> :(
>
>> Nathan, have you given up on these being sorted out?
>
> Does your configuration have KASAN (I don't think RISC-V supports
> KCSAN)? It is possible that dml/dcn32 needs something similar to commit
> 6740ec97bcdb ("drm/amd/display: Increase frame warning limit with KASAN
> or KCSAN in dml2")?
>
> I am not really interested in playing whack-a-mole with these warnings
> like I have done in the past for the reasons I outlined here:
>
> https://lore.kernel.org/[email protected]/

I also see one of these with clang 17 even with KASAN disabled:

drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:37:6:
warning: stack frame size (2208) exceeds limit (2048) in 'dml32_recalculate'
[-Wframe-larger-than]
void dml32_recalculate(struct display_mode_lib *mode_lib)

^
1532/2208 (69.38%) spills, 676/2208 (30.62%) variables

So I'm in favor of just raising the limit for these files for clang, like you
suggested in the linked thread.

Regards,
Samuel

2023-12-09 20:39:27

by Arnd Bergmann

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

On Fri, Dec 8, 2023, at 06:04, Samuel Holland wrote:
> On 2023-11-29 6:42 PM, Nathan Chancellor wrote:
>> On Thu, Nov 23, 2023 at 02:23:01PM +0000, Conor Dooley wrote:
>>> On Tue, Nov 21, 2023 at 07:05:15PM -0800, Samuel Holland wrote:
>>>> RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
>>>> architectures. Enabling hardware FP requires overriding the ISA string
>>>> for the relevant compilation units.
>>>
>>> Ah yes, bringing the joy of frame-larger-than warnings to RISC-V:
>>> ../drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: warning: stack frame size (2416) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Wframe-larger-than]
>>
>> :(
>>
>>> Nathan, have you given up on these being sorted out?
>>
>> Does your configuration have KASAN (I don't think RISC-V supports
>> KCSAN)? It is possible that dml/dcn32 needs something similar to commit
>> 6740ec97bcdb ("drm/amd/display: Increase frame warning limit with KASAN
>> or KCSAN in dml2")?
>>
>> I am not really interested in playing whack-a-mole with these warnings
>> like I have done in the past for the reasons I outlined here:
>>
>> https://lore.kernel.org/[email protected]/
>
> I also see one of these with clang 17 even with KASAN disabled:
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:37:6:
> warning: stack frame size (2208) exceeds limit (2048) in 'dml32_recalculate'
> [-Wframe-larger-than]
> void dml32_recalculate(struct display_mode_lib *mode_lib)
>
> ^
> 1532/2208 (69.38%) spills, 676/2208 (30.62%) variables
>
> So I'm in favor of just raising the limit for these files for clang, like you
> suggested in the linked thread.

How about just adding a BUG_ON(IS_ENABLED(CONFIG_RISCV))
in that function? That should also avoid the build failure
but give a better indication of where the problem is
if someone actually runs into that function and triggers
a runtime stack overflow.

Arnd

2023-12-09 21:29:59

by Samuel Holland

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

Hi Arnd,

On 2023-12-09 2:38 PM, Arnd Bergmann wrote:
> On Fri, Dec 8, 2023, at 06:04, Samuel Holland wrote:
>> On 2023-11-29 6:42 PM, Nathan Chancellor wrote:
>>> On Thu, Nov 23, 2023 at 02:23:01PM +0000, Conor Dooley wrote:
>>>> On Tue, Nov 21, 2023 at 07:05:15PM -0800, Samuel Holland wrote:
>>>>> RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
>>>>> architectures. Enabling hardware FP requires overriding the ISA string
>>>>> for the relevant compilation units.
>>>>
>>>> Ah yes, bringing the joy of frame-larger-than warnings to RISC-V:
>>>> ../drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: warning: stack frame size (2416) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Wframe-larger-than]
>>>
>>> :(
>>>
>>>> Nathan, have you given up on these being sorted out?
>>>
>>> Does your configuration have KASAN (I don't think RISC-V supports
>>> KCSAN)? It is possible that dml/dcn32 needs something similar to commit
>>> 6740ec97bcdb ("drm/amd/display: Increase frame warning limit with KASAN
>>> or KCSAN in dml2")?
>>>
>>> I am not really interested in playing whack-a-mole with these warnings
>>> like I have done in the past for the reasons I outlined here:
>>>
>>> https://lore.kernel.org/[email protected]/
>>
>> I also see one of these with clang 17 even with KASAN disabled:
>>
>> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:37:6:
>> warning: stack frame size (2208) exceeds limit (2048) in 'dml32_recalculate'
>> [-Wframe-larger-than]
>> void dml32_recalculate(struct display_mode_lib *mode_lib)
>>
>> ^
>> 1532/2208 (69.38%) spills, 676/2208 (30.62%) variables
>>
>> So I'm in favor of just raising the limit for these files for clang, like you
>> suggested in the linked thread.
>
> How about just adding a BUG_ON(IS_ENABLED(CONFIG_RISCV))
> in that function? That should also avoid the build failure
> but give a better indication of where the problem is
> if someone actually runs into that function and triggers
> a runtime stack overflow.

Won't that break actual users of the driver, trading an unlikely but
theoretically possible stack overflow for a guaranteed crash? The intent of this
series is that I have one of these GPUs plugged in to a RISC-V board, and I want
to use it.

Regards,
Samuel

2023-12-09 21:56:41

by Arnd Bergmann

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

On Sat, Dec 9, 2023, at 22:29, Samuel Holland wrote:
> On 2023-12-09 2:38 PM, Arnd Bergmann wrote:
>> On Fri, Dec 8, 2023, at 06:04, Samuel Holland wrote:
>>> On 2023-11-29 6:42 PM, Nathan Chancellor wrote:
>>>>
>>>> https://lore.kernel.org/[email protected]/
>>>
>>> I also see one of these with clang 17 even with KASAN disabled:
>>>
>>> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:37:6:
>>> warning: stack frame size (2208) exceeds limit (2048) in 'dml32_recalculate'
>>> [-Wframe-larger-than]
>>> void dml32_recalculate(struct display_mode_lib *mode_lib)
>>>
>>> ^
>>> 1532/2208 (69.38%) spills, 676/2208 (30.62%) variables
>>>
>>> So I'm in favor of just raising the limit for these files for clang, like you
>>> suggested in the linked thread.
>>
>> How about just adding a BUG_ON(IS_ENABLED(CONFIG_RISCV))
>> in that function? That should also avoid the build failure
>> but give a better indication of where the problem is
>> if someone actually runs into that function and triggers
>> a runtime stack overflow.
>
> Won't that break actual users of the driver, trading an unlikely but
> theoretically possible stack overflow for a guaranteed crash? The intent of this
> series is that I have one of these GPUs plugged in to a RISC-V board, and I want
> to use it.

Ah, I thought you just wanted to get it to compile cleanly
so you could use some of the more common cards. If you
are trying to run the dcn32 code specifically, then you
should definitely fix the stack usage of that function
instead.

Arnd

2023-12-11 15:18:24

by Alex Deucher

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

On Sun, Dec 10, 2023 at 5:10 AM Samuel Holland
<[email protected]> wrote:
>
> Hi Arnd,
>
> On 2023-12-09 2:38 PM, Arnd Bergmann wrote:
> > On Fri, Dec 8, 2023, at 06:04, Samuel Holland wrote:
> >> On 2023-11-29 6:42 PM, Nathan Chancellor wrote:
> >>> On Thu, Nov 23, 2023 at 02:23:01PM +0000, Conor Dooley wrote:
> >>>> On Tue, Nov 21, 2023 at 07:05:15PM -0800, Samuel Holland wrote:
> >>>>> RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
> >>>>> architectures. Enabling hardware FP requires overriding the ISA string
> >>>>> for the relevant compilation units.
> >>>>
> >>>> Ah yes, bringing the joy of frame-larger-than warnings to RISC-V:
> >>>> ../drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: warning: stack frame size (2416) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Wframe-larger-than]
> >>>
> >>> :(
> >>>
> >>>> Nathan, have you given up on these being sorted out?
> >>>
> >>> Does your configuration have KASAN (I don't think RISC-V supports
> >>> KCSAN)? It is possible that dml/dcn32 needs something similar to commit
> >>> 6740ec97bcdb ("drm/amd/display: Increase frame warning limit with KASAN
> >>> or KCSAN in dml2")?
> >>>
> >>> I am not really interested in playing whack-a-mole with these warnings
> >>> like I have done in the past for the reasons I outlined here:
> >>>
> >>> https://lore.kernel.org/[email protected]/
> >>
> >> I also see one of these with clang 17 even with KASAN disabled:
> >>
> >> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:37:6:
> >> warning: stack frame size (2208) exceeds limit (2048) in 'dml32_recalculate'
> >> [-Wframe-larger-than]
> >> void dml32_recalculate(struct display_mode_lib *mode_lib)
> >>
> >> ^
> >> 1532/2208 (69.38%) spills, 676/2208 (30.62%) variables
> >>
> >> So I'm in favor of just raising the limit for these files for clang, like you
> >> suggested in the linked thread.
> >
> > How about just adding a BUG_ON(IS_ENABLED(CONFIG_RISCV))
> > in that function? That should also avoid the build failure
> > but give a better indication of where the problem is
> > if someone actually runs into that function and triggers
> > a runtime stack overflow.
>
> Won't that break actual users of the driver, trading an unlikely but
> theoretically possible stack overflow for a guaranteed crash? The intent of this
> series is that I have one of these GPUs plugged in to a RISC-V board, and I want
> to use it.

Does this patch address the issue?
https://gitlab.freedesktop.org/agd5f/linux/-/commit/72ada8603e36291ad91e4f40f10ef742ef79bc4e

Alex

2023-12-11 15:40:40

by Samuel Holland

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

Hi Alex,

On 2023-12-11 9:17 AM, Alex Deucher wrote:
> On Sun, Dec 10, 2023 at 5:10 AM Samuel Holland
> <[email protected]> wrote:
>>
>> Hi Arnd,
>>
>> On 2023-12-09 2:38 PM, Arnd Bergmann wrote:
>>> On Fri, Dec 8, 2023, at 06:04, Samuel Holland wrote:
>>>> On 2023-11-29 6:42 PM, Nathan Chancellor wrote:
>>>>> On Thu, Nov 23, 2023 at 02:23:01PM +0000, Conor Dooley wrote:
>>>>>> On Tue, Nov 21, 2023 at 07:05:15PM -0800, Samuel Holland wrote:
>>>>>>> RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
>>>>>>> architectures. Enabling hardware FP requires overriding the ISA string
>>>>>>> for the relevant compilation units.
>>>>>>
>>>>>> Ah yes, bringing the joy of frame-larger-than warnings to RISC-V:
>>>>>> ../drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: warning: stack frame size (2416) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Wframe-larger-than]
>>>>>
>>>>> :(
>>>>>
>>>>>> Nathan, have you given up on these being sorted out?
>>>>>
>>>>> Does your configuration have KASAN (I don't think RISC-V supports
>>>>> KCSAN)? It is possible that dml/dcn32 needs something similar to commit
>>>>> 6740ec97bcdb ("drm/amd/display: Increase frame warning limit with KASAN
>>>>> or KCSAN in dml2")?
>>>>>
>>>>> I am not really interested in playing whack-a-mole with these warnings
>>>>> like I have done in the past for the reasons I outlined here:
>>>>>
>>>>> https://lore.kernel.org/[email protected]/
>>>>
>>>> I also see one of these with clang 17 even with KASAN disabled:
>>>>
>>>> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:37:6:
>>>> warning: stack frame size (2208) exceeds limit (2048) in 'dml32_recalculate'
>>>> [-Wframe-larger-than]
>>>> void dml32_recalculate(struct display_mode_lib *mode_lib)
>>>>
>>>> ^
>>>> 1532/2208 (69.38%) spills, 676/2208 (30.62%) variables
>>>>
>>>> So I'm in favor of just raising the limit for these files for clang, like you
>>>> suggested in the linked thread.
>>>
>>> How about just adding a BUG_ON(IS_ENABLED(CONFIG_RISCV))
>>> in that function? That should also avoid the build failure
>>> but give a better indication of where the problem is
>>> if someone actually runs into that function and triggers
>>> a runtime stack overflow.
>>
>> Won't that break actual users of the driver, trading an unlikely but
>> theoretically possible stack overflow for a guaranteed crash? The intent of this
>> series is that I have one of these GPUs plugged in to a RISC-V board, and I want
>> to use it.
>
> Does this patch address the issue?
> https://gitlab.freedesktop.org/agd5f/linux/-/commit/72ada8603e36291ad91e4f40f10ef742ef79bc4e

No, I get the warning without any of these debugging options enabled. I can
reproduce with just defconfig + CONFIG_DRM_AMDGPU=m when built with clang 17.

Regards,
Samuel

2023-12-12 07:12:59

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

On Thu, Dec 07, 2023 at 10:49:53PM -0600, Samuel Holland wrote:
> Actually tracking all possibly-FPU-tainted functions and their call sites is
> probably possible, but a much larger task.

I think objtool should be able to do that reasonably easily, it already
does it for checking section where userspace address access is enabled
or not, which is very similar.

2023-12-12 17:43:05

by Josh Poimboeuf

[permalink] [raw]

Subject: Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

On Mon, Dec 11, 2023 at 11:12:42PM -0800, Christoph Hellwig wrote:
> On Thu, Dec 07, 2023 at 10:49:53PM -0600, Samuel Holland wrote:
> > Actually tracking all possibly-FPU-tainted functions and their call sites is
> > probably possible, but a much larger task.
>
> I think objtool should be able to do that reasonably easily, it already
> does it for checking section where userspace address access is enabled
> or not, which is very similar.

Yeah, that might be doable. I can look into it.

--
Josh