2024-04-12 06:57:45

by Andy Chiu

[permalink] [raw]
Subject: [PATCH v4 2/9] riscv: smp: fail booting up smp if inconsistent vlen is detected

Currently we only support Vector for SMP platforms, that is, all SMP
cores have the same vlenb. If we happen to detect a mismatching vlen, it
is better to just fail bootting it up to prevent further race/scheduling
issues.

Also, move .Lsecondary_park forward and chage `tail smp_callin` into a
regular call in the early assembly. So a core would be parked right
after a return from smp_callin. Note that a successful smp_callin
does not return.

Fixes: 7017858eb2d7 ("riscv: Introduce riscv_v_vsize to record size of Vector context")
Reported-by: Conor Dooley <[email protected]>
Closes: https://lore.kernel.org/linux-riscv/20240228-vicinity-cornstalk-4b8eb5fe5730@spud/
Signed-off-by: Andy Chiu <[email protected]>
---
Changelog v4:
- update comment also in the assembly code (Yunhui)
Changelog v2:
- update commit message to explain asm code change (Conor)
---
arch/riscv/kernel/head.S | 19 ++++++++++++-------
arch/riscv/kernel/smpboot.c | 14 +++++++++-----
2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 4236a69c35cb..a00f7523cb91 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -165,9 +165,20 @@ secondary_start_sbi:
#endif
call .Lsetup_trap_vector
scs_load_current
- tail smp_callin
+ call smp_callin
#endif /* CONFIG_SMP */

+.align 2
+.Lsecondary_park:
+ /*
+ * Park this hart if we:
+ * - have too many harts on CONFIG_RISCV_BOOT_SPINWAIT
+ * - receive an early trap, before setup_trap_vector finished
+ * - fail in smp_callin(), as a successful one wouldn't return
+ */
+ wfi
+ j .Lsecondary_park
+
.align 2
.Lsetup_trap_vector:
/* Set trap vector to exception handler */
@@ -181,12 +192,6 @@ secondary_start_sbi:
csrw CSR_SCRATCH, zero
ret

-.align 2
-.Lsecondary_park:
- /* We lack SMP support or have too many harts, so park this hart */
- wfi
- j .Lsecondary_park
-
SYM_CODE_END(_start)

SYM_CODE_START(_start_kernel)
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index d41090fc3203..673437ccc13d 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -214,6 +214,15 @@ asmlinkage __visible void smp_callin(void)
struct mm_struct *mm = &init_mm;
unsigned int curr_cpuid = smp_processor_id();

+ if (has_vector()) {
+ /*
+ * Return as early as possible so the hart with a mismatching
+ * vlen won't boot.
+ */
+ if (riscv_v_setup_vsize())
+ return;
+ }
+
/* All kernel threads share the same mm context. */
mmgrab(mm);
current->active_mm = mm;
@@ -226,11 +235,6 @@ asmlinkage __visible void smp_callin(void)
numa_add_cpu(curr_cpuid);
set_cpu_online(curr_cpuid, 1);

- if (has_vector()) {
- if (riscv_v_setup_vsize())
- elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
- }
-
riscv_user_isa_enable();

/*

--
2.44.0.rc2



2024-04-18 10:18:04

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 2/9] riscv: smp: fail booting up smp if inconsistent vlen is detected

On Fri, Apr 12, 2024 at 02:48:58PM +0800, Andy Chiu wrote:
> Currently we only support Vector for SMP platforms, that is, all SMP
> cores have the same vlenb. If we happen to detect a mismatching vlen, it
> is better to just fail bootting it up to prevent further race/scheduling
> issues.
>
> Also, move .Lsecondary_park forward and chage `tail smp_callin` into a
> regular call in the early assembly. So a core would be parked right
> after a return from smp_callin. Note that a successful smp_callin
> does not return.
>
> Fixes: 7017858eb2d7 ("riscv: Introduce riscv_v_vsize to record size of Vector context")
> Reported-by: Conor Dooley <[email protected]>
> Closes: https://lore.kernel.org/linux-riscv/20240228-vicinity-cornstalk-4b8eb5fe5730@spud/
> Signed-off-by: Andy Chiu <[email protected]>

Reviewed-by: Conor Dooley <[email protected]>

Cheers,
Conor.


Attachments:
(No filename) (908.00 B)
signature.asc (235.00 B)
Download all attachments

2024-04-19 06:09:31

by yunhui cui

[permalink] [raw]
Subject: Re: [External] [PATCH v4 2/9] riscv: smp: fail booting up smp if inconsistent vlen is detected

Hi Andy,

On Fri, Apr 12, 2024 at 2:50 PM Andy Chiu <[email protected]> wrote:
>
> Currently we only support Vector for SMP platforms, that is, all SMP
> cores have the same vlenb. If we happen to detect a mismatching vlen, it
> is better to just fail bootting it up to prevent further race/scheduling
> issues.
>
> Also, move .Lsecondary_park forward and chage `tail smp_callin` into a
> regular call in the early assembly. So a core would be parked right
> after a return from smp_callin. Note that a successful smp_callin
> does not return.
>
> Fixes: 7017858eb2d7 ("riscv: Introduce riscv_v_vsize to record size of Vector context")
> Reported-by: Conor Dooley <[email protected]>
> Closes: https://lore.kernel.org/linux-riscv/20240228-vicinity-cornstalk-4b8eb5fe5730@spud/
> Signed-off-by: Andy Chiu <[email protected]>
> ---
> Changelog v4:
> - update comment also in the assembly code (Yunhui)
> Changelog v2:
> - update commit message to explain asm code change (Conor)
> ---
> arch/riscv/kernel/head.S | 19 ++++++++++++-------
> arch/riscv/kernel/smpboot.c | 14 +++++++++-----
> 2 files changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index 4236a69c35cb..a00f7523cb91 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -165,9 +165,20 @@ secondary_start_sbi:
> #endif
> call .Lsetup_trap_vector
> scs_load_current
> - tail smp_callin
> + call smp_callin
> #endif /* CONFIG_SMP */
>
> +.align 2
> +.Lsecondary_park:
> + /*
> + * Park this hart if we:
> + * - have too many harts on CONFIG_RISCV_BOOT_SPINWAIT
> + * - receive an early trap, before setup_trap_vector finished
> + * - fail in smp_callin(), as a successful one wouldn't return
> + */
> + wfi
> + j .Lsecondary_park
> +
> .align 2
> .Lsetup_trap_vector:
> /* Set trap vector to exception handler */
> @@ -181,12 +192,6 @@ secondary_start_sbi:
> csrw CSR_SCRATCH, zero
> ret
>
> -.align 2
> -.Lsecondary_park:
> - /* We lack SMP support or have too many harts, so park this hart */
> - wfi
> - j .Lsecondary_park
> -
> SYM_CODE_END(_start)
>
> SYM_CODE_START(_start_kernel)
> diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
> index d41090fc3203..673437ccc13d 100644
> --- a/arch/riscv/kernel/smpboot.c
> +++ b/arch/riscv/kernel/smpboot.c
> @@ -214,6 +214,15 @@ asmlinkage __visible void smp_callin(void)
> struct mm_struct *mm = &init_mm;
> unsigned int curr_cpuid = smp_processor_id();
>
> + if (has_vector()) {
> + /*
> + * Return as early as possible so the hart with a mismatching
> + * vlen won't boot.
> + */
> + if (riscv_v_setup_vsize())
> + return;
> + }
> +
> /* All kernel threads share the same mm context. */
> mmgrab(mm);
> current->active_mm = mm;
> @@ -226,11 +235,6 @@ asmlinkage __visible void smp_callin(void)
> numa_add_cpu(curr_cpuid);
> set_cpu_online(curr_cpuid, 1);
>
> - if (has_vector()) {
> - if (riscv_v_setup_vsize())
> - elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> - }
> -
> riscv_user_isa_enable();
>
> /*
>
> --
> 2.44.0.rc2
>
>

Reviewed-by: Yunhui Cui <[email protected]>


Thanks,
Yunhui

2024-04-24 20:01:43

by Alexandre Ghiti

[permalink] [raw]
Subject: Re: [PATCH v4 2/9] riscv: smp: fail booting up smp if inconsistent vlen is detected

Hi Andy,

On 12/04/2024 08:48, Andy Chiu wrote:
> Currently we only support Vector for SMP platforms, that is, all SMP
> cores have the same vlenb. If we happen to detect a mismatching vlen, it
> is better to just fail bootting it up to prevent further race/scheduling
> issues.
>
> Also, move .Lsecondary_park forward and chage `tail smp_callin` into a
> regular call in the early assembly. So a core would be parked right
> after a return from smp_callin. Note that a successful smp_callin
> does not return.
>
> Fixes: 7017858eb2d7 ("riscv: Introduce riscv_v_vsize to record size of Vector context")
> Reported-by: Conor Dooley <[email protected]>
> Closes: https://lore.kernel.org/linux-riscv/20240228-vicinity-cornstalk-4b8eb5fe5730@spud/
> Signed-off-by: Andy Chiu <[email protected]>
> ---
> Changelog v4:
> - update comment also in the assembly code (Yunhui)
> Changelog v2:
> - update commit message to explain asm code change (Conor)
> ---
> arch/riscv/kernel/head.S | 19 ++++++++++++-------
> arch/riscv/kernel/smpboot.c | 14 +++++++++-----
> 2 files changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index 4236a69c35cb..a00f7523cb91 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -165,9 +165,20 @@ secondary_start_sbi:
> #endif
> call .Lsetup_trap_vector
> scs_load_current
> - tail smp_callin
> + call smp_callin
> #endif /* CONFIG_SMP */
>
> +.align 2
> +.Lsecondary_park:
> + /*
> + * Park this hart if we:
> + * - have too many harts on CONFIG_RISCV_BOOT_SPINWAIT
> + * - receive an early trap, before setup_trap_vector finished
> + * - fail in smp_callin(), as a successful one wouldn't return
> + */
> + wfi
> + j .Lsecondary_park
> +
> .align 2
> .Lsetup_trap_vector:
> /* Set trap vector to exception handler */
> @@ -181,12 +192,6 @@ secondary_start_sbi:
> csrw CSR_SCRATCH, zero
> ret
>
> -.align 2
> -.Lsecondary_park:
> - /* We lack SMP support or have too many harts, so park this hart */
> - wfi
> - j .Lsecondary_park
> -
> SYM_CODE_END(_start)
>
> SYM_CODE_START(_start_kernel)
> diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
> index d41090fc3203..673437ccc13d 100644
> --- a/arch/riscv/kernel/smpboot.c
> +++ b/arch/riscv/kernel/smpboot.c
> @@ -214,6 +214,15 @@ asmlinkage __visible void smp_callin(void)
> struct mm_struct *mm = &init_mm;
> unsigned int curr_cpuid = smp_processor_id();
>
> + if (has_vector()) {
> + /*
> + * Return as early as possible so the hart with a mismatching
> + * vlen won't boot.
> + */
> + if (riscv_v_setup_vsize())
> + return;
> + }
> +
> /* All kernel threads share the same mm context. */
> mmgrab(mm);
> current->active_mm = mm;
> @@ -226,11 +235,6 @@ asmlinkage __visible void smp_callin(void)
> numa_add_cpu(curr_cpuid);
> set_cpu_online(curr_cpuid, 1);
>
> - if (has_vector()) {
> - if (riscv_v_setup_vsize())
> - elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> - }
> -
> riscv_user_isa_enable();
>
> /*
>

So this should go into -fixes, would you mind sending a single patch for
this fix?

Your patch 8 is actually already fixed by Clement's patch
https://lore.kernel.org/linux-riscv/[email protected]/
and I already mentioned this one to Palmer.

Thanks,

Alex