2022-03-21 14:13:04

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

Currently, the range and default value of NR_CPUS is too restrictive
for high-end RISC-V systems with large number of HARTs. The latest
QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
restrictive for QEMU as well. Other major architectures (such as
ARM64, x86_64, MIPS, etc) have a much higher range and default
value of NR_CPUS.

This patch increases NR_CPUS range to 2-512 and default value to
XLEN (i.e. 32 for RV32 and 64 for RV64).

Signed-off-by: Anup Patel <[email protected]>
---
Changes since v1:
- Updated NR_CPUS range to 2-512 which reflects maximum number of
CPUs supported by QEMU virt machine.
---
arch/riscv/Kconfig | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 5adcbd9b5e88..423ac17f598c 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -274,10 +274,11 @@ config SMP
If you don't know what to do here, say N.

config NR_CPUS
- int "Maximum number of CPUs (2-32)"
- range 2 32
+ int "Maximum number of CPUs (2-512)"
+ range 2 512
depends on SMP
- default "8"
+ default "32" if 32BIT
+ default "64" if 64BIT

config HOTPLUG_CPU
bool "Support for hot-pluggable CPUs"
--
2.25.1


2022-04-01 07:46:30

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

On Sat, 19 Mar 2022 05:12:06 PDT (-0700), [email protected] wrote:
> Currently, the range and default value of NR_CPUS is too restrictive
> for high-end RISC-V systems with large number of HARTs. The latest
> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
> restrictive for QEMU as well. Other major architectures (such as
> ARM64, x86_64, MIPS, etc) have a much higher range and default
> value of NR_CPUS.
>
> This patch increases NR_CPUS range to 2-512 and default value to
> XLEN (i.e. 32 for RV32 and 64 for RV64).
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> Changes since v1:
> - Updated NR_CPUS range to 2-512 which reflects maximum number of
> CPUs supported by QEMU virt machine.
> ---
> arch/riscv/Kconfig | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 5adcbd9b5e88..423ac17f598c 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -274,10 +274,11 @@ config SMP
> If you don't know what to do here, say N.
>
> config NR_CPUS
> - int "Maximum number of CPUs (2-32)"
> - range 2 32
> + int "Maximum number of CPUs (2-512)"
> + range 2 512
> depends on SMP
> - default "8"
> + default "32" if 32BIT
> + default "64" if 64BIT
>
> config HOTPLUG_CPU
> bool "Support for hot-pluggable CPUs"

I'm getting all sorts of boot issues with more than 32 CPUs, even on the
latest QEMU master. I'm not opposed to increasing the CPU count in
theory, but if we're going to have a setting that goes up to a huge
number it needs to at least boot. I've got 64 host threads, so it
shouldn't just be a scheduling thing.

If there was some hardware that actually boots on these I'd be happy to
take it, but given that it's just QEMU I'd prefer to sort out the bugs
first. It's probably just latent bugs somewhere, but allowing users to
turn on configs we know don't work just seems like the wrong way to go.

2022-04-06 15:20:24

by Heinrich Schuchardt

[permalink] [raw]
Subject: Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

On 3/31/22 21:42, Palmer Dabbelt wrote:
> On Sat, 19 Mar 2022 05:12:06 PDT (-0700), [email protected] wrote:
>> Currently, the range and default value of NR_CPUS is too restrictive
>> for high-end RISC-V systems with large number of HARTs. The latest
>> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
>> restrictive for QEMU as well. Other major architectures (such as
>> ARM64, x86_64, MIPS, etc) have a much higher range and default
>> value of NR_CPUS.
>>
>> This patch increases NR_CPUS range to 2-512 and default value to
>> XLEN (i.e. 32 for RV32 and 64 for RV64).
>>
>> Signed-off-by: Anup Patel <[email protected]>
>> ---
>> Changes since v1:
>>  - Updated NR_CPUS range to 2-512 which reflects maximum number of
>>    CPUs supported by QEMU virt machine.
>> ---
>>  arch/riscv/Kconfig | 7 ++++---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> index 5adcbd9b5e88..423ac17f598c 100644
>> --- a/arch/riscv/Kconfig
>> +++ b/arch/riscv/Kconfig
>> @@ -274,10 +274,11 @@ config SMP
>>        If you don't know what to do here, say N.
>>
>>  config NR_CPUS
>> -    int "Maximum number of CPUs (2-32)"
>> -    range 2 32
>> +    int "Maximum number of CPUs (2-512)"
>> +    range 2 512

For SBI_V01=y there seems to be a hard constraint to XLEN bits.
See __sbi_v01_cpumask_to_hartmask() in rch/riscv/kernel/sbi.c.

So shouldn't this be something like:

range 2 512 !SBI_V01
range 2 32 SBI_V01 && 32BIT
range 2 64 SBI_V01 && 64BIT

>>      depends on SMP
>> -    default "8"
>> +    default "32" if 32BIT
>> +    default "64" if 64BIT
>>
>>  config HOTPLUG_CPU
>>      bool "Support for hot-pluggable CPUs"
>
> I'm getting all sorts of boot issues with more than 32 CPUs, even on the
> latest QEMU master.  I'm not opposed to increasing the CPU count in
> theory, but if we're going to have a setting that goes up to a huge
> number it needs to at least boot.  I've got 64 host threads, so it
> shouldn't just be a scheduling thing.

Currently high performing hardware for RISC-V is missing. So it makes
sense to build software via QEMU on x86_64 or arm64 with as many
hardware threads as available (128 is not uncommon).

OpenSBI currently is limited to 128 threads:
include/sbi/sbi_hartmask.h:22:
#define SBI_HARTMASK_MAX_BITS 128
This is just an arbitrary value we can be modified.

U-Boot v2022.04 qemu-riscv64_smode_defconfig has a value of
CONFIG_SYS_MALLOC_F_LEN that is to low. This leads to a boot failure for
more than 16 harts. A patch to correct this is pending:
[PATCH v2 1/1] riscv: alloc space exhausted
https://lore.kernel.org/u-boot/CAN5B=eKt=tFLZ2z3aNHJqsnJzpdA0oikcrC2i1_=ZDD=f+M0jA@mail.gmail.com/T/#t

With QEMU 7.0 and the U-Boot fix booting into a 5.17 defconfig kernel
with 64 virtual cores worked fine for me.

Best regards

Heinrich

>
> If there was some hardware that actually boots on these I'd be happy to
> take it, but given that it's just QEMU I'd prefer to sort out the bugs
> first.  It's probably just latent bugs somewhere, but allowing users to
> turn on configs we know don't work just seems like the wrong way to go.
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>

2022-04-06 15:22:10

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

On Wed, Apr 6, 2022 at 3:25 PM Heinrich Schuchardt
<[email protected]> wrote:
>
> On 3/31/22 21:42, Palmer Dabbelt wrote:
> > On Sat, 19 Mar 2022 05:12:06 PDT (-0700), [email protected] wrote:
> >> Currently, the range and default value of NR_CPUS is too restrictive
> >> for high-end RISC-V systems with large number of HARTs. The latest
> >> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
> >> restrictive for QEMU as well. Other major architectures (such as
> >> ARM64, x86_64, MIPS, etc) have a much higher range and default
> >> value of NR_CPUS.
> >>
> >> This patch increases NR_CPUS range to 2-512 and default value to
> >> XLEN (i.e. 32 for RV32 and 64 for RV64).
> >>
> >> Signed-off-by: Anup Patel <[email protected]>
> >> ---
> >> Changes since v1:
> >> - Updated NR_CPUS range to 2-512 which reflects maximum number of
> >> CPUs supported by QEMU virt machine.
> >> ---
> >> arch/riscv/Kconfig | 7 ++++---
> >> 1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> >> index 5adcbd9b5e88..423ac17f598c 100644
> >> --- a/arch/riscv/Kconfig
> >> +++ b/arch/riscv/Kconfig
> >> @@ -274,10 +274,11 @@ config SMP
> >> If you don't know what to do here, say N.
> >>
> >> config NR_CPUS
> >> - int "Maximum number of CPUs (2-32)"
> >> - range 2 32
> >> + int "Maximum number of CPUs (2-512)"
> >> + range 2 512
>
> For SBI_V01=y there seems to be a hard constraint to XLEN bits.
> See __sbi_v01_cpumask_to_hartmask() in rch/riscv/kernel/sbi.c.
>
> So shouldn't this be something like:
>
> range 2 512 !SBI_V01
> range 2 32 SBI_V01 && 32BIT
> range 2 64 SBI_V01 && 64BIT

This is just making it unnecessarily complicated for supporting
SBI v0.1

How about removing SBI v0.1 support and the spin-wait CPU
operations from arch/riscv ?

>
> >> depends on SMP
> >> - default "8"
> >> + default "32" if 32BIT
> >> + default "64" if 64BIT
> >>
> >> config HOTPLUG_CPU
> >> bool "Support for hot-pluggable CPUs"
> >
> > I'm getting all sorts of boot issues with more than 32 CPUs, even on the
> > latest QEMU master. I'm not opposed to increasing the CPU count in
> > theory, but if we're going to have a setting that goes up to a huge
> > number it needs to at least boot. I've got 64 host threads, so it
> > shouldn't just be a scheduling thing.
>
> Currently high performing hardware for RISC-V is missing. So it makes
> sense to build software via QEMU on x86_64 or arm64 with as many
> hardware threads as available (128 is not uncommon).
>
> OpenSBI currently is limited to 128 threads:
> include/sbi/sbi_hartmask.h:22:
> #define SBI_HARTMASK_MAX_BITS 128
> This is just an arbitrary value we can be modified.

Yes, this limit will be gradually increased with some improvements
to optimize runtime memory used by OpenSBI.

>
> U-Boot v2022.04 qemu-riscv64_smode_defconfig has a value of
> CONFIG_SYS_MALLOC_F_LEN that is to low. This leads to a boot failure for
> more than 16 harts. A patch to correct this is pending:
> [PATCH v2 1/1] riscv: alloc space exhausted
> https://lore.kernel.org/u-boot/CAN5B=eKt=tFLZ2z3aNHJqsnJzpdA0oikcrC2i1_=ZDD=f+M0jA@mail.gmail.com/T/#t
>
> With QEMU 7.0 and the U-Boot fix booting into a 5.17 defconfig kernel
> with 64 virtual cores worked fine for me.

Thanks for trying this patch.

Regards,
Anup

>
> Best regards
>
> Heinrich
>
> >
> > If there was some hardware that actually boots on these I'd be happy to
> > take it, but given that it's just QEMU I'd prefer to sort out the bugs
> > first. It's probably just latent bugs somewhere, but allowing users to
> > turn on configs we know don't work just seems like the wrong way to go.
> >
> > _______________________________________________
> > linux-riscv mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
> >
>

2022-04-07 02:33:33

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

On Wed, Apr 6, 2022 at 2:55 AM Heinrich Schuchardt
<[email protected]> wrote:
>
> On 3/31/22 21:42, Palmer Dabbelt wrote:
> > On Sat, 19 Mar 2022 05:12:06 PDT (-0700), [email protected] wrote:
> >> Currently, the range and default value of NR_CPUS is too restrictive
> >> for high-end RISC-V systems with large number of HARTs. The latest
> >> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
> >> restrictive for QEMU as well. Other major architectures (such as
> >> ARM64, x86_64, MIPS, etc) have a much higher range and default
> >> value of NR_CPUS.
> >>
> >> This patch increases NR_CPUS range to 2-512 and default value to
> >> XLEN (i.e. 32 for RV32 and 64 for RV64).
> >>
> >> Signed-off-by: Anup Patel <[email protected]>
> >> ---
> >> Changes since v1:
> >> - Updated NR_CPUS range to 2-512 which reflects maximum number of
> >> CPUs supported by QEMU virt machine.
> >> ---
> >> arch/riscv/Kconfig | 7 ++++---
> >> 1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> >> index 5adcbd9b5e88..423ac17f598c 100644
> >> --- a/arch/riscv/Kconfig
> >> +++ b/arch/riscv/Kconfig
> >> @@ -274,10 +274,11 @@ config SMP
> >> If you don't know what to do here, say N.
> >>
> >> config NR_CPUS
> >> - int "Maximum number of CPUs (2-32)"
> >> - range 2 32
> >> + int "Maximum number of CPUs (2-512)"
> >> + range 2 512
>
> For SBI_V01=y there seems to be a hard constraint to XLEN bits.
> See __sbi_v01_cpumask_to_hartmask() in rch/riscv/kernel/sbi.c.
>
> So shouldn't this be something like:
>
> range 2 512 !SBI_V01
> range 2 32 SBI_V01 && 32BIT
> range 2 64 SBI_V01 && 64BIT
>

Yes. In addition to that, we should disable RISCV_BOOT_SPINWAIT as well
so that you don't end up in case where -smp argument in Qemu < NR_CPUs

Palmer already has a patch for that.
https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=riscv-spinwait

> >> depends on SMP
> >> - default "8"
> >> + default "32" if 32BIT
> >> + default "64" if 64BIT
> >>
> >> config HOTPLUG_CPU
> >> bool "Support for hot-pluggable CPUs"
> >
> > I'm getting all sorts of boot issues with more than 32 CPUs, even on the
> > latest QEMU master. I'm not opposed to increasing the CPU count in
> > theory, but if we're going to have a setting that goes up to a huge
> > number it needs to at least boot. I've got 64 host threads, so it
> > shouldn't just be a scheduling thing.
>
> Currently high performing hardware for RISC-V is missing. So it makes
> sense to build software via QEMU on x86_64 or arm64 with as many
> hardware threads as available (128 is not uncommon).
>
> OpenSBI currently is limited to 128 threads:
> include/sbi/sbi_hartmask.h:22:
> #define SBI_HARTMASK_MAX_BITS 128
> This is just an arbitrary value we can be modified.
>
> U-Boot v2022.04 qemu-riscv64_smode_defconfig has a value of
> CONFIG_SYS_MALLOC_F_LEN that is to low. This leads to a boot failure for
> more than 16 harts. A patch to correct this is pending:
> [PATCH v2 1/1] riscv: alloc space exhausted
> https://lore.kernel.org/u-boot/CAN5B=eKt=tFLZ2z3aNHJqsnJzpdA0oikcrC2i1_=ZDD=f+M0jA@mail.gmail.com/T/#t
>
> With QEMU 7.0 and the U-Boot fix booting into a 5.17 defconfig kernel
> with 64 virtual cores worked fine for me.
>

For me, OpenSBI -> Linux path works for 128 harts for bunch of configs from here

https://github.com/palmer-dabbelt/riscv-systems-ci/tree/master/configs/linux

with RISCV_BOOT_SPINWAIT disabled.

> Best regards
>
> Heinrich
>
> >
> > If there was some hardware that actually boots on these I'd be happy to
> > take it, but given that it's just QEMU I'd prefer to sort out the bugs
> > first. It's probably just latent bugs somewhere, but allowing users to
> > turn on configs we know don't work just seems like the wrong way to go.
> >
> > _______________________________________________
> > linux-riscv mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
> >
>


--
Regards,
Atish

2022-04-08 18:20:32

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

On Fri, Apr 8, 2022 at 10:08 PM Heinrich Schuchardt
<[email protected]> wrote:
>
> On 4/6/22 12:10, Anup Patel wrote:
> > On Wed, Apr 6, 2022 at 3:25 PM Heinrich Schuchardt
> > <[email protected]> wrote:
> >>
> >> On 3/31/22 21:42, Palmer Dabbelt wrote:
> >>> On Sat, 19 Mar 2022 05:12:06 PDT (-0700), [email protected] wrote:
> >>>> Currently, the range and default value of NR_CPUS is too restrictive
> >>>> for high-end RISC-V systems with large number of HARTs. The latest
> >>>> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
> >>>> restrictive for QEMU as well. Other major architectures (such as
> >>>> ARM64, x86_64, MIPS, etc) have a much higher range and default
> >>>> value of NR_CPUS.
> >>>>
> >>>> This patch increases NR_CPUS range to 2-512 and default value to
> >>>> XLEN (i.e. 32 for RV32 and 64 for RV64).
> >>>>
> >>>> Signed-off-by: Anup Patel <[email protected]>
> >>>> ---
> >>>> Changes since v1:
> >>>> - Updated NR_CPUS range to 2-512 which reflects maximum number of
> >>>> CPUs supported by QEMU virt machine.
> >>>> ---
> >>>> arch/riscv/Kconfig | 7 ++++---
> >>>> 1 file changed, 4 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> >>>> index 5adcbd9b5e88..423ac17f598c 100644
> >>>> --- a/arch/riscv/Kconfig
> >>>> +++ b/arch/riscv/Kconfig
> >>>> @@ -274,10 +274,11 @@ config SMP
> >>>> If you don't know what to do here, say N.
> >>>>
> >>>> config NR_CPUS
> >>>> - int "Maximum number of CPUs (2-32)"
> >>>> - range 2 32
> >>>> + int "Maximum number of CPUs (2-512)"
> >>>> + range 2 512
> >>
> >> For SBI_V01=y there seems to be a hard constraint to XLEN bits.
> >> See __sbi_v01_cpumask_to_hartmask() in rch/riscv/kernel/sbi.c.
> >>
> >> So shouldn't this be something like:
> >>
> >> range 2 512 !SBI_V01
> >> range 2 32 SBI_V01 && 32BIT
> >> range 2 64 SBI_V01 && 64BIT
> >
> > This is just making it unnecessarily complicated for supporting
> > SBI v0.1
> >
> > How about removing SBI v0.1 support and the spin-wait CPU
> > operations from arch/riscv ?
>
> The SBI v0.1 specification was only a draft. Only the v1.0 version has
> ever been ratified.
>
> It would be good to remove this legacy code from Linux and U-Boot.
>
> By the way, why does upstream OpenSBI claim to be conformant to SBI v0.3
> and not to v1.0?

The ratification process for SBI v1.0 was in early stages when OpenSBI v1.0
was being released so we decided to keep the SBI v0.3 spec version. The
next OpenSBI v1.1 release (due in June 2022) will change to SBI v1.0

Regards,
Anup

>
> include/sbi/sbi_ecall.h:16:
>
> #define SBI_ECALL_VERSION_MAJOR 0
> #define SBI_ECALL_VERSION_MINOR 3
>
> Best regards
>
> Heinrich
>
> >
> >>
> >>>> depends on SMP
> >>>> - default "8"
> >>>> + default "32" if 32BIT
> >>>> + default "64" if 64BIT
> >>>>
> >>>> config HOTPLUG_CPU
> >>>> bool "Support for hot-pluggable CPUs"
> >>>
> >>> I'm getting all sorts of boot issues with more than 32 CPUs, even on the
> >>> latest QEMU master. I'm not opposed to increasing the CPU count in
> >>> theory, but if we're going to have a setting that goes up to a huge
> >>> number it needs to at least boot. I've got 64 host threads, so it
> >>> shouldn't just be a scheduling thing.
> >>
> >> Currently high performing hardware for RISC-V is missing. So it makes
> >> sense to build software via QEMU on x86_64 or arm64 with as many
> >> hardware threads as available (128 is not uncommon).
> >>
> >> OpenSBI currently is limited to 128 threads:
> >> include/sbi/sbi_hartmask.h:22:
> >> #define SBI_HARTMASK_MAX_BITS 128
> >> This is just an arbitrary value we can be modified.
> >
> > Yes, this limit will be gradually increased with some improvements
> > to optimize runtime memory used by OpenSBI.
> >
> >>
> >> U-Boot v2022.04 qemu-riscv64_smode_defconfig has a value of
> >> CONFIG_SYS_MALLOC_F_LEN that is to low. This leads to a boot failure for
> >> more than 16 harts. A patch to correct this is pending:
> >> [PATCH v2 1/1] riscv: alloc space exhausted
> >> https://lore.kernel.org/u-boot/CAN5B=eKt=tFLZ2z3aNHJqsnJzpdA0oikcrC2i1_=ZDD=f+M0jA@mail.gmail.com/T/#t
> >>
> >> With QEMU 7.0 and the U-Boot fix booting into a 5.17 defconfig kernel
> >> with 64 virtual cores worked fine for me.
> >
> > Thanks for trying this patch.
> >
> > Regards,
> > Anup
> >
> >>
> >> Best regards
> >>
> >> Heinrich
> >>
> >>>
> >>> If there was some hardware that actually boots on these I'd be happy to
> >>> take it, but given that it's just QEMU I'd prefer to sort out the bugs
> >>> first. It's probably just latent bugs somewhere, but allowing users to
> >>> turn on configs we know don't work just seems like the wrong way to go.
> >>>

2022-04-12 07:00:01

by Heinrich Schuchardt

[permalink] [raw]
Subject: Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

On 4/6/22 12:10, Anup Patel wrote:
> On Wed, Apr 6, 2022 at 3:25 PM Heinrich Schuchardt
> <[email protected]> wrote:
>>
>> On 3/31/22 21:42, Palmer Dabbelt wrote:
>>> On Sat, 19 Mar 2022 05:12:06 PDT (-0700), [email protected] wrote:
>>>> Currently, the range and default value of NR_CPUS is too restrictive
>>>> for high-end RISC-V systems with large number of HARTs. The latest
>>>> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
>>>> restrictive for QEMU as well. Other major architectures (such as
>>>> ARM64, x86_64, MIPS, etc) have a much higher range and default
>>>> value of NR_CPUS.
>>>>
>>>> This patch increases NR_CPUS range to 2-512 and default value to
>>>> XLEN (i.e. 32 for RV32 and 64 for RV64).
>>>>
>>>> Signed-off-by: Anup Patel <[email protected]>
>>>> ---
>>>> Changes since v1:
>>>> - Updated NR_CPUS range to 2-512 which reflects maximum number of
>>>> CPUs supported by QEMU virt machine.
>>>> ---
>>>> arch/riscv/Kconfig | 7 ++++---
>>>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>>>> index 5adcbd9b5e88..423ac17f598c 100644
>>>> --- a/arch/riscv/Kconfig
>>>> +++ b/arch/riscv/Kconfig
>>>> @@ -274,10 +274,11 @@ config SMP
>>>> If you don't know what to do here, say N.
>>>>
>>>> config NR_CPUS
>>>> - int "Maximum number of CPUs (2-32)"
>>>> - range 2 32
>>>> + int "Maximum number of CPUs (2-512)"
>>>> + range 2 512
>>
>> For SBI_V01=y there seems to be a hard constraint to XLEN bits.
>> See __sbi_v01_cpumask_to_hartmask() in rch/riscv/kernel/sbi.c.
>>
>> So shouldn't this be something like:
>>
>> range 2 512 !SBI_V01
>> range 2 32 SBI_V01 && 32BIT
>> range 2 64 SBI_V01 && 64BIT
>
> This is just making it unnecessarily complicated for supporting
> SBI v0.1
>
> How about removing SBI v0.1 support and the spin-wait CPU
> operations from arch/riscv ?

The SBI v0.1 specification was only a draft. Only the v1.0 version has
ever been ratified.

It would be good to remove this legacy code from Linux and U-Boot.

By the way, why does upstream OpenSBI claim to be conformant to SBI v0.3
and not to v1.0?

include/sbi/sbi_ecall.h:16:

#define SBI_ECALL_VERSION_MAJOR 0
#define SBI_ECALL_VERSION_MINOR 3

Best regards

Heinrich

>
>>
>>>> depends on SMP
>>>> - default "8"
>>>> + default "32" if 32BIT
>>>> + default "64" if 64BIT
>>>>
>>>> config HOTPLUG_CPU
>>>> bool "Support for hot-pluggable CPUs"
>>>
>>> I'm getting all sorts of boot issues with more than 32 CPUs, even on the
>>> latest QEMU master. I'm not opposed to increasing the CPU count in
>>> theory, but if we're going to have a setting that goes up to a huge
>>> number it needs to at least boot. I've got 64 host threads, so it
>>> shouldn't just be a scheduling thing.
>>
>> Currently high performing hardware for RISC-V is missing. So it makes
>> sense to build software via QEMU on x86_64 or arm64 with as many
>> hardware threads as available (128 is not uncommon).
>>
>> OpenSBI currently is limited to 128 threads:
>> include/sbi/sbi_hartmask.h:22:
>> #define SBI_HARTMASK_MAX_BITS 128
>> This is just an arbitrary value we can be modified.
>
> Yes, this limit will be gradually increased with some improvements
> to optimize runtime memory used by OpenSBI.
>
>>
>> U-Boot v2022.04 qemu-riscv64_smode_defconfig has a value of
>> CONFIG_SYS_MALLOC_F_LEN that is to low. This leads to a boot failure for
>> more than 16 harts. A patch to correct this is pending:
>> [PATCH v2 1/1] riscv: alloc space exhausted
>> https://lore.kernel.org/u-boot/CAN5B=eKt=tFLZ2z3aNHJqsnJzpdA0oikcrC2i1_=ZDD=f+M0jA@mail.gmail.com/T/#t
>>
>> With QEMU 7.0 and the U-Boot fix booting into a 5.17 defconfig kernel
>> with 64 virtual cores worked fine for me.
>
> Thanks for trying this patch.
>
> Regards,
> Anup
>
>>
>> Best regards
>>
>> Heinrich
>>
>>>
>>> If there was some hardware that actually boots on these I'd be happy to
>>> take it, but given that it's just QEMU I'd prefer to sort out the bugs
>>> first. It's probably just latent bugs somewhere, but allowing users to
>>> turn on configs we know don't work just seems like the wrong way to go.
>>>

2022-04-12 11:39:15

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

On Fri, Apr 8, 2022 at 9:45 AM Anup Patel <[email protected]> wrote:
>
> On Fri, Apr 8, 2022 at 10:08 PM Heinrich Schuchardt
> <[email protected]> wrote:
> >
> > On 4/6/22 12:10, Anup Patel wrote:
> > > On Wed, Apr 6, 2022 at 3:25 PM Heinrich Schuchardt
> > > <[email protected]> wrote:
> > >>
> > >> On 3/31/22 21:42, Palmer Dabbelt wrote:
> > >>> On Sat, 19 Mar 2022 05:12:06 PDT (-0700), [email protected] wrote:
> > >>>> Currently, the range and default value of NR_CPUS is too restrictive
> > >>>> for high-end RISC-V systems with large number of HARTs. The latest
> > >>>> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
> > >>>> restrictive for QEMU as well. Other major architectures (such as
> > >>>> ARM64, x86_64, MIPS, etc) have a much higher range and default
> > >>>> value of NR_CPUS.
> > >>>>
> > >>>> This patch increases NR_CPUS range to 2-512 and default value to
> > >>>> XLEN (i.e. 32 for RV32 and 64 for RV64).
> > >>>>
> > >>>> Signed-off-by: Anup Patel <[email protected]>
> > >>>> ---
> > >>>> Changes since v1:
> > >>>> - Updated NR_CPUS range to 2-512 which reflects maximum number of
> > >>>> CPUs supported by QEMU virt machine.
> > >>>> ---
> > >>>> arch/riscv/Kconfig | 7 ++++---
> > >>>> 1 file changed, 4 insertions(+), 3 deletions(-)
> > >>>>
> > >>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > >>>> index 5adcbd9b5e88..423ac17f598c 100644
> > >>>> --- a/arch/riscv/Kconfig
> > >>>> +++ b/arch/riscv/Kconfig
> > >>>> @@ -274,10 +274,11 @@ config SMP
> > >>>> If you don't know what to do here, say N.
> > >>>>
> > >>>> config NR_CPUS
> > >>>> - int "Maximum number of CPUs (2-32)"
> > >>>> - range 2 32
> > >>>> + int "Maximum number of CPUs (2-512)"
> > >>>> + range 2 512
> > >>
> > >> For SBI_V01=y there seems to be a hard constraint to XLEN bits.
> > >> See __sbi_v01_cpumask_to_hartmask() in rch/riscv/kernel/sbi.c.
> > >>
> > >> So shouldn't this be something like:
> > >>
> > >> range 2 512 !SBI_V01
> > >> range 2 32 SBI_V01 && 32BIT
> > >> range 2 64 SBI_V01 && 64BIT
> > >
> > > This is just making it unnecessarily complicated for supporting
> > > SBI v0.1
> > >
> > > How about removing SBI v0.1 support and the spin-wait CPU
> > > operations from arch/riscv ?
> >
> > The SBI v0.1 specification was only a draft. Only the v1.0 version has
> > ever been ratified.
> >
> > It would be good to remove this legacy code from Linux and U-Boot.
> >
> > By the way, why does upstream OpenSBI claim to be conformant to SBI v0.3
> > and not to v1.0?
>
> The ratification process for SBI v1.0 was in early stages when OpenSBI v1.0
> was being released so we decided to keep the SBI v0.3 spec version. The
> next OpenSBI v1.1 release (due in June 2022) will change to SBI v1.0
>

Yes. We are in the final stages of the official ratification of SBI
v1.0. Once that ratified version is released,
OpenSBI will be upgraded to support that.

> Regards,
> Anup
>
> >
> > include/sbi/sbi_ecall.h:16:
> >
> > #define SBI_ECALL_VERSION_MAJOR 0
> > #define SBI_ECALL_VERSION_MINOR 3
> >
> > Best regards
> >
> > Heinrich
> >
> > >
> > >>
> > >>>> depends on SMP
> > >>>> - default "8"
> > >>>> + default "32" if 32BIT
> > >>>> + default "64" if 64BIT
> > >>>>
> > >>>> config HOTPLUG_CPU
> > >>>> bool "Support for hot-pluggable CPUs"
> > >>>
> > >>> I'm getting all sorts of boot issues with more than 32 CPUs, even on the
> > >>> latest QEMU master. I'm not opposed to increasing the CPU count in
> > >>> theory, but if we're going to have a setting that goes up to a huge
> > >>> number it needs to at least boot. I've got 64 host threads, so it
> > >>> shouldn't just be a scheduling thing.
> > >>
> > >> Currently high performing hardware for RISC-V is missing. So it makes
> > >> sense to build software via QEMU on x86_64 or arm64 with as many
> > >> hardware threads as available (128 is not uncommon).
> > >>
> > >> OpenSBI currently is limited to 128 threads:
> > >> include/sbi/sbi_hartmask.h:22:
> > >> #define SBI_HARTMASK_MAX_BITS 128
> > >> This is just an arbitrary value we can be modified.
> > >
> > > Yes, this limit will be gradually increased with some improvements
> > > to optimize runtime memory used by OpenSBI.
> > >
> > >>
> > >> U-Boot v2022.04 qemu-riscv64_smode_defconfig has a value of
> > >> CONFIG_SYS_MALLOC_F_LEN that is to low. This leads to a boot failure for
> > >> more than 16 harts. A patch to correct this is pending:
> > >> [PATCH v2 1/1] riscv: alloc space exhausted
> > >> https://lore.kernel.org/u-boot/CAN5B=eKt=tFLZ2z3aNHJqsnJzpdA0oikcrC2i1_=ZDD=f+M0jA@mail.gmail.com/T/#t
> > >>
> > >> With QEMU 7.0 and the U-Boot fix booting into a 5.17 defconfig kernel
> > >> with 64 virtual cores worked fine for me.
> > >
> > > Thanks for trying this patch.
> > >
> > > Regards,
> > > Anup
> > >
> > >>
> > >> Best regards
> > >>
> > >> Heinrich
> > >>
> > >>>
> > >>> If there was some hardware that actually boots on these I'd be happy to
> > >>> take it, but given that it's just QEMU I'd prefer to sort out the bugs
> > >>> first. It's probably just latent bugs somewhere, but allowing users to
> > >>> turn on configs we know don't work just seems like the wrong way to go.
> > >>>



--
Regards,
Atish