LinuxLists.cc - [v3 PATCH 0/8] Various SMP related fixes

2019-02-08 01:53:22

Subject: [v3 PATCH 0/8] Various SMP related fixes

The existing upstream kernel doesn't boot for non-smp
configuration. This patch series address various issues
with non-smp configurations.

The patch series is based on 5.0-rc5. Tested on QEMU and
HiFive Unleashed board using both OpenSBI & BBL.

Changes from v2->v3

1. Fixed spurious white space.
2. Added lockdep for smpboot completion variable.
2. Added a sanity check for hwcap.

Changes from v1->v2

1. Move the cpuid to hartd id map to smp.c from setup.c
2. Split 3rd patch into several small patches based on
logical grouping.
3. Added a new patch that fixes an issue in hwcap query.
4. Changed the title of the patch series.

Atish Patra (8):
RISC-V: Do not wait indefinitely in __cpu_up
RISC-V: Move cpuid to hartid mapping to SMP.
RISC-V: Remove NR_CPUs check during hartid search from DT
RISC-V: Allow hartid-to-cpuid function to fail.
RISC-V: Compare cpuid with NR_CPUS before mapping.
clocksource/drivers/riscv: Add required checks during clock source
init
irqchip/irq-sifive-plic:: Check and continue in case of an invalid
cpuid.
RISC-V: Assign hwcap only according to boot cpu.

arch/riscv/include/asm/smp.h | 14 ++++++++---
arch/riscv/kernel/cpu.c | 4 ---
arch/riscv/kernel/cpufeature.c | 52 +++++++++++++++++++++++++++------------
arch/riscv/kernel/setup.c | 9 -------
arch/riscv/kernel/smp.c | 10 +++++++-
arch/riscv/kernel/smpboot.c | 20 ++++++++++++---
drivers/clocksource/timer-riscv.c | 23 ++++++++++++++---
drivers/irqchip/irq-sifive-plic.c | 5 ++++
8 files changed, 98 insertions(+), 39 deletions(-)

--
2.7.4

2019-02-08 01:52:09

by Atish Patra

[permalink] [raw]

Subject: [v3 PATCH 4/8] RISC-V: Allow hartid-to-cpuid function to fail.

It is perfectly okay to call riscv_hartid_to_cpuid for a
hartid that is not mapped with an CPU id. It can happen
if the calling functions retrieves the hartid from DT.
However, that hartid was never brought online by the
firmware or kernel for any reasons.

No need to BUG() in the above case. A negative error return
is sufficient and the calling function should check for the
return value always.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
arch/riscv/kernel/smp.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index b69883c6..ca99f0fb 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -60,7 +60,6 @@ int riscv_hartid_to_cpuid(int hartid)
return i;

pr_err("Couldn't find cpu id for hartid [%d]\n", hartid);
- BUG();
return i;
}

--
2.7.4

2019-02-08 01:52:19

by Atish Patra

[permalink] [raw]

Subject: [v3 PATCH 6/8] clocksource/drivers/riscv: Add required checks during clock source init

Currently, clocksource registration happens for an invalid cpu
for non-smp kernels. This lead to kernel panic as cpu hotplug
registration will fail for those cpus. Moreover,
riscv_hartid_to_cpuid can return errors now.

Do not proceed if hartid or cpuid is invalid. Take this opprtunity
to print appropriate error strings for different failure cases.

Signed-off-by: Atish Patra <[email protected]>
---
drivers/clocksource/timer-riscv.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
index 43189220..3c7ea75b 100644
--- a/drivers/clocksource/timer-riscv.c
+++ b/drivers/clocksource/timer-riscv.c
@@ -95,13 +95,30 @@ static int __init riscv_timer_init_dt(struct device_node *n)
struct clocksource *cs;

hartid = riscv_of_processor_hartid(n);
+ if (hartid < 0) {
+ pr_warn("Not valid hartid for node [%pOF] error = [%d]\n",
+ n, hartid);
+ return hartid;
+ }
+
cpuid = riscv_hartid_to_cpuid(hartid);
+ if (cpuid < 0) {
+ pr_warn("Invalid cpuid for hartid [%d]\n", hartid);
+ return cpuid;
+ }

if (cpuid != smp_processor_id())
return 0;

+ pr_err("%s: Registering clocksource cpuid [%d] hartid [%d]\n",
+ __func__, cpuid, hartid);
cs = per_cpu_ptr(&riscv_clocksource, cpuid);
- clocksource_register_hz(cs, riscv_timebase);
+ error = clocksource_register_hz(cs, riscv_timebase);
+ if (error) {
+ pr_err("RISCV timer register failed [%d] for cpu = [%d]\n",
+ error, cpuid);
+ return error;
+ }

sched_clock_register(riscv_sched_clock,
BITS_PER_LONG, riscv_timebase);
@@ -110,8 +127,8 @@ static int __init riscv_timer_init_dt(struct device_node *n)
"clockevents/riscv/timer:starting",
riscv_timer_starting_cpu, riscv_timer_dying_cpu);
if (error)
- pr_err("RISCV timer register failed [%d] for cpu = [%d]\n",
- error, cpuid);
+ pr_err("cpu hp setup state failed for RISCV timer [%d]\n",
+ error);
return error;
}

--
2.7.4

2019-02-08 01:52:22

by Atish Patra

[permalink] [raw]

Subject: [v3 PATCH 5/8] RISC-V: Compare cpuid with NR_CPUS before mapping.

We should never have a cpuid greater that NR_CPUS. Compare
with NR_CPUS before creating the mapping between logical
and physical CPU ids. This is also mandatory as NR_CPUS
check is removed from riscv_of_processor_hartid.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
arch/riscv/kernel/smpboot.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 669eb332..f120d325 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -66,6 +66,11 @@ void __init setup_smp(void)
found_boot_cpu = 1;
continue;
}
+ if (cpuid >= NR_CPUS) {
+ pr_warn("Invalid cpuid [%d] for hartid [%d]\n",
+ cpuid, hart);
+ break;
+ }

cpuid_to_hartid_map(cpuid) = hart;
set_cpu_possible(cpuid, true);
--
2.7.4

2019-02-08 01:52:35

by Atish Patra

[permalink] [raw]

Subject: [v3 PATCH 7/8] irqchip/irq-sifive-plic:: Check and continue in case of an invalid cpuid.

riscv_hartid_to_cpuid can return invalid cpuid for a hart
that is present in DT but was never brought up.

Print the appropriate warning message and continue.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
drivers/irqchip/irq-sifive-plic.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 357e9daf..254ecd76 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -237,6 +237,11 @@ static int __init plic_init(struct device_node *node,
}

cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ pr_warn("Invalid cpuid for context %d\n", i);
+ continue;
+ }
+
handler = per_cpu_ptr(&plic_handlers, cpu);
handler->present = true;
handler->ctxid = i;
--
2.7.4

2019-02-08 01:53:09

by Atish Patra

[permalink] [raw]

Subject: [v3 PATCH 3/8] RISC-V: Remove NR_CPUs check during hartid search from DT

In non-smp configuration, hartid can be higher that NR_CPUS.
riscv_of_processor_hartid should not be compared to hartid to
NR_CPUS in that case. Moreover, this function checks all the
DT properties of a hart node. NR_CPUS comparison seems out of
place.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
---
arch/riscv/kernel/cpu.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index f8fa2c63..19edaeae 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -34,10 +34,6 @@ int riscv_of_processor_hartid(struct device_node *node)
pr_warn("Found CPU without hart ID\n");
return -(ENODEV);
}
- if (hart >= NR_CPUS) {
- pr_info("Found hart ID %d, which is above NR_CPUs. Disabling this hart\n", hart);
- return -(ENODEV);
- }

if (of_property_read_string(node, "status", &status)) {
pr_warn("CPU with hartid=%d has no \"status\" property\n", hart);
--
2.7.4

2019-02-08 01:53:16

by Atish Patra

[permalink] [raw]

Subject: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

Currently, we set hwcap based on first valid cpu from
DT. This may not be correct always as that CPU might not
be current booting cpu.

Set hwcap based on the boot cpu instead of first
valid CPU from DT. Add a sanity check to identify if any
hwcap do not match.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kernel/cpufeature.c | 52 +++++++++++++++++++++++++++++-------------
1 file changed, 36 insertions(+), 16 deletions(-)

diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index a6e369ed..ed8f0c28 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -20,6 +20,7 @@
#include <linux/of.h>
#include <asm/processor.h>
#include <asm/hwcap.h>
+#include <asm/smp.h>

unsigned long elf_hwcap __read_mostly;
#ifdef CONFIG_FPU
@@ -32,6 +33,8 @@ void riscv_fill_hwcap(void)
const char *isa;
size_t i;
static unsigned long isa2hwcap[256] = {0};
+ int hartid;
+ unsigned long temp_hwcap = 0, boot_hwcap = 0;

isa2hwcap['i'] = isa2hwcap['I'] = COMPAT_HWCAP_ISA_I;
isa2hwcap['m'] = isa2hwcap['M'] = COMPAT_HWCAP_ISA_M;
@@ -43,27 +46,44 @@ void riscv_fill_hwcap(void)
elf_hwcap = 0;

/*
- * We don't support running Linux on hertergenous ISA systems. For
- * now, we just check the ISA of the first "okay" processor.
+ * We don't support running Linux on hertergenous ISA systems.
+ * But first "okay" processor might not be the boot cpu.
+ * Check the ISA of boot cpu.
*/
- while ((node = of_find_node_by_type(node, "cpu")))
- if (riscv_of_processor_hartid(node) >= 0)
- break;
- if (!node) {
- pr_warning("Unable to find \"cpu\" devicetree entry");
- return;
- }
+ while ((node = of_find_node_by_type(node, "cpu"))) {
+ if (!node) {
+ pr_warn("Unable to find \"cpu\" devicetree entry");
+ return;
+ }
+
+ hartid = riscv_of_processor_hartid(node);
+ if (hartid < 0)
+ continue;

- if (of_property_read_string(node, "riscv,isa", &isa)) {
- pr_warning("Unable to find \"riscv,isa\" devicetree entry");
+ if (of_property_read_string(node, "riscv,isa", &isa)) {
+ pr_warn("Unable to find \"riscv,isa\" devicetree entry");
+ of_node_put(node);
+ return;
+ }
of_node_put(node);
- return;
- }
- of_node_put(node);

- for (i = 0; i < strlen(isa); ++i)
- elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
+ for (i = 0; i < strlen(isa); ++i)
+ temp_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
+ /*
+ * All "okay" hart should have same isa. We don't know how to
+ * handle if they don't. Throw a warning for now.
+ */
+ if (elf_hwcap && temp_hwcap != elf_hwcap)
+ pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
+ elf_hwcap, temp_hwcap);
+
+ if (hartid == boot_cpu_hartid)
+ boot_hwcap = temp_hwcap;
+ elf_hwcap = temp_hwcap;
+ temp_hwcap = 0;
+ }

+ elf_hwcap = boot_hwcap;
/* We don't support systems with F but without D, so mask those out
* here. */
if ((elf_hwcap & COMPAT_HWCAP_ISA_F) && !(elf_hwcap & COMPAT_HWCAP_ISA_D)) {
--
2.7.4

2019-02-08 01:54:17

by Atish Patra

[permalink] [raw]

Subject: [v3 PATCH 1/8] RISC-V: Do not wait indefinitely in __cpu_up

In SMP path, __cpu_up waits for other CPU to come online
indefinitely. This is wrong as other CPU might be disabled
in machine mode and possible CPU is set to the cpus present
in DT.

Introduce a completion variable and waits only for a second.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
---
arch/riscv/kernel/smpboot.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 18cda0e8..669eb332 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -39,6 +39,7 @@

void *__cpu_up_stack_pointer[NR_CPUS];
void *__cpu_up_task_pointer[NR_CPUS];
+static DECLARE_COMPLETION(cpu_running);

void __init smp_prepare_boot_cpu(void)
{
@@ -77,6 +78,7 @@ void __init setup_smp(void)

int __cpu_up(unsigned int cpu, struct task_struct *tidle)
{
+ int ret = 0;
int hartid = cpuid_to_hartid_map(cpu);
tidle->thread_info.cpu = cpu;

@@ -92,10 +94,16 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
task_stack_page(tidle) + THREAD_SIZE);
WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);

- while (!cpu_online(cpu))
- cpu_relax();
+ lockdep_assert_held(&cpu_running);
+ wait_for_completion_timeout(&cpu_running,
+ msecs_to_jiffies(1000));

- return 0;
+ if (!cpu_online(cpu)) {
+ pr_crit("CPU%u: failed to come online\n", cpu);
+ ret = -EIO;
+ }
+
+ return ret;
}

void __init smp_cpus_done(unsigned int max_cpus)
@@ -121,6 +129,7 @@ asmlinkage void __init smp_callin(void)
* a local TLB flush right now just in case.
*/
local_flush_tlb_all();
+ complete(&cpu_running);
/*
* Disable preemption before enabling interrupts, so we don't try to
* schedule a CPU that hasn't actually started yet.
--
2.7.4

2019-02-08 01:54:32

by Atish Patra

[permalink] [raw]

Subject: [v3 PATCH 2/8] RISC-V: Move cpuid to hartid mapping to SMP.

Currently, logical CPU id to physical hartid mapping is
defined for both smp and non-smp configurations. This
is not required as we need this only for smp configuration.
The mapping function can define directly boot_cpu_hartid
for non-smp use case.

The reverse mapping function i.e. hartid to cpuid can be called
for any valid but not booted harts. So it should return default
cpu 0 only if it is a boot hartid.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
---
arch/riscv/include/asm/smp.h | 14 +++++++++++---
arch/riscv/kernel/setup.c | 9 ---------
arch/riscv/kernel/smp.c | 9 +++++++++
3 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 41aa73b4..21fd2d75 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -22,12 +22,13 @@
/*
* Mapping between linux logical cpu index and hartid.
*/
-extern unsigned long __cpuid_to_hartid_map[NR_CPUS];
-#define cpuid_to_hartid_map(cpu) __cpuid_to_hartid_map[cpu]

+extern unsigned long boot_cpu_hartid;
struct seq_file;

#ifdef CONFIG_SMP
+extern unsigned long __cpuid_to_hartid_map[NR_CPUS];
+#define cpuid_to_hartid_map(cpu) __cpuid_to_hartid_map[cpu]

/* print IPI stats */
void show_ipi_stats(struct seq_file *p, int prec);
@@ -58,7 +59,14 @@ static inline void show_ipi_stats(struct seq_file *p, int prec)

static inline int riscv_hartid_to_cpuid(int hartid)
{
- return 0;
+ if (hartid == boot_cpu_hartid)
+ return 0;
+
+ return -1;
+}
+static inline unsigned long cpuid_to_hartid_map(int cpu)
+{
+ return boot_cpu_hartid;
}

static inline void riscv_cpuid_to_hartid_mask(const struct cpumask *in,
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 77564310..45e9a2f0 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -61,15 +61,6 @@ EXPORT_SYMBOL(empty_zero_page);
atomic_t hart_lottery;
unsigned long boot_cpu_hartid;

-unsigned long __cpuid_to_hartid_map[NR_CPUS] = {
- [0 ... NR_CPUS-1] = INVALID_HARTID
-};
-
-void __init smp_setup_processor_id(void)
-{
- cpuid_to_hartid_map(0) = boot_cpu_hartid;
-}
-
#ifdef CONFIG_BLK_DEV_INITRD
static void __init setup_initrd(void)
{
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 246635ea..b69883c6 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -36,6 +36,15 @@ enum ipi_message_type {
IPI_MAX
};

+unsigned long __cpuid_to_hartid_map[NR_CPUS] = {
+ [0 ... NR_CPUS-1] = INVALID_HARTID
+};
+
+void __init smp_setup_processor_id(void)
+{
+ cpuid_to_hartid_map(0) = boot_cpu_hartid;
+}
+
/* A collection of single bit ipi messages. */
static struct {
unsigned long stats[IPI_MAX] ____cacheline_aligned;
--
2.7.4

2019-02-08 09:02:14

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [v3 PATCH 1/8] RISC-V: Do not wait indefinitely in __cpu_up

On Thu, Feb 07, 2019 at 05:51:14PM -0800, Atish Patra wrote:
> In SMP path, __cpu_up waits for other CPU to come online
> indefinitely. This is wrong as other CPU might be disabled
> in machine mode and possible CPU is set to the cpus present
> in DT.
>
> Introduce a completion variable and waits only for a second.
>
> Signed-off-by: Atish Patra <[email protected]>
> Reviewed-by: Anup Patel <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2019-02-08 09:04:24

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [v3 PATCH 2/8] RISC-V: Move cpuid to hartid mapping to SMP.

On Thu, Feb 07, 2019 at 05:51:15PM -0800, Atish Patra wrote:
> Currently, logical CPU id to physical hartid mapping is
> defined for both smp and non-smp configurations. This
> is not required as we need this only for smp configuration.
> The mapping function can define directly boot_cpu_hartid
> for non-smp use case.

Please use up your available 72 chars for the changelog. (probably also
in other patches).

>
> The reverse mapping function i.e. hartid to cpuid can be called
> for any valid but not booted harts. So it should return default
> cpu 0 only if it is a boot hartid.
>
> Signed-off-by: Atish Patra <[email protected]>
> Reviewed-by: Anup Patel <[email protected]>
> ---
> arch/riscv/include/asm/smp.h | 14 +++++++++++---
> arch/riscv/kernel/setup.c | 9 ---------
> arch/riscv/kernel/smp.c | 9 +++++++++
> 3 files changed, 20 insertions(+), 12 deletions(-)
>
> diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
> index 41aa73b4..21fd2d75 100644
> --- a/arch/riscv/include/asm/smp.h
> +++ b/arch/riscv/include/asm/smp.h
> @@ -22,12 +22,13 @@
> /*
> * Mapping between linux logical cpu index and hartid.
> */
> -extern unsigned long __cpuid_to_hartid_map[NR_CPUS];
> -#define cpuid_to_hartid_map(cpu) __cpuid_to_hartid_map[cpu]
>
> +extern unsigned long boot_cpu_hartid;
> struct seq_file;

We usually try to keep forward declatations at the top of the file.

Can you add the new external declaration below the forward one?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2019-02-08 09:06:20

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [v3 PATCH 6/8] clocksource/drivers/riscv: Add required checks during clock source init

On Thu, Feb 07, 2019 at 05:51:19PM -0800, Atish Patra wrote:
> Currently, clocksource registration happens for an invalid cpu
> for non-smp kernels. This lead to kernel panic as cpu hotplug
> registration will fail for those cpus. Moreover,
> riscv_hartid_to_cpuid can return errors now.
>
> Do not proceed if hartid or cpuid is invalid. Take this opprtunity
> to print appropriate error strings for different failure cases.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> drivers/clocksource/timer-riscv.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
> index 43189220..3c7ea75b 100644
> --- a/drivers/clocksource/timer-riscv.c
> +++ b/drivers/clocksource/timer-riscv.c
> @@ -95,13 +95,30 @@ static int __init riscv_timer_init_dt(struct device_node *n)
> struct clocksource *cs;
>
> hartid = riscv_of_processor_hartid(n);
> + if (hartid < 0) {
> + pr_warn("Not valid hartid for node [%pOF] error = [%d]\n",
> + n, hartid);
> + return hartid;
> + }
> +
> cpuid = riscv_hartid_to_cpuid(hartid);
> + if (cpuid < 0) {
> + pr_warn("Invalid cpuid for hartid [%d]\n", hartid);
> + return cpuid;
> + }
>
> if (cpuid != smp_processor_id())
> return 0;
>
> + pr_err("%s: Registering clocksource cpuid [%d] hartid [%d]\n",
> + __func__, cpuid, hartid);

This does not look like an error case to me. At best it is info,
if not debug.

2019-02-08 09:13:45

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

> + * We don't support running Linux on hertergenous ISA systems.
> + * But first "okay" processor might not be the boot cpu.
> + * Check the ISA of boot cpu.

Please use up your available 80 characters per line in comments.

> + /*
> + * All "okay" hart should have same isa. We don't know how to
> + * handle if they don't. Throw a warning for now.
> + */
> + if (elf_hwcap && temp_hwcap != elf_hwcap)
> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
> + elf_hwcap, temp_hwcap);
> +
> + if (hartid == boot_cpu_hartid)
> + boot_hwcap = temp_hwcap;
> + elf_hwcap = temp_hwcap;

So we always set elf_hwcap to the capabilities of the previous cpu.

> + temp_hwcap = 0;

I think tmp_hwcap should be declared and initialized inside the outer loop
instead having to manually reset it like this.

> + }
>
> + elf_hwcap = boot_hwcap;

And then reset it here to the boot cpu.

Shoudn't we only report the features supported by all cores? Otherwise
we'll still have problems if the boot cpu supports a feature, but not
others.

Something like:

for () {
unsigned long this_hwcap = 0;

for (i = 0; i < strlen(isa); i++)
this_hwcap |= isa2hwcap[(unsigned char)(isa[i])];

if (elf_hwcap)
elf_hwcap &= this_hwcap;
else
elf_hwcap = this_hwcap;
}

2019-02-08 22:56:45

by Atish Patra

[permalink] [raw]

Subject: Re: [v3 PATCH 6/8] clocksource/drivers/riscv: Add required checks during clock source init

On 2/8/19 1:04 AM, Christoph Hellwig wrote:
> On Thu, Feb 07, 2019 at 05:51:19PM -0800, Atish Patra wrote:
>> Currently, clocksource registration happens for an invalid cpu
>> for non-smp kernels. This lead to kernel panic as cpu hotplug
>> registration will fail for those cpus. Moreover,
>> riscv_hartid_to_cpuid can return errors now.
>>
>> Do not proceed if hartid or cpuid is invalid. Take this opprtunity
>> to print appropriate error strings for different failure cases.
>>
>> Signed-off-by: Atish Patra <[email protected]>
>> ---
>> drivers/clocksource/timer-riscv.c | 23 ++++++++++++++++++++---
>> 1 file changed, 20 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
>> index 43189220..3c7ea75b 100644
>> --- a/drivers/clocksource/timer-riscv.c
>> +++ b/drivers/clocksource/timer-riscv.c
>> @@ -95,13 +95,30 @@ static int __init riscv_timer_init_dt(struct device_node *n)
>> struct clocksource *cs;
>>
>> hartid = riscv_of_processor_hartid(n);
>> + if (hartid < 0) {
>> + pr_warn("Not valid hartid for node [%pOF] error = [%d]\n",
>> + n, hartid);
>> + return hartid;
>> + }
>> +
>> cpuid = riscv_hartid_to_cpuid(hartid);
>> + if (cpuid < 0) {
>> + pr_warn("Invalid cpuid for hartid [%d]\n", hartid);
>> + return cpuid;
>> + }
>>
>> if (cpuid != smp_processor_id())
>> return 0;
>>
>> + pr_err("%s: Registering clocksource cpuid [%d] hartid [%d]\n",
>> + __func__, cpuid, hartid);
>
> This does not look like an error case to me. At best it is info,
> if not debug.
>
Thanks for catching. It was a typo. I will fix it in next version.

Regards,
Atish
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>

2019-02-08 22:58:00

by Atish Patra

[permalink] [raw]

Subject: Re: [v3 PATCH 2/8] RISC-V: Move cpuid to hartid mapping to SMP.

On 2/8/19 1:03 AM, Christoph Hellwig wrote:
> On Thu, Feb 07, 2019 at 05:51:15PM -0800, Atish Patra wrote:
>> Currently, logical CPU id to physical hartid mapping is
>> defined for both smp and non-smp configurations. This
>> is not required as we need this only for smp configuration.
>> The mapping function can define directly boot_cpu_hartid
>> for non-smp use case.
>
> Please use up your available 72 chars for the changelog. (probably also
> in other patches).
>

Sorry. I will fix all patches to use 72 chars.

>>
>> The reverse mapping function i.e. hartid to cpuid can be called
>> for any valid but not booted harts. So it should return default
>> cpu 0 only if it is a boot hartid.
>>
>> Signed-off-by: Atish Patra <[email protected]>
>> Reviewed-by: Anup Patel <[email protected]>
>> ---
>> arch/riscv/include/asm/smp.h | 14 +++++++++++---
>> arch/riscv/kernel/setup.c | 9 ---------
>> arch/riscv/kernel/smp.c | 9 +++++++++
>> 3 files changed, 20 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
>> index 41aa73b4..21fd2d75 100644
>> --- a/arch/riscv/include/asm/smp.h
>> +++ b/arch/riscv/include/asm/smp.h
>> @@ -22,12 +22,13 @@
>> /*
>> * Mapping between linux logical cpu index and hartid.
>> */
>> -extern unsigned long __cpuid_to_hartid_map[NR_CPUS];
>> -#define cpuid_to_hartid_map(cpu) __cpuid_to_hartid_map[cpu]
>>
>> +extern unsigned long boot_cpu_hartid;
>> struct seq_file;
>
> We usually try to keep forward declatations at the top of the file.
>
> Can you add the new external declaration below the forward one?
>
Sure.

Regards,
Atish
> Otherwise looks good:
>
> Reviewed-by: Christoph Hellwig <[email protected]>
>

2019-02-08 23:03:18

by Atish Patra

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

On 2/8/19 1:11 AM, Christoph Hellwig wrote:
>> + * We don't support running Linux on hertergenous ISA systems.
>> + * But first "okay" processor might not be the boot cpu.
>> + * Check the ISA of boot cpu.
>
> Please use up your available 80 characters per line in comments.
>
I will fix it.

>> + /*
>> + * All "okay" hart should have same isa. We don't know how to
>> + * handle if they don't. Throw a warning for now.
>> + */
>> + if (elf_hwcap && temp_hwcap != elf_hwcap)
>> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
>> + elf_hwcap, temp_hwcap);
>> +
>> + if (hartid == boot_cpu_hartid)
>> + boot_hwcap = temp_hwcap;
>> + elf_hwcap = temp_hwcap;
>
> So we always set elf_hwcap to the capabilities of the previous cpu.
>
>> + temp_hwcap = 0;
>
> I think tmp_hwcap should be declared and initialized inside the outer loop
> instead having to manually reset it like this.
>
>> + }
>>
>> + elf_hwcap = boot_hwcap;
>
> And then reset it here to the boot cpu.
>
> Shoudn't we only report the features supported by all cores? Otherwise
> we'll still have problems if the boot cpu supports a feature, but not
> others.
>

Hmm. The other side of the argument is boot cpu does have a feature that
is not supported by other hart that didn't even boot.
The user space may execute something based on boot cpu capability but
that won't be enabled.

At least, in this way we know that we are compatible completely with
boot cpu capabilities. Thoughts ?

Regards,
Atish
> Something like:
>
> for () {
> unsigned long this_hwcap = 0;
>
> for (i = 0; i < strlen(isa); i++)
> this_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
>
> if (elf_hwcap)
> elf_hwcap &= this_hwcap;
> else
> elf_hwcap = this_hwcap;
> }
>
>

2019-02-09 04:28:38

by David Abdurachmanov

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

On Sat, Feb 9, 2019 at 12:03 AM Atish Patra <[email protected]> wrote:
>
> On 2/8/19 1:11 AM, Christoph Hellwig wrote:
> >> + * We don't support running Linux on hertergenous ISA systems.
> >> + * But first "okay" processor might not be the boot cpu.
> >> + * Check the ISA of boot cpu.
> >
> > Please use up your available 80 characters per line in comments.
> >
> I will fix it.
>
> >> + /*
> >> + * All "okay" hart should have same isa. We don't know how to
> >> + * handle if they don't. Throw a warning for now.
> >> + */
> >> + if (elf_hwcap && temp_hwcap != elf_hwcap)
> >> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
> >> + elf_hwcap, temp_hwcap);
> >> +
> >> + if (hartid == boot_cpu_hartid)
> >> + boot_hwcap = temp_hwcap;
> >> + elf_hwcap = temp_hwcap;
> >
> > So we always set elf_hwcap to the capabilities of the previous cpu.
> >
> >> + temp_hwcap = 0;
> >
> > I think tmp_hwcap should be declared and initialized inside the outer loop
> > instead having to manually reset it like this.
> >
> >> + }
> >>
> >> + elf_hwcap = boot_hwcap;
> >
> > And then reset it here to the boot cpu.
> >
> > Shoudn't we only report the features supported by all cores? Otherwise
> > we'll still have problems if the boot cpu supports a feature, but not
> > others.
> >
>
> Hmm. The other side of the argument is boot cpu does have a feature that
> is not supported by other hart that didn't even boot.
> The user space may execute something based on boot cpu capability but
> that won't be enabled.
>
> At least, in this way we know that we are compatible completely with
> boot cpu capabilities. Thoughts ?

There is one example on the market, e.g., Samsung Exynos 9810.

Mongoose 3 (big cores) only support ARMv8.0, while Cortex-A55
(little ones) support ARMv8.2 (and that brings atomics support).
I think, it's the only ARM SOC that supports different ISA extensions
between cores on the same package.

Kernel scheduler doesn't know that big cores are missing atomics
support or that applications needs it and moves the thread
resulting in illegal instruction.

E.g., see Golang issue: https://github.com/golang/go/issues/28431

I also recall Jon Masters (Computer Architect at Red Hat) advocating
against having cores with mismatched capabilities on the server market.

It just causes more problems down the line.

david

2019-02-09 16:12:33

by Marc Zyngier

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

On Sat, 09 Feb 2019 04:26:07 +0000,
David Abdurachmanov <[email protected]> wrote:
>
> On Sat, Feb 9, 2019 at 12:03 AM Atish Patra <[email protected]> wrote:
> >
> > On 2/8/19 1:11 AM, Christoph Hellwig wrote:
> > >> + * We don't support running Linux on hertergenous ISA systems.
> > >> + * But first "okay" processor might not be the boot cpu.
> > >> + * Check the ISA of boot cpu.
> > >
> > > Please use up your available 80 characters per line in comments.
> > >
> > I will fix it.
> >
> > >> + /*
> > >> + * All "okay" hart should have same isa. We don't know how to
> > >> + * handle if they don't. Throw a warning for now.
> > >> + */
> > >> + if (elf_hwcap && temp_hwcap != elf_hwcap)
> > >> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
> > >> + elf_hwcap, temp_hwcap);
> > >> +
> > >> + if (hartid == boot_cpu_hartid)
> > >> + boot_hwcap = temp_hwcap;
> > >> + elf_hwcap = temp_hwcap;
> > >
> > > So we always set elf_hwcap to the capabilities of the previous cpu.
> > >
> > >> + temp_hwcap = 0;
> > >
> > > I think tmp_hwcap should be declared and initialized inside the outer loop
> > > instead having to manually reset it like this.
> > >
> > >> + }
> > >>
> > >> + elf_hwcap = boot_hwcap;
> > >
> > > And then reset it here to the boot cpu.
> > >
> > > Shoudn't we only report the features supported by all cores? Otherwise
> > > we'll still have problems if the boot cpu supports a feature, but not
> > > others.
> > >
> >
> > Hmm. The other side of the argument is boot cpu does have a feature that
> > is not supported by other hart that didn't even boot.
> > The user space may execute something based on boot cpu capability but
> > that won't be enabled.
> >
> > At least, in this way we know that we are compatible completely with
> > boot cpu capabilities. Thoughts ?
>
> There is one example on the market, e.g., Samsung Exynos 9810.
>
> Mongoose 3 (big cores) only support ARMv8.0, while Cortex-A55
> (little ones) support ARMv8.2 (and that brings atomics support).
> I think, it's the only ARM SOC that supports different ISA extensions
> between cores on the same package.
>
> Kernel scheduler doesn't know that big cores are missing atomics
> support or that applications needs it and moves the thread
> resulting in illegal instruction.

Not quite. The scheduler doesn't have to know (thankfully).

The problem is that the Samsung folks tampered with the detection
logic in the kernel, and ended up advertising the LSE atomics to
userspace (despite only being available on half the cores).

If you run a mainline kernel on this things, it will just work, as the
LSE atomics are not advertised to userspace at all.

>
> E.g., see Golang issue: https://github.com/golang/go/issues/28431
>
> I also recall Jon Masters (Computer Architect at Red Hat) advocating
> against having cores with mismatched capabilities on the server
> market.

Well, nobody recommends that, server or not. That being said, it is
possible to handle it, and the arm64 kernel has been dealing with such
thing from day 1. We can have CPUs with different PMUs, implemented
page sizes, VA and PA spaces... What it takes is some work in the
kernel to sanitize it, and be careful in what you expose to userspace.

The thing to realise is that people will build stupid systems, no
matter how loud you shout. You can either pretend they don't exist, or
try to deal with them.

Thanks,

M.

--
Jazz is not dead, it just smell funny.

2019-02-11 13:24:17

by Andreas Schwab

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

On Feb 07 2019, Atish Patra <[email protected]> wrote:

> + while ((node = of_find_node_by_type(node, "cpu"))) {
> + if (!node) {

That can never be true.

> + pr_warn("Unable to find \"cpu\" devicetree entry");
> + return;
> + }
> +
> + hartid = riscv_of_processor_hartid(node);
> + if (hartid < 0)
> + continue;
>
> - if (of_property_read_string(node, "riscv,isa", &isa)) {
> - pr_warning("Unable to find \"riscv,isa\" devicetree entry");
> + if (of_property_read_string(node, "riscv,isa", &isa)) {
> + pr_warn("Unable to find \"riscv,isa\" devicetree entry");
> + of_node_put(node);
> + return;
> + }
> of_node_put(node);

[ 0.000000] OF: ERROR: Bad of_node_put() on /cpus/cpu@1
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc6-00020-g5903f30f1310 #12
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffe001076812>] walk_stackframe+0x0/0xa4
[ 0.000000] [<ffffffe001076a12>] show_stack+0x2a/0x34
[ 0.000000] [<ffffffe0015cf9ea>] dump_stack+0x62/0x7c
[ 0.000000] [<ffffffe00149fed4>] of_node_release+0xbe/0xc0
[ 0.000000] [<ffffffe0015d465a>] kobject_put+0xa6/0x1e8
[ 0.000000] [<ffffffe00149f44e>] of_node_put+0x16/0x20
[ 0.000000] [<ffffffe00149b45e>] of_find_node_by_type+0x66/0xa4
[ 0.000000] [<ffffffe0010755ca>] riscv_fill_hwcap+0x14c/0x1ce
[ 0.000000] [<ffffffe0000026d4>] 0xffffffe0000026d4
[ 0.000000] [<ffffffe0000006ec>] 0xffffffe0000006ec
[ 0.000000] [<ffffffe000000076>] 0xffffffe000000076

Andreas.

--
Andreas Schwab, SUSE Labs, [email protected]
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

2019-02-11 20:40:46

by Palmer Dabbelt

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

On Fri, 08 Feb 2019 20:26:07 PST (-0800), [email protected] wrote:
> On Sat, Feb 9, 2019 at 12:03 AM Atish Patra <[email protected]> wrote:
>>
>> On 2/8/19 1:11 AM, Christoph Hellwig wrote:
>> >> + * We don't support running Linux on hertergenous ISA systems.
>> >> + * But first "okay" processor might not be the boot cpu.
>> >> + * Check the ISA of boot cpu.
>> >
>> > Please use up your available 80 characters per line in comments.
>> >
>> I will fix it.
>>
>> >> + /*
>> >> + * All "okay" hart should have same isa. We don't know how to
>> >> + * handle if they don't. Throw a warning for now.
>> >> + */
>> >> + if (elf_hwcap && temp_hwcap != elf_hwcap)
>> >> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
>> >> + elf_hwcap, temp_hwcap);
>> >> +
>> >> + if (hartid == boot_cpu_hartid)
>> >> + boot_hwcap = temp_hwcap;
>> >> + elf_hwcap = temp_hwcap;
>> >
>> > So we always set elf_hwcap to the capabilities of the previous cpu.
>> >
>> >> + temp_hwcap = 0;
>> >
>> > I think tmp_hwcap should be declared and initialized inside the outer loop
>> > instead having to manually reset it like this.
>> >
>> >> + }
>> >>
>> >> + elf_hwcap = boot_hwcap;
>> >
>> > And then reset it here to the boot cpu.
>> >
>> > Shoudn't we only report the features supported by all cores? Otherwise
>> > we'll still have problems if the boot cpu supports a feature, but not
>> > others.
>> >
>>
>> Hmm. The other side of the argument is boot cpu does have a feature that
>> is not supported by other hart that didn't even boot.
>> The user space may execute something based on boot cpu capability but
>> that won't be enabled.
>>
>> At least, in this way we know that we are compatible completely with
>> boot cpu capabilities. Thoughts ?
>
> There is one example on the market, e.g., Samsung Exynos 9810.
>
> Mongoose 3 (big cores) only support ARMv8.0, while Cortex-A55
> (little ones) support ARMv8.2 (and that brings atomics support).
> I think, it's the only ARM SOC that supports different ISA extensions
> between cores on the same package.
>
> Kernel scheduler doesn't know that big cores are missing atomics
> support or that applications needs it and moves the thread
> resulting in illegal instruction.
>
> E.g., see Golang issue: https://github.com/golang/go/issues/28431
>
> I also recall Jon Masters (Computer Architect at Red Hat) advocating
> against having cores with mismatched capabilities on the server market.
>
> It just causes more problems down the line.

IMO the best bet is to only put extensions in HWCAP that are supported by all
the harts that userspace will be scheduled on.

2019-02-11 20:45:26

by Atish Patra

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

On 2/11/19 11:02 AM, Palmer Dabbelt wrote:
> On Fri, 08 Feb 2019 20:26:07 PST (-0800), [email protected] wrote:
>> On Sat, Feb 9, 2019 at 12:03 AM Atish Patra <[email protected]> wrote:
>>>
>>> On 2/8/19 1:11 AM, Christoph Hellwig wrote:
>>>>> + * We don't support running Linux on hertergenous ISA systems.
>>>>> + * But first "okay" processor might not be the boot cpu.
>>>>> + * Check the ISA of boot cpu.
>>>>
>>>> Please use up your available 80 characters per line in comments.
>>>>
>>> I will fix it.
>>>
>>>>> + /*
>>>>> + * All "okay" hart should have same isa. We don't know how to
>>>>> + * handle if they don't. Throw a warning for now.
>>>>> + */
>>>>> + if (elf_hwcap && temp_hwcap != elf_hwcap)
>>>>> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
>>>>> + elf_hwcap, temp_hwcap);
>>>>> +
>>>>> + if (hartid == boot_cpu_hartid)
>>>>> + boot_hwcap = temp_hwcap;
>>>>> + elf_hwcap = temp_hwcap;
>>>>
>>>> So we always set elf_hwcap to the capabilities of the previous cpu.
>>>>
>>>>> + temp_hwcap = 0;
>>>>
>>>> I think tmp_hwcap should be declared and initialized inside the outer loop
>>>> instead having to manually reset it like this.
>>>>
>>>>> + }
>>>>>
>>>>> + elf_hwcap = boot_hwcap;
>>>>
>>>> And then reset it here to the boot cpu.
>>>>
>>>> Shoudn't we only report the features supported by all cores? Otherwise
>>>> we'll still have problems if the boot cpu supports a feature, but not
>>>> others.
>>>>
>>>
>>> Hmm. The other side of the argument is boot cpu does have a feature that
>>> is not supported by other hart that didn't even boot.
>>> The user space may execute something based on boot cpu capability but
>>> that won't be enabled.
>>>
>>> At least, in this way we know that we are compatible completely with
>>> boot cpu capabilities. Thoughts ?
>>
>> There is one example on the market, e.g., Samsung Exynos 9810.
>>
>> Mongoose 3 (big cores) only support ARMv8.0, while Cortex-A55
>> (little ones) support ARMv8.2 (and that brings atomics support).
>> I think, it's the only ARM SOC that supports different ISA extensions
>> between cores on the same package.
>>
>> Kernel scheduler doesn't know that big cores are missing atomics
>> support or that applications needs it and moves the thread
>> resulting in illegal instruction.
>>
>> E.g., see Golang issue: https://github.com/golang/go/issues/28431
>>
>> I also recall Jon Masters (Computer Architect at Red Hat) advocating
>> against having cores with mismatched capabilities on the server market.
>>
>> It just causes more problems down the line.
>
> IMO the best bet is to only put extensions in HWCAP that are supported by all
> the harts that userspace will be scheduled on.
>
Fair enough. Instead of setting HWCAP in setup_arch() once, we can set
it only for boot cpu. It will be updated after every cpu comes up online.

Thus, HWCAP will consists all extensions supported by all cpus that are
online currently.

Regards,
Atish

2019-02-11 22:14:31

by Marc Zyngier

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

On Mon, 11 Feb 2019 12:03:30 -0800
Atish Patra <[email protected]> wrote:

> On 2/11/19 11:02 AM, Palmer Dabbelt wrote:
> > On Fri, 08 Feb 2019 20:26:07 PST (-0800), [email protected] wrote:
> >> On Sat, Feb 9, 2019 at 12:03 AM Atish Patra <[email protected]> wrote:
> >>>
> >>> On 2/8/19 1:11 AM, Christoph Hellwig wrote:
> >>>>> + * We don't support running Linux on hertergenous ISA systems.
> >>>>> + * But first "okay" processor might not be the boot cpu.
> >>>>> + * Check the ISA of boot cpu.
> >>>>
> >>>> Please use up your available 80 characters per line in comments.
> >>>>
> >>> I will fix it.
> >>>
> >>>>> + /*
> >>>>> + * All "okay" hart should have same isa. We don't know how to
> >>>>> + * handle if they don't. Throw a warning for now.
> >>>>> + */
> >>>>> + if (elf_hwcap && temp_hwcap != elf_hwcap)
> >>>>> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
> >>>>> + elf_hwcap, temp_hwcap);
> >>>>> +
> >>>>> + if (hartid == boot_cpu_hartid)
> >>>>> + boot_hwcap = temp_hwcap;
> >>>>> + elf_hwcap = temp_hwcap;
> >>>>
> >>>> So we always set elf_hwcap to the capabilities of the previous cpu.
> >>>>
> >>>>> + temp_hwcap = 0;
> >>>>
> >>>> I think tmp_hwcap should be declared and initialized inside the outer loop
> >>>> instead having to manually reset it like this.
> >>>>
> >>>>> + }
> >>>>>
> >>>>> + elf_hwcap = boot_hwcap;
> >>>>
> >>>> And then reset it here to the boot cpu.
> >>>>
> >>>> Shoudn't we only report the features supported by all cores? Otherwise
> >>>> we'll still have problems if the boot cpu supports a feature, but not
> >>>> others.
> >>>>
> >>>
> >>> Hmm. The other side of the argument is boot cpu does have a feature that
> >>> is not supported by other hart that didn't even boot.
> >>> The user space may execute something based on boot cpu capability but
> >>> that won't be enabled.
> >>>
> >>> At least, in this way we know that we are compatible completely with
> >>> boot cpu capabilities. Thoughts ?
> >>
> >> There is one example on the market, e.g., Samsung Exynos 9810.
> >>
> >> Mongoose 3 (big cores) only support ARMv8.0, while Cortex-A55
> >> (little ones) support ARMv8.2 (and that brings atomics support).
> >> I think, it's the only ARM SOC that supports different ISA extensions
> >> between cores on the same package.
> >>
> >> Kernel scheduler doesn't know that big cores are missing atomics
> >> support or that applications needs it and moves the thread
> >> resulting in illegal instruction.
> >>
> >> E.g., see Golang issue: https://github.com/golang/go/issues/28431
> >>
> >> I also recall Jon Masters (Computer Architect at Red Hat) advocating
> >> against having cores with mismatched capabilities on the server market.
> >>
> >> It just causes more problems down the line.
> > > IMO the best bet is to only put extensions in HWCAP that are supported by all
> > the harts that userspace will be scheduled on.
> > Fair enough. Instead of setting HWCAP in setup_arch() once, we can set it only for boot cpu. It will be updated after every cpu comes up online.
>
> Thus, HWCAP will consists all extensions supported by all cpus that are online currently.

You must thus prevent CPUs that have a different set of capabilities
from coming up late (once userspace has started).

M.
--
Without deviation from the norm, progress is not possible.

2019-02-11 22:23:55

by Palmer Dabbelt

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

On Mon, 11 Feb 2019 14:13:25 PST (-0800), [email protected] wrote:
> On Mon, 11 Feb 2019 12:03:30 -0800
> Atish Patra <[email protected]> wrote:
>
>> On 2/11/19 11:02 AM, Palmer Dabbelt wrote:
>> > On Fri, 08 Feb 2019 20:26:07 PST (-0800), [email protected] wrote:
>> >> On Sat, Feb 9, 2019 at 12:03 AM Atish Patra <[email protected]> wrote:
>> >>>
>> >>> On 2/8/19 1:11 AM, Christoph Hellwig wrote:
>> >>>>> + * We don't support running Linux on hertergenous ISA systems.
>> >>>>> + * But first "okay" processor might not be the boot cpu.
>> >>>>> + * Check the ISA of boot cpu.
>> >>>>
>> >>>> Please use up your available 80 characters per line in comments.
>> >>>>
>> >>> I will fix it.
>> >>>
>> >>>>> + /*
>> >>>>> + * All "okay" hart should have same isa. We don't know how to
>> >>>>> + * handle if they don't. Throw a warning for now.
>> >>>>> + */
>> >>>>> + if (elf_hwcap && temp_hwcap != elf_hwcap)
>> >>>>> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
>> >>>>> + elf_hwcap, temp_hwcap);
>> >>>>> +
>> >>>>> + if (hartid == boot_cpu_hartid)
>> >>>>> + boot_hwcap = temp_hwcap;
>> >>>>> + elf_hwcap = temp_hwcap;
>> >>>>
>> >>>> So we always set elf_hwcap to the capabilities of the previous cpu.
>> >>>>
>> >>>>> + temp_hwcap = 0;
>> >>>>
>> >>>> I think tmp_hwcap should be declared and initialized inside the outer loop
>> >>>> instead having to manually reset it like this.
>> >>>>
>> >>>>> + }
>> >>>>>
>> >>>>> + elf_hwcap = boot_hwcap;
>> >>>>
>> >>>> And then reset it here to the boot cpu.
>> >>>>
>> >>>> Shoudn't we only report the features supported by all cores? Otherwise
>> >>>> we'll still have problems if the boot cpu supports a feature, but not
>> >>>> others.
>> >>>>
>> >>>
>> >>> Hmm. The other side of the argument is boot cpu does have a feature that
>> >>> is not supported by other hart that didn't even boot.
>> >>> The user space may execute something based on boot cpu capability but
>> >>> that won't be enabled.
>> >>>
>> >>> At least, in this way we know that we are compatible completely with
>> >>> boot cpu capabilities. Thoughts ?
>> >>
>> >> There is one example on the market, e.g., Samsung Exynos 9810.
>> >>
>> >> Mongoose 3 (big cores) only support ARMv8.0, while Cortex-A55
>> >> (little ones) support ARMv8.2 (and that brings atomics support).
>> >> I think, it's the only ARM SOC that supports different ISA extensions
>> >> between cores on the same package.
>> >>
>> >> Kernel scheduler doesn't know that big cores are missing atomics
>> >> support or that applications needs it and moves the thread
>> >> resulting in illegal instruction.
>> >>
>> >> E.g., see Golang issue: https://github.com/golang/go/issues/28431
>> >>
>> >> I also recall Jon Masters (Computer Architect at Red Hat) advocating
>> >> against having cores with mismatched capabilities on the server market.
>> >>
>> >> It just causes more problems down the line.
>> > > IMO the best bet is to only put extensions in HWCAP that are supported by all
>> > the harts that userspace will be scheduled on.
>> > Fair enough. Instead of setting HWCAP in setup_arch() once, we can set it only for boot cpu. It will be updated after every cpu comes up online.
>>
>> Thus, HWCAP will consists all extensions supported by all cpus that are online currently.
>
> You must thus prevent CPUs that have a different set of capabilities
> from coming up late (once userspace has started).

and we have no way to do that. I'd prefer if we just looked through the entire
device tree and only showed userspace the features that are on every possible
CPU from the start. Otherwise the HWCAP will shift around during a userspace
run, which seems odd.

2019-02-11 23:25:52

by Atish Patra

[permalink] [raw]

Subject: Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

> On Feb 11, 2019, at 2:23 PM, Palmer Dabbelt <[email protected]> wrote:
>
> On Mon, 11 Feb 2019 14:13:25 PST (-0800), [email protected] wrote:
>> On Mon, 11 Feb 2019 12:03:30 -0800
>> Atish Patra <[email protected]> wrote:
>>
>>> On 2/11/19 11:02 AM, Palmer Dabbelt wrote:
>>> > On Fri, 08 Feb 2019 20:26:07 PST (-0800), [email protected] wrote:
>>> >> On Sat, Feb 9, 2019 at 12:03 AM Atish Patra <[email protected]> wrote:
>>> >>>
>>> >>> On 2/8/19 1:11 AM, Christoph Hellwig wrote:
>>> >>>>> + * We don't support running Linux on hertergenous ISA systems.
>>> >>>>> + * But first "okay" processor might not be the boot cpu.
>>> >>>>> + * Check the ISA of boot cpu.
>>> >>>>
>>> >>>> Please use up your available 80 characters per line in comments.
>>> >>>>
>>> >>> I will fix it.
>>> >>>
>>> >>>>> + /*
>>> >>>>> + * All "okay" hart should have same isa. We don't know how to
>>> >>>>> + * handle if they don't. Throw a warning for now.
>>> >>>>> + */
>>> >>>>> + if (elf_hwcap && temp_hwcap != elf_hwcap)
>>> >>>>> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
>>> >>>>> + elf_hwcap, temp_hwcap);
>>> >>>>> +
>>> >>>>> + if (hartid == boot_cpu_hartid)
>>> >>>>> + boot_hwcap = temp_hwcap;
>>> >>>>> + elf_hwcap = temp_hwcap;
>>> >>>>
>>> >>>> So we always set elf_hwcap to the capabilities of the previous cpu.
>>> >>>>
>>> >>>>> + temp_hwcap = 0;
>>> >>>>
>>> >>>> I think tmp_hwcap should be declared and initialized inside the outer loop
>>> >>>> instead having to manually reset it like this.
>>> >>>>
>>> >>>>> + }
>>> >>>>>
>>> >>>>> + elf_hwcap = boot_hwcap;
>>> >>>>
>>> >>>> And then reset it here to the boot cpu.
>>> >>>>
>>> >>>> Shoudn't we only report the features supported by all cores? Otherwise
>>> >>>> we'll still have problems if the boot cpu supports a feature, but not
>>> >>>> others.
>>> >>>>
>>> >>>
>>> >>> Hmm. The other side of the argument is boot cpu does have a feature that
>>> >>> is not supported by other hart that didn't even boot.
>>> >>> The user space may execute something based on boot cpu capability but
>>> >>> that won't be enabled.
>>> >>>
>>> >>> At least, in this way we know that we are compatible completely with
>>> >>> boot cpu capabilities. Thoughts ?
>>> >>
>>> >> There is one example on the market, e.g., Samsung Exynos 9810.
>>> >>
>>> >> Mongoose 3 (big cores) only support ARMv8.0, while Cortex-A55
>>> >> (little ones) support ARMv8.2 (and that brings atomics support).
>>> >> I think, it's the only ARM SOC that supports different ISA extensions
>>> >> between cores on the same package.
>>> >>
>>> >> Kernel scheduler doesn't know that big cores are missing atomics
>>> >> support or that applications needs it and moves the thread
>>> >> resulting in illegal instruction.
>>> >>
>>> >> E.g., see Golang issue: https://github.com/golang/go/issues/28431
>>> >>
>>> >> I also recall Jon Masters (Computer Architect at Red Hat) advocating
>>> >> against having cores with mismatched capabilities on the server market.
>>> >>
>>> >> It just causes more problems down the line.
>>> > > IMO the best bet is to only put extensions in HWCAP that are supported by all
>>> > the harts that userspace will be scheduled on.
>>> > Fair enough. Instead of setting HWCAP in setup_arch() once, we can set it only for boot cpu. It will be updated after every cpu comes up online.
>>>
>>> Thus, HWCAP will consists all extensions supported by all cpus that are online currently.
>>
>> You must thus prevent CPUs that have a different set of capabilities
>> from coming up late (once userspace has started).
>
> and we have no way to do that. I'd prefer if we just looked through the entire device tree and only showed userspace the features that are on every possible CPU from the start. Otherwise the HWCAP will shift around during a userspace run, which seems odd.

ok. I will do this for now. Once we have cpu hotplug enabled, we can revisit this.

Regards,
Atish