2019-02-12 11:28:18

by Atish Patra

[permalink] [raw]
Subject: [v4 PATCH 0/8] Various SMP related fixes

The existing upstream kernel doesn't boot for non-smp configuration.
This patch series address various issues with non-smp configurations.

The patch series is based on 5.0-rc5 + Johan's below mentioned patch
series. Tested on both QEMU and HiFive Unleashed board using both
OpenSBI & BBL.

https://lore.kernel.org/lkml/[email protected]/

Changes from v3->v4
1. Fixed commit text length issues.
2. Updated hwcap patch to use common capabilities of all harts.
3. Rebased on Johan's patch series.

Changes from v2->v3

1. Fixed spurious white space.
2. Added lockdep for smpboot completion variable.
2. Added a sanity check for hwcap.

Changes from v1->v2

1. Move the cpuid to hartd id map to smp.c from setup.c
2. Split 3rd patch into several small patches based on
logical grouping.
3. Added a new patch that fixes an issue in hwcap query.
4. Changed the title of the patch series.

Atish Patra (8):
RISC-V: Do not wait indefinitely in __cpu_up
RISC-V: Move cpuid to hartid mapping to SMP.
RISC-V: Remove NR_CPUs check during hartid search from DT
RISC-V: Allow hartid-to-cpuid function to fail.
RISC-V: Compare cpuid with NR_CPUS before mapping.
clocksource/drivers/riscv: Add required checks during clock source
init
irqchip/irq-sifive-plic: Check and continue in case of an invalid
cpuid.
RISC-V: Assign hwcap as per comman capabilities.

arch/riscv/include/asm/smp.h | 18 ++++++++++++-----
arch/riscv/kernel/cpu.c | 4 ----
arch/riscv/kernel/cpufeature.c | 41 +++++++++++++++++++++------------------
arch/riscv/kernel/setup.c | 9 ---------
arch/riscv/kernel/smp.c | 10 +++++++++-
arch/riscv/kernel/smpboot.c | 20 ++++++++++++++++---
drivers/clocksource/timer-riscv.c | 23 +++++++++++++++++++---
drivers/irqchip/irq-sifive-plic.c | 5 +++++
8 files changed, 86 insertions(+), 44 deletions(-)

--
2.7.4



2019-02-12 11:26:33

by Atish Patra

[permalink] [raw]
Subject: [v4 PATCH 3/8] RISC-V: Remove NR_CPUs check during hartid search from DT

In non-smp configuration, hartid can be higher that NR_CPUS.
riscv_of_processor_hartid should not be compared to hartid to NR_CPUS in
that case. Moreover, this function checks all the DT properties of a
hart node. NR_CPUS comparison seems out of place.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
---
arch/riscv/kernel/cpu.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index d1d9bfd5..cf2fca12 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -34,10 +34,6 @@ int riscv_of_processor_hartid(struct device_node *node)
pr_warn("Found CPU without hart ID\n");
return -ENODEV;
}
- if (hart >= NR_CPUS) {
- pr_info("Found hart ID %d, which is above NR_CPUs. Disabling this hart\n", hart);
- return -ENODEV;
- }

if (!of_device_is_available(node)) {
pr_info("CPU with hartid=%d is not available\n", hart);
--
2.7.4


2019-02-12 11:26:34

by Atish Patra

[permalink] [raw]
Subject: [v4 PATCH 8/8] RISC-V: Assign hwcap as per comman capabilities.

Currently, we set hwcap based on first valid hart from DT. This may not
be correct always as that hart might not be current booting cpu or may
have a different capability.

Set hwcap as the capabilities supported by all possible harts with "okay"
status.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kernel/cpufeature.c | 41 ++++++++++++++++++++++-------------------
1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index e7a4701f..a1e4fb34 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -20,6 +20,7 @@
#include <linux/of.h>
#include <asm/processor.h>
#include <asm/hwcap.h>
+#include <asm/smp.h>

unsigned long elf_hwcap __read_mostly;
#ifdef CONFIG_FPU
@@ -42,28 +43,30 @@ void riscv_fill_hwcap(void)

elf_hwcap = 0;

- /*
- * We don't support running Linux on hertergenous ISA systems. For
- * now, we just check the ISA of the first "okay" processor.
- */
for_each_of_cpu_node(node) {
- if (riscv_of_processor_hartid(node) >= 0)
- break;
- }
- if (!node) {
- pr_warn("Unable to find \"cpu\" devicetree entry\n");
- return;
- }
+ unsigned long this_hwcap = 0;

- if (of_property_read_string(node, "riscv,isa", &isa)) {
- pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
- of_node_put(node);
- return;
- }
- of_node_put(node);
+ if (riscv_of_processor_hartid(node) < 0)
+ continue;

- for (i = 0; i < strlen(isa); ++i)
- elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
+ if (of_property_read_string(node, "riscv,isa", &isa)) {
+ pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
+ return;
+ }
+
+ for (i = 0; i < strlen(isa); ++i)
+ this_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
+
+ /*
+ * All "okay" hart should have same isa. Set HWCAP based on
+ * common capabilities of every "okay" hart, in case they don't
+ * have.
+ */
+ if (elf_hwcap)
+ elf_hwcap &= this_hwcap;
+ else
+ elf_hwcap = this_hwcap;
+ }

/* We don't support systems with F but without D, so mask those out
* here. */
--
2.7.4


2019-02-12 11:26:40

by Atish Patra

[permalink] [raw]
Subject: [v4 PATCH 6/8] clocksource/drivers/riscv: Add required checks during clock source init

Currently, clocksource registration happens for an invalid cpu for
non-smp kernels. This lead to kernel panic as cpu hotplug registration
will fail for those cpus. Moreover, riscv_hartid_to_cpuid can return
errors now.

Do not proceed if hartid or cpuid is invalid. Take this opprtunity to
print appropriate error strings for different failure cases.

Signed-off-by: Atish Patra <[email protected]>
---
drivers/clocksource/timer-riscv.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
index 43189220..e8163693 100644
--- a/drivers/clocksource/timer-riscv.c
+++ b/drivers/clocksource/timer-riscv.c
@@ -95,13 +95,30 @@ static int __init riscv_timer_init_dt(struct device_node *n)
struct clocksource *cs;

hartid = riscv_of_processor_hartid(n);
+ if (hartid < 0) {
+ pr_warn("Not valid hartid for node [%pOF] error = [%d]\n",
+ n, hartid);
+ return hartid;
+ }
+
cpuid = riscv_hartid_to_cpuid(hartid);
+ if (cpuid < 0) {
+ pr_warn("Invalid cpuid for hartid [%d]\n", hartid);
+ return cpuid;
+ }

if (cpuid != smp_processor_id())
return 0;

+ pr_info("%s: Registering clocksource cpuid [%d] hartid [%d]\n",
+ __func__, cpuid, hartid);
cs = per_cpu_ptr(&riscv_clocksource, cpuid);
- clocksource_register_hz(cs, riscv_timebase);
+ error = clocksource_register_hz(cs, riscv_timebase);
+ if (error) {
+ pr_err("RISCV timer register failed [%d] for cpu = [%d]\n",
+ error, cpuid);
+ return error;
+ }

sched_clock_register(riscv_sched_clock,
BITS_PER_LONG, riscv_timebase);
@@ -110,8 +127,8 @@ static int __init riscv_timer_init_dt(struct device_node *n)
"clockevents/riscv/timer:starting",
riscv_timer_starting_cpu, riscv_timer_dying_cpu);
if (error)
- pr_err("RISCV timer register failed [%d] for cpu = [%d]\n",
- error, cpuid);
+ pr_err("cpu hp setup state failed for RISCV timer [%d]\n",
+ error);
return error;
}

--
2.7.4


2019-02-12 11:26:47

by Atish Patra

[permalink] [raw]
Subject: [v4 PATCH 5/8] RISC-V: Compare cpuid with NR_CPUS before mapping.

We should never have a cpuid greater that NR_CPUS. Compare with NR_CPUS
before creating the mapping between logical and physical CPU ids. This
is also mandatory as NR_CPUS check is removed from
riscv_of_processor_hartid.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
arch/riscv/kernel/smpboot.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index d369b669..eb533b5c 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -66,6 +66,11 @@ void __init setup_smp(void)
found_boot_cpu = 1;
continue;
}
+ if (cpuid >= NR_CPUS) {
+ pr_warn("Invalid cpuid [%d] for hartid [%d]\n",
+ cpuid, hart);
+ break;
+ }

cpuid_to_hartid_map(cpuid) = hart;
set_cpu_possible(cpuid, true);
--
2.7.4


2019-02-12 11:27:24

by Atish Patra

[permalink] [raw]
Subject: [v4 PATCH 7/8] irqchip/irq-sifive-plic: Check and continue in case of an invalid cpuid.

riscv_hartid_to_cpuid can return invalid cpuid for a hart that is
present in DT but was never brought up.

Print the appropriate warning message and continue.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
drivers/irqchip/irq-sifive-plic.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 357e9daf..254ecd76 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -237,6 +237,11 @@ static int __init plic_init(struct device_node *node,
}

cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ pr_warn("Invalid cpuid for context %d\n", i);
+ continue;
+ }
+
handler = per_cpu_ptr(&plic_handlers, cpu);
handler->present = true;
handler->ctxid = i;
--
2.7.4


2019-02-12 11:27:27

by Atish Patra

[permalink] [raw]
Subject: [v4 PATCH 1/8] RISC-V: Do not wait indefinitely in __cpu_up

In SMP path, __cpu_up waits for other CPU to come online indefinitely.
This is wrong as other CPU might be disabled in machine mode and
possible CPU is set to the cpus present in DT.

Introduce a completion variable and waits only for a second.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
arch/riscv/kernel/smpboot.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 6e281325..d369b669 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -39,6 +39,7 @@

void *__cpu_up_stack_pointer[NR_CPUS];
void *__cpu_up_task_pointer[NR_CPUS];
+static DECLARE_COMPLETION(cpu_running);

void __init smp_prepare_boot_cpu(void)
{
@@ -77,6 +78,7 @@ void __init setup_smp(void)

int __cpu_up(unsigned int cpu, struct task_struct *tidle)
{
+ int ret = 0;
int hartid = cpuid_to_hartid_map(cpu);
tidle->thread_info.cpu = cpu;

@@ -92,10 +94,16 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
task_stack_page(tidle) + THREAD_SIZE);
WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);

- while (!cpu_online(cpu))
- cpu_relax();
+ lockdep_assert_held(&cpu_running);
+ wait_for_completion_timeout(&cpu_running,
+ msecs_to_jiffies(1000));

- return 0;
+ if (!cpu_online(cpu)) {
+ pr_crit("CPU%u: failed to come online\n", cpu);
+ ret = -EIO;
+ }
+
+ return ret;
}

void __init smp_cpus_done(unsigned int max_cpus)
@@ -121,6 +129,7 @@ asmlinkage void __init smp_callin(void)
* a local TLB flush right now just in case.
*/
local_flush_tlb_all();
+ complete(&cpu_running);
/*
* Disable preemption before enabling interrupts, so we don't try to
* schedule a CPU that hasn't actually started yet.
--
2.7.4


2019-02-12 11:27:29

by Atish Patra

[permalink] [raw]
Subject: [v4 PATCH 4/8] RISC-V: Allow hartid-to-cpuid function to fail.

It is perfectly okay to call riscv_hartid_to_cpuid for a hartid that is
not mapped with an CPU id. It can happen if the calling functions
retrieves the hartid from DT. However, that hartid was never brought
online by the firmware or kernel for any reasons.

No need to BUG() in the above case. A negative error return is
sufficient and the calling function should check for the return value
always.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
arch/riscv/kernel/smp.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index b69883c6..ca99f0fb 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -60,7 +60,6 @@ int riscv_hartid_to_cpuid(int hartid)
return i;

pr_err("Couldn't find cpu id for hartid [%d]\n", hartid);
- BUG();
return i;
}

--
2.7.4


2019-02-12 11:28:06

by Atish Patra

[permalink] [raw]
Subject: [v4 PATCH 2/8] RISC-V: Move cpuid to hartid mapping to SMP.

Currently, logical CPU id to physical hartid mapping is defined for both
smp and non-smp configurations. This is not required as we need this
only for smp configuration. The mapping function can define directly
boot_cpu_hartid for non-smp use case.

The reverse mapping function i.e. hartid to cpuid can be called for any
valid but not booted harts. So it should return default cpu 0 only if it
is a boot hartid.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
arch/riscv/include/asm/smp.h | 18 +++++++++++++-----
arch/riscv/kernel/setup.c | 9 ---------
arch/riscv/kernel/smp.c | 9 +++++++++
3 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 41aa73b4..636a934f 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -19,16 +19,17 @@
#include <linux/thread_info.h>

#define INVALID_HARTID ULONG_MAX
+
+struct seq_file;
+extern unsigned long boot_cpu_hartid;
+
+#ifdef CONFIG_SMP
/*
* Mapping between linux logical cpu index and hartid.
*/
extern unsigned long __cpuid_to_hartid_map[NR_CPUS];
#define cpuid_to_hartid_map(cpu) __cpuid_to_hartid_map[cpu]

-struct seq_file;
-
-#ifdef CONFIG_SMP
-
/* print IPI stats */
void show_ipi_stats(struct seq_file *p, int prec);

@@ -58,7 +59,14 @@ static inline void show_ipi_stats(struct seq_file *p, int prec)

static inline int riscv_hartid_to_cpuid(int hartid)
{
- return 0;
+ if (hartid == boot_cpu_hartid)
+ return 0;
+
+ return -1;
+}
+static inline unsigned long cpuid_to_hartid_map(int cpu)
+{
+ return boot_cpu_hartid;
}

static inline void riscv_cpuid_to_hartid_mask(const struct cpumask *in,
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index fb09e013..61c81616 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -61,15 +61,6 @@ EXPORT_SYMBOL(empty_zero_page);
atomic_t hart_lottery;
unsigned long boot_cpu_hartid;

-unsigned long __cpuid_to_hartid_map[NR_CPUS] = {
- [0 ... NR_CPUS-1] = INVALID_HARTID
-};
-
-void __init smp_setup_processor_id(void)
-{
- cpuid_to_hartid_map(0) = boot_cpu_hartid;
-}
-
#ifdef CONFIG_BLK_DEV_INITRD
static void __init setup_initrd(void)
{
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 246635ea..b69883c6 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -36,6 +36,15 @@ enum ipi_message_type {
IPI_MAX
};

+unsigned long __cpuid_to_hartid_map[NR_CPUS] = {
+ [0 ... NR_CPUS-1] = INVALID_HARTID
+};
+
+void __init smp_setup_processor_id(void)
+{
+ cpuid_to_hartid_map(0) = boot_cpu_hartid;
+}
+
/* A collection of single bit ipi messages. */
static struct {
unsigned long stats[IPI_MAX] ____cacheline_aligned;
--
2.7.4


2019-02-12 11:31:21

by Johan Hovold

[permalink] [raw]
Subject: Re: [v4 PATCH 8/8] RISC-V: Assign hwcap as per comman capabilities.

On Tue, Feb 12, 2019 at 03:10:12AM -0800, Atish Patra wrote:
> Currently, we set hwcap based on first valid hart from DT. This may not
> be correct always as that hart might not be current booting cpu or may
> have a different capability.
>
> Set hwcap as the capabilities supported by all possible harts with "okay"
> status.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/kernel/cpufeature.c | 41 ++++++++++++++++++++++-------------------
> 1 file changed, 22 insertions(+), 19 deletions(-)
>
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index e7a4701f..a1e4fb34 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -20,6 +20,7 @@
> #include <linux/of.h>
> #include <asm/processor.h>
> #include <asm/hwcap.h>
> +#include <asm/smp.h>
>
> unsigned long elf_hwcap __read_mostly;
> #ifdef CONFIG_FPU
> @@ -42,28 +43,30 @@ void riscv_fill_hwcap(void)
>
> elf_hwcap = 0;
>
> - /*
> - * We don't support running Linux on hertergenous ISA systems. For
> - * now, we just check the ISA of the first "okay" processor.
> - */
> for_each_of_cpu_node(node) {
> - if (riscv_of_processor_hartid(node) >= 0)
> - break;
> - }
> - if (!node) {
> - pr_warn("Unable to find \"cpu\" devicetree entry\n");
> - return;
> - }
> + unsigned long this_hwcap = 0;
>
> - if (of_property_read_string(node, "riscv,isa", &isa)) {
> - pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
> - of_node_put(node);
> - return;
> - }
> - of_node_put(node);
> + if (riscv_of_processor_hartid(node) < 0)
> + continue;
>
> - for (i = 0; i < strlen(isa); ++i)
> - elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
> + if (of_property_read_string(node, "riscv,isa", &isa)) {
> + pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
> + return;

Did you want "continue" here to continue processing the other harts?

Note that you currently leak the device node when returning.

Johan

2019-02-12 20:10:47

by Atish Patra

[permalink] [raw]
Subject: Re: [v4 PATCH 8/8] RISC-V: Assign hwcap as per comman capabilities.

On 2/12/19 3:25 AM, Johan Hovold wrote:
> On Tue, Feb 12, 2019 at 03:10:12AM -0800, Atish Patra wrote:
>> Currently, we set hwcap based on first valid hart from DT. This may not
>> be correct always as that hart might not be current booting cpu or may
>> have a different capability.
>>
>> Set hwcap as the capabilities supported by all possible harts with "okay"
>> status.
>>
>> Signed-off-by: Atish Patra <[email protected]>
>> ---
>> arch/riscv/kernel/cpufeature.c | 41 ++++++++++++++++++++++-------------------
>> 1 file changed, 22 insertions(+), 19 deletions(-)
>>
>> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
>> index e7a4701f..a1e4fb34 100644
>> --- a/arch/riscv/kernel/cpufeature.c
>> +++ b/arch/riscv/kernel/cpufeature.c
>> @@ -20,6 +20,7 @@
>> #include <linux/of.h>
>> #include <asm/processor.h>
>> #include <asm/hwcap.h>
>> +#include <asm/smp.h>
>>
>> unsigned long elf_hwcap __read_mostly;
>> #ifdef CONFIG_FPU
>> @@ -42,28 +43,30 @@ void riscv_fill_hwcap(void)
>>
>> elf_hwcap = 0;
>>
>> - /*
>> - * We don't support running Linux on hertergenous ISA systems. For
>> - * now, we just check the ISA of the first "okay" processor.
>> - */
>> for_each_of_cpu_node(node) {
>> - if (riscv_of_processor_hartid(node) >= 0)
>> - break;
>> - }
>> - if (!node) {
>> - pr_warn("Unable to find \"cpu\" devicetree entry\n");
>> - return;
>> - }
>> + unsigned long this_hwcap = 0;
>>
>> - if (of_property_read_string(node, "riscv,isa", &isa)) {
>> - pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>> - of_node_put(node);
>> - return;
>> - }
>> - of_node_put(node);
>> + if (riscv_of_processor_hartid(node) < 0)
>> + continue;
>>

>> - for (i = 0; i < strlen(isa); ++i)
>> - elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
>> + if (of_property_read_string(node, "riscv,isa", &isa)) {
>> + pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>> + return;
>
> Did you want "continue" here to continue processing the other harts?
>

Hmm. If a cpu node doesn't have isa in DT, that means DT is wrong. A
"continue" here will let user space use other harts just with a warning
message?

Returning here will not set elf_hwcap which forces the user to fix the
DT. I am not sure what should be the defined behavior in this case.

Any thoughts ?
> Note that you currently leak the device node when returning.
>
Ahh yes. I will fix it if we continue to return in the error case.

Regards,
Atish
> Johan
>


2019-02-13 09:42:19

by Anup Patel

[permalink] [raw]
Subject: Re: [v4 PATCH 6/8] clocksource/drivers/riscv: Add required checks during clock source init

On Tue, Feb 12, 2019 at 4:40 PM Atish Patra <[email protected]> wrote:
>
> Currently, clocksource registration happens for an invalid cpu for
> non-smp kernels. This lead to kernel panic as cpu hotplug registration
> will fail for those cpus. Moreover, riscv_hartid_to_cpuid can return
> errors now.
>
> Do not proceed if hartid or cpuid is invalid. Take this opprtunity to

s/opprtunity/opportunity

Otherwise, looks good to me.

Reviewed-by: Anup Patel <[email protected]>

Regards,
Anup

2019-02-13 13:12:00

by Johan Hovold

[permalink] [raw]
Subject: Re: [v4 PATCH 8/8] RISC-V: Assign hwcap as per comman capabilities.

On Tue, Feb 12, 2019 at 11:58:10AM -0800, Atish Patra wrote:
> On 2/12/19 3:25 AM, Johan Hovold wrote:
> > On Tue, Feb 12, 2019 at 03:10:12AM -0800, Atish Patra wrote:
> >> Currently, we set hwcap based on first valid hart from DT. This may not
> >> be correct always as that hart might not be current booting cpu or may
> >> have a different capability.
> >>
> >> Set hwcap as the capabilities supported by all possible harts with "okay"
> >> status.
> >>
> >> Signed-off-by: Atish Patra <[email protected]>
> >> ---
> >> arch/riscv/kernel/cpufeature.c | 41 ++++++++++++++++++++++-------------------
> >> 1 file changed, 22 insertions(+), 19 deletions(-)
> >>
> >> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> >> index e7a4701f..a1e4fb34 100644
> >> --- a/arch/riscv/kernel/cpufeature.c
> >> +++ b/arch/riscv/kernel/cpufeature.c
> >> @@ -20,6 +20,7 @@
> >> #include <linux/of.h>
> >> #include <asm/processor.h>
> >> #include <asm/hwcap.h>
> >> +#include <asm/smp.h>
> >>
> >> unsigned long elf_hwcap __read_mostly;
> >> #ifdef CONFIG_FPU
> >> @@ -42,28 +43,30 @@ void riscv_fill_hwcap(void)
> >>
> >> elf_hwcap = 0;
> >>
> >> - /*
> >> - * We don't support running Linux on hertergenous ISA systems. For
> >> - * now, we just check the ISA of the first "okay" processor.
> >> - */
> >> for_each_of_cpu_node(node) {
> >> - if (riscv_of_processor_hartid(node) >= 0)
> >> - break;
> >> - }
> >> - if (!node) {
> >> - pr_warn("Unable to find \"cpu\" devicetree entry\n");
> >> - return;
> >> - }
> >> + unsigned long this_hwcap = 0;
> >>
> >> - if (of_property_read_string(node, "riscv,isa", &isa)) {
> >> - pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
> >> - of_node_put(node);
> >> - return;
> >> - }
> >> - of_node_put(node);
> >> + if (riscv_of_processor_hartid(node) < 0)
> >> + continue;
> >>
>
> >> - for (i = 0; i < strlen(isa); ++i)
> >> - elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
> >> + if (of_property_read_string(node, "riscv,isa", &isa)) {
> >> + pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
> >> + return;
> >
> > Did you want "continue" here to continue processing the other harts?
>
> Hmm. If a cpu node doesn't have isa in DT, that means DT is wrong. A
> "continue" here will let user space use other harts just with a warning
> message?
>
> Returning here will not set elf_hwcap which forces the user to fix the
> DT. I am not sure what should be the defined behavior in this case.
>
> Any thoughts ?

The problem is that the proposed code might still set elf_hwcap -- it
all depends on the order of the hart nodes in dt (i.e. it will only be
left unset if the first node is malformed).

For that reason, I'd say it's better to either bail out (hard or at
least with elf_hwcap unset) or to continue processing the other nodes.

The former might break current systems with malformed dt, though.

And since the harts are expected to have the same ISA, continuing the
processing while warning and ignoring the malformed node might be
acceptable.

Johan

2019-02-13 13:14:35

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [v4 PATCH 6/8] clocksource/drivers/riscv: Add required checks during clock source init

On 12/02/2019 12:10, Atish Patra wrote:
> Currently, clocksource registration happens for an invalid cpu for
> non-smp kernels. This lead to kernel panic as cpu hotplug registration
> will fail for those cpus. Moreover, riscv_hartid_to_cpuid can return
> errors now.
>
> Do not proceed if hartid or cpuid is invalid. Take this opprtunity to
> print appropriate error strings for different failure cases.
>
> Signed-off-by: Atish Patra <[email protected]>

Do you want me to take it through my tree ?


> ---
> drivers/clocksource/timer-riscv.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
> index 43189220..e8163693 100644
> --- a/drivers/clocksource/timer-riscv.c
> +++ b/drivers/clocksource/timer-riscv.c
> @@ -95,13 +95,30 @@ static int __init riscv_timer_init_dt(struct device_node *n)
> struct clocksource *cs;
>
> hartid = riscv_of_processor_hartid(n);
> + if (hartid < 0) {
> + pr_warn("Not valid hartid for node [%pOF] error = [%d]\n",
> + n, hartid);
> + return hartid;
> + }
> +
> cpuid = riscv_hartid_to_cpuid(hartid);
> + if (cpuid < 0) {
> + pr_warn("Invalid cpuid for hartid [%d]\n", hartid);
> + return cpuid;
> + }
>
> if (cpuid != smp_processor_id())
> return 0;
>
> + pr_info("%s: Registering clocksource cpuid [%d] hartid [%d]\n",
> + __func__, cpuid, hartid);
> cs = per_cpu_ptr(&riscv_clocksource, cpuid);
> - clocksource_register_hz(cs, riscv_timebase);
> + error = clocksource_register_hz(cs, riscv_timebase);
> + if (error) {
> + pr_err("RISCV timer register failed [%d] for cpu = [%d]\n",
> + error, cpuid);
> + return error;
> + }
>
> sched_clock_register(riscv_sched_clock,
> BITS_PER_LONG, riscv_timebase);
> @@ -110,8 +127,8 @@ static int __init riscv_timer_init_dt(struct device_node *n)
> "clockevents/riscv/timer:starting",
> riscv_timer_starting_cpu, riscv_timer_dying_cpu);
> if (error)
> - pr_err("RISCV timer register failed [%d] for cpu = [%d]\n",
> - error, cpuid);
> + pr_err("cpu hp setup state failed for RISCV timer [%d]\n",
> + error);
> return error;
> }
>
>


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


2019-02-14 08:10:56

by Atish Patra

[permalink] [raw]
Subject: Re: [v4 PATCH 8/8] RISC-V: Assign hwcap as per comman capabilities.

On 2/13/19 12:44 AM, Johan Hovold wrote:
> On Tue, Feb 12, 2019 at 11:58:10AM -0800, Atish Patra wrote:
>> On 2/12/19 3:25 AM, Johan Hovold wrote:
>>> On Tue, Feb 12, 2019 at 03:10:12AM -0800, Atish Patra wrote:
>>>> Currently, we set hwcap based on first valid hart from DT. This may not
>>>> be correct always as that hart might not be current booting cpu or may
>>>> have a different capability.
>>>>
>>>> Set hwcap as the capabilities supported by all possible harts with "okay"
>>>> status.
>>>>
>>>> Signed-off-by: Atish Patra <[email protected]>
>>>> ---
>>>> arch/riscv/kernel/cpufeature.c | 41 ++++++++++++++++++++++-------------------
>>>> 1 file changed, 22 insertions(+), 19 deletions(-)
>>>>
>>>> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
>>>> index e7a4701f..a1e4fb34 100644
>>>> --- a/arch/riscv/kernel/cpufeature.c
>>>> +++ b/arch/riscv/kernel/cpufeature.c
>>>> @@ -20,6 +20,7 @@
>>>> #include <linux/of.h>
>>>> #include <asm/processor.h>
>>>> #include <asm/hwcap.h>
>>>> +#include <asm/smp.h>
>>>>
>>>> unsigned long elf_hwcap __read_mostly;
>>>> #ifdef CONFIG_FPU
>>>> @@ -42,28 +43,30 @@ void riscv_fill_hwcap(void)
>>>>
>>>> elf_hwcap = 0;
>>>>
>>>> - /*
>>>> - * We don't support running Linux on hertergenous ISA systems. For
>>>> - * now, we just check the ISA of the first "okay" processor.
>>>> - */
>>>> for_each_of_cpu_node(node) {
>>>> - if (riscv_of_processor_hartid(node) >= 0)
>>>> - break;
>>>> - }
>>>> - if (!node) {
>>>> - pr_warn("Unable to find \"cpu\" devicetree entry\n");
>>>> - return;
>>>> - }
>>>> + unsigned long this_hwcap = 0;
>>>>
>>>> - if (of_property_read_string(node, "riscv,isa", &isa)) {
>>>> - pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>>>> - of_node_put(node);
>>>> - return;
>>>> - }
>>>> - of_node_put(node);
>>>> + if (riscv_of_processor_hartid(node) < 0)
>>>> + continue;
>>>>
>>
>>>> - for (i = 0; i < strlen(isa); ++i)
>>>> - elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
>>>> + if (of_property_read_string(node, "riscv,isa", &isa)) {
>>>> + pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>>>> + return;
>>>
>>> Did you want "continue" here to continue processing the other harts?
>>
>> Hmm. If a cpu node doesn't have isa in DT, that means DT is wrong. A
>> "continue" here will let user space use other harts just with a warning
>> message?
>>
>> Returning here will not set elf_hwcap which forces the user to fix the
>> DT. I am not sure what should be the defined behavior in this case.
>>
>> Any thoughts ?
>
> The problem is that the proposed code might still set elf_hwcap -- it
> all depends on the order of the hart nodes in dt (i.e. it will only be
> left unset if the first node is malformed).
>
> For that reason, I'd say it's better to either bail out (hard or at
> least with elf_hwcap unset) or to continue processing the other nodes.
>
> The former might break current systems with malformed dt, though.
>
> And since the harts are expected to have the same ISA, continuing the
> processing while warning and ignoring the malformed node might be
> acceptable.
>

ok. I will change it to continue unless somebody else has objection.

Thanks for the review.

Regards,
Atish
> Johan
>


2019-02-14 10:07:20

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [v4 PATCH 6/8] clocksource/drivers/riscv: Add required checks during clock source init

On Wed, 13 Feb 2019 00:48:24 PST (-0800), [email protected] wrote:
> On 12/02/2019 12:10, Atish Patra wrote:
>> Currently, clocksource registration happens for an invalid cpu for
>> non-smp kernels. This lead to kernel panic as cpu hotplug registration
>> will fail for those cpus. Moreover, riscv_hartid_to_cpuid can return
>> errors now.
>>
>> Do not proceed if hartid or cpuid is invalid. Take this opprtunity to
>> print appropriate error strings for different failure cases.
>>
>> Signed-off-by: Atish Patra <[email protected]>
>
> Do you want me to take it through my tree ?

Works for me. Aside from the typo:

Reviewed-by: Palmer Dabbelt <[email protected]>

>
>
>> ---
>> drivers/clocksource/timer-riscv.c | 23 ++++++++++++++++++++---
>> 1 file changed, 20 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
>> index 43189220..e8163693 100644
>> --- a/drivers/clocksource/timer-riscv.c
>> +++ b/drivers/clocksource/timer-riscv.c
>> @@ -95,13 +95,30 @@ static int __init riscv_timer_init_dt(struct device_node *n)
>> struct clocksource *cs;
>>
>> hartid = riscv_of_processor_hartid(n);
>> + if (hartid < 0) {
>> + pr_warn("Not valid hartid for node [%pOF] error = [%d]\n",
>> + n, hartid);
>> + return hartid;
>> + }
>> +
>> cpuid = riscv_hartid_to_cpuid(hartid);
>> + if (cpuid < 0) {
>> + pr_warn("Invalid cpuid for hartid [%d]\n", hartid);
>> + return cpuid;
>> + }
>>
>> if (cpuid != smp_processor_id())
>> return 0;
>>
>> + pr_info("%s: Registering clocksource cpuid [%d] hartid [%d]\n",
>> + __func__, cpuid, hartid);

My only comment is that we flipped back and forth on whether or not to print
info on driver initialization. I like it because I have to go add the prints
in to debug things, but IIRC we removed it at some point (probably more than
once, as I have to keep adding these sorts of things back in and forget to
remove them).

The review stands either way :)

>> cs = per_cpu_ptr(&riscv_clocksource, cpuid);
>> - clocksource_register_hz(cs, riscv_timebase);
>> + error = clocksource_register_hz(cs, riscv_timebase);
>> + if (error) {
>> + pr_err("RISCV timer register failed [%d] for cpu = [%d]\n",
>> + error, cpuid);
>> + return error;
>> + }
>>
>> sched_clock_register(riscv_sched_clock,
>> BITS_PER_LONG, riscv_timebase);
>> @@ -110,8 +127,8 @@ static int __init riscv_timer_init_dt(struct device_node *n)
>> "clockevents/riscv/timer:starting",
>> riscv_timer_starting_cpu, riscv_timer_dying_cpu);
>> if (error)
>> - pr_err("RISCV timer register failed [%d] for cpu = [%d]\n",
>> - error, cpuid);
>> + pr_err("cpu hp setup state failed for RISCV timer [%d]\n",
>> + error);
>> return error;
>> }
>>
>>

2019-02-14 10:07:59

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [v4 PATCH 8/8] RISC-V: Assign hwcap as per comman capabilities.

On Wed, 13 Feb 2019 00:44:42 PST (-0800), [email protected] wrote:
> On Tue, Feb 12, 2019 at 11:58:10AM -0800, Atish Patra wrote:
>> On 2/12/19 3:25 AM, Johan Hovold wrote:
>> > On Tue, Feb 12, 2019 at 03:10:12AM -0800, Atish Patra wrote:
>> >> Currently, we set hwcap based on first valid hart from DT. This may not
>> >> be correct always as that hart might not be current booting cpu or may
>> >> have a different capability.
>> >>
>> >> Set hwcap as the capabilities supported by all possible harts with "okay"
>> >> status.
>> >>
>> >> Signed-off-by: Atish Patra <[email protected]>
>> >> ---
>> >> arch/riscv/kernel/cpufeature.c | 41 ++++++++++++++++++++++-------------------
>> >> 1 file changed, 22 insertions(+), 19 deletions(-)
>> >>
>> >> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
>> >> index e7a4701f..a1e4fb34 100644
>> >> --- a/arch/riscv/kernel/cpufeature.c
>> >> +++ b/arch/riscv/kernel/cpufeature.c
>> >> @@ -20,6 +20,7 @@
>> >> #include <linux/of.h>
>> >> #include <asm/processor.h>
>> >> #include <asm/hwcap.h>
>> >> +#include <asm/smp.h>
>> >>
>> >> unsigned long elf_hwcap __read_mostly;
>> >> #ifdef CONFIG_FPU
>> >> @@ -42,28 +43,30 @@ void riscv_fill_hwcap(void)
>> >>
>> >> elf_hwcap = 0;
>> >>
>> >> - /*
>> >> - * We don't support running Linux on hertergenous ISA systems. For
>> >> - * now, we just check the ISA of the first "okay" processor.
>> >> - */
>> >> for_each_of_cpu_node(node) {
>> >> - if (riscv_of_processor_hartid(node) >= 0)
>> >> - break;
>> >> - }
>> >> - if (!node) {
>> >> - pr_warn("Unable to find \"cpu\" devicetree entry\n");
>> >> - return;
>> >> - }
>> >> + unsigned long this_hwcap = 0;
>> >>
>> >> - if (of_property_read_string(node, "riscv,isa", &isa)) {
>> >> - pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>> >> - of_node_put(node);
>> >> - return;
>> >> - }
>> >> - of_node_put(node);
>> >> + if (riscv_of_processor_hartid(node) < 0)
>> >> + continue;
>> >>
>>
>> >> - for (i = 0; i < strlen(isa); ++i)
>> >> - elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
>> >> + if (of_property_read_string(node, "riscv,isa", &isa)) {
>> >> + pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>> >> + return;
>> >
>> > Did you want "continue" here to continue processing the other harts?
>>
>> Hmm. If a cpu node doesn't have isa in DT, that means DT is wrong. A
>> "continue" here will let user space use other harts just with a warning
>> message?
>>
>> Returning here will not set elf_hwcap which forces the user to fix the
>> DT. I am not sure what should be the defined behavior in this case.
>>
>> Any thoughts ?
>
> The problem is that the proposed code might still set elf_hwcap -- it
> all depends on the order of the hart nodes in dt (i.e. it will only be
> left unset if the first node is malformed).
>
> For that reason, I'd say it's better to either bail out (hard or at
> least with elf_hwcap unset) or to continue processing the other nodes.
>
> The former might break current systems with malformed dt, though.
>
> And since the harts are expected to have the same ISA, continuing the
> processing while warning and ignoring the malformed node might be
> acceptable.

Handling malformed device trees by providing a warning and an empty HWCAP seems
like the right way to go to me.

>
> Johan

2019-02-14 23:12:19

by Marc Zyngier

[permalink] [raw]
Subject: Re: [v4 PATCH 7/8] irqchip/irq-sifive-plic: Check and continue in case of an invalid cpuid.

On Tue, 12 Feb 2019 11:10:11 +0000,
Atish Patra <[email protected]> wrote:
>
> riscv_hartid_to_cpuid can return invalid cpuid for a hart that is
> present in DT but was never brought up.
>
> Print the appropriate warning message and continue.
>
> Signed-off-by: Atish Patra <[email protected]>
> Reviewed-by: Anup Patel <[email protected]>
> Reviewed-by: Christoph Hellwig <[email protected]>

Queued to irqchip-next.

Thanks,

M.

--
Jazz is not dead, it just smell funny.

2019-02-15 02:25:04

by Atish Patra

[permalink] [raw]
Subject: Re: [v4 PATCH 8/8] RISC-V: Assign hwcap as per comman capabilities.

On 2/13/19 4:38 PM, Palmer Dabbelt wrote:
> On Wed, 13 Feb 2019 00:44:42 PST (-0800), [email protected] wrote:
>> On Tue, Feb 12, 2019 at 11:58:10AM -0800, Atish Patra wrote:
>>> On 2/12/19 3:25 AM, Johan Hovold wrote:
>>>> On Tue, Feb 12, 2019 at 03:10:12AM -0800, Atish Patra wrote:
>>>>> Currently, we set hwcap based on first valid hart from DT. This may not
>>>>> be correct always as that hart might not be current booting cpu or may
>>>>> have a different capability.
>>>>>
>>>>> Set hwcap as the capabilities supported by all possible harts with "okay"
>>>>> status.
>>>>>
>>>>> Signed-off-by: Atish Patra <[email protected]>
>>>>> ---
>>>>> arch/riscv/kernel/cpufeature.c | 41 ++++++++++++++++++++++-------------------
>>>>> 1 file changed, 22 insertions(+), 19 deletions(-)
>>>>>
>>>>> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
>>>>> index e7a4701f..a1e4fb34 100644
>>>>> --- a/arch/riscv/kernel/cpufeature.c
>>>>> +++ b/arch/riscv/kernel/cpufeature.c
>>>>> @@ -20,6 +20,7 @@
>>>>> #include <linux/of.h>
>>>>> #include <asm/processor.h>
>>>>> #include <asm/hwcap.h>
>>>>> +#include <asm/smp.h>
>>>>>
>>>>> unsigned long elf_hwcap __read_mostly;
>>>>> #ifdef CONFIG_FPU
>>>>> @@ -42,28 +43,30 @@ void riscv_fill_hwcap(void)
>>>>>
>>>>> elf_hwcap = 0;
>>>>>
>>>>> - /*
>>>>> - * We don't support running Linux on hertergenous ISA systems. For
>>>>> - * now, we just check the ISA of the first "okay" processor.
>>>>> - */
>>>>> for_each_of_cpu_node(node) {
>>>>> - if (riscv_of_processor_hartid(node) >= 0)
>>>>> - break;
>>>>> - }
>>>>> - if (!node) {
>>>>> - pr_warn("Unable to find \"cpu\" devicetree entry\n");
>>>>> - return;
>>>>> - }
>>>>> + unsigned long this_hwcap = 0;
>>>>>
>>>>> - if (of_property_read_string(node, "riscv,isa", &isa)) {
>>>>> - pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>>>>> - of_node_put(node);
>>>>> - return;
>>>>> - }
>>>>> - of_node_put(node);
>>>>> + if (riscv_of_processor_hartid(node) < 0)
>>>>> + continue;
>>>>>
>>>
>>>>> - for (i = 0; i < strlen(isa); ++i)
>>>>> - elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
>>>>> + if (of_property_read_string(node, "riscv,isa", &isa)) {
>>>>> + pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>>>>> + return;
>>>>
>>>> Did you want "continue" here to continue processing the other harts?
>>>
>>> Hmm. If a cpu node doesn't have isa in DT, that means DT is wrong. A
>>> "continue" here will let user space use other harts just with a warning
>>> message?
>>>
>>> Returning here will not set elf_hwcap which forces the user to fix the
>>> DT. I am not sure what should be the defined behavior in this case.
>>>
>>> Any thoughts ?
>>
>> The problem is that the proposed code might still set elf_hwcap -- it
>> all depends on the order of the hart nodes in dt (i.e. it will only be
>> left unset if the first node is malformed).
>>
>> For that reason, I'd say it's better to either bail out (hard or at
>> least with elf_hwcap unset) or to continue processing the other nodes.
>>
>> The former might break current systems with malformed dt, though.
>>
>> And since the harts are expected to have the same ISA, continuing the
>> processing while warning and ignoring the malformed node might be
>> acceptable.
>
> Handling malformed device trees by providing a warning and an empty HWCAP seems
> like the right way to go to me.
>

If I understand you correctly, you prefer following things to be done in
case of malformed DT.

1. Print a warning message
2. Unset the entire HWCAP
3. Return without processing other harts. This will most likely result
in panic when user space starts.

Is this correct ?

Regards,
Atish
>>
>> Johan
>


2019-02-22 19:22:00

by Atish Patra

[permalink] [raw]
Subject: Re: [v4 PATCH 8/8] RISC-V: Assign hwcap as per comman capabilities.

On 2/14/19 3:49 PM, Atish Patra wrote:
> On 2/13/19 4:38 PM, Palmer Dabbelt wrote:
>> On Wed, 13 Feb 2019 00:44:42 PST (-0800), [email protected] wrote:
>>> On Tue, Feb 12, 2019 at 11:58:10AM -0800, Atish Patra wrote:
>>>> On 2/12/19 3:25 AM, Johan Hovold wrote:
>>>>> On Tue, Feb 12, 2019 at 03:10:12AM -0800, Atish Patra wrote:
>>>>>> Currently, we set hwcap based on first valid hart from DT. This may not
>>>>>> be correct always as that hart might not be current booting cpu or may
>>>>>> have a different capability.
>>>>>>
>>>>>> Set hwcap as the capabilities supported by all possible harts with "okay"
>>>>>> status.
>>>>>>
>>>>>> Signed-off-by: Atish Patra <[email protected]>
>>>>>> ---
>>>>>> arch/riscv/kernel/cpufeature.c | 41 ++++++++++++++++++++++-------------------
>>>>>> 1 file changed, 22 insertions(+), 19 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
>>>>>> index e7a4701f..a1e4fb34 100644
>>>>>> --- a/arch/riscv/kernel/cpufeature.c
>>>>>> +++ b/arch/riscv/kernel/cpufeature.c
>>>>>> @@ -20,6 +20,7 @@
>>>>>> #include <linux/of.h>
>>>>>> #include <asm/processor.h>
>>>>>> #include <asm/hwcap.h>
>>>>>> +#include <asm/smp.h>
>>>>>>
>>>>>> unsigned long elf_hwcap __read_mostly;
>>>>>> #ifdef CONFIG_FPU
>>>>>> @@ -42,28 +43,30 @@ void riscv_fill_hwcap(void)
>>>>>>
>>>>>> elf_hwcap = 0;
>>>>>>
>>>>>> - /*
>>>>>> - * We don't support running Linux on hertergenous ISA systems. For
>>>>>> - * now, we just check the ISA of the first "okay" processor.
>>>>>> - */
>>>>>> for_each_of_cpu_node(node) {
>>>>>> - if (riscv_of_processor_hartid(node) >= 0)
>>>>>> - break;
>>>>>> - }
>>>>>> - if (!node) {
>>>>>> - pr_warn("Unable to find \"cpu\" devicetree entry\n");
>>>>>> - return;
>>>>>> - }
>>>>>> + unsigned long this_hwcap = 0;
>>>>>>
>>>>>> - if (of_property_read_string(node, "riscv,isa", &isa)) {
>>>>>> - pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>>>>>> - of_node_put(node);
>>>>>> - return;
>>>>>> - }
>>>>>> - of_node_put(node);
>>>>>> + if (riscv_of_processor_hartid(node) < 0)
>>>>>> + continue;
>>>>>>
>>>>
>>>>>> - for (i = 0; i < strlen(isa); ++i)
>>>>>> - elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
>>>>>> + if (of_property_read_string(node, "riscv,isa", &isa)) {
>>>>>> + pr_warn("Unable to find \"riscv,isa\" devicetree entry\n");
>>>>>> + return;
>>>>>
>>>>> Did you want "continue" here to continue processing the other harts?
>>>>
>>>> Hmm. If a cpu node doesn't have isa in DT, that means DT is wrong. A
>>>> "continue" here will let user space use other harts just with a warning
>>>> message?
>>>>
>>>> Returning here will not set elf_hwcap which forces the user to fix the
>>>> DT. I am not sure what should be the defined behavior in this case.
>>>>
>>>> Any thoughts ?
>>>
>>> The problem is that the proposed code might still set elf_hwcap -- it
>>> all depends on the order of the hart nodes in dt (i.e. it will only be
>>> left unset if the first node is malformed).
>>>
>>> For that reason, I'd say it's better to either bail out (hard or at
>>> least with elf_hwcap unset) or to continue processing the other nodes.
>>>
>>> The former might break current systems with malformed dt, though.
>>>
>>> And since the harts are expected to have the same ISA, continuing the
>>> processing while warning and ignoring the malformed node might be
>>> acceptable.
>>
>> Handling malformed device trees by providing a warning and an empty HWCAP seems
>> like the right way to go to me.
>>
>
> If I understand you correctly, you prefer following things to be done in
> case of malformed DT.
>
> 1. Print a warning message
> 2. Unset the entire HWCAP
> 3. Return without processing other harts. This will most likely result
> in panic when user space starts.
>
> Is this correct ?
>

As per our offline discussion, we should let kernel avoid setting any
value for the cpu with incorrect DT entry and continue for other harts.
A warning is enough. This is fine as long as user space never see that hart.

As the hart enumeration depends on riscv_of_processor_hartid, the hart
with corrupted isa property will never boot. riscv_of_processor_hartid
will return -ENODEV if "riscv,isa" property is not present.

Moreover, the discussed conditional statement will not even executed
unless there is memory corruption or somebody corrupts the DT on the fly.

So we can continue with the patch as it is. I will just resend the
series (dropping driver patches) for easy merge.

Regards,
Atish
> Regards,
> Atish
>>>
>>> Johan
>>
>
>