2018-09-06 08:12:39

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 00/12] SMP cleanup and new features

This patch series has updated the assorted cleanup series by palmer.
The original cleanup patch series can be found here.
http://lists.infradead.org/pipermail/linux-riscv/2018-August/001232.html

It also implemented following smp related features.
Some of the work has been inspired from ARM64.

1. Decouple linux logical cpu ids from hardware cpu id
2. Support cpu hotplug.

Tested on QEMU & HighFive Unleashed board with/without SMP enabled.

Both the patch series have been combined to avoid conflicts as a lot
of common code is changed in both the series. I have mostly addressed
review comments and fixed checkpatch errors from palmer's series.

Palmer: I hope it's not a problem. I would be happy to drop the patches
if you want to take over.

v1->v2:

1. Dropped cpu_ops patch.
2. Moved back IRQ cause definitions to irq.h
3. Keep boot cpu hart id and assign zero as the cpu id for boot cpu.
4. Renamed cpu id and hart id correctly.

v2-v3:

1. Added cleanup patches from palmer.
2. Moved the hotplug related functions to it's own file.
3. Updated stub functions as per coding guidelines.
4. Renamed __cpu_logical_map to a more coherent name.

Atish Patra (5):
RISC-V: Disable preemption before enabling interrupts
RISC-V: User WRITE_ONCE instead of direct access
RISC-V: Add logical CPU indexing for RISC-V
RISC-V: Use Linux logical cpu number instead of hartid
RISC-V: Support cpu hotplug.

Palmer Dabbelt (7):
RISC-V: Don't set cacheinfo.{physical_line_partition,attributes}
RISC-V: Filter ISA and MMU values in cpuinfo
RISC-V: Comment on the TLB flush in smp_callin()
RISC-V: Provide a cleaner raw_smp_processor_id()
RISC-V: Rename riscv_of_processor_hart to riscv_of_processor_hartid
RISC-V: Rename im_okay_therefore_i_am to found_boot_cpu
RISC-V: Use mmgrab()

arch/riscv/Kconfig | 12 +++++-
arch/riscv/include/asm/irq.h | 1 +
arch/riscv/include/asm/processor.h | 2 +-
arch/riscv/include/asm/smp.h | 66 +++++++++++++++++++++++++-----
arch/riscv/include/asm/tlbflush.h | 16 ++++++--
arch/riscv/kernel/Makefile | 1 +
arch/riscv/kernel/cacheinfo.c | 7 ----
arch/riscv/kernel/cpu-hotplug.c | 72 +++++++++++++++++++++++++++++++++
arch/riscv/kernel/cpu.c | 83 ++++++++++++++++++++++++++++++++------
arch/riscv/kernel/head.S | 17 +++++++-
arch/riscv/kernel/irq.c | 24 +++++++++++
arch/riscv/kernel/setup.c | 27 ++++++++++++-
arch/riscv/kernel/smp.c | 49 +++++++++++++++-------
arch/riscv/kernel/smpboot.c | 53 ++++++++++++++++--------
arch/riscv/kernel/traps.c | 6 +--
drivers/clocksource/riscv_timer.c | 12 ++++--
drivers/irqchip/irq-sifive-plic.c | 10 +++--
17 files changed, 379 insertions(+), 79 deletions(-)
create mode 100644 arch/riscv/kernel/cpu-hotplug.c

--
2.7.4



2018-09-06 08:07:01

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 08/12] RISC-V: Use mmgrab()

From: Palmer Dabbelt <[email protected]>

commit f1f1007644ff ("mm: add new mmgrab() helper") added a
helper that we missed out on.

Signed-off-by: Palmer Dabbelt <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kernel/smpboot.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 4a232600..17e74831 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -30,6 +30,7 @@
#include <linux/irq.h>
#include <linux/of.h>
#include <linux/sched/task_stack.h>
+#include <linux/sched/mm.h>
#include <asm/irq.h>
#include <asm/mmu_context.h>
#include <asm/tlbflush.h>
@@ -101,7 +102,7 @@ asmlinkage void __init smp_callin(void)
struct mm_struct *mm = &init_mm;

/* All kernel threads share the same mm context. */
- atomic_inc(&mm->mm_count);
+ mmgrab(mm);
current->active_mm = mm;

trap_init();
--
2.7.4


2018-09-06 08:07:29

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 10/12] RISC-V: Add logical CPU indexing for RISC-V

Currently, both linux cpu id and hardware cpu id are same.
This is not recommended as it will lead to discontinuous cpu
indexing in Linux. Moreover, kdump kernel will run from CPU0
which would be absent if we follow existing scheme.

Implement a logical mapping between Linux cpu id and hardware
cpuid to decouple these two. Always mark the boot processor as
cpu0 and all other cpus get the logical cpu id based on their
booting order.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by : Anup Patel <[email protected]>
---
arch/riscv/include/asm/smp.h | 24 +++++++++++++++++++++++-
arch/riscv/kernel/setup.c | 4 ++++
arch/riscv/kernel/smp.c | 19 +++++++++++++++++++
3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 85d7619e..fce312ce 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -18,6 +18,13 @@
#include <linux/irqreturn.h>
#include <linux/thread_info.h>

+#define INVALID_HARTID ULONG_MAX
+/*
+ * Mapping between linux logical cpu index and hartid.
+ */
+extern unsigned long __cpuid_to_hardid_map[NR_CPUS];
+#define cpuid_to_hardid_map(cpu) __cpuid_to_hardid_map[cpu]
+
#ifdef CONFIG_SMP

/* SMP initialization hook for setup_arch */
@@ -29,12 +36,27 @@ void arch_send_call_function_ipi_mask(struct cpumask *mask);
/* Hook for the generic smp_call_function_single() routine. */
void arch_send_call_function_single_ipi(int cpu);

+int riscv_hartid_to_cpuid(int hartid);
+void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
+
/*
* Obtains the hart ID of the currently executing task. This relies on
* THREAD_INFO_IN_TASK, but we define that unconditionally.
*/
#define raw_smp_processor_id() (current_thread_info()->cpu)

-#endif /* CONFIG_SMP */
+#else
+
+static inline int riscv_hartid_to_cpuid(int hartid)
+{
+ return 0;
+}

+static inline void riscv_cpuid_to_hartid_mask(const struct cpumask *in,
+ struct cpumask *out)
+{
+ cpumask_set_cpu(cpuid_to_hardid_map(0), out);
+}
+
+#endif /* CONFIG_SMP */
#endif /* _ASM_RISCV_SMP_H */
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index db20dc63..eef1b1a6 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -82,6 +82,10 @@ EXPORT_SYMBOL(empty_zero_page);
/* The lucky hart to first increment this variable will boot the other cores */
atomic_t hart_lottery;

+unsigned long __cpuid_to_hardid_map[NR_CPUS] = {
+ [0 ... NR_CPUS-1] = INVALID_HARTID
+};
+
#ifdef CONFIG_BLK_DEV_INITRD
static void __init setup_initrd(void)
{
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 906fe21e..5aba0107 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -38,7 +38,26 @@ enum ipi_message_type {
IPI_MAX
};

+int riscv_hartid_to_cpuid(int hartid)
+{
+ int i = -1;
+
+ for (i = 0; i < NR_CPUS; i++)
+ if (cpuid_to_hardid_map(i) == hartid)
+ return i;
+
+ pr_err("Couldn't find cpu id for hartid [%d]\n", hartid);
+ BUG();
+ return i;
+}

+void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out)
+{
+ int cpu;
+
+ for_each_cpu(cpu, in)
+ cpumask_set_cpu(cpuid_to_hardid_map(cpu), out);
+}
/* Unsupported */
int setup_profiling_timer(unsigned int multiplier)
{
--
2.7.4


2018-09-06 08:07:45

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 04/12] RISC-V: Disable preemption before enabling interrupts

Currently, irq is enabled before preemption disabling happens.
If the scheduler fired right here and cpu is scheduled then it
may blow up.

Signed-off-by: Palmer Dabbelt <[email protected]>
[Atish: Commit text and code comment formatting update]
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kernel/smpboot.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 712e9ca8..670749ec 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -111,7 +111,11 @@ asmlinkage void __init smp_callin(void)
* a local TLB flush right now just in case.
*/
local_flush_tlb_all();
- local_irq_enable();
+ /*
+ * Disable preemption before enabling interrupts, so we don't try to
+ * schedule a CPU that hasn't actually started yet.
+ */
preempt_disable();
+ local_irq_enable();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
}
--
2.7.4


2018-09-06 08:07:59

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 11/12] RISC-V: Use Linux logical cpu number instead of hartid

Setup the cpu_logical_map during boot. Moreover, every SBI call
and PLIC context are based on the physical hartid. Use the logical
cpu to hartid mapping to pass correct hartid to respective functions.

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by : Anup Patel <[email protected]>
---
arch/riscv/include/asm/tlbflush.h | 16 +++++++++++++---
arch/riscv/kernel/cpu.c | 8 +++++---
arch/riscv/kernel/head.S | 4 +++-
arch/riscv/kernel/setup.c | 6 ++++++
arch/riscv/kernel/smp.c | 24 +++++++++++++++---------
arch/riscv/kernel/smpboot.c | 25 ++++++++++++++++---------
drivers/clocksource/riscv_timer.c | 12 ++++++++----
drivers/irqchip/irq-sifive-plic.c | 8 +++++---
8 files changed, 71 insertions(+), 32 deletions(-)

diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index 85c2d8ba..54fee0ca 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -16,6 +16,7 @@
#define _ASM_RISCV_TLBFLUSH_H

#include <linux/mm_types.h>
+#include <asm/smp.h>

/*
* Flush entire local TLB. 'sfence.vma' implicitly fences with the instruction
@@ -49,13 +50,22 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,

#include <asm/sbi.h>

+static inline void remote_sfence_vma(struct cpumask *cmask, unsigned long start,
+ unsigned long size)
+{
+ struct cpumask hmask;
+
+ cpumask_clear(&hmask);
+ riscv_cpuid_to_hartid_mask(cmask, &hmask);
+ sbi_remote_sfence_vma(hmask.bits, start, size);
+}
+
#define flush_tlb_all() sbi_remote_sfence_vma(NULL, 0, -1)
#define flush_tlb_page(vma, addr) flush_tlb_range(vma, addr, 0)
#define flush_tlb_range(vma, start, end) \
- sbi_remote_sfence_vma(mm_cpumask((vma)->vm_mm)->bits, \
- start, (end) - (start))
+ remote_sfence_vma(mm_cpumask((vma)->vm_mm), start, (end) - (start))
#define flush_tlb_mm(mm) \
- sbi_remote_sfence_vma(mm_cpumask(mm)->bits, 0, -1)
+ remote_sfence_vma(mm_cpumask(mm), 0, -1)

#endif /* CONFIG_SMP */

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 04c0bcf2..bb28f4ab 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -14,6 +14,7 @@
#include <linux/init.h>
#include <linux/seq_file.h>
#include <linux/of.h>
+#include <asm/smp.h>

/*
* Returns the hart ID of the given device tree node, or -1 if the device tree
@@ -138,11 +139,12 @@ static void c_stop(struct seq_file *m, void *v)

static int c_show(struct seq_file *m, void *v)
{
- unsigned long hart_id = (unsigned long)v - 1;
- struct device_node *node = of_get_cpu_node(hart_id, NULL);
+ unsigned long cpu_id = (unsigned long)v - 1;
+ struct device_node *node = of_get_cpu_node(cpuid_to_hardid_map(cpu_id),
+ NULL);
const char *compat, *isa, *mmu;

- seq_printf(m, "hart\t: %lu\n", hart_id);
+ seq_printf(m, "hart\t: %lu\n", cpu_id);
if (!of_property_read_string(node, "riscv,isa", &isa))
print_isa(m, isa);
if (!of_property_read_string(node, "mmu-type", &mmu))
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index c4d2c63f..711190d4 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -47,6 +47,8 @@ ENTRY(_start)
/* Save hart ID and DTB physical address */
mv s0, a0
mv s1, a1
+ la a2, boot_cpu_hartid
+ REG_S a0, (a2)

/* Initialize page tables and relocate to virtual addresses */
la sp, init_thread_union + THREAD_SIZE
@@ -55,7 +57,7 @@ ENTRY(_start)

/* Restore C environment */
la tp, init_task
- sw s0, TASK_TI_CPU(tp)
+ sw zero, TASK_TI_CPU(tp)

la sp, init_thread_union
li a0, ASM_THREAD_SIZE
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index eef1b1a6..a5fac1b7 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -81,11 +81,17 @@ EXPORT_SYMBOL(empty_zero_page);

/* The lucky hart to first increment this variable will boot the other cores */
atomic_t hart_lottery;
+unsigned long boot_cpu_hartid;

unsigned long __cpuid_to_hardid_map[NR_CPUS] = {
[0 ... NR_CPUS-1] = INVALID_HARTID
};

+void __init smp_setup_processor_id(void)
+{
+ cpuid_to_hardid_map(0) = boot_cpu_hartid;
+}
+
#ifdef CONFIG_BLK_DEV_INITRD
static void __init setup_initrd(void)
{
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 5aba0107..89d95866 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -97,14 +97,18 @@ void riscv_software_interrupt(void)
static void
send_ipi_message(const struct cpumask *to_whom, enum ipi_message_type operation)
{
- int i;
+ int cpuid, hartid;
+ struct cpumask hartid_mask;

+ cpumask_clear(&hartid_mask);
mb();
- for_each_cpu(i, to_whom)
- set_bit(operation, &ipi_data[i].bits);
-
+ for_each_cpu(cpuid, to_whom) {
+ set_bit(operation, &ipi_data[cpuid].bits);
+ hartid = cpuid_to_hardid_map(cpuid);
+ cpumask_set_cpu(hartid, &hartid_mask);
+ }
mb();
- sbi_send_ipi(cpumask_bits(to_whom));
+ sbi_send_ipi(cpumask_bits(&hartid_mask));
}

void arch_send_call_function_ipi_mask(struct cpumask *mask)
@@ -146,7 +150,7 @@ void smp_send_reschedule(int cpu)
void flush_icache_mm(struct mm_struct *mm, bool local)
{
unsigned int cpu;
- cpumask_t others, *mask;
+ cpumask_t others, hmask, *mask;

preempt_disable();

@@ -164,9 +168,11 @@ void flush_icache_mm(struct mm_struct *mm, bool local)
*/
cpumask_andnot(&others, mm_cpumask(mm), cpumask_of(cpu));
local |= cpumask_empty(&others);
- if (mm != current->active_mm || !local)
- sbi_remote_fence_i(others.bits);
- else {
+ if (mm != current->active_mm || !local) {
+ cpumask_clear(&hmask);
+ riscv_cpuid_to_hartid_mask(&others, &hmask);
+ sbi_remote_fence_i(hmask.bits);
+ } else {
/*
* It's assumed that at least one strongly ordered operation is
* performed on this hart between setting a hart's cpumask bit
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 1e478615..f44ae780 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -53,17 +53,23 @@ void __init setup_smp(void)
struct device_node *dn = NULL;
int hart;
bool found_boot_cpu = false;
+ int cpuid = 1;

while ((dn = of_find_node_by_type(dn, "cpu"))) {
hart = riscv_of_processor_hartid(dn);
- if (hart >= 0) {
- set_cpu_possible(hart, true);
- set_cpu_present(hart, true);
- if (hart == smp_processor_id()) {
- BUG_ON(found_boot_cpu);
- found_boot_cpu = true;
- }
+ if (hart < 0)
+ continue;
+
+ if (hart == cpuid_to_hardid_map(0)) {
+ BUG_ON(found_boot_cpu);
+ found_boot_cpu = 1;
+ continue;
}
+
+ cpuid_to_hardid_map(cpuid) = hart;
+ set_cpu_possible(cpuid, true);
+ set_cpu_present(cpuid, true);
+ cpuid++;
}

BUG_ON(!found_boot_cpu);
@@ -71,6 +77,7 @@ void __init setup_smp(void)

int __cpu_up(unsigned int cpu, struct task_struct *tidle)
{
+ int hartid = cpuid_to_hardid_map(cpu);
tidle->thread_info.cpu = cpu;

/*
@@ -81,9 +88,9 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
* the spinning harts that they can continue the boot process.
*/
smp_mb();
- WRITE_ONCE(__cpu_up_stack_pointer[cpu],
+ WRITE_ONCE(__cpu_up_stack_pointer[hartid],
task_stack_page(tidle) + THREAD_SIZE);
- WRITE_ONCE(__cpu_up_task_pointer[cpu], tidle);
+ WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);

while (!cpu_online(cpu))
cpu_relax();
diff --git a/drivers/clocksource/riscv_timer.c b/drivers/clocksource/riscv_timer.c
index ad7453fc..084e97dc 100644
--- a/drivers/clocksource/riscv_timer.c
+++ b/drivers/clocksource/riscv_timer.c
@@ -8,6 +8,7 @@
#include <linux/cpu.h>
#include <linux/delay.h>
#include <linux/irq.h>
+#include <asm/smp.h>
#include <asm/sbi.h>

/*
@@ -84,13 +85,16 @@ void riscv_timer_interrupt(void)

static int __init riscv_timer_init_dt(struct device_node *n)
{
- int cpu_id = riscv_of_processor_hartid(n), error;
+ int cpuid, hartid, error;
struct clocksource *cs;

- if (cpu_id != smp_processor_id())
+ hartid = riscv_of_processor_hartid(n);
+ cpuid = riscv_hartid_to_cpuid(hartid);
+
+ if (cpuid != smp_processor_id())
return 0;

- cs = per_cpu_ptr(&riscv_clocksource, cpu_id);
+ cs = per_cpu_ptr(&riscv_clocksource, cpuid);
clocksource_register_hz(cs, riscv_timebase);

error = cpuhp_setup_state(CPUHP_AP_RISCV_TIMER_STARTING,
@@ -98,7 +102,7 @@ static int __init riscv_timer_init_dt(struct device_node *n)
riscv_timer_starting_cpu, riscv_timer_dying_cpu);
if (error)
pr_err("RISCV timer register failed [%d] for cpu = [%d]\n",
- error, cpu_id);
+ error, cpuid);
return error;
}

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index c55eaa31..357e9daf 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -15,6 +15,7 @@
#include <linux/of_irq.h>
#include <linux/platform_device.h>
#include <linux/spinlock.h>
+#include <asm/smp.h>

/*
* This driver implements a version of the RISC-V PLIC with the actual layout
@@ -218,7 +219,7 @@ static int __init plic_init(struct device_node *node,
struct of_phandle_args parent;
struct plic_handler *handler;
irq_hw_number_t hwirq;
- int cpu;
+ int cpu, hartid;

if (of_irq_parse_one(node, i, &parent)) {
pr_err("failed to parse parent for context %d.\n", i);
@@ -229,12 +230,13 @@ static int __init plic_init(struct device_node *node,
if (parent.args[0] == -1)
continue;

- cpu = plic_find_hart_id(parent.np);
- if (cpu < 0) {
+ hartid = plic_find_hart_id(parent.np);
+ if (hartid < 0) {
pr_warn("failed to parse hart ID for context %d.\n", i);
continue;
}

+ cpu = riscv_hartid_to_cpuid(hartid);
handler = per_cpu_ptr(&plic_handlers, cpu);
handler->present = true;
handler->ctxid = i;
--
2.7.4


2018-09-06 08:08:00

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 06/12] RISC-V: Rename riscv_of_processor_hart to riscv_of_processor_hartid

From: Palmer Dabbelt <[email protected]>

It's a bit confusing exactly what this function does: it actually
returns the hartid of an OF processor node, failing with -1 on invalid
nodes. I've changed the name to _hartid() in order to make that a bit
more clear, as well as adding a comment.

Signed-off-by: Palmer Dabbelt <[email protected]>
[Atish: code comment formatting update]
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/processor.h | 2 +-
arch/riscv/kernel/cpu.c | 7 +++++--
arch/riscv/kernel/smpboot.c | 2 +-
drivers/clocksource/riscv_timer.c | 2 +-
drivers/irqchip/irq-sifive-plic.c | 2 +-
5 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 3fe4af81..50de774d 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -88,7 +88,7 @@ static inline void wait_for_interrupt(void)
}

struct device_node;
-extern int riscv_of_processor_hart(struct device_node *node);
+int riscv_of_processor_hartid(struct device_node *node);

extern void riscv_fill_hwcap(void);

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index b10cf054..04c0bcf2 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -15,8 +15,11 @@
#include <linux/seq_file.h>
#include <linux/of.h>

-/* Return -1 if not a valid hart */
-int riscv_of_processor_hart(struct device_node *node)
+/*
+ * Returns the hart ID of the given device tree node, or -1 if the device tree
+ * node isn't a RISC-V hart.
+ */
+int riscv_of_processor_hartid(struct device_node *node)
{
const char *isa, *status;
u32 hart;
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 670749ec..cfb0b02d 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -53,7 +53,7 @@ void __init setup_smp(void)
int hart, im_okay_therefore_i_am = 0;

while ((dn = of_find_node_by_type(dn, "cpu"))) {
- hart = riscv_of_processor_hart(dn);
+ hart = riscv_of_processor_hartid(dn);
if (hart >= 0) {
set_cpu_possible(hart, true);
set_cpu_present(hart, true);
diff --git a/drivers/clocksource/riscv_timer.c b/drivers/clocksource/riscv_timer.c
index 4e8b347e..ad7453fc 100644
--- a/drivers/clocksource/riscv_timer.c
+++ b/drivers/clocksource/riscv_timer.c
@@ -84,7 +84,7 @@ void riscv_timer_interrupt(void)

static int __init riscv_timer_init_dt(struct device_node *n)
{
- int cpu_id = riscv_of_processor_hart(n), error;
+ int cpu_id = riscv_of_processor_hartid(n), error;
struct clocksource *cs;

if (cpu_id != smp_processor_id())
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 532e9d68..c55eaa31 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -176,7 +176,7 @@ static int plic_find_hart_id(struct device_node *node)
{
for (; node; node = node->parent) {
if (of_device_is_compatible(node, "riscv"))
- return riscv_of_processor_hart(node);
+ return riscv_of_processor_hartid(node);
}

return -1;
--
2.7.4


2018-09-06 08:08:11

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 09/12] RISC-V: User WRITE_ONCE instead of direct access

The secondary harts spin on couple of per cpu variables until both of
these are non-zero so it's not necessary to have any ordering here.
However, WRITE_ONCE should be used to avoid tearing.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kernel/smpboot.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 17e74831..1e478615 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -81,8 +81,9 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
* the spinning harts that they can continue the boot process.
*/
smp_mb();
- __cpu_up_stack_pointer[cpu] = task_stack_page(tidle) + THREAD_SIZE;
- __cpu_up_task_pointer[cpu] = tidle;
+ WRITE_ONCE(__cpu_up_stack_pointer[cpu],
+ task_stack_page(tidle) + THREAD_SIZE);
+ WRITE_ONCE(__cpu_up_task_pointer[cpu], tidle);

while (!cpu_online(cpu))
cpu_relax();
--
2.7.4


2018-09-06 08:08:20

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 07/12] RISC-V: Rename im_okay_therefore_i_am to found_boot_cpu

From: Palmer Dabbelt <[email protected]>

The old name was a bit odd.

Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kernel/smpboot.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index cfb0b02d..4a232600 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -50,7 +50,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
void __init setup_smp(void)
{
struct device_node *dn = NULL;
- int hart, im_okay_therefore_i_am = 0;
+ int hart;
+ bool found_boot_cpu = false;

while ((dn = of_find_node_by_type(dn, "cpu"))) {
hart = riscv_of_processor_hartid(dn);
@@ -58,13 +59,13 @@ void __init setup_smp(void)
set_cpu_possible(hart, true);
set_cpu_present(hart, true);
if (hart == smp_processor_id()) {
- BUG_ON(im_okay_therefore_i_am);
- im_okay_therefore_i_am = 1;
+ BUG_ON(found_boot_cpu);
+ found_boot_cpu = true;
}
}
}

- BUG_ON(!im_okay_therefore_i_am);
+ BUG_ON(!found_boot_cpu);
}

int __cpu_up(unsigned int cpu, struct task_struct *tidle)
--
2.7.4


2018-09-06 08:08:50

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 02/12] RISC-V: Filter ISA and MMU values in cpuinfo

From: Palmer Dabbelt <[email protected]>

We shouldn't be directly passing device tree values to userspace, both
because there could be mistakes in device trees and because the kernel
doesn't support arbitrary ISAs.

Signed-off-by: Palmer Dabbelt <[email protected]>
[Atish: checkpatch fix and code comment formatting update]
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kernel/cpu.c | 68 ++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 61 insertions(+), 7 deletions(-)

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index ca6c81e5..b10cf054 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -58,6 +58,63 @@ int riscv_of_processor_hart(struct device_node *node)

#ifdef CONFIG_PROC_FS

+static void print_isa(struct seq_file *f, const char *orig_isa)
+{
+ static const char *ext = "mafdc";
+ const char *isa = orig_isa;
+ const char *e;
+
+ /*
+ * Linux doesn't support rv32e or rv128i, and we only support booting
+ * kernels on harts with the same ISA that the kernel is compiled for.
+ */
+#if defined(CONFIG_32BIT)
+ if (strncmp(isa, "rv32i", 5) != 0)
+ return;
+#elif defined(CONFIG_64BIT)
+ if (strncmp(isa, "rv64i", 5) != 0)
+ return;
+#endif
+
+ /* Print the base ISA, as we already know it's legal. */
+ seq_puts(f, "isa\t: ");
+ seq_write(f, isa, 5);
+ isa += 5;
+
+ /*
+ * Check the rest of the ISA string for valid extensions, printing those
+ * we find. RISC-V ISA strings define an order, so we only print the
+ * extension bits when they're in order.
+ */
+ for (e = ext; *e != '\0'; ++e) {
+ if (isa[0] == e[0]) {
+ seq_write(f, isa, 1);
+ isa++;
+ }
+ }
+
+ /*
+ * If we were given an unsupported ISA in the device tree then print
+ * a bit of info describing what went wrong.
+ */
+ if (isa[0] != '\0')
+ pr_info("unsupported ISA \"%s\" in device tree", orig_isa);
+}
+
+static void print_mmu(struct seq_file *f, const char *mmu_type)
+{
+#if defined(CONFIG_32BIT)
+ if (strcmp(mmu_type, "riscv,sv32") != 0)
+ return;
+#elif defined(CONFIG_64BIT)
+ if ((strcmp(mmu_type, "riscv,sv39") != 0)
+ && (strcmp(mmu_type, "riscv,sv48") != 0))
+ return;
+#endif
+
+ seq_printf(f, "mmu\t: %s\n", mmu_type+6);
+}
+
static void *c_start(struct seq_file *m, loff_t *pos)
{
*pos = cpumask_next(*pos - 1, cpu_online_mask);
@@ -83,13 +140,10 @@ static int c_show(struct seq_file *m, void *v)
const char *compat, *isa, *mmu;

seq_printf(m, "hart\t: %lu\n", hart_id);
- if (!of_property_read_string(node, "riscv,isa", &isa)
- && isa[0] == 'r'
- && isa[1] == 'v')
- seq_printf(m, "isa\t: %s\n", isa);
- if (!of_property_read_string(node, "mmu-type", &mmu)
- && !strncmp(mmu, "riscv,", 6))
- seq_printf(m, "mmu\t: %s\n", mmu+6);
+ if (!of_property_read_string(node, "riscv,isa", &isa))
+ print_isa(m, isa);
+ if (!of_property_read_string(node, "mmu-type", &mmu))
+ print_mmu(m, mmu);
if (!of_property_read_string(node, "compatible", &compat)
&& strcmp(compat, "riscv"))
seq_printf(m, "uarch\t: %s\n", compat);
--
2.7.4


2018-09-06 08:08:53

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 05/12] RISC-V: Provide a cleaner raw_smp_processor_id()

From: Palmer Dabbelt <[email protected]>

I'm not sure how I managed to miss this the first time, but this is much
better.

Signed-off-by: Palmer Dabbelt <[email protected]>
[Atish: code comment formatting and other fixes]
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/smp.h | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 36016845..85d7619e 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -14,13 +14,9 @@
#ifndef _ASM_RISCV_SMP_H
#define _ASM_RISCV_SMP_H

-/* This both needs asm-offsets.h and is used when generating it. */
-#ifndef GENERATING_ASM_OFFSETS
-#include <asm/asm-offsets.h>
-#endif
-
#include <linux/cpumask.h>
#include <linux/irqreturn.h>
+#include <linux/thread_info.h>

#ifdef CONFIG_SMP

@@ -34,12 +30,10 @@ void arch_send_call_function_ipi_mask(struct cpumask *mask);
void arch_send_call_function_single_ipi(int cpu);

/*
- * This is particularly ugly: it appears we can't actually get the definition
- * of task_struct here, but we need access to the CPU this task is running on.
- * Instead of using C we're using asm-offsets.h to get the current processor
- * ID.
+ * Obtains the hart ID of the currently executing task. This relies on
+ * THREAD_INFO_IN_TASK, but we define that unconditionally.
*/
-#define raw_smp_processor_id() (*((int*)((char*)get_current() + TASK_TI_CPU)))
+#define raw_smp_processor_id() (current_thread_info()->cpu)

#endif /* CONFIG_SMP */

--
2.7.4


2018-09-06 08:09:00

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 03/12] RISC-V: Comment on the TLB flush in smp_callin()

From: Palmer Dabbelt <[email protected]>

This isn't readily apparent from reading the code.

Signed-off-by: Palmer Dabbelt <[email protected]>
[Atish: code comment formatting update]
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kernel/smpboot.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 56abab6a..712e9ca8 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -106,6 +106,10 @@ asmlinkage void __init smp_callin(void)
trap_init();
notify_cpu_starting(smp_processor_id());
set_cpu_online(smp_processor_id(), 1);
+ /*
+ * Remote TLB flushes are ignored while the CPU is offline, so emit
+ * a local TLB flush right now just in case.
+ */
local_flush_tlb_all();
local_irq_enable();
preempt_disable();
--
2.7.4


2018-09-06 08:09:46

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 12/12] RISC-V: Support cpu hotplug.

This patch enable support for cpu hotplug in RISC-V.

In absence of generic cpu stop functions, WFI is used
to put the cpu in low power state during offline. An IPI
is sent to bring it out of WFI during online operation.

Tested both on QEMU and HighFive Unleashed board with
4 cpus. Test result follows.

$ echo 0 > /sys/devices/system/cpu/cpu2/online
[ 31.828562] CPU2: shutdown
$ cat /proc/cpuinfo
hart : 0
isa : rv64imafdc
mmu : sv39
uarch : sifive,rocket0

hart : 1
isa : rv64imafdc
mmu : sv39
uarch : sifive,rocket0

hart : 3
isa : rv64imafdc
mmu : sv39
uarch : sifive,rocket0

$ echo 0 > /sys/devices/system/cpu/cpu3/online
[ 52.968495] CPU3: shutdown
$ cat /proc/cpuinfo
hart : 0
isa : rv64imafdc
mmu : sv39
uarch : sifive,rocket0

hart : 2
isa : rv64imafdc
mmu : sv39
uarch : sifive,rocket0

$ echo 1 > /sys/devices/system/cpu/cpu3/online
[ 64.298250] CPU3: online
$ cat /proc/cpuinfo
hart : 0
isa : rv64imafdc
mmu : sv39
uarch : sifive,rocket0

hart : 1
isa : rv64imafdc
mmu : sv39
uarch : sifive,rocket0

hart : 3
isa : rv64imafdc
mmu : sv39
uarch : sifive,rocket0

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/Kconfig | 12 ++++++-
arch/riscv/include/asm/irq.h | 1 +
arch/riscv/include/asm/smp.h | 28 ++++++++++++++++
arch/riscv/kernel/Makefile | 1 +
arch/riscv/kernel/cpu-hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++
arch/riscv/kernel/head.S | 13 ++++++++
arch/riscv/kernel/irq.c | 24 ++++++++++++++
arch/riscv/kernel/setup.c | 17 +++++++++-
arch/riscv/kernel/smp.c | 8 +----
arch/riscv/kernel/smpboot.c | 7 ++--
arch/riscv/kernel/traps.c | 6 ++--
11 files changed, 175 insertions(+), 14 deletions(-)
create mode 100644 arch/riscv/kernel/cpu-hotplug.c

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index a3449802..ea2e617e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -21,7 +21,6 @@ config RISCV
select COMMON_CLK
select DMA_DIRECT_OPS
select GENERIC_CLOCKEVENTS
- select GENERIC_CPU_DEVICES
select GENERIC_IRQ_SHOW
select GENERIC_PCI_IOMAP
select GENERIC_STRNCPY_FROM_USER
@@ -167,6 +166,17 @@ config SMP

If you don't know what to do here, say N.

+config HOTPLUG_CPU
+ bool "Support for hot-pluggable CPUs"
+ depends on SMP
+ select GENERIC_IRQ_MIGRATION
+ help
+
+ Say Y here to experiment with turning CPUs off and on. CPUs
+ can be controlled through /sys/devices/system/cpu.
+
+ Say N if you want to disable CPU hotplug.
+
config NR_CPUS
int "Maximum number of CPUs (2-32)"
range 2 32
diff --git a/arch/riscv/include/asm/irq.h b/arch/riscv/include/asm/irq.h
index 996b6fbe..a873a72d 100644
--- a/arch/riscv/include/asm/irq.h
+++ b/arch/riscv/include/asm/irq.h
@@ -19,6 +19,7 @@

void riscv_timer_interrupt(void);
void riscv_software_interrupt(void);
+void wait_for_software_interrupt(void);

#include <asm-generic/irq.h>

diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index fce312ce..8145b865 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -25,8 +25,29 @@
extern unsigned long __cpuid_to_hardid_map[NR_CPUS];
#define cpuid_to_hardid_map(cpu) __cpuid_to_hardid_map[cpu]

+#if defined CONFIG_SMP && defined CONFIG_HOTPLUG_CPU
+void arch_send_call_wakeup_ipi(int cpu);
+bool can_hotplug_cpu(void);
+#else
+static inline bool can_hotplug_cpu(void)
+{
+ return 0;
+}
+static inline void arch_send_call_wakeup_ipi(int cpu) { }
+#endif
+
#ifdef CONFIG_SMP

+enum ipi_message_type {
+ IPI_RESCHEDULE,
+ IPI_CALL_FUNC,
+ IPI_CALL_WAKEUP,
+ IPI_MAX
+};
+
+void send_ipi_message(const struct cpumask *to_whom,
+ enum ipi_message_type operation);
+
/* SMP initialization hook for setup_arch */
void __init setup_smp(void);

@@ -45,6 +66,13 @@ void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
*/
#define raw_smp_processor_id() (current_thread_info()->cpu)

+#ifdef CONFIG_HOTPLUG_CPU
+int __cpu_disable(void);
+void __cpu_die(unsigned int cpu);
+void cpu_play_dead(void);
+void boot_sec_cpu(void);
+#endif /* CONFIG_HOTPLUG_CPU */
+
#else

static inline int riscv_hartid_to_cpuid(int hartid)
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index e1274fc0..62043204 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_SMP) += smpboot.o
obj-$(CONFIG_SMP) += smp.o
obj-$(CONFIG_MODULES) += module.o
obj-$(CONFIG_MODULE_SECTIONS) += module-sections.o
+obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o

obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o
obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o
diff --git a/arch/riscv/kernel/cpu-hotplug.c b/arch/riscv/kernel/cpu-hotplug.c
new file mode 100644
index 00000000..7a152972
--- /dev/null
+++ b/arch/riscv/kernel/cpu-hotplug.c
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 Western Digital Corporation or its affiliates.
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <linux/err.h>
+#include <linux/irq.h>
+#include <linux/cpu.h>
+#include <linux/sched/hotplug.h>
+#include <asm/irq.h>
+#include <asm/sbi.h>
+
+bool can_hotplug_cpu(void)
+{
+ return true;
+}
+
+void arch_cpu_idle_dead(void)
+{
+ cpu_play_dead();
+}
+
+/*
+ * __cpu_disable runs on the processor to be shutdown.
+ */
+int __cpu_disable(void)
+{
+ int ret = 0;
+ unsigned int cpu = smp_processor_id();
+
+ set_cpu_online(cpu, false);
+ irq_migrate_all_off_this_cpu();
+
+ return ret;
+}
+/*
+ * called on the thread which is asking for a CPU to be shutdown -
+ * waits until shutdown has completed, or it is timed out.
+ */
+void __cpu_die(unsigned int cpu)
+{
+ if (!cpu_wait_death(cpu, 5)) {
+ pr_err("CPU %u: didn't die\n", cpu);
+ return;
+ }
+ pr_notice("CPU%u: shutdown\n", cpu);
+ /*TODO: Do we need to verify is cpu is really dead */
+}
+
+/*
+ * Called from the idle thread for the CPU which has been shutdown.
+ *
+ */
+void cpu_play_dead(void)
+{
+ idle_task_exit();
+
+ (void)cpu_report_death();
+
+ /* Do not disable software interrupt to restart cpu after WFI */
+ csr_clear(sie, SIE_STIE | SIE_SEIE);
+ wait_for_software_interrupt();
+ boot_sec_cpu();
+}
+
+void arch_send_call_wakeup_ipi(int cpu)
+{
+ send_ipi_message(cpumask_of(cpu), IPI_CALL_WAKEUP);
+}
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 711190d4..07d9fb38 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -153,6 +153,19 @@ relocate:
j .Lsecondary_park
END(_start)

+#ifdef CONFIG_SMP
+.section .text
+.global boot_sec_cpu
+
+boot_sec_cpu:
+ /* clear all pending flags */
+ csrw sip, zero
+ /* Mask all interrupts */
+ csrw sie, zero
+ fence
+
+ tail smp_callin
+#endif
__PAGE_ALIGNED_BSS
/* Empty zero page */
.balign PAGE_SIZE
diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
index 0cfac48a..7e14b0d9 100644
--- a/arch/riscv/kernel/irq.c
+++ b/arch/riscv/kernel/irq.c
@@ -53,6 +53,30 @@ asmlinkage void __irq_entry do_IRQ(struct pt_regs *regs, unsigned long cause)
set_irq_regs(old_regs);
}

+/*
+ * This function doesn't return until a software interrupt is sent via IPI.
+ * Obviously, all the interrupts except software interrupt should be disabled
+ * before this function is called.
+ */
+void wait_for_software_interrupt(void)
+{
+ unsigned long sipval, sieval, scauseval;
+
+ /* clear all pending flags */
+ csr_write(sip, 0);
+ /* clear any previous scause data */
+ csr_write(scause, 0);
+
+ do {
+ wait_for_interrupt();
+ sipval = csr_read(sip);
+ sieval = csr_read(sie);
+ scauseval = csr_read(scause) & ~INTERRUPT_CAUSE_FLAG;
+ /* only break if wfi returns for an enabled interrupt */
+ } while ((sipval & sieval) == 0 &&
+ scauseval != INTERRUPT_CAUSE_SOFTWARE);
+}
+
void __init init_IRQ(void)
{
irqchip_init();
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index a5fac1b7..612f0c21 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -30,11 +30,11 @@
#include <linux/of_platform.h>
#include <linux/sched/task.h>
#include <linux/swiotlb.h>
+#include <linux/smp.h>

#include <asm/setup.h>
#include <asm/sections.h>
#include <asm/pgtable.h>
-#include <asm/smp.h>
#include <asm/sbi.h>
#include <asm/tlbflush.h>
#include <asm/thread_info.h>
@@ -82,6 +82,7 @@ EXPORT_SYMBOL(empty_zero_page);
/* The lucky hart to first increment this variable will boot the other cores */
atomic_t hart_lottery;
unsigned long boot_cpu_hartid;
+static DEFINE_PER_CPU(struct cpu, cpu_devices);

unsigned long __cpuid_to_hardid_map[NR_CPUS] = {
[0 ... NR_CPUS-1] = INVALID_HARTID
@@ -257,3 +258,17 @@ void __init setup_arch(char **cmdline_p)
riscv_fill_hwcap();
}

+static int __init topology_init(void)
+{
+ int i;
+
+ for_each_possible_cpu(i) {
+ struct cpu *cpu = &per_cpu(cpu_devices, i);
+
+ cpu->hotpluggable = can_hotplug_cpu();
+ register_cpu(cpu, i);
+ }
+
+ return 0;
+}
+subsys_initcall(topology_init);
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 89d95866..629456bb 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -32,12 +32,6 @@ static struct {
unsigned long bits ____cacheline_aligned;
} ipi_data[NR_CPUS] __cacheline_aligned;

-enum ipi_message_type {
- IPI_RESCHEDULE,
- IPI_CALL_FUNC,
- IPI_MAX
-};
-
int riscv_hartid_to_cpuid(int hartid)
{
int i = -1;
@@ -94,7 +88,7 @@ void riscv_software_interrupt(void)
}
}

-static void
+void
send_ipi_message(const struct cpumask *to_whom, enum ipi_message_type operation)
{
int cpuid, hartid;
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index f44ae780..a9138431 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -92,9 +92,12 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
task_stack_page(tidle) + THREAD_SIZE);
WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);

+ arch_send_call_wakeup_ipi(cpu);
while (!cpu_online(cpu))
cpu_relax();

+ pr_notice("CPU%u: online\n", cpu);
+
return 0;
}

@@ -105,7 +108,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
/*
* C entry point for a secondary processor.
*/
-asmlinkage void __init smp_callin(void)
+asmlinkage void smp_callin(void)
{
struct mm_struct *mm = &init_mm;

@@ -115,7 +118,7 @@ asmlinkage void __init smp_callin(void)

trap_init();
notify_cpu_starting(smp_processor_id());
- set_cpu_online(smp_processor_id(), 1);
+ set_cpu_online(smp_processor_id(), true);
/*
* Remote TLB flushes are ignored while the CPU is offline, so emit
* a local TLB flush right now just in case.
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 24a9333d..8b331619 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -153,7 +153,7 @@ int is_valid_bugaddr(unsigned long pc)
}
#endif /* CONFIG_GENERIC_BUG */

-void __init trap_init(void)
+void trap_init(void)
{
/*
* Set sup0 scratch register to 0, indicating to exception vector
@@ -162,6 +162,6 @@ void __init trap_init(void)
csr_write(sscratch, 0);
/* Set the exception vector address */
csr_write(stvec, &handle_exception);
- /* Enable all interrupts */
- csr_write(sie, -1);
+ /* Enable all interrupts but timer interrupt*/
+ csr_set(sie, SIE_SSIE | SIE_SEIE);
}
--
2.7.4


2018-09-06 08:09:51

by Atish Patra

[permalink] [raw]
Subject: [PATCH v3 01/12] RISC-V: Don't set cacheinfo.{physical_line_partition,attributes}

From: Palmer Dabbelt <[email protected]>

These are just hard coded in the RISC-V port, which doesn't make any
sense. We should probably be setting these from device tree entries
when they exist, but for now I think it's saner to just leave them all
as their default values.

Signed-off-by: Palmer Dabbelt <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Jeremy Linton <[email protected]>
---
arch/riscv/kernel/cacheinfo.c | 7 -------
1 file changed, 7 deletions(-)

diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index 0bc86e5f..cb35ffd8 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -22,13 +22,6 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
{
this_leaf->level = level;
this_leaf->type = type;
- /* not a sector cache */
- this_leaf->physical_line_partition = 1;
- /* TODO: Add to DTS */
- this_leaf->attributes =
- CACHE_WRITE_BACK
- | CACHE_READ_ALLOCATE
- | CACHE_WRITE_ALLOCATE;
}

static int __init_cache_level(unsigned int cpu)
--
2.7.4


2018-09-06 10:24:04

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v3 12/12] RISC-V: Support cpu hotplug.

Hi,

On Thu, Sep 06, 2018 at 01:05:35AM -0700, Atish Patra wrote:
> This patch enable support for cpu hotplug in RISC-V.
>
> In absence of generic cpu stop functions, WFI is used
> to put the cpu in low power state during offline. An IPI
> is sent to bring it out of WFI during online operation.

AFAICT, this doesn't actually return the CPU to firwmare, and the WFI and
return code are in-kernel. From experience on arm and arm64 I would *very*
strongly advise against this kind of pseudo-hotplug, as it causes many more
long-term problems than it solves.

For instance, this causes significant pain with kexec, since the prior kernel's
text must be kept around forever for the offline CPUs. Likewise with hibernate
suspend/resume.

Depending on how your architecture works, there can also be a number of
problems with cache/TLB maintenance for secondary CPUs which are unexpectedly
online.

I would suggest that someone should look at writing a FW standard for CPU
hotplug, akin to PSCI for arm/arm64. At minimum, you will want:

- A mechanism to determine the version of FW and/or features supported by the FW.

- a mechanism to hotplug-in a specific CPU to a runtime-specified address,
ideally with some unique token (e.g. so you can pass a pointer to its stack
or whatever).

- a mechanism to hotplug-out the calling CPU.

- a mechanism to determine when a specific CPU has been hotplugged out. This is
necessary for kexec and other things to be robust.

Thanks,
Mark.

>
> Tested both on QEMU and HighFive Unleashed board with
> 4 cpus. Test result follows.
>
> $ echo 0 > /sys/devices/system/cpu/cpu2/online
> [ 31.828562] CPU2: shutdown
> $ cat /proc/cpuinfo
> hart : 0
> isa : rv64imafdc
> mmu : sv39
> uarch : sifive,rocket0
>
> hart : 1
> isa : rv64imafdc
> mmu : sv39
> uarch : sifive,rocket0
>
> hart : 3
> isa : rv64imafdc
> mmu : sv39
> uarch : sifive,rocket0
>
> $ echo 0 > /sys/devices/system/cpu/cpu3/online
> [ 52.968495] CPU3: shutdown
> $ cat /proc/cpuinfo
> hart : 0
> isa : rv64imafdc
> mmu : sv39
> uarch : sifive,rocket0
>
> hart : 2
> isa : rv64imafdc
> mmu : sv39
> uarch : sifive,rocket0
>
> $ echo 1 > /sys/devices/system/cpu/cpu3/online
> [ 64.298250] CPU3: online
> $ cat /proc/cpuinfo
> hart : 0
> isa : rv64imafdc
> mmu : sv39
> uarch : sifive,rocket0
>
> hart : 1
> isa : rv64imafdc
> mmu : sv39
> uarch : sifive,rocket0
>
> hart : 3
> isa : rv64imafdc
> mmu : sv39
> uarch : sifive,rocket0
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/Kconfig | 12 ++++++-
> arch/riscv/include/asm/irq.h | 1 +
> arch/riscv/include/asm/smp.h | 28 ++++++++++++++++
> arch/riscv/kernel/Makefile | 1 +
> arch/riscv/kernel/cpu-hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++
> arch/riscv/kernel/head.S | 13 ++++++++
> arch/riscv/kernel/irq.c | 24 ++++++++++++++
> arch/riscv/kernel/setup.c | 17 +++++++++-
> arch/riscv/kernel/smp.c | 8 +----
> arch/riscv/kernel/smpboot.c | 7 ++--
> arch/riscv/kernel/traps.c | 6 ++--
> 11 files changed, 175 insertions(+), 14 deletions(-)
> create mode 100644 arch/riscv/kernel/cpu-hotplug.c
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index a3449802..ea2e617e 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -21,7 +21,6 @@ config RISCV
> select COMMON_CLK
> select DMA_DIRECT_OPS
> select GENERIC_CLOCKEVENTS
> - select GENERIC_CPU_DEVICES
> select GENERIC_IRQ_SHOW
> select GENERIC_PCI_IOMAP
> select GENERIC_STRNCPY_FROM_USER
> @@ -167,6 +166,17 @@ config SMP
>
> If you don't know what to do here, say N.
>
> +config HOTPLUG_CPU
> + bool "Support for hot-pluggable CPUs"
> + depends on SMP
> + select GENERIC_IRQ_MIGRATION
> + help
> +
> + Say Y here to experiment with turning CPUs off and on. CPUs
> + can be controlled through /sys/devices/system/cpu.
> +
> + Say N if you want to disable CPU hotplug.
> +
> config NR_CPUS
> int "Maximum number of CPUs (2-32)"
> range 2 32
> diff --git a/arch/riscv/include/asm/irq.h b/arch/riscv/include/asm/irq.h
> index 996b6fbe..a873a72d 100644
> --- a/arch/riscv/include/asm/irq.h
> +++ b/arch/riscv/include/asm/irq.h
> @@ -19,6 +19,7 @@
>
> void riscv_timer_interrupt(void);
> void riscv_software_interrupt(void);
> +void wait_for_software_interrupt(void);
>
> #include <asm-generic/irq.h>
>
> diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
> index fce312ce..8145b865 100644
> --- a/arch/riscv/include/asm/smp.h
> +++ b/arch/riscv/include/asm/smp.h
> @@ -25,8 +25,29 @@
> extern unsigned long __cpuid_to_hardid_map[NR_CPUS];
> #define cpuid_to_hardid_map(cpu) __cpuid_to_hardid_map[cpu]
>
> +#if defined CONFIG_SMP && defined CONFIG_HOTPLUG_CPU
> +void arch_send_call_wakeup_ipi(int cpu);
> +bool can_hotplug_cpu(void);
> +#else
> +static inline bool can_hotplug_cpu(void)
> +{
> + return 0;
> +}
> +static inline void arch_send_call_wakeup_ipi(int cpu) { }
> +#endif
> +
> #ifdef CONFIG_SMP
>
> +enum ipi_message_type {
> + IPI_RESCHEDULE,
> + IPI_CALL_FUNC,
> + IPI_CALL_WAKEUP,
> + IPI_MAX
> +};
> +
> +void send_ipi_message(const struct cpumask *to_whom,
> + enum ipi_message_type operation);
> +
> /* SMP initialization hook for setup_arch */
> void __init setup_smp(void);
>
> @@ -45,6 +66,13 @@ void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
> */
> #define raw_smp_processor_id() (current_thread_info()->cpu)
>
> +#ifdef CONFIG_HOTPLUG_CPU
> +int __cpu_disable(void);
> +void __cpu_die(unsigned int cpu);
> +void cpu_play_dead(void);
> +void boot_sec_cpu(void);
> +#endif /* CONFIG_HOTPLUG_CPU */
> +
> #else
>
> static inline int riscv_hartid_to_cpuid(int hartid)
> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
> index e1274fc0..62043204 100644
> --- a/arch/riscv/kernel/Makefile
> +++ b/arch/riscv/kernel/Makefile
> @@ -35,6 +35,7 @@ obj-$(CONFIG_SMP) += smpboot.o
> obj-$(CONFIG_SMP) += smp.o
> obj-$(CONFIG_MODULES) += module.o
> obj-$(CONFIG_MODULE_SECTIONS) += module-sections.o
> +obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o
>
> obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o
> obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o
> diff --git a/arch/riscv/kernel/cpu-hotplug.c b/arch/riscv/kernel/cpu-hotplug.c
> new file mode 100644
> index 00000000..7a152972
> --- /dev/null
> +++ b/arch/riscv/kernel/cpu-hotplug.c
> @@ -0,0 +1,72 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2018 Western Digital Corporation or its affiliates.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +#include <linux/sched.h>
> +#include <linux/err.h>
> +#include <linux/irq.h>
> +#include <linux/cpu.h>
> +#include <linux/sched/hotplug.h>
> +#include <asm/irq.h>
> +#include <asm/sbi.h>
> +
> +bool can_hotplug_cpu(void)
> +{
> + return true;
> +}
> +
> +void arch_cpu_idle_dead(void)
> +{
> + cpu_play_dead();
> +}
> +
> +/*
> + * __cpu_disable runs on the processor to be shutdown.
> + */
> +int __cpu_disable(void)
> +{
> + int ret = 0;
> + unsigned int cpu = smp_processor_id();
> +
> + set_cpu_online(cpu, false);
> + irq_migrate_all_off_this_cpu();
> +
> + return ret;
> +}
> +/*
> + * called on the thread which is asking for a CPU to be shutdown -
> + * waits until shutdown has completed, or it is timed out.
> + */
> +void __cpu_die(unsigned int cpu)
> +{
> + if (!cpu_wait_death(cpu, 5)) {
> + pr_err("CPU %u: didn't die\n", cpu);
> + return;
> + }
> + pr_notice("CPU%u: shutdown\n", cpu);
> + /*TODO: Do we need to verify is cpu is really dead */
> +}
> +
> +/*
> + * Called from the idle thread for the CPU which has been shutdown.
> + *
> + */
> +void cpu_play_dead(void)
> +{
> + idle_task_exit();
> +
> + (void)cpu_report_death();
> +
> + /* Do not disable software interrupt to restart cpu after WFI */
> + csr_clear(sie, SIE_STIE | SIE_SEIE);
> + wait_for_software_interrupt();
> + boot_sec_cpu();
> +}
> +
> +void arch_send_call_wakeup_ipi(int cpu)
> +{
> + send_ipi_message(cpumask_of(cpu), IPI_CALL_WAKEUP);
> +}
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index 711190d4..07d9fb38 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -153,6 +153,19 @@ relocate:
> j .Lsecondary_park
> END(_start)
>
> +#ifdef CONFIG_SMP
> +.section .text
> +.global boot_sec_cpu
> +
> +boot_sec_cpu:
> + /* clear all pending flags */
> + csrw sip, zero
> + /* Mask all interrupts */
> + csrw sie, zero
> + fence
> +
> + tail smp_callin
> +#endif
> __PAGE_ALIGNED_BSS
> /* Empty zero page */
> .balign PAGE_SIZE
> diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
> index 0cfac48a..7e14b0d9 100644
> --- a/arch/riscv/kernel/irq.c
> +++ b/arch/riscv/kernel/irq.c
> @@ -53,6 +53,30 @@ asmlinkage void __irq_entry do_IRQ(struct pt_regs *regs, unsigned long cause)
> set_irq_regs(old_regs);
> }
>
> +/*
> + * This function doesn't return until a software interrupt is sent via IPI.
> + * Obviously, all the interrupts except software interrupt should be disabled
> + * before this function is called.
> + */
> +void wait_for_software_interrupt(void)
> +{
> + unsigned long sipval, sieval, scauseval;
> +
> + /* clear all pending flags */
> + csr_write(sip, 0);
> + /* clear any previous scause data */
> + csr_write(scause, 0);
> +
> + do {
> + wait_for_interrupt();
> + sipval = csr_read(sip);
> + sieval = csr_read(sie);
> + scauseval = csr_read(scause) & ~INTERRUPT_CAUSE_FLAG;
> + /* only break if wfi returns for an enabled interrupt */
> + } while ((sipval & sieval) == 0 &&
> + scauseval != INTERRUPT_CAUSE_SOFTWARE);
> +}
> +
> void __init init_IRQ(void)
> {
> irqchip_init();
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index a5fac1b7..612f0c21 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -30,11 +30,11 @@
> #include <linux/of_platform.h>
> #include <linux/sched/task.h>
> #include <linux/swiotlb.h>
> +#include <linux/smp.h>
>
> #include <asm/setup.h>
> #include <asm/sections.h>
> #include <asm/pgtable.h>
> -#include <asm/smp.h>
> #include <asm/sbi.h>
> #include <asm/tlbflush.h>
> #include <asm/thread_info.h>
> @@ -82,6 +82,7 @@ EXPORT_SYMBOL(empty_zero_page);
> /* The lucky hart to first increment this variable will boot the other cores */
> atomic_t hart_lottery;
> unsigned long boot_cpu_hartid;
> +static DEFINE_PER_CPU(struct cpu, cpu_devices);
>
> unsigned long __cpuid_to_hardid_map[NR_CPUS] = {
> [0 ... NR_CPUS-1] = INVALID_HARTID
> @@ -257,3 +258,17 @@ void __init setup_arch(char **cmdline_p)
> riscv_fill_hwcap();
> }
>
> +static int __init topology_init(void)
> +{
> + int i;
> +
> + for_each_possible_cpu(i) {
> + struct cpu *cpu = &per_cpu(cpu_devices, i);
> +
> + cpu->hotpluggable = can_hotplug_cpu();
> + register_cpu(cpu, i);
> + }
> +
> + return 0;
> +}
> +subsys_initcall(topology_init);
> diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
> index 89d95866..629456bb 100644
> --- a/arch/riscv/kernel/smp.c
> +++ b/arch/riscv/kernel/smp.c
> @@ -32,12 +32,6 @@ static struct {
> unsigned long bits ____cacheline_aligned;
> } ipi_data[NR_CPUS] __cacheline_aligned;
>
> -enum ipi_message_type {
> - IPI_RESCHEDULE,
> - IPI_CALL_FUNC,
> - IPI_MAX
> -};
> -
> int riscv_hartid_to_cpuid(int hartid)
> {
> int i = -1;
> @@ -94,7 +88,7 @@ void riscv_software_interrupt(void)
> }
> }
>
> -static void
> +void
> send_ipi_message(const struct cpumask *to_whom, enum ipi_message_type operation)
> {
> int cpuid, hartid;
> diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
> index f44ae780..a9138431 100644
> --- a/arch/riscv/kernel/smpboot.c
> +++ b/arch/riscv/kernel/smpboot.c
> @@ -92,9 +92,12 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
> task_stack_page(tidle) + THREAD_SIZE);
> WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);
>
> + arch_send_call_wakeup_ipi(cpu);
> while (!cpu_online(cpu))
> cpu_relax();
>
> + pr_notice("CPU%u: online\n", cpu);
> +
> return 0;
> }
>
> @@ -105,7 +108,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
> /*
> * C entry point for a secondary processor.
> */
> -asmlinkage void __init smp_callin(void)
> +asmlinkage void smp_callin(void)
> {
> struct mm_struct *mm = &init_mm;
>
> @@ -115,7 +118,7 @@ asmlinkage void __init smp_callin(void)
>
> trap_init();
> notify_cpu_starting(smp_processor_id());
> - set_cpu_online(smp_processor_id(), 1);
> + set_cpu_online(smp_processor_id(), true);
> /*
> * Remote TLB flushes are ignored while the CPU is offline, so emit
> * a local TLB flush right now just in case.
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index 24a9333d..8b331619 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -153,7 +153,7 @@ int is_valid_bugaddr(unsigned long pc)
> }
> #endif /* CONFIG_GENERIC_BUG */
>
> -void __init trap_init(void)
> +void trap_init(void)
> {
> /*
> * Set sup0 scratch register to 0, indicating to exception vector
> @@ -162,6 +162,6 @@ void __init trap_init(void)
> csr_write(sscratch, 0);
> /* Set the exception vector address */
> csr_write(stvec, &handle_exception);
> - /* Enable all interrupts */
> - csr_write(sie, -1);
> + /* Enable all interrupts but timer interrupt*/
> + csr_set(sie, SIE_SSIE | SIE_SEIE);
> }
> --
> 2.7.4
>

2018-09-06 21:12:57

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v3 12/12] RISC-V: Support cpu hotplug.

On 9/6/18 3:21 AM, Mark Rutland wrote:
> Hi,
>
> On Thu, Sep 06, 2018 at 01:05:35AM -0700, Atish Patra wrote:
>> This patch enable support for cpu hotplug in RISC-V.
>>
>> In absence of generic cpu stop functions, WFI is used
>> to put the cpu in low power state during offline. An IPI
>> is sent to bring it out of WFI during online operation.
>
> AFAICT, this doesn't actually return the CPU to firwmare, and the WFI and
> return code are in-kernel. From experience on arm and arm64 I would *very*
> strongly advise against this kind of pseudo-hotplug, as it causes many more
> long-term problems than it solves.
>
I completely agree with you. The idea was here to have a working
cpu-hotplug solution until we have a concrete implementation of cpu
stop/start via firmware. Once we have that, we just need to replace the
wakeup and wait_for_software_interrupt calls. The current implementation
via WFI was never meant to be a long term solution.


> For instance, this causes significant pain with kexec, since the prior kernel's
> text must be kept around forever for the offline CPUs. Likewise with hibernate
> suspend/resume.
>
> Depending on how your architecture works, there can also be a number of
> problems with cache/TLB maintenance for secondary CPUs which are unexpectedly
> online.
>
> I would suggest that someone should look at writing a FW standard for CPU
> hotplug, akin to PSCI for arm/arm64. At minimum, you will want:
>

I have already started working on it in parallel. As of now, I am
planning to strengthen SBI spec by making it more flexible/extensible
with some of the below features you mentioned. But we can come up with a
new spec altogether if that makes more sense.

> - A mechanism to determine the version of FW and/or features supported by the FW.
>
> - a mechanism to hotplug-in a specific CPU to a runtime-specified address,
> ideally with some unique token (e.g. so you can pass a pointer to its stack
> or whatever).
>
> - a mechanism to hotplug-out the calling CPU.
>
> - a mechanism to determine when a specific CPU has been hotplugged out. This is
> necessary for kexec and other things to be robust.
>

Thanks for the pointers. I had not looked into 2&4 earlier. I will go
over PSCI docs again.

Regards,
Atish

> Thanks,
> Mark.
>
>>
>> Tested both on QEMU and HighFive Unleashed board with
>> 4 cpus. Test result follows.
>>
>> $ echo 0 > /sys/devices/system/cpu/cpu2/online
>> [ 31.828562] CPU2: shutdown
>> $ cat /proc/cpuinfo
>> hart : 0
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 1
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 3
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> $ echo 0 > /sys/devices/system/cpu/cpu3/online
>> [ 52.968495] CPU3: shutdown
>> $ cat /proc/cpuinfo
>> hart : 0
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 2
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> $ echo 1 > /sys/devices/system/cpu/cpu3/online
>> [ 64.298250] CPU3: online
>> $ cat /proc/cpuinfo
>> hart : 0
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 1
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 3
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> Signed-off-by: Atish Patra <[email protected]>
>> ---
>> arch/riscv/Kconfig | 12 ++++++-
>> arch/riscv/include/asm/irq.h | 1 +
>> arch/riscv/include/asm/smp.h | 28 ++++++++++++++++
>> arch/riscv/kernel/Makefile | 1 +
>> arch/riscv/kernel/cpu-hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++
>> arch/riscv/kernel/head.S | 13 ++++++++
>> arch/riscv/kernel/irq.c | 24 ++++++++++++++
>> arch/riscv/kernel/setup.c | 17 +++++++++-
>> arch/riscv/kernel/smp.c | 8 +----
>> arch/riscv/kernel/smpboot.c | 7 ++--
>> arch/riscv/kernel/traps.c | 6 ++--
>> 11 files changed, 175 insertions(+), 14 deletions(-)
>> create mode 100644 arch/riscv/kernel/cpu-hotplug.c
>>
>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> index a3449802..ea2e617e 100644
>> --- a/arch/riscv/Kconfig
>> +++ b/arch/riscv/Kconfig
>> @@ -21,7 +21,6 @@ config RISCV
>> select COMMON_CLK
>> select DMA_DIRECT_OPS
>> select GENERIC_CLOCKEVENTS
>> - select GENERIC_CPU_DEVICES
>> select GENERIC_IRQ_SHOW
>> select GENERIC_PCI_IOMAP
>> select GENERIC_STRNCPY_FROM_USER
>> @@ -167,6 +166,17 @@ config SMP
>>
>> If you don't know what to do here, say N.
>>
>> +config HOTPLUG_CPU
>> + bool "Support for hot-pluggable CPUs"
>> + depends on SMP
>> + select GENERIC_IRQ_MIGRATION
>> + help
>> +
>> + Say Y here to experiment with turning CPUs off and on. CPUs
>> + can be controlled through /sys/devices/system/cpu.
>> +
>> + Say N if you want to disable CPU hotplug.
>> +
>> config NR_CPUS
>> int "Maximum number of CPUs (2-32)"
>> range 2 32
>> diff --git a/arch/riscv/include/asm/irq.h b/arch/riscv/include/asm/irq.h
>> index 996b6fbe..a873a72d 100644
>> --- a/arch/riscv/include/asm/irq.h
>> +++ b/arch/riscv/include/asm/irq.h
>> @@ -19,6 +19,7 @@
>>
>> void riscv_timer_interrupt(void);
>> void riscv_software_interrupt(void);
>> +void wait_for_software_interrupt(void);
>>
>> #include <asm-generic/irq.h>
>>
>> diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
>> index fce312ce..8145b865 100644
>> --- a/arch/riscv/include/asm/smp.h
>> +++ b/arch/riscv/include/asm/smp.h
>> @@ -25,8 +25,29 @@
>> extern unsigned long __cpuid_to_hardid_map[NR_CPUS];
>> #define cpuid_to_hardid_map(cpu) __cpuid_to_hardid_map[cpu]
>>
>> +#if defined CONFIG_SMP && defined CONFIG_HOTPLUG_CPU
>> +void arch_send_call_wakeup_ipi(int cpu);
>> +bool can_hotplug_cpu(void);
>> +#else
>> +static inline bool can_hotplug_cpu(void)
>> +{
>> + return 0;
>> +}
>> +static inline void arch_send_call_wakeup_ipi(int cpu) { }
>> +#endif
>> +
>> #ifdef CONFIG_SMP
>>
>> +enum ipi_message_type {
>> + IPI_RESCHEDULE,
>> + IPI_CALL_FUNC,
>> + IPI_CALL_WAKEUP,
>> + IPI_MAX
>> +};
>> +
>> +void send_ipi_message(const struct cpumask *to_whom,
>> + enum ipi_message_type operation);
>> +
>> /* SMP initialization hook for setup_arch */
>> void __init setup_smp(void);
>>
>> @@ -45,6 +66,13 @@ void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
>> */
>> #define raw_smp_processor_id() (current_thread_info()->cpu)
>>
>> +#ifdef CONFIG_HOTPLUG_CPU
>> +int __cpu_disable(void);
>> +void __cpu_die(unsigned int cpu);
>> +void cpu_play_dead(void);
>> +void boot_sec_cpu(void);
>> +#endif /* CONFIG_HOTPLUG_CPU */
>> +
>> #else
>>
>> static inline int riscv_hartid_to_cpuid(int hartid)
>> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
>> index e1274fc0..62043204 100644
>> --- a/arch/riscv/kernel/Makefile
>> +++ b/arch/riscv/kernel/Makefile
>> @@ -35,6 +35,7 @@ obj-$(CONFIG_SMP) += smpboot.o
>> obj-$(CONFIG_SMP) += smp.o
>> obj-$(CONFIG_MODULES) += module.o
>> obj-$(CONFIG_MODULE_SECTIONS) += module-sections.o
>> +obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o
>>
>> obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o
>> obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o
>> diff --git a/arch/riscv/kernel/cpu-hotplug.c b/arch/riscv/kernel/cpu-hotplug.c
>> new file mode 100644
>> index 00000000..7a152972
>> --- /dev/null
>> +++ b/arch/riscv/kernel/cpu-hotplug.c
>> @@ -0,0 +1,72 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (C) 2018 Western Digital Corporation or its affiliates.
>> + */
>> +
>> +#include <linux/kernel.h>
>> +#include <linux/mm.h>
>> +#include <linux/sched.h>
>> +#include <linux/err.h>
>> +#include <linux/irq.h>
>> +#include <linux/cpu.h>
>> +#include <linux/sched/hotplug.h>
>> +#include <asm/irq.h>
>> +#include <asm/sbi.h>
>> +
>> +bool can_hotplug_cpu(void)
>> +{
>> + return true;
>> +}
>> +
>> +void arch_cpu_idle_dead(void)
>> +{
>> + cpu_play_dead();
>> +}
>> +
>> +/*
>> + * __cpu_disable runs on the processor to be shutdown.
>> + */
>> +int __cpu_disable(void)
>> +{
>> + int ret = 0;
>> + unsigned int cpu = smp_processor_id();
>> +
>> + set_cpu_online(cpu, false);
>> + irq_migrate_all_off_this_cpu();
>> +
>> + return ret;
>> +}
>> +/*
>> + * called on the thread which is asking for a CPU to be shutdown -
>> + * waits until shutdown has completed, or it is timed out.
>> + */
>> +void __cpu_die(unsigned int cpu)
>> +{
>> + if (!cpu_wait_death(cpu, 5)) {
>> + pr_err("CPU %u: didn't die\n", cpu);
>> + return;
>> + }
>> + pr_notice("CPU%u: shutdown\n", cpu);
>> + /*TODO: Do we need to verify is cpu is really dead */
>> +}
>> +
>> +/*
>> + * Called from the idle thread for the CPU which has been shutdown.
>> + *
>> + */
>> +void cpu_play_dead(void)
>> +{
>> + idle_task_exit();
>> +
>> + (void)cpu_report_death();
>> +
>> + /* Do not disable software interrupt to restart cpu after WFI */
>> + csr_clear(sie, SIE_STIE | SIE_SEIE);
>> + wait_for_software_interrupt();
>> + boot_sec_cpu();
>> +}
>> +
>> +void arch_send_call_wakeup_ipi(int cpu)
>> +{
>> + send_ipi_message(cpumask_of(cpu), IPI_CALL_WAKEUP);
>> +}
>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
>> index 711190d4..07d9fb38 100644
>> --- a/arch/riscv/kernel/head.S
>> +++ b/arch/riscv/kernel/head.S
>> @@ -153,6 +153,19 @@ relocate:
>> j .Lsecondary_park
>> END(_start)
>>
>> +#ifdef CONFIG_SMP
>> +.section .text
>> +.global boot_sec_cpu
>> +
>> +boot_sec_cpu:
>> + /* clear all pending flags */
>> + csrw sip, zero
>> + /* Mask all interrupts */
>> + csrw sie, zero
>> + fence
>> +
>> + tail smp_callin
>> +#endif
>> __PAGE_ALIGNED_BSS
>> /* Empty zero page */
>> .balign PAGE_SIZE
>> diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
>> index 0cfac48a..7e14b0d9 100644
>> --- a/arch/riscv/kernel/irq.c
>> +++ b/arch/riscv/kernel/irq.c
>> @@ -53,6 +53,30 @@ asmlinkage void __irq_entry do_IRQ(struct pt_regs *regs, unsigned long cause)
>> set_irq_regs(old_regs);
>> }
>>
>> +/*
>> + * This function doesn't return until a software interrupt is sent via IPI.
>> + * Obviously, all the interrupts except software interrupt should be disabled
>> + * before this function is called.
>> + */
>> +void wait_for_software_interrupt(void)
>> +{
>> + unsigned long sipval, sieval, scauseval;
>> +
>> + /* clear all pending flags */
>> + csr_write(sip, 0);
>> + /* clear any previous scause data */
>> + csr_write(scause, 0);
>> +
>> + do {
>> + wait_for_interrupt();
>> + sipval = csr_read(sip);
>> + sieval = csr_read(sie);
>> + scauseval = csr_read(scause) & ~INTERRUPT_CAUSE_FLAG;
>> + /* only break if wfi returns for an enabled interrupt */
>> + } while ((sipval & sieval) == 0 &&
>> + scauseval != INTERRUPT_CAUSE_SOFTWARE);
>> +}
>> +
>> void __init init_IRQ(void)
>> {
>> irqchip_init();
>> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
>> index a5fac1b7..612f0c21 100644
>> --- a/arch/riscv/kernel/setup.c
>> +++ b/arch/riscv/kernel/setup.c
>> @@ -30,11 +30,11 @@
>> #include <linux/of_platform.h>
>> #include <linux/sched/task.h>
>> #include <linux/swiotlb.h>
>> +#include <linux/smp.h>
>>
>> #include <asm/setup.h>
>> #include <asm/sections.h>
>> #include <asm/pgtable.h>
>> -#include <asm/smp.h>
>> #include <asm/sbi.h>
>> #include <asm/tlbflush.h>
>> #include <asm/thread_info.h>
>> @@ -82,6 +82,7 @@ EXPORT_SYMBOL(empty_zero_page);
>> /* The lucky hart to first increment this variable will boot the other cores */
>> atomic_t hart_lottery;
>> unsigned long boot_cpu_hartid;
>> +static DEFINE_PER_CPU(struct cpu, cpu_devices);
>>
>> unsigned long __cpuid_to_hardid_map[NR_CPUS] = {
>> [0 ... NR_CPUS-1] = INVALID_HARTID
>> @@ -257,3 +258,17 @@ void __init setup_arch(char **cmdline_p)
>> riscv_fill_hwcap();
>> }
>>
>> +static int __init topology_init(void)
>> +{
>> + int i;
>> +
>> + for_each_possible_cpu(i) {
>> + struct cpu *cpu = &per_cpu(cpu_devices, i);
>> +
>> + cpu->hotpluggable = can_hotplug_cpu();
>> + register_cpu(cpu, i);
>> + }
>> +
>> + return 0;
>> +}
>> +subsys_initcall(topology_init);
>> diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
>> index 89d95866..629456bb 100644
>> --- a/arch/riscv/kernel/smp.c
>> +++ b/arch/riscv/kernel/smp.c
>> @@ -32,12 +32,6 @@ static struct {
>> unsigned long bits ____cacheline_aligned;
>> } ipi_data[NR_CPUS] __cacheline_aligned;
>>
>> -enum ipi_message_type {
>> - IPI_RESCHEDULE,
>> - IPI_CALL_FUNC,
>> - IPI_MAX
>> -};
>> -
>> int riscv_hartid_to_cpuid(int hartid)
>> {
>> int i = -1;
>> @@ -94,7 +88,7 @@ void riscv_software_interrupt(void)
>> }
>> }
>>
>> -static void
>> +void
>> send_ipi_message(const struct cpumask *to_whom, enum ipi_message_type operation)
>> {
>> int cpuid, hartid;
>> diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
>> index f44ae780..a9138431 100644
>> --- a/arch/riscv/kernel/smpboot.c
>> +++ b/arch/riscv/kernel/smpboot.c
>> @@ -92,9 +92,12 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
>> task_stack_page(tidle) + THREAD_SIZE);
>> WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);
>>
>> + arch_send_call_wakeup_ipi(cpu);
>> while (!cpu_online(cpu))
>> cpu_relax();
>>
>> + pr_notice("CPU%u: online\n", cpu);
>> +
>> return 0;
>> }
>>
>> @@ -105,7 +108,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
>> /*
>> * C entry point for a secondary processor.
>> */
>> -asmlinkage void __init smp_callin(void)
>> +asmlinkage void smp_callin(void)
>> {
>> struct mm_struct *mm = &init_mm;
>>
>> @@ -115,7 +118,7 @@ asmlinkage void __init smp_callin(void)
>>
>> trap_init();
>> notify_cpu_starting(smp_processor_id());
>> - set_cpu_online(smp_processor_id(), 1);
>> + set_cpu_online(smp_processor_id(), true);
>> /*
>> * Remote TLB flushes are ignored while the CPU is offline, so emit
>> * a local TLB flush right now just in case.
>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>> index 24a9333d..8b331619 100644
>> --- a/arch/riscv/kernel/traps.c
>> +++ b/arch/riscv/kernel/traps.c
>> @@ -153,7 +153,7 @@ int is_valid_bugaddr(unsigned long pc)
>> }
>> #endif /* CONFIG_GENERIC_BUG */
>>
>> -void __init trap_init(void)
>> +void trap_init(void)
>> {
>> /*
>> * Set sup0 scratch register to 0, indicating to exception vector
>> @@ -162,6 +162,6 @@ void __init trap_init(void)
>> csr_write(sscratch, 0);
>> /* Set the exception vector address */
>> csr_write(stvec, &handle_exception);
>> - /* Enable all interrupts */
>> - csr_write(sie, -1);
>> + /* Enable all interrupts but timer interrupt*/
>> + csr_set(sie, SIE_SSIE | SIE_SEIE);
>> }
>> --
>> 2.7.4
>>
>


2018-09-10 11:25:42

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 02/12] RISC-V: Filter ISA and MMU values in cpuinfo

On Thu, Sep 06, 2018 at 01:05:25AM -0700, Atish Patra wrote:
> +#elif defined(CONFIG_64BIT)
> + if ((strcmp(mmu_type, "riscv,sv39") != 0)
> + && (strcmp(mmu_type, "riscv,sv48") != 0))
> + return;

This should be:

if (strcmp(mmu_type, "riscv,sv39") != 0 &&
strcmp(mmu_type, "riscv,sv48") != 0)
return;

Otherwise looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2018-09-10 11:26:24

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 03/12] RISC-V: Comment on the TLB flush in smp_callin()

On Thu, Sep 06, 2018 at 01:05:26AM -0700, Atish Patra wrote:
> From: Palmer Dabbelt <[email protected]>
>
> This isn't readily apparent from reading the code.
>
> Signed-off-by: Palmer Dabbelt <[email protected]>
> [Atish: code comment formatting update]
> Signed-off-by: Atish Patra <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2018-09-10 11:26:31

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 04/12] RISC-V: Disable preemption before enabling interrupts

On Thu, Sep 06, 2018 at 01:05:27AM -0700, Atish Patra wrote:
> Currently, irq is enabled before preemption disabling happens.
> If the scheduler fired right here and cpu is scheduled then it
> may blow up.
>
> Signed-off-by: Palmer Dabbelt <[email protected]>
> [Atish: Commit text and code comment formatting update]
> Signed-off-by: Atish Patra <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2018-09-10 11:27:56

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 06/12] RISC-V: Rename riscv_of_processor_hart to riscv_of_processor_hartid

On Thu, Sep 06, 2018 at 01:05:29AM -0700, Atish Patra wrote:
> From: Palmer Dabbelt <[email protected]>
>
> It's a bit confusing exactly what this function does: it actually
> returns the hartid of an OF processor node, failing with -1 on invalid
> nodes. I've changed the name to _hartid() in order to make that a bit
> more clear, as well as adding a comment.
>
> Signed-off-by: Palmer Dabbelt <[email protected]>

Looks generally good, but it is going to conflict with the dt iterators
series from Rob. Given that this is just a little cleanup it might be
worth deferring until the next merge window.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2018-09-10 11:28:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 07/12] RISC-V: Rename im_okay_therefore_i_am to found_boot_cpu

On Thu, Sep 06, 2018 at 01:05:30AM -0700, Atish Patra wrote:
> From: Palmer Dabbelt <[email protected]>
>
> The old name was a bit odd.
>
> Signed-off-by: Palmer Dabbelt <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2018-09-10 11:28:20

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 05/12] RISC-V: Provide a cleaner raw_smp_processor_id()

On Thu, Sep 06, 2018 at 01:05:28AM -0700, Atish Patra wrote:
> From: Palmer Dabbelt <[email protected]>
>
> I'm not sure how I managed to miss this the first time, but this is much
> better.
>
> Signed-off-by: Palmer Dabbelt <[email protected]>
> [Atish: code comment formatting and other fixes]
> Signed-off-by: Atish Patra <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2018-09-10 11:30:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 10/12] RISC-V: Add logical CPU indexing for RISC-V

On Thu, Sep 06, 2018 at 01:05:33AM -0700, Atish Patra wrote:
> Currently, both linux cpu id and hardware cpu id are same.
> This is not recommended as it will lead to discontinuous cpu
> indexing in Linux. Moreover, kdump kernel will run from CPU0
> which would be absent if we follow existing scheme.
>
> Implement a logical mapping between Linux cpu id and hardware
> cpuid to decouple these two. Always mark the boot processor as
> cpu0 and all other cpus get the logical cpu id based on their
> booting order.
>
> Signed-off-by: Atish Patra <[email protected]>
> Reviewed-by : Anup Patel <[email protected]>

drop the two spaces before the ":", please.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2018-09-10 11:30:52

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 11/12] RISC-V: Use Linux logical cpu number instead of hartid

On Thu, Sep 06, 2018 at 01:05:34AM -0700, Atish Patra wrote:
> Setup the cpu_logical_map during boot. Moreover, every SBI call
> and PLIC context are based on the physical hartid. Use the logical
> cpu to hartid mapping to pass correct hartid to respective functions.
>
> Signed-off-by: Atish Patra <[email protected]>
> Reviewed-by : Anup Patel <[email protected]>

Same whitespace issue here.

Otherwise looks fine:

Reviewed-by: Christoph Hellwig <[email protected]>

2018-09-10 11:31:31

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 09/12] RISC-V: User WRITE_ONCE instead of direct access

On Thu, Sep 06, 2018 at 01:05:32AM -0700, Atish Patra wrote:
> The secondary harts spin on couple of per cpu variables until both of
> these are non-zero so it's not necessary to have any ordering here.
> However, WRITE_ONCE should be used to avoid tearing.

We normally pair WRITE_ONCE with READ_ONCE. But it seems like the
reader side is in assembly code, so this should be ok:

Reviewed-by: Christoph Hellwig <[email protected]>

2018-09-11 01:35:28

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v3 02/12] RISC-V: Filter ISA and MMU values in cpuinfo

On 9/10/18 4:24 AM, Christoph Hellwig wrote:
> On Thu, Sep 06, 2018 at 01:05:25AM -0700, Atish Patra wrote:
>> +#elif defined(CONFIG_64BIT)
>> + if ((strcmp(mmu_type, "riscv,sv39") != 0)
>> + && (strcmp(mmu_type, "riscv,sv48") != 0))
>> + return;
>
> This should be:
>
> if (strcmp(mmu_type, "riscv,sv39") != 0 &&
> strcmp(mmu_type, "riscv,sv48") != 0)
> return;
>
> Otherwise looks good:
>
> Reviewed-by: Christoph Hellwig <[email protected]>
>

Fixed.

Regards,
Atish

2018-09-11 01:36:44

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v3 11/12] RISC-V: Use Linux logical cpu number instead of hartid

On 9/10/18 4:29 AM, Christoph Hellwig wrote:
> On Thu, Sep 06, 2018 at 01:05:34AM -0700, Atish Patra wrote:
>> Setup the cpu_logical_map during boot. Moreover, every SBI call
>> and PLIC context are based on the physical hartid. Use the logical
>> cpu to hartid mapping to pass correct hartid to respective functions.
>>
>> Signed-off-by: Atish Patra <[email protected]>
>> Reviewed-by : Anup Patel <[email protected]>
>
> Same whitespace issue here.
>
> Otherwise looks fine:
>
> Reviewed-by: Christoph Hellwig <[email protected]>
>

Fixed. Thanks for the review.

Regards,
Atish

2018-09-11 01:37:26

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v3 06/12] RISC-V: Rename riscv_of_processor_hart to riscv_of_processor_hartid

On 9/10/18 4:26 AM, Christoph Hellwig wrote:
> On Thu, Sep 06, 2018 at 01:05:29AM -0700, Atish Patra wrote:
>> From: Palmer Dabbelt <[email protected]>
>>
>> It's a bit confusing exactly what this function does: it actually
>> returns the hartid of an OF processor node, failing with -1 on invalid
>> nodes. I've changed the name to _hartid() in order to make that a bit
>> more clear, as well as adding a comment.
>>
>> Signed-off-by: Palmer Dabbelt <[email protected]>
>
> Looks generally good, but it is going to conflict with the dt iterators
> series from Rob. Given that this is just a little cleanup it might be
> worth deferring until the next merge window.
>

Sure. np. I guess the entire series can be parked in for-next until the
next merge window opens.

Regards,
Atish
> Otherwise looks good:
>
> Reviewed-by: Christoph Hellwig <[email protected]>
>


2018-09-11 06:19:01

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 06/12] RISC-V: Rename riscv_of_processor_hart to riscv_of_processor_hartid

On Mon, Sep 10, 2018 at 06:36:52PM -0700, Atish Patra wrote:
> On 9/10/18 4:26 AM, Christoph Hellwig wrote:
> > On Thu, Sep 06, 2018 at 01:05:29AM -0700, Atish Patra wrote:
> > > From: Palmer Dabbelt <[email protected]>
> > >
> > > It's a bit confusing exactly what this function does: it actually
> > > returns the hartid of an OF processor node, failing with -1 on invalid
> > > nodes. I've changed the name to _hartid() in order to make that a bit
> > > more clear, as well as adding a comment.
> > >
> > > Signed-off-by: Palmer Dabbelt <[email protected]>
> >
> > Looks generally good, but it is going to conflict with the dt iterators
> > series from Rob. Given that this is just a little cleanup it might be
> > worth deferring until the next merge window.
> >
>
> Sure. np. I guess the entire series can be parked in for-next until the next
> merge window opens.

Actually more in your series also conflicts. I think we can defer the
patch from Rob as it is very small and simple and can be applied post
4.20-rc1.

2018-09-11 18:36:59

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v3 12/12] RISC-V: Support cpu hotplug.

On 9/6/18 11:25 AM, Atish Patra wrote:
> On 9/6/18 3:21 AM, Mark Rutland wrote:
>> Hi,
>>
>> On Thu, Sep 06, 2018 at 01:05:35AM -0700, Atish Patra wrote:
>>> This patch enable support for cpu hotplug in RISC-V.
>>>
>>> In absence of generic cpu stop functions, WFI is used
>>> to put the cpu in low power state during offline. An IPI
>>> is sent to bring it out of WFI during online operation.
>>
>> AFAICT, this doesn't actually return the CPU to firwmare, and the WFI and
>> return code are in-kernel. From experience on arm and arm64 I would *very*
>> strongly advise against this kind of pseudo-hotplug, as it causes many more
>> long-term problems than it solves.
>>
> I completely agree with you. The idea was here to have a working
> cpu-hotplug solution until we have a concrete implementation of cpu
> stop/start via firmware. Once we have that, we just need to replace the
> wakeup and wait_for_software_interrupt calls. The current implementation
> via WFI was never meant to be a long term solution.
>
>

Any thoughts on this ? Is this good enough argument to keep the current
version of hotplug feature and make necessary changes after we have a
proper cpu start/stop via firmware (which may take some significant time)?

Regards,
Atish
>> For instance, this causes significant pain with kexec, since the prior kernel's
>> text must be kept around forever for the offline CPUs. Likewise with hibernate
>> suspend/resume.
>>
>> Depending on how your architecture works, there can also be a number of
>> problems with cache/TLB maintenance for secondary CPUs which are unexpectedly
>> online.
>>
>> I would suggest that someone should look at writing a FW standard for CPU
>> hotplug, akin to PSCI for arm/arm64. At minimum, you will want:
>>
>
> I have already started working on it in parallel. As of now, I am
> planning to strengthen SBI spec by making it more flexible/extensible
> with some of the below features you mentioned. But we can come up with a
> new spec altogether if that makes more sense.
>
>> - A mechanism to determine the version of FW and/or features supported by the FW.
>>
>> - a mechanism to hotplug-in a specific CPU to a runtime-specified address,
>> ideally with some unique token (e.g. so you can pass a pointer to its stack
>> or whatever).
>>
>> - a mechanism to hotplug-out the calling CPU.
>>
>> - a mechanism to determine when a specific CPU has been hotplugged out. This is
>> necessary for kexec and other things to be robust.
>>
>
> Thanks for the pointers. I had not looked into 2&4 earlier. I will go
> over PSCI docs again.
>
> Regards,
> Atish
>
>> Thanks,
>> Mark.
>>
>>>
>>> Tested both on QEMU and HighFive Unleashed board with
>>> 4 cpus. Test result follows.
>>>
>>> $ echo 0 > /sys/devices/system/cpu/cpu2/online
>>> [ 31.828562] CPU2: shutdown
>>> $ cat /proc/cpuinfo
>>> hart : 0
>>> isa : rv64imafdc
>>> mmu : sv39
>>> uarch : sifive,rocket0
>>>
>>> hart : 1
>>> isa : rv64imafdc
>>> mmu : sv39
>>> uarch : sifive,rocket0
>>>
>>> hart : 3
>>> isa : rv64imafdc
>>> mmu : sv39
>>> uarch : sifive,rocket0
>>>
>>> $ echo 0 > /sys/devices/system/cpu/cpu3/online
>>> [ 52.968495] CPU3: shutdown
>>> $ cat /proc/cpuinfo
>>> hart : 0
>>> isa : rv64imafdc
>>> mmu : sv39
>>> uarch : sifive,rocket0
>>>
>>> hart : 2
>>> isa : rv64imafdc
>>> mmu : sv39
>>> uarch : sifive,rocket0
>>>
>>> $ echo 1 > /sys/devices/system/cpu/cpu3/online
>>> [ 64.298250] CPU3: online
>>> $ cat /proc/cpuinfo
>>> hart : 0
>>> isa : rv64imafdc
>>> mmu : sv39
>>> uarch : sifive,rocket0
>>>
>>> hart : 1
>>> isa : rv64imafdc
>>> mmu : sv39
>>> uarch : sifive,rocket0
>>>
>>> hart : 3
>>> isa : rv64imafdc
>>> mmu : sv39
>>> uarch : sifive,rocket0
>>>
>>> Signed-off-by: Atish Patra <[email protected]>
>>> ---
>>> arch/riscv/Kconfig | 12 ++++++-
>>> arch/riscv/include/asm/irq.h | 1 +
>>> arch/riscv/include/asm/smp.h | 28 ++++++++++++++++
>>> arch/riscv/kernel/Makefile | 1 +
>>> arch/riscv/kernel/cpu-hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++
>>> arch/riscv/kernel/head.S | 13 ++++++++
>>> arch/riscv/kernel/irq.c | 24 ++++++++++++++
>>> arch/riscv/kernel/setup.c | 17 +++++++++-
>>> arch/riscv/kernel/smp.c | 8 +----
>>> arch/riscv/kernel/smpboot.c | 7 ++--
>>> arch/riscv/kernel/traps.c | 6 ++--
>>> 11 files changed, 175 insertions(+), 14 deletions(-)
>>> create mode 100644 arch/riscv/kernel/cpu-hotplug.c
>>>
>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>>> index a3449802..ea2e617e 100644
>>> --- a/arch/riscv/Kconfig
>>> +++ b/arch/riscv/Kconfig
>>> @@ -21,7 +21,6 @@ config RISCV
>>> select COMMON_CLK
>>> select DMA_DIRECT_OPS
>>> select GENERIC_CLOCKEVENTS
>>> - select GENERIC_CPU_DEVICES
>>> select GENERIC_IRQ_SHOW
>>> select GENERIC_PCI_IOMAP
>>> select GENERIC_STRNCPY_FROM_USER
>>> @@ -167,6 +166,17 @@ config SMP
>>>
>>> If you don't know what to do here, say N.
>>>
>>> +config HOTPLUG_CPU
>>> + bool "Support for hot-pluggable CPUs"
>>> + depends on SMP
>>> + select GENERIC_IRQ_MIGRATION
>>> + help
>>> +
>>> + Say Y here to experiment with turning CPUs off and on. CPUs
>>> + can be controlled through /sys/devices/system/cpu.
>>> +
>>> + Say N if you want to disable CPU hotplug.
>>> +
>>> config NR_CPUS
>>> int "Maximum number of CPUs (2-32)"
>>> range 2 32
>>> diff --git a/arch/riscv/include/asm/irq.h b/arch/riscv/include/asm/irq.h
>>> index 996b6fbe..a873a72d 100644
>>> --- a/arch/riscv/include/asm/irq.h
>>> +++ b/arch/riscv/include/asm/irq.h
>>> @@ -19,6 +19,7 @@
>>>
>>> void riscv_timer_interrupt(void);
>>> void riscv_software_interrupt(void);
>>> +void wait_for_software_interrupt(void);
>>>
>>> #include <asm-generic/irq.h>
>>>
>>> diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
>>> index fce312ce..8145b865 100644
>>> --- a/arch/riscv/include/asm/smp.h
>>> +++ b/arch/riscv/include/asm/smp.h
>>> @@ -25,8 +25,29 @@
>>> extern unsigned long __cpuid_to_hardid_map[NR_CPUS];
>>> #define cpuid_to_hardid_map(cpu) __cpuid_to_hardid_map[cpu]
>>>
>>> +#if defined CONFIG_SMP && defined CONFIG_HOTPLUG_CPU
>>> +void arch_send_call_wakeup_ipi(int cpu);
>>> +bool can_hotplug_cpu(void);
>>> +#else
>>> +static inline bool can_hotplug_cpu(void)
>>> +{
>>> + return 0;
>>> +}
>>> +static inline void arch_send_call_wakeup_ipi(int cpu) { }
>>> +#endif
>>> +
>>> #ifdef CONFIG_SMP
>>>
>>> +enum ipi_message_type {
>>> + IPI_RESCHEDULE,
>>> + IPI_CALL_FUNC,
>>> + IPI_CALL_WAKEUP,
>>> + IPI_MAX
>>> +};
>>> +
>>> +void send_ipi_message(const struct cpumask *to_whom,
>>> + enum ipi_message_type operation);
>>> +
>>> /* SMP initialization hook for setup_arch */
>>> void __init setup_smp(void);
>>>
>>> @@ -45,6 +66,13 @@ void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
>>> */
>>> #define raw_smp_processor_id() (current_thread_info()->cpu)
>>>
>>> +#ifdef CONFIG_HOTPLUG_CPU
>>> +int __cpu_disable(void);
>>> +void __cpu_die(unsigned int cpu);
>>> +void cpu_play_dead(void);
>>> +void boot_sec_cpu(void);
>>> +#endif /* CONFIG_HOTPLUG_CPU */
>>> +
>>> #else
>>>
>>> static inline int riscv_hartid_to_cpuid(int hartid)
>>> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
>>> index e1274fc0..62043204 100644
>>> --- a/arch/riscv/kernel/Makefile
>>> +++ b/arch/riscv/kernel/Makefile
>>> @@ -35,6 +35,7 @@ obj-$(CONFIG_SMP) += smpboot.o
>>> obj-$(CONFIG_SMP) += smp.o
>>> obj-$(CONFIG_MODULES) += module.o
>>> obj-$(CONFIG_MODULE_SECTIONS) += module-sections.o
>>> +obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o
>>>
>>> obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o
>>> obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o
>>> diff --git a/arch/riscv/kernel/cpu-hotplug.c b/arch/riscv/kernel/cpu-hotplug.c
>>> new file mode 100644
>>> index 00000000..7a152972
>>> --- /dev/null
>>> +++ b/arch/riscv/kernel/cpu-hotplug.c
>>> @@ -0,0 +1,72 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * Copyright (C) 2018 Western Digital Corporation or its affiliates.
>>> + */
>>> +
>>> +#include <linux/kernel.h>
>>> +#include <linux/mm.h>
>>> +#include <linux/sched.h>
>>> +#include <linux/err.h>
>>> +#include <linux/irq.h>
>>> +#include <linux/cpu.h>
>>> +#include <linux/sched/hotplug.h>
>>> +#include <asm/irq.h>
>>> +#include <asm/sbi.h>
>>> +
>>> +bool can_hotplug_cpu(void)
>>> +{
>>> + return true;
>>> +}
>>> +
>>> +void arch_cpu_idle_dead(void)
>>> +{
>>> + cpu_play_dead();
>>> +}
>>> +
>>> +/*
>>> + * __cpu_disable runs on the processor to be shutdown.
>>> + */
>>> +int __cpu_disable(void)
>>> +{
>>> + int ret = 0;
>>> + unsigned int cpu = smp_processor_id();
>>> +
>>> + set_cpu_online(cpu, false);
>>> + irq_migrate_all_off_this_cpu();
>>> +
>>> + return ret;
>>> +}
>>> +/*
>>> + * called on the thread which is asking for a CPU to be shutdown -
>>> + * waits until shutdown has completed, or it is timed out.
>>> + */
>>> +void __cpu_die(unsigned int cpu)
>>> +{
>>> + if (!cpu_wait_death(cpu, 5)) {
>>> + pr_err("CPU %u: didn't die\n", cpu);
>>> + return;
>>> + }
>>> + pr_notice("CPU%u: shutdown\n", cpu);
>>> + /*TODO: Do we need to verify is cpu is really dead */
>>> +}
>>> +
>>> +/*
>>> + * Called from the idle thread for the CPU which has been shutdown.
>>> + *
>>> + */
>>> +void cpu_play_dead(void)
>>> +{
>>> + idle_task_exit();
>>> +
>>> + (void)cpu_report_death();
>>> +
>>> + /* Do not disable software interrupt to restart cpu after WFI */
>>> + csr_clear(sie, SIE_STIE | SIE_SEIE);
>>> + wait_for_software_interrupt();
>>> + boot_sec_cpu();
>>> +}
>>> +
>>> +void arch_send_call_wakeup_ipi(int cpu)
>>> +{
>>> + send_ipi_message(cpumask_of(cpu), IPI_CALL_WAKEUP);
>>> +}
>>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
>>> index 711190d4..07d9fb38 100644
>>> --- a/arch/riscv/kernel/head.S
>>> +++ b/arch/riscv/kernel/head.S
>>> @@ -153,6 +153,19 @@ relocate:
>>> j .Lsecondary_park
>>> END(_start)
>>>
>>> +#ifdef CONFIG_SMP
>>> +.section .text
>>> +.global boot_sec_cpu
>>> +
>>> +boot_sec_cpu:
>>> + /* clear all pending flags */
>>> + csrw sip, zero
>>> + /* Mask all interrupts */
>>> + csrw sie, zero
>>> + fence
>>> +
>>> + tail smp_callin
>>> +#endif
>>> __PAGE_ALIGNED_BSS
>>> /* Empty zero page */
>>> .balign PAGE_SIZE
>>> diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
>>> index 0cfac48a..7e14b0d9 100644
>>> --- a/arch/riscv/kernel/irq.c
>>> +++ b/arch/riscv/kernel/irq.c
>>> @@ -53,6 +53,30 @@ asmlinkage void __irq_entry do_IRQ(struct pt_regs *regs, unsigned long cause)
>>> set_irq_regs(old_regs);
>>> }
>>>
>>> +/*
>>> + * This function doesn't return until a software interrupt is sent via IPI.
>>> + * Obviously, all the interrupts except software interrupt should be disabled
>>> + * before this function is called.
>>> + */
>>> +void wait_for_software_interrupt(void)
>>> +{
>>> + unsigned long sipval, sieval, scauseval;
>>> +
>>> + /* clear all pending flags */
>>> + csr_write(sip, 0);
>>> + /* clear any previous scause data */
>>> + csr_write(scause, 0);
>>> +
>>> + do {
>>> + wait_for_interrupt();
>>> + sipval = csr_read(sip);
>>> + sieval = csr_read(sie);
>>> + scauseval = csr_read(scause) & ~INTERRUPT_CAUSE_FLAG;
>>> + /* only break if wfi returns for an enabled interrupt */
>>> + } while ((sipval & sieval) == 0 &&
>>> + scauseval != INTERRUPT_CAUSE_SOFTWARE);
>>> +}
>>> +
>>> void __init init_IRQ(void)
>>> {
>>> irqchip_init();
>>> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
>>> index a5fac1b7..612f0c21 100644
>>> --- a/arch/riscv/kernel/setup.c
>>> +++ b/arch/riscv/kernel/setup.c
>>> @@ -30,11 +30,11 @@
>>> #include <linux/of_platform.h>
>>> #include <linux/sched/task.h>
>>> #include <linux/swiotlb.h>
>>> +#include <linux/smp.h>
>>>
>>> #include <asm/setup.h>
>>> #include <asm/sections.h>
>>> #include <asm/pgtable.h>
>>> -#include <asm/smp.h>
>>> #include <asm/sbi.h>
>>> #include <asm/tlbflush.h>
>>> #include <asm/thread_info.h>
>>> @@ -82,6 +82,7 @@ EXPORT_SYMBOL(empty_zero_page);
>>> /* The lucky hart to first increment this variable will boot the other cores */
>>> atomic_t hart_lottery;
>>> unsigned long boot_cpu_hartid;
>>> +static DEFINE_PER_CPU(struct cpu, cpu_devices);
>>>
>>> unsigned long __cpuid_to_hardid_map[NR_CPUS] = {
>>> [0 ... NR_CPUS-1] = INVALID_HARTID
>>> @@ -257,3 +258,17 @@ void __init setup_arch(char **cmdline_p)
>>> riscv_fill_hwcap();
>>> }
>>>
>>> +static int __init topology_init(void)
>>> +{
>>> + int i;
>>> +
>>> + for_each_possible_cpu(i) {
>>> + struct cpu *cpu = &per_cpu(cpu_devices, i);
>>> +
>>> + cpu->hotpluggable = can_hotplug_cpu();
>>> + register_cpu(cpu, i);
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +subsys_initcall(topology_init);
>>> diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
>>> index 89d95866..629456bb 100644
>>> --- a/arch/riscv/kernel/smp.c
>>> +++ b/arch/riscv/kernel/smp.c
>>> @@ -32,12 +32,6 @@ static struct {
>>> unsigned long bits ____cacheline_aligned;
>>> } ipi_data[NR_CPUS] __cacheline_aligned;
>>>
>>> -enum ipi_message_type {
>>> - IPI_RESCHEDULE,
>>> - IPI_CALL_FUNC,
>>> - IPI_MAX
>>> -};
>>> -
>>> int riscv_hartid_to_cpuid(int hartid)
>>> {
>>> int i = -1;
>>> @@ -94,7 +88,7 @@ void riscv_software_interrupt(void)
>>> }
>>> }
>>>
>>> -static void
>>> +void
>>> send_ipi_message(const struct cpumask *to_whom, enum ipi_message_type operation)
>>> {
>>> int cpuid, hartid;
>>> diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
>>> index f44ae780..a9138431 100644
>>> --- a/arch/riscv/kernel/smpboot.c
>>> +++ b/arch/riscv/kernel/smpboot.c
>>> @@ -92,9 +92,12 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
>>> task_stack_page(tidle) + THREAD_SIZE);
>>> WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);
>>>
>>> + arch_send_call_wakeup_ipi(cpu);
>>> while (!cpu_online(cpu))
>>> cpu_relax();
>>>
>>> + pr_notice("CPU%u: online\n", cpu);
>>> +
>>> return 0;
>>> }
>>>
>>> @@ -105,7 +108,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
>>> /*
>>> * C entry point for a secondary processor.
>>> */
>>> -asmlinkage void __init smp_callin(void)
>>> +asmlinkage void smp_callin(void)
>>> {
>>> struct mm_struct *mm = &init_mm;
>>>
>>> @@ -115,7 +118,7 @@ asmlinkage void __init smp_callin(void)
>>>
>>> trap_init();
>>> notify_cpu_starting(smp_processor_id());
>>> - set_cpu_online(smp_processor_id(), 1);
>>> + set_cpu_online(smp_processor_id(), true);
>>> /*
>>> * Remote TLB flushes are ignored while the CPU is offline, so emit
>>> * a local TLB flush right now just in case.
>>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>>> index 24a9333d..8b331619 100644
>>> --- a/arch/riscv/kernel/traps.c
>>> +++ b/arch/riscv/kernel/traps.c
>>> @@ -153,7 +153,7 @@ int is_valid_bugaddr(unsigned long pc)
>>> }
>>> #endif /* CONFIG_GENERIC_BUG */
>>>
>>> -void __init trap_init(void)
>>> +void trap_init(void)
>>> {
>>> /*
>>> * Set sup0 scratch register to 0, indicating to exception vector
>>> @@ -162,6 +162,6 @@ void __init trap_init(void)
>>> csr_write(sscratch, 0);
>>> /* Set the exception vector address */
>>> csr_write(stvec, &handle_exception);
>>> - /* Enable all interrupts */
>>> - csr_write(sie, -1);
>>> + /* Enable all interrupts but timer interrupt*/
>>> + csr_set(sie, SIE_SSIE | SIE_SEIE);
>>> }
>>> --
>>> 2.7.4
>>>
>>
>
>